I have sorted a CSV
file as I want it and appended a column to sort my data properly. However, in using concat
(I think this is where the issue is, anyway) The output CSV
file has been changed to (0L, 'HeadingTitle'). I just want it to be HeadingTitle.
import numpy as np
import pandas as pd
import pandas.util.testing as tm; tm.N = 3
data = pd.DataFrame.from_csv('MYDATA.csv')
byqualityissue = data.groupby(["CompanyName","QualityIssue"]).size()
df = pd.DataFrame(byqualityissue)
formatted = df.unstack(level=-1)
formatted[np.isnan(formatted)] = 0
includingtotals = pd.concat([formatted,pd.DataFrame(formatted.sum(axis=1),columns=['Total'])],axis=1)
sorted = includingtotals.sort_index(by=['Total'], ascending=[False])
#del sorted['Total']
sorted.to_csv('byqualityissue.csv')
Where the output headings are:
CompanyName, (0L, 'Equipment'), (0L, 'User'), (0L, 'Neither'), Total
How do I modify this so that I only have the heading titles?
Edit: If I print sorted.columns the output is
Index([(0, u'Equipment), (0, u'User'), (0, u'Neither'), u'Total'], dtype='object')
In the line
you don't give the column a name, so it takes the default value
0
. Then when you callunstack
,the result has hierarchical columns with
0
in the first level. To fix this you can substitute the previous line with