I have a dataframe:
df1=pd.DataFrame({
'ID':[101,102],
'Name':['Axel','Bob'],
'US':['GrA','GrC'],
'Europe':['GrB','GrD'],
'AsiaPac':['GrZ','GrF']
})
Which I want to change to this:
df2=pd.DataFrame({
'ID':[101,101,101,102,102,102],
'Name':['Axel','Axel','Axel','Bob','Bob','Bob'],
'Region':['US','Europe','AsiaPac','US','Europe','AsiaPac'],
'Group':['GrA','GrB','GrZ','GrC','GrD','GrF']
})
How do I do it? There is a crosstab function in pandas but it doesn't do this. In Qlik I would simply do
Crosstable(Region,Group,2)
LOAD
ID,
Name,
US,
Europe,
AsiaPac
And I would go from df1 to df2. How can I do this in python (pandas or otherwise)?
This is essentially reshaping your data from a wide format to a long format, as it's known in R parlance. In pandas, you can do this with
pd.melt
:If you need your columns sorted on
ID
orName
andGroup
, as in your example output, you can add.sort_values()
to the expression: