Pivot or Melt a Pandas DataFrame

Question

Pivot or Melt a Pandas DataFrame

65 views Asked by alter_k At 02 December 2023 at 19:00

I have a DataFrame that has the following columns: 'Series Name', 'Country Name', '2000', '20001' , ..., '2018'. In 'Series Name' I have the variables that I want to analize (for example, 'PBI', 'Surface Area', etc). So, in 'Series Name' I have repeated values (PBI for each Country, for example).

Original DataFrame

Series Name	Country Name	2000 [YR2000]	2001 [YR2001]
Superficie (kilómetros cuadrados)	Estados Unidos	9632030	9632030
Superficie (kilómetros cuadrados)	Unión Europea	4384964,995	4384984,995
Superficie (kilómetros cuadrados)	Japón	377800	377880

I want to reorder the DataFrame in this way: - Columns: 'Country Name', 'Year' and all the unique values of 'Series Name' - Year column: take the the values from 2000 to 2018 for each Country - 'Series Name' columns : each column ('PBI', 'Surface' etc) has the value for the respective Year and Country.

Country Name	Year	Superficie (kilómetros cuadrados)
Estados Unidos	2000	9632030
Estados Unidos	2001	9632030
Unión Europea	2000	4384964,995
Unión Europea	2001	4384964,995
Japón	2000	377800
Japón	2001	377800

I tried with melt, but I really don´t know how to use it:

indicadores_clean_pv = pd.melt(indicadores_clean, id_vars=['Country Name'], var_name='Año')

What I Tried

I´ve also tried :

indicadores_clean_pv.melt(id_vars='Country Name', var_name='Year').groupby(['Country Name','Year']).agg({'Series Name':'first'}).reset_index()

But I can't put the 'Series Name' values in the columns

Can anyone help me? Thanks in advance!

Original Q&A

There are 2 answers

**Corralien** · Answer 1 · 2023-12-02T20:19:23+00:00

You can use pivot_table:

out = (df.pivot_table(index='Country Name', columns='Series Name', 
                      aggfunc='first', sort=False)
         .rename_axis(columns=['Year', None])
         .stack('Year').reset_index()
         .assign(Year=lambda x: x['Year'].str.split().str[0]))

Output:

>>> out
     Country Name  Year Superficie (kilómetros cuadrados)
0  Estados Unidos  2000                           9632030
1  Estados Unidos  2001                           9632030
2   Unión Europea  2000                       4384964,995
3   Unión Europea  2001                       4384984,995
4           Japón  2000                            377800
5           Japón  2001                            377880

**Andrej Kesely** · Answer 2 · 2023-12-02T20:32:56+00:00

You can try simple .stack()/.unstack():

out = (
    df.set_index(["Country Name", "Series Name"])
    .stack()
    .unstack(level=1)
    .rename_axis(columns=None, index=["Country Name", "Year"])
    .reset_index()
)

print(out)

Prints:

     Country Name  Year Superficie (kilómetros cuadrados)
0  Estados Unidos  2000                           9632030
1  Estados Unidos  2001                           9632030
2           Japón  2000                            377800
3           Japón  2001                            377880
4   Unión Europea  2000                       4384964,995
5   Unión Europea  2001                       4384984,995

TechQA.

Pivot or Melt a Pandas DataFrame

There are 2 answers

Related Questions in PYTHON

Related Questions in PANDAS

Related Questions in DATAFRAME

Related Questions in PIVOT-TABLE

Related Questions in MELT

Popular Questions

Trending Questions