Python DataFrame - Boxplot of a data column (which contains 20+ values in a form of a list)

34 views Asked by At

I am bit new to this.

Consider a dataframe:

df = pd.DataFrame({'info':['1a','1b','1a','1b','1a','1b'],'list1': np.random.randint(0, 10, (6,15)).tolist()},index=list('AABBCC'))

print(df)

I wish to be able to create a boxplot(preferably using seaborn), with a "hue" applied to the 'info' column such that the box(or boxenplot) plot for each of the index A,B,C are next to one another - in a single plot of size plt.figure(figsize=X,Y).

I am familiar with plotting from dataframe where single values per column. However I am struggling to plot it eloquently when there is a list.

1

There are 1 answers

0
Ian Thompson On

You'll need to explode your list1 series into long or "tidy" data before using seaborn.boxplot.

from matplotlib import pyplot as plt
import numpy as np
import pandas as pd
import seaborn as sns


df = pd.DataFrame(
    data={
        "info": ["1a", "1b", "1a", "1b", "1a", "1b"],
        "list1": np.random.randint(0, 10, (6, 15)).tolist(),
    },
    index=list("AABBCC"),
)
df = df.explode(column="list1")
sns.boxplot(data=df, x=df.index, y="list1", hue="info")
plt.show()

boxplot