I keep running into an issue where I group data by certain columns, but I cannot figure out how to plot by that data that I've grouped.

Here is my Data

For example,

import plotly.plotly as py
import plotly.graph_objs as go

xs = df['region'].values
ys = df['AveragePrice'].values
data = [go.Bar(
    x=xs,
    y=ys,
    marker={
        'color': ys,
        'colorscale': 'Viridis'
    }
)]

layout = {
    'xaxis': {
        'categoryorder': 'array',
        'categoryarray': [x for _, x in sorted(zip(ys, xs))]
    }
}

fig = go.FigureWidget(data=data, layout=layout)
fig

This works, but doesn't show what I really want. ^

import plotly.plotly as py
import plotly.graph_objs as go
df1 = df.groupby(['region'])['AveragePrice'].mean()
xs = df1['region'].values
ys = df1['AveragePrice'].values
data = [go.Bar(
    x=xs,
    y=ys,
    marker={
        'color': ys,
        'colorscale': 'Viridis'
    }
)]

layout = {
    'xaxis': {
        'categoryorder': 'array',
        'categoryarray': [x for _, x in sorted(zip(ys, xs))]
    }
}

fig = go.FigureWidget(data=data, layout=layout)
fig

This gives me a key error. ^

1 Answers

0
Oysiyl On

You need to add .reset_index() to your groupby call. Without that pandas can't there is the only pd.Series and not a table on which you can make a call to a column such as region:

region
A    1.340
B    1.005
C    1.280
Name: AveragePrice, dtype: float64

So for plotting you need to convert those output from groupby call back to pd.DataFrame. Without that you can't assign x and y to columns because there is no such columns. And you will get this error:

KeyError: 'region'

With .reset_index():

(df1 = df.groupby(['region'])['AveragePrice'].mean().reset_index())

  region  AveragePrice
0      A         1.340
1      B         1.005
2      C         1.280

So here you get pd.DataFrame, on which you can operate as in your previous block of code (assign x to one column, y to another, etc.). And your code will completed and get your barchart with region values on x and mean value of all values by each region on y.