Plotly boxplot: groupby option?

4.1k views Asked by At

I have two boolean variables and I try to create a boxplot with groups. Each group should represent one of the variables and it should contain two boxplots, one for TRUE and one for FALSE. Instead I am getting two groups, one representing TRUE and one representing FALSE and for each group two boxplots corresponding to each variable as in the attached graph:

boxplot

I understand that groups are derived from the xaxis. But how can I make plotly think that the variable names are the groups? The code I used for the output :

trace3= Box(
 y=raw_matrix.TPS,
 x=raw_matrix.noClassGc,
 name='noClassGc',
 marker=Marker(
 color='#3F5D7D'
))

trace4= Box(
 y=raw_matrix.TPS,
 x=raw_matrix.aggresiveOpts,
 name='aggresiveOpts',
 marker=Marker(
 color='#0099FF'
 ))

data = Data([trace3, trace4])
layout = Layout(
 yaxis=YAxis(
 title='confidence',
 zeroline=False),
 boxmode='group',
 boxgroupgap=0.5
 )


fig = Figure(data=data, layout=layout)
plot_url = ploteczki.plot(fig, filename='Performance by categoricals parameters')
2

There are 2 answers

0
etpinard On

You need to rearrange your data arrays so that the two Box traces have 'x' coordinates of 'noClassGc' and 'aggresiveOpts'.

This IPython notebook shows you how to do so.

0
neda On

Alternatively, to represent each group by its boolean boxplots, you can assign each trace to different x-axes. Here is an example:

trace0 = Box(
    y=raw_matrix_TPS,
    x=raw_matrix_noClassGc,
    name='noClassGc',
    marker=Marker(
        color='#3F5D7D'
    )
)
trace1 = Box(
    y=raw_matrix_TPS,
    x=raw_matrix_aggresiveOpts,
    name='aggresiveOpts',
    xaxis='x2',
    marker=Marker(
        color='#0099FF'
    )
)
data = Data([trace0, trace1])
layout = Layout(
    xaxis = XAxis(
        domain=[0, 0.55],
    ),
    xaxis2 = XAxis(
         domain=[0.55, 1],
    ),
    yaxis = YAxis(
         title='confidence',
         zeroline=False
    ),
    boxmode='group',
    boxgroupgap=0.5
)
fig = Figure(data=data, layout=layout)
plot_url = py.plot(fig, filename='Performance by categoricals parameters')

Here is the link to the plot

To learn more, you can checkout Plotly Python reference