Seaborn violinplots not showing correctly

2.3k views Asked by At

I am using Seaborn violinplot() and swarmplot() to show data from a Dataframe. Swarmplot works fine, while I am having troubles using the violinplot.

I do all plots in loops and most of them show up as expected, but a few figures do not have corresponding violins. This also happens without the loop for the same data-portion.

DataFrame for plots with errors.

Thank you very much for you time!

    CDS source
0   3158    nature
1   2879    DTU
2   2881    DTU
3   3103    dairy
4   2992    nature
5   3127    dairy
6   3127    nature
7   2879    dairy
8   3116    nature
9   3091    nature
10  3014    dairy
11  3003    nature
12  2951    dairy
13  3161    nature
14  2960    nature
15  2971    nature
16  3138    nature
17  3153    nature
18  2878    DTU
19  2882    DTU
20  2880    DTU
21  2880    DTU
22  2942    nature
23  3027    dairy
24  3021    dairy
25  3395    nature
26  3160    nature
27  2997    nature
28  3094    nature
29  2798    nature
30  3082    dairy
31  3061    nature
32  2912    nature
33  2952    nature
34  3154    nature
35  3158    nature
36  2980    dairy
37  3069    dairy
38  3080    nature
39  2880    DTU
40  3301    nature
41  3042    nature
42  3154    nature
43  3034    nature
44  2983    dairy
45  2981    nature
46  3049    nature
47  3090    dairy
48  2987    nature
49  2828    nature
50  2924    nature
51  3108    dairy
52  3128    nature
53  3030    nature
54  3120    nature
55  3176    nature
56  3185    nature
57  3205    nature
58  2987    nature
59  2900    nature
60  3247    nature
61  3144    nature
62  3092    nature
63  2944    dairy
64  3284    nature
65  2947    nature
66  3185    dairy
67  2715    dairy
68  2924    nature
69  2929    nature
70  2961    nature
71  3172    nature
72  3161    nature
73  3200    nature
74  2913    nature
75  3157    nature
76  2965    nature
77  2940    nature
78  3104    dairy
79  3015    dairy
80  3022    dairy
81  3119    nature
82  3189    dairy
83  3179    nature

Code:

for species in listofspecies:
    dfplot = df[df['species'].isin([species])]
    ax = sns.violinplot(data = dfplot, x='source', y="CDS", order=["dairy","DTU","nature"], inner=None)
    ax = sns.swarmplot(data = dfplot, x='source', y='CDS', order=["dairy","DTU","nature"], color=("white"), edgecolor="black", linewidth=0.7)    
    plt.show()
    plt.clf()

Error violin plot

Error violin plot

Correct:

Correct violin plot

1

There are 1 answers

0
JohanC On BEST ANSWER

There are very few data points with source == 'DTU'; moreover, their 'CDS' values are very close to each other. The central violin plot ends up with a height of almost zero.

The violinplot has a parameter scale which defaults to area. To have all areas equal, the two other violins need to be very narrow. Setting scale='width' gives all violins an equal width:

ax = sns.violinplot(data=dfplot, x='source', y="CDS", order=["dairy", "DTU", "nature"],
                    inner=None, scale='width')
ax = sns.swarmplot(data=dfplot, x='source', y='CDS', order=["dairy", "DTU", "nature"],
                   color=("white"), edgecolor="black", linewidth=0.7, ax=ax)

The image at the left is the generated plot, the image at the right is zoomed in to a very restricted y-region concentrated on the 'CTU' violin plot.

example plot