How to assign a list of colors to a specific bar in a stacked bar chart

64 views Asked by At

I have a stacked bar chart showing the count of different comorbidities grouped by sex.

Female is represented by 'pink', Male by 'lightskyblue'.

To make it more obvious that 'None' is not just another comorbidity, I want 'None' (patient had no comorbidity) to use a different set of colors to make it stand out: 'lightcoral' for Female and 'royalblue' for Male.

This is my current stacked bar chart using colors = ['lightskyblue', 'pink']:

comorbidity_counts_by_gender = df.groupby('sex_female')[comorbidity_columns].sum()

comorbidity_counts_by_gender = comorbidity_counts_by_gender[comorbidity_counts_by_gender.sum().sort_values(ascending=True).index].T

colors = ['lightskyblue', 'pink']
bars = comorbidity_counts_by_gender.plot(kind='barh', stacked=True, figsize=(8, 8), color=colors, width=0.8)

plt.title('Distribution of Comorbidities by Gender')
plt.xlabel('Count')
plt.ylabel('')
bars.legend(title='Gender', loc="upper left", bbox_to_anchor=(1, 1), frameon=False)

plt.show()

enter image description here

No matter what I try, I can't seem to provide a different pair of colors to the 'None' bar. Here are a 2 ways I tried to solve the issue that didn't work out for me:

Try 1:

colors = [['royalblue', 'lightcoral'] if comorbidity == 'None' else ['lightskyblue', 'pink'] for comorbidity in comorbidity_counts_per_gender.index]

This results in ValueError: Invalid color ['lightskyblue', 'pink']

Try 2:

colors = []
for comorbidity in comorbidity_counts_by_gender.index:
  if comorbidity == 'None':
    colors = ['royalblue', 'lightcoral']
  else:
    colors = ['lightskyblue', 'pink']

This always uses ['lightskyblue', 'pink'] for any column.

1

There are 1 answers

0
Matt Pitkin On BEST ANSWER

Here's an example that changes the colours of the appropriate bars directly:

import pandas as pd
from matplotlib import pyplot as plt


def setcolors(ax, name="None", colors=["royalblue", "lightcoral"]):
    """
    Function to set the colours for the bars for a given category name.
    """

    # get labels
    ytl = ax.get_yticklabels()
    numlabels = len(ytl)
    
    # find the index of the given named label
    for i, t in enumerate(ytl):
        if t.get_text() == name:
            break
    
    # get the matplotlib rectangle objects representing the bars
    # (note this relies on nothing else having been added to the plot)
    rects = ax.get_children()[0:2 * numlabels]
    nrects = [rects[i], rects[numlabels + i]]
    
    # loop over the two bars for the given named label and change the colours
    for rect, color in zip(nrects, colors):
        rect.set_color(color)
        rect.set_edgecolor("none")


# some mock data
df = pd.DataFrame(
    {
        "Male": [5, 1, 3, 1],
        "Female": [4, 2, 2, 0]
    },
    index=["Smoking", "Hypertension", "None", "Hyperthyroidism"],
)

bars = df.plot(kind="barh", stacked=True, color=["lightskyblue", "pink"])

# change the colors
setcolors(bars)

plt.show()

enter image description here

Note that, by default (I think) the Rectangle objects representing the bars should be the first things in the list returned by get_children(). But, if you add further things to the plot then this may not be the case.