how to add trendlines to stacked barcharts

167 views Asked by At

I want to make a stacked bar from a python dataframe. I want to connect equivalent categories between bars with trendlines. BUT i want the trendlines to connect the upper and lower edges of each category (as can be easily done in Excel), not its mean value (as is the case in most answers to similar questions found on stack overflow).

Here is an example image (generated with Excel) of what I would want to achieve: enter image description here

How can this be best achieved?


edit: GitHub Copilot gives me the following suggestion which ALMOST does what i want:

import pandas as pd
import matplotlib.pyplot as plt
import numpy as np

# Assuming df is your DataFrame and it's indexed properly
df = pd.DataFrame({
    'Category1': [1, 2, 3, 4],
    'Category2': [2, 3, 4, 5],
    'Category3': [3, 4, 5, 6]
}, index=['Bar1', 'Bar2', 'Bar3', 'Bar4'])

ax = df.plot(kind='bar', stacked=True)

# Calculate the cumulative sum for each category
df_cumsum = df.cumsum(axis=1)

# Iterate over each category
for i, category in enumerate(df.columns):
    # Get the y coordinates for the upper boundary of each "sub-bar"
    y_upper = df_cumsum[category].values
    # Get the y coordinates for the lower boundary of each "sub-bar"
    y_lower = y_upper - df[category].values
    # Get the x coordinates for each point
    x_coords = np.arange(len(df))
    # Plot the line connecting the upper boundaries
    plt.plot(x_coords, y_upper, marker='o')
    # Plot the line connecting the lower boundaries
    plt.plot(x_coords, y_lower, marker='o')

plt.show()

however as you can see in the resulting figure, it connects the middle of the upper edges of each categories sub-bar. How can i connect the left and right CORNERS of each categories sub-bar?

enter image description here

Additional Info: In my specific case, no NaN or negative values occur

Edit 2: I DO have values of zero though...

2

There are 2 answers

4
BigBen On BEST ANSWER

One approach might be to iterate over ax.patches and get the top right and left corners of each pair of bars, which respectively become the left and right coordinates of the line segment connecting the bars:

for i in range(len(ax.patches)-1):
    if (i + 1) % len(df) != 0:
        left = ax.patches[i].get_corners()[2]
        right = ax.patches[i+1].get_corners()[3]
        ax.plot([left[0], right[0]], [left[1], right[1]], c='gray', lw=0.5)

Output:

enter image description here

Importantly, this assumes no NaNs (or zero values) in your data. You also haven't specified what should happen for negative values.

0
jov14 On

building on BigBens extremely helpful answer with a lot of experienced additional help I now got the following function that also accounts for zero values that I wanted to share here (haven't checked for negative of NaN values though):

def plot_connected_stacked_barchart2(df):
    ax = df.plot.bar(stacked=True)

    for i in range(len(ax.patches)-1):
        if (i + 1) % len(df) != 0:
            # Get the corners of the current patch and the next patch
            left = ax.patches[i].get_corners()[2]
            right = ax.patches[i+1].get_corners()[3]

            n = 1

            # Get the corners of the last patch of the current bar and the next bar
            while ax.patches[i-n*len(df)].get_height() == 0 and i-n*len(df) >= 0:
                n += 1
            alt_left = ax.patches[i-n*len(df)].get_corners()[2]

            n = 1
            while ax.patches[i+1-n*len(df)].get_height() == 0 and i+1-n*len(df) >= 0:
                n += 1
            alt_right = ax.patches[i+1-n*len(df)].get_corners()[3]

            # If the current patch or the next patch has a height of zero, use the corners of the last patch
            if ax.patches[i].get_height() == 0:
                left = alt_left
            if ax.patches[i+1].get_height() == 0:
                right = alt_right

            ax.plot([left[0], right[0]], [left[1], right[1]], c='gray', lw=0.5)

    plt.show()