I want to make a stacked bar from a python dataframe. I want to connect equivalent categories between bars with trendlines. BUT i want the trendlines to connect the upper and lower edges of each category (as can be easily done in Excel), not its mean value (as is the case in most answers to similar questions found on stack overflow).
Here is an example image (generated with Excel) of what I would want to achieve:
How can this be best achieved?
edit: GitHub Copilot gives me the following suggestion which ALMOST does what i want:
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
# Assuming df is your DataFrame and it's indexed properly
df = pd.DataFrame({
'Category1': [1, 2, 3, 4],
'Category2': [2, 3, 4, 5],
'Category3': [3, 4, 5, 6]
}, index=['Bar1', 'Bar2', 'Bar3', 'Bar4'])
ax = df.plot(kind='bar', stacked=True)
# Calculate the cumulative sum for each category
df_cumsum = df.cumsum(axis=1)
# Iterate over each category
for i, category in enumerate(df.columns):
# Get the y coordinates for the upper boundary of each "sub-bar"
y_upper = df_cumsum[category].values
# Get the y coordinates for the lower boundary of each "sub-bar"
y_lower = y_upper - df[category].values
# Get the x coordinates for each point
x_coords = np.arange(len(df))
# Plot the line connecting the upper boundaries
plt.plot(x_coords, y_upper, marker='o')
# Plot the line connecting the lower boundaries
plt.plot(x_coords, y_lower, marker='o')
plt.show()
however as you can see in the resulting figure, it connects the middle of the upper edges of each categories sub-bar. How can i connect the left and right CORNERS of each categories sub-bar?
Additional Info: In my specific case, no NaN or negative values occur
Edit 2: I DO have values of zero though...
One approach might be to iterate over
ax.patches
and get the top right and left corners of each pair of bars, which respectively become the left and right coordinates of the line segment connecting the bars:Output:
Importantly, this assumes no
NaN
s (or zero values) in your data. You also haven't specified what should happen for negative values.