How can plot errorbar style using scatter plot for min & max values instead of mean & std over time within dataframe in python?

168 views Asked by At

Let's say I have the following datafarme:

+-------------------+--------+--------+--------+
|timestamp          |average |min     |max     |
+-------------------+--------+--------+--------+
|2021-08-11 04:05:06|2.0     |1.8     |2.2     |
|2021-08-11 04:15:06|2.3     |2.0     |2.7     |
|2021-08-11 09:15:26|2.5     |2.3     |2.8     |
|2021-08-11 11:04:06|2.3     |2.1     |2.6     |
|2021-08-11 14:55:16|2.6     |2.2     |2.9     |
|2021-08-13 04:12:11|2.1     |1.7     |2.3     |
+-------------------+--------+--------+--------+

I want to plot average values in the form of a scatter plot and also plot min and max columns using a dashed line plot and then fill between areas average & max as well as average & min separately as it is shown in fig 1 in the table.

I found some close examples:


I aim to develop this to reach something like plt.errorbar() (which deals with mean and std) example 1, example 2 but I just want to illustrate min and max values instead of ~~mean & std~~ over time as follows:

img img
Fig. 1: without errorbar style. Fig. 2: with errorbar style.

sadly I could not find the output example for fig 2 since normally they used to translate mean and std with data points but for fig 1 this post but it for language is what part of what I want but the area color should be filled with different color (light red and blue) separately.

Any help and guidance will be appreciated.

2

There are 2 answers

3
r-beginners On BEST ANSWER

Since you have not specified the library of visualizations to use, I have plotly.graph_objects created the graph for your purposes. The graph will consist of a scatter plot line type with error bars, and a scatter plot line type with maximum, average, and minimum lines. The reason why I am drawing the line of averages is to paint the maximum and minimum. The only difference from the output in your question is the color of the lines in the error bars. plotly automatically processes time series data on the x-axis.

import plotly.graph_objects as go

fig = go.Figure()
fig.add_trace(go.Scatter(
    x=df['timestamp'],
    y=df['average'],
    marker=dict(size=10, color='black'),
    line_color='blue',
    name='average',
    error_y=dict(
        type='data',
        array=df['max']-df['average'],
        arrayminus=df['average']-df['min'],
        visible=True)
    ))

fig.add_trace(go.Scatter(
    x=df['timestamp'],
    y=df['max'],
    mode='lines',
    line=dict(color='rgba(255,0,0,1.0)', dash='dash'),
    fill='tonexty',
    fillcolor='rgba(255,0,0,0.5)',
    name='max'
))
fig.add_trace(go.Scatter(
    x=df['timestamp'],
    y=df['average'],
    mode='lines',
    line_color='rgba(0,0,0,0)',
    fill='tonexty',
    fillcolor='rgba(0,0,0,0)',
    showlegend=False
))
fig.add_trace(go.Scatter(
    x=df['timestamp'],
    y=df['min'],
    mode='lines',
    line=dict(color='rgba(0,0,255,1.0)', dash='dash'),
    fill='tonexty',
    fillcolor='rgba(0,0,255,0.5)',
    name='min'
))

fig.show()

enter image description here

1
D-E-N On

first of all, you should give copy&paste ready data, so you can get more help from others ;)

If i understand you right you can build you plot step by ste and this should lead to the wanted output:

from io import StringIO

import matplotlib.pyplot as plt
import pandas as pd

# You should give data in a better form to make it easier to help you, so somebody can copy/paste data
data = StringIO("""
timestamp|average|min|max
2021-08-11 04:05:06|2.0|1.8|2.2
2021-08-11 04:15:06|2.3|2.0|2.7
2021-08-11 09:15:26|2.5|2.3|2.8
2021-08-11 11:04:06|2.3|2.1|2.6
2021-08-11 14:55:16|2.6|2.2|2.9
2021-08-13 04:12:11|2.1|1.7|2.3
""")

df = pd.read_table(data, delimiter="|")

# draw area between max/average and min/average
plt.fill_between(x='timestamp', y1='average', y2='max', data=df, color="lightcoral")
plt.fill_between(x='timestamp', y1='average', y2='min', data=df, color="lightblue")

# draw dashed lines of min/max
plt.plot(df["timestamp"], df["max"], "r--")
plt.plot(df["timestamp"], df["min"], "b--")

# draw vertical lines from min/max to average
plt.vlines(df["timestamp"], df["max"], df["average"], color="red")
plt.vlines(df["timestamp"], df["min"], df["average"], color="blue")

# draw dots of average
plt.plot(df["timestamp"], df["average"], "k.")

plt.show()

for me, it gives the following result:

enter image description here

I didn't prettefied the stuff around, but i think that isn't a problem for you to show nice axes and so on.