I have some data which includes dates in a year-quarter format.
Sorting the dataframe works fine, however when plotting the data, Plotly
automatically re-orders the x-axis, by placing data with missing values at the end, instead of adhering to the desired order.
# Example data that is not yet ordered on 'Date'
import pandas as pd
import plotly.express as px
df = pd.DataFrame([
['2021-Q4', 'A', 1],
['2021-Q4', 'B', 5],
['2022-Q1', 'B', 5],
['2023-Q2', 'B', 3],
['2023-Q3', 'B', 16],
['2022-Q2', 'B', 4],
['2022-Q2', 'A', 1],
['2022-Q3', 'B', 5],
['2022-Q4', 'B', 6],
['2022-Q4', 'A', 4],
['2023-Q1', 'A', 1],
['2023-Q1', 'B', 9],
['2023-Q3', 'A', 1]
], columns=['Date', 'Type', 'Count'])
# we explicity order the data
# Note that now the 2022-Q1 is in between 2021-Q4 and 2022-Q2
df = df.sort_values('Date', key=lambda e: e.replace('Q',''))
# Now the x-axis and the broken chronology, i.e., 2022-Q1 at the end
fg = px.bar(df, x="Date", y="Count", color="Type", barmode="group")
fg.show()
My desired behavior is that the x-axis remains in the same order as the df
is after applying sort_values
. Instead, rows with empty data are places in the end, no longer on the chronological order. How can I override this behavior?
The layout can be updated to specify the category order:
This outputs:
Please refer to this page for more information on how to deal with sorted or ordered categories.