This question is a follow up to How to read a .txt file to graph a plot.
I have a file with time series data in the following format:
00:01:28,102,103,103 20-03-2024
00:02:16,111,110,110
00:02:33,108,109,109
00:02:49,107,108,108
...24 hours read... # not in the measurement file
23:58:54,111,112,112
23:59:11,109,110,110
23:59:47,115,116,117
00:00:04,115,116,116 21-03-2024
00:00:20,121,122,120
00:00:36,124,125,125
...24 hours read...
23:59:02,115,115,116
23:59:19,114,114,114
23:59:51,113,114,115
00:00:07,113,114,115 22-03-2024
00:00:24,116,117,115
00:00:45,115,115,116
...24 hours read...
23:59:08,101,101,100
23:59:32,103,103,102
23:59:48,102,102,102
...Next day...
Each line includes a timestamp, three numerical readings, and occasionally a date indicating the start of a new day. I am trying to plot this data with pandas and matplotlib but encounter two main issues: the x-axis labels (hours) overlap and the plot loads slowly.
Here's my current approach to plotting:
plt.figure(figsize=(15,9))
plt.xlabel('Day')
plt.ylabel('Voltage')
# Plot three series from the data
plt.plot(C0Temp, C1Temp, label="Voltage", color=LineColorTemp1Text)
plt.plot(C2Temp, C3Temp, label="Max", color='r')
plt.plot(C4Temp, C5Temp, label="Min", color='g')
plt.legend()
# Attempt to format x-axis to handle daily data
locator = mdates.AutoDateLocator(minticks=12, maxticks=24)
plt.gcf().axes[0].xaxis.set_major_locator(locator)
plt.xticks(rotation=45)
I'm looking for guidance on how to effectively plot this data day by day or even across months, ensuring the x-axis labels are readable and the plot loads efficiently.
Given the non-uniform format of the text file, it will need to be parsed line-by-line. This method allows for handling variations in data representation, such as the presence or absence of dates on certain lines and the inclusion of non-data lines (e.g., "24 hours read..." and "Next day"). By reading each line, the script differentiates between data entries and metadata or comments, ensuring that only relevant information is processed. This approach prepares a structured dataset for analysis and visualization, despite the file's initial irregularities.
My recommendation is to standardize the measurement output format.
Parse File
Create DataFrame
Plot
The plot displays a DataFrame with markers for each data point, set to a specific size and labeled axes. Major ticks on the x-axis show dates in 'Y-m-d' format, with minor ticks indicating times every 4 hours within a specified range. Major tick labels are rotated 90 degrees and centered, while minor tick labels remain horizontal and centered. The plot features grid lines for both major and minor intervals, styled differently to distinguish days from times. The layout is adjusted for clarity, accommodating rotated labels for better visibility.
Numerous questions already address plotting with pandas DataFrames and formatting the datetime x-axis of a pandas DataFrame. I encourage you to explore these resources and adjust the plot according to your requirements. For further plotting inquiries or specific adjustments, please consider posting a new question with a reference to the existing discussions.
df