I am facing problems in being able to plot a single graph that presents the clear sky model for a specific location, along with the global irradiation measured for the same respective day.
I need help, because I've been studying for a relatively short time and I've been stuck on it for two weeks now. I've already researched many sources, tried to make several changes to the code, but I still haven't succeeded.
Basically I collected data provided by a collection station, I checked through pandas which day would be the max ghi day for the location. So, my goal would be to model what a clear sky day would be like, using the Ineichein and Solis models, and compare them with the measured the ghi data for the day.
Basically, either I get stuck in a chain of errors, or the generated graph does not match what would be expected, or simply the measured data is not plotted.
I will leave the entire sequence of my code, as well as the csv file with the data. I would be very grateful if someone could help me.
**csv file:**https://drive.google.com/file/d/1Dfrx1Mj7hcb4p5eRnyZYRTLBRhmESxfn/view?usp=sharing
my code:
import pandas as pd
import pvlib
from pvlib import clearsky, atmosphere, solarposition
from pvlib.location import Location
from pvlib.iotools import read_tmy3
import matplotlib.pyplot as plt
location = Location(latitude=-15.60083, longitude=-47.71306, tz="America/Sao_Paulo", altitude=1023, name="Brasilia station")
data = pd.read_csv('brjan19.csv', index_col=0, sep=';', usecols = ['id', 'year', 'day', 'glo'], low_memory=False)
data = data.rename(columns={'glo': 'ghi'})
data['ghi'] = pd.to_numeric(data['ghi'], errors='coerce')`
# Create the date column by specifying the year (2019), month (January) and day
data['date'] = pd.to_datetime(data['year'].astype(str) + '-01-' + data['day'].astype(str), format='%Y-%m-%d')
# Select data only for the 13th and 14th
data_subset = data[(data['date'].dt.day == 13) | (data['date'].dt.day == 14)]
# Set 'date' as the index
data_subset.set_index('date', inplace=True)
# Resample to 2 minutes and calculate the average
ghi = data_subset['ghi'].resample('2T').mean()
# Get Clear Sky values
cs = location.get_clearsky(ghi.index)
cs = cs.loc[ghi.index]
#print(data_subset)
# Plot
plt.figure(figsize=(10, 4))
cs['ghi'].plot(label='Clear Sky GHI')
ghi.plot(label='Measured GHI')
plt.title('Ineichen, Brasília station GHI (Days 13 and 14)')
plt.ylabel('Irradiance $(W/m^2)$')
plt.xlabel('Time')
plt.legend(loc='lower right')
plt.show()
When I got home to analyze more calmly and do tests, I noticed that there was another problem. The highest ghi values recorded for this day exceed the value of 1000 W/m^2, however these values are not being plotted on the graph.
I tried just removing the decimal points to see if that would solve the problem. However, making this change only the measured ghi data was plotted and the clear sky model resulted in 0.0
I tried creating a new separate variable in the dataframe to try to plot the ghi values separately, but I was also unsuccessful. I'll leave a printout of what the expected result would be like for the ghi.
[3
To understand what is wrong, try going step by step through the code and examining the value of each variable. By doing this, you can see that the
ghi
variable is messed up:Why is this? Well, it's because the dates you're using don't have hour/minute values, only daily. You have to read and make use of the
minuto
column in the data file too. Here is a revision of your code that I think does what you want: