I am trying to run an online change point detection on the trend component of a time series signal (so I don't get false positives due to seasonality). The seasonal_decompose from statsmodels returns NaN values for trend component at the beginning and end due to CMA under the hood. Is there a way to get the trend without losing any data? I want to check changes on a daily level so need updated data
I have tried stl from statsmodels, same issue
Additional details:
I have a df with dates and corresponding values for over 4 years (sample below):
| date | value |
|---|---|
| 01.01.2019 | 50 |
| 02.01.2019 | 51 |
| 03.01.2019 | 52 |
| 04.01.2019 | 53 |
I am then decomposing the signal using seasonal_decompose from statsmodels and then applying change point detection from ruptures:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from statsmodels.tsa.seasonal import seasonal_decompose
df['date'] = pd.to_datetime(df['date'])
df.set_index('date', inplace=True)
ts = df['value'].values
result = seasonal_decompose(ts, model='additive', period=365) #yearly seasonality
trend_component = result.trend
valid_indices = ~np.isnan(trend_component)
trend_component_np = trend_component[valid_indices].reshape(-1, 1)
detector = rpt.Pelt(model="l2").fit(trend_component_np)
change_points = detector.predict(pen=150)
When I plot the trend and residual components, it gives me NaN values in the first/last 6 months. I have more than 2 cycles of data, why is this happening? Seasonality is populated.
Looking for alternative methods too.