Im working on TFT model but I stuck with this error. I am trying to load custom dataset to TimeSeriesDataSet but I am getting this error. I also attached head of my dataset, actual dataset is much more bigger, so there is enough data to create series (over 1 million)
My core to create dataset;
time_category_columns = ['woy', 'dow', 'hod', 'moh', 'som']
not_real = ['timestamp', 'target', 'last_price', 'time_idx'] + time_category_columns
time_reals = [x for x in train.columns if x not in not_real]
training = TimeSeriesDataSet(
train,
time_idx="time_idx",
target="target",
group_ids=time_category_columns,
max_encoder_length=50,
max_prediction_length=1,
static_categoricals=[],
time_varying_known_categoricals=time_category_columns,
time_varying_known_reals=["time_idx"],
time_varying_unknown_categoricals=[],
time_varying_unknown_reals=time_reals,
add_relative_time_idx=True,
add_target_scales=True,
add_encoder_length=True,
allow_missing_timesteps=True,
)
This is the output and error message;
/usr/local/lib/python3.10/dist-packages/pytorch_forecasting/data/timeseries.py:1281: UserWarning: Min encoder length and/or min_prediction_idx and/or min prediction length and/or lags are too large for 1072897 series/groups which therefore are not present in the dataset index. This means no predictions can be made for those series. First 10 removed groups: [{'__group_id__woy': '5', '__group_id__dow': 'Friday', '__group_id__hod': '00', '__group_id__moh': '00', '__group_id__som': '03'}, {'__group_id__woy': '5', '__group_id__dow': 'Friday', '__group_id__hod': '00', '__group_id__moh': '00', '__group_id__som': '04'}, {'__group_id__woy': '5', '__group_id__dow': 'Friday', '__group_id__hod': '00', '__group_id__moh': '00', '__group_id__som': '05'}, {'__group_id__woy': '5', '__group_id__dow': 'Friday', '__group_id__hod': '00', '__group_id__moh': '00', '__group_id__som': '07'}, {'__group_id__woy': '5', '__group_id__dow': 'Friday', '__group_id__hod': '00', '__group_id__moh': '00', '__group_id__som': '08'}, {'__group_id__woy': '5', '__group_id__dow': 'Friday', '__group_id__hod': '00', '__group_id__moh': '00', '__group_id__som': '09'}, {'__group_id__woy': '5', '__group_id__dow': 'Friday', '__group_id__hod': '00', '__group_id__moh': '00', '__group_id__som': '10'}, {'__group_id__woy': '5', '__group_id__dow': 'Friday', '__group_id__hod': '00', '__group_id__moh': '00', '__group_id__som': '11'}, {'__group_id__woy': '5', '__group_id__dow': 'Friday', '__group_id__hod': '00', '__group_id__moh': '00', '__group_id__som': '13'}, {'__group_id__woy': '5', '__group_id__dow': 'Friday', '__group_id__hod': '00', '__group_id__moh': '00', '__group_id__som': '14'}]
warnings.warn(
---------------------------------------------------------------------------
AssertionError Traceback (most recent call last)
<ipython-input-18-62548dcd8bea> in <cell line: 2>()
1 # Let's create a Dataset
----> 2 training = TimeSeriesDataSet(
3 train,
4 time_idx="time_idx",
5 target="target",
1 frames
/usr/local/lib/python3.10/dist-packages/pytorch_forecasting/data/timeseries.py in __init__(self, data, time_idx, target, group_ids, weight, max_encoder_length, min_encoder_length, min_prediction_idx, min_prediction_length, max_prediction_length, static_categoricals, static_reals, time_varying_known_categoricals, time_varying_known_reals, time_varying_unknown_categoricals, time_varying_unknown_reals, variable_groups, constant_fill_strategy, allow_missing_timesteps, lags, add_relative_time_idx, add_target_scales, add_encoder_length, target_normalizer, categorical_encoders, scalers, randomize_length, predict_mode)
479
480 # create index
--> 481 self.index = self._construct_index(data, predict_mode=self.predict_mode)
482
483 # convert to torch tensor for high performance data loading later
/usr/local/lib/python3.10/dist-packages/pytorch_forecasting/data/timeseries.py in _construct_index(self, data, predict_mode)
1288 )
1289 assert (
-> 1290 len(df_index) > 0
1291 ), "filters should not remove entries all entries - check encoder/decoder lengths and lags"
1292
AssertionError: filters should not remove entries all entries - check encoder/decoder lengths and lags
Example from my dataset; https://drive.google.com/file/d/1lj8G3x-ubwYs2o8bMztXlm1kOuQBP9hQ/view?usp=sharing
I tried different parameters and changing dataset. Also asked chatgpt, searched on web.