Facebook Prophet Future Dataframe

5.1k views Asked by At

I have last 5 years monthly data. I am using that to create a forecasting model using fbprophet. Last 5 months of my data is as follows:

data1['ds'].tail()

Out[86]: 55   2019-01-08
56   2019-01-09
57   2019-01-10
58   2019-01-11
59   2019-01-12

I have created the model on this and made a future prediction dataframe.

model = Prophet(
    interval_width=0.80,
    growth='linear',
    daily_seasonality=False,
    weekly_seasonality=False,
    yearly_seasonality=True,
    seasonality_mode='additive'
)

# fit the model to data
model.fit(data1)

future_data = model.make_future_dataframe( periods=4, freq='m', include_history=True)

After 2019 December, I need the next year first four months. But it's adding next 4 months with same year 2019.

future_data.tail()

    ds
59  2019-01-12
60  2019-01-31
61  2019-02-28
62  2019-03-31
63  2019-04-30

How to get the next year first 4 months in the future dataframe? Is there any specific parameter in that to adjust the year?

2

There are 2 answers

0
Alex Punnen On

Stumbled here searching for the appropriate string for minutes

As per the docs the date time need to be YY-MM-DD format -

The input to Prophet is always a dataframe with two columns: ds and y. The ds (datestamp) column should be of a format expected by Pandas, ideally YYYY-MM-DD for a date or YYYY-MM-DD HH:MM:SS for a timestamp. The y column must be numeric, and represents the measurement we wish to forecast.

2019-01-12 in YY-MM-DD is 2019-12-01 ; using this

>>> dates = pd.date_range(start='2019-12-01',periods=4 + 1,freq='M')
>>> dates
DatetimeIndex(['2019-12-31', '2020-01-31', '2020-02-29', '2020-03-31',
               '2020-04-30'],
              dtype='datetime64[ns]', freq='M')

Other formats here; it is not given explicitly for python in prophet docs

https://pandas.pydata.org/docs/reference/api/pandas.tseries.frequencies.to_offset.html

dates = pd.date_range(start='2022-03-17 11:40:00',periods=10 + 1,freq='min')
>>> dates
DatetimeIndex(['2022-03-17 11:40:00', '2022-03-17 11:41:00',
               '2022-03-17 11:42:00', '2022-03-17 11:43:00',
              ..],
              dtype='datetime64[ns]', freq='T')
0
Akash sharma On

The issue is because of the date-format i.e. the 2019-01-12 (2019 December as per your question) is in format "%Y-%d-%m" Hence, it creates data with month end frequency (stated by 'm') for the next 4 periods.

Just for reference this is how the future dataframe is created by Prophet:

    dates = pd.date_range(
        start=last_date,
        periods=periods + 1,  # An extra in case we include start
        freq=freq)
    dates = dates[dates > last_date]  # Drop start if equals last_date
    dates = dates[:periods]  # Return correct number of periods

Hence, it infers the date format and extrapolates in the future dataframe.

Solution: Change the date format in training data to "%Y-%m-%d"