Dataframe - interpolate values based on inputs from another dataframe

532 views Asked by At

Here is my dataframe:

import pandas as pd
dates = ('2020-09-24','2020-10-19','2020-12-17','2021-03-17','2021-06-17','2021-09-17','2022-03-17','2022-09-20','2023-09-19','2024-09-17','2025-09-17','2026-09-17','2027-09-17','2028-09-19','2029-09-18','2030-09-17','2031-09-17','2032-09-17','2035-09-18','2040-09-18','2045-09-19')
factors = ('1','0.999994','0.999875','1.000166','1.000303','1.000438','1.00056','1.000817','1.001046','1.001412','1.001525','1.001334','1.000685','0.999376','0.997456','0.994626','0.991244','0.986754','0.982072','0.962028','0.925136')
df = pd.DataFrame()
df['dates']=dates
df['factors']=factors
df['dates'] = pd.to_datetime(df['dates'])
df.set_index(['dates'],inplace=True)
df

Here is another dataframe with a timeseries with fixed interval

interpolated = pd.DataFrame(0, index=pd.date_range('2020-09-24', '2045-09-19', freq='3M'),columns=['result'])

The goal is to populate the second dataframe with the cubic spline interpolated values from the first table. Thanks for all the ideas

Attempt

interpolated['result'] = df['factors'].interpolate(method='cubic')

However it gives only NaN values in the intepolated dataframe. Not sure how to correctly refer to the first table.

1

There are 1 answers

2
ItsAnApe On BEST ANSWER

First things first, the shapes don't match. Since it seems none of the dates in the index from the df match the dates in interpolated, you just end up with NaN being filled in on the dates. I think you want something more like merge or join, as described in this post: Merging time series data by timestamp using numpy/pandas merge and join will also be helpful.