Interpolating along a table of values

62 views Asked by At

I have a dataframe with 5 columns (a, b, c, d, e, f). I then have specific values for columns a, b and c and need to interpolate along the dataframe to get values for columns d and e as well.

As a simplified case I try to do e.g.:

a = np.array(random.sample(range(1, 1001), 100)) 
b = 2* a
c = 3* a 
d = 4* a 
e= 5* a 

data = {'a': a,
        'b': b,
        'c': c,
        'd': d,
        'e': e}

df = pd.DataFrame(data)

to_interp = {'a': [200.0, 525.0],
             'b': [400.0, 1050.0],
             'c': [600.0, 1575.0]}


new_df = pd.DataFrame(to_interp)

df = pd.concat([df, new_df], ignore_index=True)

df.sort_values('a', inplace=True)

df['d'] = df['d'].interpolate(method='linear', limit_direction='forward', axis=0) 

df['e'] = df['e'].interpolate(method='linear', limit_direction='forward', axis=0)

interpolated_values = df[df['a'].isin(to_interp['a'])][['a', 'b', 'c', 'd', 'e']].copy()

print(interpolated_values)

but for this simplified case I am getting

        a, b, c, d, e = 

 200.0 ,  400.0,   600.0 ,  800.0 , 1000.0

 525.0,  1050.0  ,1575.0 , 2106.0 , 2632.5

which doesn't look right when i look at the row defined by a=525.

I'm not sure what I'm doing wrong so any help would be appreciated.

Thank you!

1

There are 1 answers

0
mozway On

You should fix random.seed(0) and provide the exact expected output for clarity, but I imagine that you need to might want to interpolate relative to a (in which case, set it as index and use method='index'.

Then, merge and combine_first:

tmp = (pd.concat([df, new_df], ignore_index=True)
         .sort_values('a').set_index('a')
         .interpolate(method='index', limit_direction='forward', axis=0)
         .reset_index()
      )

out = new_df.combine_first(new_df[['a']].merge(tmp, how='left'))

Output (using random.seed(0) to define the input):

       a       b       c       d       e
0  200.0   400.0   600.0   800.0  1000.0
1  525.0  1050.0  1575.0  2100.0  2625.0