I have a pd DataFrame, typically on this format:
1 2 3 4
0.1100 0.0000E+00 1.0000E+00 5.0000E+00
0.1323 7.7444E-05 8.7935E-01 1.0452E+00
0.1545 4.3548E-04 7.7209E-01 4.5432E-01
0.1768 1.2130E-03 6.7193E-01 2.6896E-01
0.1990 2.5349E-03 5.7904E-01 1.8439E-01
0.2213 4.5260E-03 4.9407E-01 1.3771E-01
What I would like to do is re-sample the column 1 (index) values from a list, for example represented by:
indexList = numpy.linspace(0.11, 0.25, 8)
Then I need the values for columns 2, 3 and 4 to be linearly interpolated from the input DataFrame (it is always only my column 1 that I re-sample/reindex) - and if necessary extrapolated, as the min/max values for my list is not necessarily within my existing column 1 (index). However the key point is the interpolation part. I am quite new to python, but I was thinking using an approach like this:
- output_df = DataFrame.reindex(index=indexList) - this will give me mainly NaN's for columns 2-4.
- for index, row in output_df.iterrows()
"function that calculates interpolated/extrapolated values from DataFrame and inserts them at correct row/column"
Somehow it feels like I should be able to use the .interpolate functionality, but I cannot figure out how. I cannot use it straightforward - it will be too inaccurate since most of my entries after re-indexing as mentioned in columns 2-4 will be NaN's; the interpolation should be done within the two closest values of my initial DataFrame. Any good tips anyone? (and if my format/intension is unclear, please let me know...)
Assuming column 1 is in the index, you can reindex your dataframe with the original values along with the list you created and then use interpolate to fill in the nan's.