I am trying to make an interpolation for all the rows in a dataframe. I am using apply(data_final,2,na.approx). This make the interpolation for the values but some are out of range.

If I use na.approx(data_final[8,]) instead, I get a different value at that row compared to the row using apply.

Also, if I do na.approx(data_final) I get the same result as apply(data_final,2,na.approx). It doesn't make any sense since supposedly apply is applying na.approx function to every row in the data frame.

apply(data_final,2,na.approx) [8,] 0.63 0.49 2.40 2.65 3.65 5.80 0.96 1.85 1.43 1.25 1.21 1.20 0.91 1.00 0.96 0.80 1.42 1.82 1.910

na.approx(data_final[8,]) [1] 0.630 0.490 0.584 0.678 0.772 0.866 0.960 1.850 1.430 1.250 1.210 1.200 0.910 1.000 0.960 0.800 1.420 1.820 1.910 1.780 1.620 [22] 1.650 1.380 1.370

1 Answers

0
akrun On

Because na.approx does columnwise calculation instead of rowwise. According to ?na.approx (from zoo), the usage is

na.approx(object, ...)

and the parameter description is

If obj has more than one column, the above strategy is applied to each column.


Using a reproducible example

library(zoo)
df1 <- data.frame(col1 = c(2, NA, 3, 4), col2 = c(1, 3, NA, 2))
na.approx(df1)
#     col1 col2
#[1,]  2.0  1.0
#[2,]  2.5  3.0
#[3,]  3.0  2.5
#[4,]  4.0  2.0

columnwise applying na.approx

sapply(df1, na.approx)
#     col1 col2
#[1,]  2.0  1.0
#[2,]  2.5  3.0
#[3,]  3.0  2.5
#[4,]  4.0  2.0