My data is instrument reads and instrument baselines. The baseline data is punctual and typically does not extend to the "ends" of the dataset (i.e. first and last rows). Therefore I want to make a function that looks at the baseline column, and copies the values of the earliest and latest baselinepoints to the very first/last rows in the dataset, so that I can interpolate between them with approx().
I have so far done this manually, as exemplified below, but I need to do this task over and over again, so I'd like to make it a function. I checked for other threads around here, and from what I read, I think must have to do with the different ways to address columns and cells esp. When using self-made functions in data.frames.
Here is an example
#Make Two data frames: one holds instrument data, and one holds some
#baseline calibration we need to extend to the ends of the dataset
time<-seq(1,100,1)
data1<-rnorm(n = 100,mean = 7.5, sd = 1.1)
table1<-data.frame(cbind(time, data1))
time<-data.frame("time"=seq(2,96,4))
data2<-(0.32*rnorm(n = 24, mean = 1, sd = 1))
table2<-cbind(time,data2)
rm(time)
#now merge the two tables
newtable<-merge(table1, table2, by="time", all=T)
#remove junk
rm(data1, data2,table1,table2)
#copy 3rd column for later testing
newtable$data3<-newtable$data2
#the old manual way to fill the first row
newtable$data2[1]<-newtable$data2[min(which(!is.na(newtable$data2)))]
#the old manual way to fill the last row
newtable$data2[nrow(newtable)]<-newtable$data2[max(which(!is.na(newtable$data2)))]
#Now I try with a function
endfill<-function(df, col){
#fill the first row
df[1,col] <- df[min(which(!is.na(df[[col]]))), col] # using = instead of <- has no effect
df[nrow(df),col]<-df[max(which(!is.na(df[[col]]))),col]
#
}
#I want to try my funtion in column 4:
endfill(df= newtable,col = 4)
#Does not work...
Another try:
endfill<-function(df, col){
#fill the first row
df$col[1] <- df[[col]] [min(which(!is.na(df[[col]])))] # using $names
#df[nrow(df),col]<-df[max(which(!is.na(df[[col]]))),col]
#
}
endfill(df= newtable,col = 4)
# :-(
In the function I have tried different approaches to address cells, first with using df$col[1], then also with df[[col]][1], and mixed versions, but I seem to miss a point here. When I execute my above function in pieces, e.g. only the single parts before and after the "<-", they all make sense, i.e. deliver NA values for empty cells or the target value. But it seems impossible to do real assignments?!
Here is a solution with function
na.locffrom packagezoo.Created on 2024-02-26 with reprex v2.0.2