R: Add columns to a data frame on the fly

945 views Asked by At

new at R and programming in general over here. I have several binary matrices of presence/absence data for species (columns) and plots (rows). I'm trying to use them in several dissimilarity indices which requires that they all have the same dimensions. Although there are always 10 plots there are a variable number of columns based on which species were observed at that particular time. My attempt to add the 'missing' columns to each matrix so I can perform the analyses went as follows:

df1 <- read.csv('file1.csv', header=TRUE)
df2 <- read.csv('file2.csv', header=TRUE)

newCol <- unique(append(colnames(df1),colnames(df2)))
diff1 <- setdiff(newCol,colnames(df1))
diff2 <- setdiff(newCol,colnames(df2))

for (i in 1:length(diff1)) {
  df1[paste(diff1[i])]
}
for (i in 1:length(diff2)) {
  df2[paste(diff2[i])]
}

No errors are thrown, but df1 and df2 both remain unchanged. I suspect my issue is with my use of paste, but I couldn't find any other way to add columns to a data frame on the fly like that. When added, the new columns should have 0s in the matrix as well, but I think that's the default, so I didn't add anything to specify it.

Thanks all.

1

There are 1 answers

0
akrun On BEST ANSWER

Using your code, you can generate the columns without the for loop by:

df1[, diff1] <- 0 #I guess you want `0` to fill those columns
df2[, diff2] <- 0

identical(sort(colnames(df1)), sort(colnames(df2)))
#[1] TRUE

Or if you want to combine the datasets to one, you could use rbind_list from data.table with fill=TRUE

library(data.table)
rbindlist(list(df1, df2), fill=TRUE)

data

 set.seed(22)
 df1 <- as.data.frame(matrix(sample(0:1, 10*6, replace=TRUE), ncol=6,
  dimnames=list(NULL, sample(paste0("Species", 1:10), 6, replace=FALSE))))


 set.seed(35)
 df2 <- as.data.frame(matrix(sample(0:1, 10*8, replace=TRUE), ncol=8,
  dimnames=list(NULL, sample(paste0("Species", 1:10),8 , replace=FALSE))))