I have two dataframes of equal dimensions. One has some value in cells (i.e. 'abc') that i need to index. Other has all different values. And I need to replace the values in other dataframe with the same index as 'abc'.


df1 <- data.frame('1'=c('abc','bbb','rweq','dsaf','cxc','rwer','anc','ewr','yuje','gda'),

df2 <- data.frame('1'=c('green','black','white','yelp','help','green','red','brown','green','crack'),

I can find sequential index of 'abc', but it returns one-sized vector

which(df1 == 'abc')
#[1]  1 24 35 45 63 69 70 73 85 95

And i don't know how to replace values using this method

In output expected to view df2 with replaced values 'green' only on the same indexes as values 'abc' in df1.

But note!! that 'green' values in df2 are not only in the same indexes as in df1

2 Answers

Elie Ker Arno On

Here is a way to. Learn about the *apply family in R: I think it is the most useful group of functions in this language, whatever you plan to do ;) Also know that data.frame are of 'list' type.

df1 <- lapply(df1, function(frame, pattern, replace){ # for each frame = column:
  matches <- which(pattern %in% frame)                # what are the matching indexes of the frame
  if(length(matches) > 0)                             # If there is at least one index matching,
    frame[matches] <- replace                         # give it the value you want
  return(frame)                                       # Commit your changes back to df1
}, pattern="abc", replace= "<whatYouWant>")           # don't forget this part: the needed arguments !
David O On

I don't think your problem is appropriately approached with the data in a data.frame. That introduces several complications. First, each variable (column) in the data frame is a factor with different levels! Second, your code is making a comparison between a list (data.frame) and a factor (which is coerced into an atomic vector). The help function for the == operator states ..if the other is a list R attempts to coerce it to the type of the atomic vector.. The help function also points out that factors get special handling in comparisons where it first assumes you are comparing factor levels, which your code is doing.

I think you want to convert your data frames of identical dimensions to a matrix first. If you need the results in a data.frame, convert it back after as I show here but realize that the factor levels may have changed.

# Starting with the values assigned to df1 and df2
  m1 <- as.matrix(df1)
  m2 <- as.matrix(df2)
  index <- which(m1 == "abc")
  m2[index] <- "abc"
  df2 <- as.data.frame(m2)