How to recode one column conditional on another

Question

How to recode one column conditional on another

32 views Asked by user21027866 At 29 August 2023 at 15:54

I currently have a dataframe in R with a variable for names ("Name") and one that notes whether an individual is female ("Female") (1, NA). A given individual likely has multiple rows because they responded to the survey multiple times. However, sometimes, the "Female" column is incorrect and will not list an individual who has previously responded to being Female (1), as (1) and instead will have an NA.

So for example, if my dataset looks like this:

Name <- c("Claire Smith", "John Doe", "Serena Williams", "Claire Smith", "Claire Smith", "Serena Williams")
Female <- c(1, NA, NA, NA, NA, 1)
df <- data.frame(Name, Female)

I want it to check across all rows so that it will recognize that "Claire Smith" and "Serena Williams" do have at least one "1" associated for their "Names" and propagate the rest of the "Female" columns for these names as "1". So I would want the code to do this:

Name <- c("Claire Smith", "John Doe", "Serena Williams", "Claire Smith", "Claire Smith", "Serena Williams")
Female <- c(1, NA, 1, 1, 1, 1)
df <- data.frame(Name, Female)

I'm sure this is quite an easy fix but I can't seem to figure it out. Thank you very much in advance.

Original Q&A

There are 3 answers

Onyambu On 29 August 2023 at 16:01

library(tidyverse)
group_by(df, Name)%>%
   fill(Female, .direction = 'updown')

# A tibble: 6 × 2
# Groups:   Name [3]
  Name            Female
  <chr>            <dbl>
1 Claire Smith         1
2 John Doe            NA
3 Serena Williams      1
4 Claire Smith         1
5 Claire Smith         1
6 Serena Williams      1

jpsmith On 29 August 2023 at 16:22

A base R approach would be to use tapply and match:

xx <- tapply(df$Female, df$Name, sum, na.rm = TRUE)
df$Female2 <- xx[match(df$Name, names(xx))]

Output:

#              Name Female Female2
# 1    Claire Smith      1       1
# 2        John Doe     NA       0
# 3 Serena Williams     NA       1
# 4    Claire Smith     NA       1
# 5    Claire Smith     NA       1
# 6 Serena Williams      1       1

Note I made Female2 just for demonstration but could overwrite Female if desired.

Also, to preserve the NA instead of recoding males to 0, could do:

xx <- tapply(df$Female, df$Name, function(x) +any(x == 1))
df$Female2 <- xx[match(df$Name, names(xx))]

#              Name Female Female2
# 1    Claire Smith      1       1
# 2        John Doe     NA      NA
# 3 Serena Williams     NA       1
# 4    Claire Smith     NA       1
# 5    Claire Smith     NA       1
# 6 Serena Williams      1       1

**MrFlick** · Accepted Answer · 2023-08-29T16:00:54+00:00

Using a bit of dplyr you can use

library(dplyr)

df %>% 
  mutate(Female=any(Female==1)+0, .by=Name)
#              Name Female
# 1    Claire Smith      1
# 2        John Doe     NA
# 3 Serena Williams      1
# 4    Claire Smith      1
# 5    Claire Smith      1
# 6 Serena Williams      1

You just check for any "1" value for the name and propagate that to the group. The any() function returns a TRUE/FALSE value so we use +0 to turn TRUE into 1.

TechQA.

How to recode one column conditional on another

There are 3 answers

Related Questions in R

Related Questions in RECODE

Popular Questions

Trending Questions