I am trying to conditionally concatenate string variables using tidyverse.
Here is the toy data
df <- tibble(id = paste0("id_", 1:4),
outcome = rep(x = c("simon",
"garfunkel"),
times = 2),
worth = rep(x = c("awesome",
"disposable"),
times = 2))
df
# id outcome worth
# <chr> <chr> <chr>
# 1 id_1 simon awesome
# 2 id_2 garfunkel disposable
# 3 id_3 simon awesome
# 4 id_4 garfunkel disposable
I can use unite()
from tidyr
to combine the id
column and 'worth' column like so
df %>%
unite("id", c(id, worth))
# id outcome
# <chr> <chr>
# 1 id_1_awesome simon
# 2 id_2_disposable garfunkel
# 3 id_3_awesome simon
# 4 id_4_disposable garfunkel
But there are a few problems with this, some problems with the output and some problems with the way I generated it.
First, I would like to retain the original column whereas unite()
simply concatenates the two columns. I tried unite
within mutate
but this generated an error.
Second, and most important, rather than simply concatenating a column I would like to make the new cocantenated id
column a combination of the id
column and the worth
column but conditional on the outcome
column. I tried to do this using case_when()
within mutate()
but got confused where to put the paste0()
function and/or whether unite()
could be used inside case_when()
.
Third, and related to the second point, I need to concatenate only a part of the worth
column into the id
column. ideally using a regex substitution, capturing only the first x letters of the worth
column
Basically I need the new dataset to look like the dataframe below, but using conditional and string-concantenation mechanics
tibble(id = paste0(paste0("id_", 1:4),
rep(c("_awes", "_disp"))),
outcome = rep(x = c("simon",
"garfunkel"),
times = 2),
worth = rep(x = c("awesome",
"disposable"),
times = 2))
# id outcome worth
# <chr> <chr> <chr>
# 1 id_1_awes simon awesome
# 2 id_2_disp garfunkel disposable
# 3 id_3_awes simon awesome
# 4 id_4_disp garfunkel disposable
Any help much appreciated.
(p.s. apologies if you think Garfunkel was also awesome)