Suppose I have data where I have count and percent columns based on some prior operations where the column names are data dependent. My question is how can I slect columns containing a common string when I dont' know the string in advance because its data dependent? Some toy data where we do know the column names (but lets pretend we don't!)
library(tidyverse)
mtcars <- mtcars %>%
mutate(cnt_something = sample(0:100, nrow(mtcars)),
cnt_otherthing = sample(0:100, nrow(mtcars)),
pct_something = paste0( cnt_something, "%"),
pct_otherthing = paste0( cnt_otherthing, "%"))
So in real data the strings something
and otherthing
result from previous data dependent steps and actually there could be many columns, but I know there will alway be a pair of columns in the form cnt_
and pct_
. My question is therefore how can I select the cnt_*****
and pct_****
with matching *****'s for the next manipulation (e.g. paste0()
) without knowing the different possible ***** strings.
The desired output is something like this:
result_something result_otherthing
Mazda RX4 49 (49%) 82 (82%)
Mazda RX4 Wag 20 (20%) 72 (72%)
Datsun 710 37 (37%) 75 (75%)
Hornet 4 Drive 22 (22%) 85 (85%)
Hornet Sportabout 53 (53%) 100 (100%)
You can do a series of pivots to transform this data:
Note: in the tidyverse row names are discouraged, but since your example had them I used
rownames_to_column
andcolumn_to_rownames
from thetibble
package to preserve them. This might not be necessary in your actual data.