I'm looking to separate out a column containing multiple comma-delimited responses into multiple columns. I'm using the cSplit_e function in the splitstackshape package. Unfortunately, some items within the package contain commas within a single item, so I am trying to indicate that it should split only at commas that are not followed by spaces.
This is the syntax that I've got right now:
cSplit_e(data=df,split.col="question",sep=",",type="character")
Which takes this:
Behavior; green, pink, blue,Sleep; indigo, violet, puce
And creates separate columns for:
question_Behavior; green
question_pink
question_blue
question_Sleep; indigo
question_violet
question_puce
But I want it to split into this:
question_Behavior; green, pink, blue
question_Sleep; indigo, violet, puce
I'm not sure how to indicate within the syntax of cSplit_e that I only want it to split at the commas that are immediately followed by not-whitespace, and would appreciate assistance!
An example dataframe:
id_num <- c("1","2","3","4","5")
question <- c("Behavior; green, pink, blue,Sleep; indigo, violet, puce","Behavior; green, pink, blue","","Sleep; indigo, violet, puce","Behavior; green, pink, blue,Sleep; indigo, violet, puce")
df <- data.frame(id_num,question)
If you don't mind using the
tidyr package
, here is a suggestion for a possible solution. Maybe it's not as elegant or simple as using thissplitstackshape package
, but I don't know it.I had to remove the id_num with empty values in both answers (id = 3)
My code:
Output: