I have a data table containing 20000+ rows and one column. The string in each column has different number of words. I want to split the words and put each of them in a new column. I know how I can do it word by word:
Data [ , Word1 := as.character(lapply(strsplit(as.character(Data$complaint), split=" "), "[", 1))]
(Data
is my data table and complaint
is the name of the column)
Obviously, this is not efficient because each cell in each row has different number of words.
Could you please tell me about a more efficient way to do this?
Check out
cSplit
from my "splitstackshape" package. It works on eitherdata.frame
s ordata.table
s (but always returns adata.table
).Assuming KFB's sample data is at least slightly representative of your actual data, you can try:
Another (blazing) option is to use
stri_split_fixed
withsimplify = TRUE
(from "stringi") (which is obviously deemed to enter the "splitstackshape" code soon):