Extract only words containing ASCII characters from vector of strings

221 views Asked by At

I'm stuck with it, so, please, any advice is welcome.

b <- str_extract_all(c('hello ringпрг','trust'), regex("[a-z]+", TRUE))

Returns a list:

    List of 2
 $ : chr [1:2] "hello" "ring"
 $ : chr "trust"

But I want to have a vector with strings of this words for each element of vector c('hello ringпрг','trust') such as "hello ring", "trust". Any other functions and packages are welcome too.

2

There are 2 answers

0
Tyler Rinker On BEST ANSWER

Use sapply with paste as in:

b<-str_extract_all(c('hello ringпрг','trust'), regex("[a-z]+", TRUE))

sapply(b, paste, collapse = " ")

## [1] "hello ring" "trust" 
0
akrun On

We can use

unlist(str_extract_all(c('hello ringпрг','trust'), regex("[A-Za-z ]+", TRUE)))
#[1] "hello ring" "trust" 

Or use the pattern as "[[:ascii:]]+"