Removing values that contain a string using dplyr (R)

2.6k views Asked by At

My current code removes only the values that have the exact value of "unassigned", whereas I want it to remove any value that contains "unassigned".

Here's my code

Newdata <- mydata %>%
  filter(taxon !="unassigned")

The column I'm looking to remove any "unassigned" values from is called taxon.

Thanks!

2

There are 2 answers

0
jasbner On BEST ANSWER

A grepl answer:

Newdata <- mydata %>%
  filter(!grepl(".*unassigned.*",taxon))
0
Celi Manu On

Try something like this:

library(tidyverse)
library(stringr)
# Create sample data
test <- c("hello", "world", "unassigned", "unassigned2", "unassigned3")
# Create data frame
df <- data.frame(test)
# Filter dataframe named "df" at column "test" for strings containing "unassigned"
df %>% filter(str_detect(test, "unassigned"))

This outputs

         test
1  unassigned
2 unassigned2
3 unassigned3