I have a dataframe df with some urls. There are subcategories within the slashes in the URLs I want to extract with stringr and str_extract
My data looks like
Text URL
Hello www.facebook.com/group1/bla/exy/1234
Test www.facebook.com/group2/fssas/eda/1234
Text www.facebook.com/group-sdja/sdsds/adeds/23234
Texter www.facebook.com/blablabla/sdksds/sdsad
I now want to extract everything after .com/ and the next /
I tried suburlpattern <- "^.com//{1,20}//$"
and df$categories <- str_extract(df$URL, suburlpattern)
But I only end up with NA in df$categories
Any idea what I am doing wrong here? Is it my regex code?
Any help is highly appreciated! Many thanks beforehand.
this will return everything between the first set of forward slashes