I have a string in which I want to get out the city, in this example it would be 'Elland Rd' and 'Leeds'.
mystring = "0000\" club_info=\"Elland Rd, Leeds\" Pitch=\"100x50\""
city = gsub(".* club_info=\"(.*),(.+)\.*", "\\2", mystring) #cant get this part to work
My theory behind getting the city is to search for everything after the comma and up until the backslash but I cant seem to get it to recognize the backslash
I prefer
strcaptureto extract multiple patterns vice repeatedgsubing, how about this?(It was not required to include the
Pitch=in there, but I thought you might use it since it appears you're doing reductivegsubing.)FYI,
x2here has a leading space; it could be handled in the regex, but if you are not 100% positive it's in all cases, then it might be simpler to addtrimws(.), as inIn this case it does drop from a
data.frameto alist, but I'm not certain you need a frame, a named list should suffice. If you really want it as a frame --- and many of my use-cases really prefer that --- just add|> as.data.frame()to the pipe.Regex walk-through.
Also, since we know that we'll have double quotes in the pattern and not single-quotes, I chose to use single-quotes as the outer string-defining demarcation. If we have both or if you want to avoid double-backslashes and the like, we can use R's "raw strings" instead,
where the
r"{and}"are the open/close delimiters; I chose braces here since parens are visually confusing with the regex-parens, though bracketsr"[/]"and parensr"(/)"also work.