Say I have a string that looks like this:
iword/i
Here the tag is i
. This is similar to an HTML tag except without the <> angle brackets.
Or say I have
emword/em
Here the tag is em
.
What I want is a pattern that removes these tags.
I'm testing this pattern:
<([A-Z][A-Z0-9]*)\b[^>]*>(.*?)</\1>
on http://rubular.com/, but it is not working properly.
Specifically, what I want to do is with Objective-C:
NSString *string = @"iword/i";
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:pattern options:NSRegularExpressionCaseInsensitive error:&error];
return [regex stringByReplacingMatchesInString:string options:0 range:NSMakeRange(0, string.length) withTemplate:@""];
which will just remove all but word
.
You're going to need a complete list of html tags you want to remove then (i, em, b, what else?) since you're going to have to search specifically for the tags to remove.
One way of doing this is:
\b(i|em|b)(\w*)\/(i|em|b)\b
(and as you've seen before with Obj-c, likely some double \ escaping)In action: http://regex101.com/r/qL3cU9
Input:
Substitution result: