I noticed this very strange behavior, when trying to get the match for the <> tags
let s = "TEST \r\n\r\n<strong>more:</strong>"
let re = try! NSRegularExpression(pattern: "<.*?>")
let matches = re.matches(in: s, range: NSRange(location: 0, length: s.count))
This results only in 1 match (should have been 2 < strong > and </ strong >)
▿ 1 element
- 0 : <NSSimpleRegularExpressionCheckingResult: 0x600003be3ac0>{9, 8}{<NSRegularExpression: 0x600002019080> <.*?> 0x0}
however when i remove the \r\n from the input checked text
let s = "TEST <strong>more:</strong>"
i get the expected 2 matches!!!
▿ 2 elements
- 0 : <NSSimpleRegularExpressionCheckingResult: 0x600002e0ea00>{5, 8}{<NSRegularExpression: 0x6000035faaf0> <.*?> 0x0}
- 1 : <NSSimpleRegularExpressionCheckingResult: 0x600002e0ed40>{18, 9}{<NSRegularExpression: 0x6000035faaf0> <.*?> 0x0}
What is going on?
The problem is due to the way
Stringencodes the\r\nas a singleCharacter:In your example there are 31 ASCII characters but each
/r/nis encoded as a singleCharacter:The
NSRangeyou calculate uses the Swift string length to specify a range in theNSStringand is effectively removing the last two characters of the string when calculating the match. This can easily be confirmed by adding a two or more characters to the end of the string and seeing that two matches are returned.Stringhas a method for calculating anNSRangefrom anRange<String.Index>and when that is used then your example produces two matches:You should probably move to the new Swift regular expression API rather than use the older bridged
NSStringandNSRegularExpression.