Confusion with case used by CFURLCreateStringByAddingPercentEscapes encoding

271 views Asked by At

I want URL encoding to be done. My input string is "ChBdgzQ3qUpNRBEHB+bOXQNjRTQ="

I get an output as "ChBdgzQ3qUpNRBEHB%2BbOXQNjRTQ%3D" which is totally correct except the case which gets encoded.

Ideally, it should have been "ChBdgzQ3qUpNRBEHB%2bbOXQNjRTQ%3d" instead of the output I get. i.e I should have got %2b and %3d instead of %2B and %3D.

Could this be done?

The code I used is as below :

NSString* inputStr = @"ChBdgzQ3qUpNRBEHB+bOXQNjRTQ=";
NSString* outputStr = (NSString *)CFURLCreateStringByAddingPercentEscapes(NULL,
                                                                          (CFStringRef)inputStr,
                                                                          NULL,
                                                                          (CFStringRef)@"!*'\"();:@&=+$,/?%#[]% ",
                                                                          CFStringConvertNSStringEncodingToEncoding(encoding));
2

There are 2 answers

0
Ja͢ck On

You can use a regular expression to perform the post operation:

NSMutableString *finalStr = outputStr.mutableCopy;
NSRegularExpression *re = [[NSRegularExpression alloc] initWithPattern:@"(?<=%)[0-9A-F]{2}" options:0 error:nil];

for (NSTextCheckingResult *match in [re matchesInString:escaped options:0 range:NSMakeRange(0, escaped.length)]) {
    [finalStr replaceCharactersInRange:match.range withString:[[escaped substringWithRange:match.range] lowercaseString]];
}

The code uses this regular expression:

(<?=%)[0-9A-F]{2}

It matches two hexadecimal characters, only if preceded by a percent sign. Each match is then iterated and replaced within a mutable string. We don't have to worry about offset changes because the replacement string is always the same length.

0
David H On

Another perhaps more elegant but slower way would be to loop over your string, converting each character in the string one by one (so you would get the length of your string, then get a substring from it from location 0 to length-1, with one character each time, then translate just that substring. If the returned string has a length > 1, then CFURLCreateStringByAddingPercentEscapes encoded the character, and you can safely turn the case into lower case.

In all cases you append the returned (and possibly modified) string to a mutable string, and when done you have exactly what you want for any possible string. Even though this would appear to be a real processor hog, the reality is you would probably never notice the extra consumed cycles.

Likewise, a second approach would be to just convert your whole string first, then copy it byte by byte to a mutable string, and if you find a "%", then turn the next two characters into lower case. Just a slightly different way to slice the problem.