NSString to UTF8String error when contains chinese character

5.2k views Asked by At

I tried to encrypt a data that might contains chinese character, however i kept getting null when I decrypt the string. the way I encrypt the data is derived from our android team, So I wanna keep it the same. It looks like when I call [[NSString alloc] initWithData:dataFrom64 encoding:NSUTF8StringEncoding]; It gives me a NSString representation of an UTF8String. and when I call NSString UTF8String, it returns something unexpected. I tried to print out every thing to see where goes wrong. Sorry for the mess. I really need help on this. I can't figure out how to solve it.

   NSLog(@"--------Test begins--------");
   NSString *chinese = @"abcd 測試";

   /** encrypt **/
   char const *testCStr = [testString UTF8String];
   char const *cStr = [chinese UTF8String];
   char *newCStr = (char*)calloc(sizeof(char), strlen(cStr));
   strcpy(newCStr, cStr);

   int lenStr = strlen(cStr);
   int lenKey = testString.length;

   for (int i = 0, j = 0; i < lenStr; i++, j++) {
      if (j >= lenKey) j = 0;
      newCStr[i] = cStr[i] ^ testCStr[j];
   }

   NSString *tempStr = [NSString stringWithUTF8String:[[NSString stringWithFormat:@"%s",newCStr] UTF8String]];
   NSData   *tempData = [tempStr dataUsingEncoding:NSUTF8StringEncoding];
   NSString *base64Str = [tempData base64EncodedString];
   char const *dataCStr = [tempData bytes];
   NSString* dataToStr = [[NSString alloc] initWithData:tempData
                                          encoding:NSUTF8StringEncoding];

   NSLog(@"chinese         : %@", chinese);
   NSLog(@"chinese utf8    : %s ", [chinese UTF8String]);
   NSLog(@"encrypted utf8  : %s ", newCStr);
   NSLog(@"--------Encrypt--------");
   NSLog(@"encrypted str   : %@", tempStr);
   NSLog(@"temp data bytes : %s", dataCStr);
   NSLog(@"data to str     : %@", dataToStr);
   NSLog(@"base64 data     : %@", base64Str);
   NSLog(@"data temp       : %@", tempData );

   /** decrypt**/
   NSData *dataFrom64 = [NSData dataFromBase64String:base64Str];
   NSString *strFromData = [[NSString alloc] initWithData:dataFrom64
                                             encoding:NSUTF8StringEncoding];
   char const *cStrFromData = [strFromData UTF8String];
   char *newStr2 = (char*)calloc(sizeof(char), strlen(cStrFromData));

   strcpy(newStr2, cStrFromData);

   for (int i = 0, j = 0; i < lenStr; i++, j++) {
      if (j >= lenKey) j = 0;
      newStr2[i] = cStrFromData[i] ^ testCStr[j];
   }

   NSLog(@"--------Decrypt--------");
   NSLog(@"data 64         : %@", dataFrom64 );
   NSLog(@"data 64 bytes   : %s", [dataFrom64 bytes]);
   NSLog(@"str from data   : %@", strFromData);
   NSLog(@"cStr from data  : %s", [strFromData UTF8String]);
   NSLog(@"decrypt utf8    : %s", newStr2);
   NSLog(@"decrypt str     : %@", [NSString stringWithUTF8String:newStr2]);

and here is the out put:

   --------Test begins--------
   chinese         : abcd 測試
   chinese utf8    : abcd 測試 
   encrypted utf8  : #!B5aºÄõ–ôá 
   --------Encrypt--------
   encrypted str   : #!B5aºÄõ–ôá
   temp data bytes : #!B5aºÄõ–ôá6.889 WebSocke
   data to str     : #!B5aºÄõ–ôá
   base64 data     : IyFCNWHCusOEw7XigJPDtMOh
   data temp       : <23214235 61c2bac3 84c3b5e2 8093c3b4 c3a1>
   --------Decrypt--------
   data 64         : <23214235 61c2bac3 84c3b5e2 8093c3b4 c3a1>
   data 64 bytes   : #!B5aºÄõ–ôá
   str from data   : #!B5aºÄõ–ôá
   cStr from data  : #!B5aºÄõ–ôá
   decrypt utf8    : abcd òÇÙºÛî‚Äì√¥√°
   decrypt str     : (null)
   --------test ends--------
1

There are 1 answers

0
Martin R On BEST ANSWER

The problem is that newCStr is not null-terminated, and does also not represent a valid UTF-8 string. So this conversion

NSString *tempStr = [NSString stringWithUTF8String:[[NSString stringWithFormat:@"%s",newCStr] UTF8String]];

is bound to fail (or give a wrong result).

The following code avoids unnecessary conversions:

NSLog(@"--------Test begins--------");
NSString *plainText = @"abcd 測試";
NSString *keyString = @"topsecret";

/** encrypt **/
NSMutableData *plainData = [[plainText dataUsingEncoding:NSUTF8StringEncoding] mutableCopy];
NSData *keyData = [keyString dataUsingEncoding:NSUTF8StringEncoding];
uint8_t *plainBytes = [plainData mutableBytes];
const uint8_t *keyBytes = [keyData bytes];
for (int i = 0, j = 0; i < [plainData length]; i++, j++) {
    if (j >= [keyData length]) j = 0;
    plainBytes[i] ^= keyBytes[j];
}
NSString *base64Str = [plainData base64EncodedString];

NSLog(@"chinese         : %@", plainText);
NSLog(@"--------Encrypt--------");
NSLog(@"base64 data     : %@", base64Str);

/** decrypt**/
NSData *dataFrom64 = [NSData dataFromBase64String:base64Str];

NSMutableData *decodeData = [dataFrom64 mutableCopy];
uint8_t *decodeBytes = [decodeData mutableBytes];
for (int i = 0, j = 0; i < [decodeData length]; i++, j++) {
    if (j >= [keyData length]) j = 0;
    decodeBytes[i] ^= keyBytes[j];
}
NSString *decrypted = [[NSString alloc] initWithData:decodeData
                                              encoding:NSUTF8StringEncoding];
NSLog(@"--------Decrypt--------");
NSLog(@"decrypt str     : %@", decrypted);

Output:

--------Test begins--------
chinese         : abcd 測試
--------Encrypt--------
base64 data     : FQ0TF0WFysmc3ck=
--------Decrypt--------
decrypt str     : abcd 測試