conversion from ASCII to utf-16LE giving issues

1.6k views Asked by At

I have written an sample code by following the link to convert ASCII to UTF-16LE using iconv but the output shows only a single charecter and blankspaces . The code is attached below please let me know where i'm going wrong.

#include <iconv.h>
#include <stdio.h>
#include <string.h>

int main()
{

  char Input[20];
  char Output[100];
  size_t insize,out_size;
  memset(Input,0,sizeof(Input));
  memset(Output,0,sizeof(Output));
  int nconv=0;
  char *Inptr;
  char *outptr;  

  printf("Input data :");
  scanf("%s",Input);

  iconv_t cd = iconv_open("UTF-16LE","ASCII");

  if(cd==(iconv_t)-1)
  {
     printf("iconv_open has failed ");
     return 0;
  }

  insize=strlen(Input);

  out_size=3*insize;

  Inptr =Input;

  outptr=(char *)Output;

  nconv=iconv(cd,&Inptr,&insize,&outptr,&out_size);

  if(nconv!=0)
  {
     printf("Unable to perform conversion ");
     return 0;
  }

  printf("\n Data After conversion from ASCII to UTF-16 = %s \n ",Output);


}

The output for the same is as given below

Input data :Hello world

Data After conversion from ASCII to UTF-16 = H

2

There are 2 answers

0
Adrian McCarthy On BEST ANSWER

When you convert "Hello" to UTF-16LE, you end up with this byte sequence (shown in hex):

48 00 65 00 6C 00 6C 00 6F 00 00 00

The printf call says to print the string as though it's a regular zero-terminated character string. It sees 48 and prints an H, and then it sees 00 and it thinks it's done.

You need a print function that can interpret the string as UTF-16LE. There isn't a standard one in C.

0
chux - Reinstate Monica On

Issues: wrong scanf() and wrong printf() format specifier.

  1. scanf("%s",Input); only scans in non-whitespace. Entering "Hello world" will only read in "Hello". Suggest using fgets() instead.

  2. The %s in printf("\n Data ... %s \n ",Output); is for C strings, not for multi-byte Output. Add the following to see detail:

    for (size_t i=0; i<out_size*2; i++)
      printf("%3d:%3d\n", i, Output[i]);
    
    0: 72
    1:  0
    2:101
    3:  0
    4:108
    5:  0
    6:108
    7:  0
    8:111
    9:  0
    
  3. printf("\n Data ... %ls \n ",Output); appears to work on my machine (note the l). But I think this depends on your system considering wide strings the same as "UTF-16LE".