Multibyte Characters corrupt to ???? when read from database and posted to ASP Page using HTTPURLConnection

1.2k views Asked by At

In my java code,I am retrieving some multibyte data from database and making some xml DOM, with that data as the value of some node then converting the DOM to String and posting bytest to ASP Page via HTTPURLConnection , but somehow at receiver end the data is appearing as ???? instead of some multibyte values.Please suggest what to do.

Things that i am already doing..

1) I have set -Dfile.encoding =UTF8 as System Property 2)While using TransformerFactory for converting my XML DOM to String , i have set

 transformer.setOutputProperty(OutputKeys.ENCODING, "UTF-8")

to make sure that the encoding is proper there. Please suggest where i am getting wrong.

@Jon Skeet Few more things to add here... 1) I am getting data from database correctly 2) Transformed XML also appears to be proper, as i checked by saving it to my local file system.
For posting earlier i was using something like

'dout = new DataOutputStream(urlconn.getOutputStream());'  
 'dout.write(strXML.getBytes());' 
 'dout.write(strXML);' 

and the resulting data at the receiver end was getting converted to ????? but then i switched to
'

dout=new OutputStreamWriter(urlconn.getOutputStream(),"UTF8");' 
'dout.write(strXML);' 

then data at receiver end appears to be proper ... but the problem occurs with the way it is handled at receiver end in this case. in my receiver ASP code i am using objStream.WriteLine (oXMLDom.xml) ... and here it fails and starts to give internal server error... please suggest whats wrong with second approach.

1

There are 1 answers

7
Jon Skeet On BEST ANSWER

There are lots of potential conversions going on there. You should verify the data at every step:

  • Check that you're getting it out of the database correctly
  • See what the transformed XML looks like
  • Watch what goes over the network (including HTTP headers)
  • Check exactly what you're getting in ASP

Don't just print out the strings as strings - log the Unicode value of each character, by casting it to int:

for (int i = 0; i < text.length(); i++)
{
    char c = text.charAt(i);
    log("Character " + c + " - " + (int) c);
}