I have problems with reading text from an external XML. Flash doesn't seem to have problem with ascii characters from (32-127), but it isn't able to show extended characters (128 - 255). In that XML i have for example „ (DEC: 132) and “ (DEC:147). In the XML those characters are not visible, but still there. Flash isn't able to show them. My approach was to get each charCode and convert it to string, but that does only work for printable characters.
var textToConvert:String = xml.parameters.text[1].value;
trace("LENGTH = "+textToConvert.length);
var test:String="";
for(var i:int=1;i<textToConvert.length;i++){
trace(textToConvert.charCodeAt(i));
//OCT
trace(textToConvert.charCodeAt(i).toString(8));
//HEX
trace(textToConvert.charCodeAt(i).toString(16));
//HEX
test += textToConvert.charCodeAt(i).toString(16);
trace("SYMBOL : " +String.fromCharCode(textToConvert.charCodeAt(i)))
}
trace("TEST: "+test);
Result:
76
114
4c
SYMBOL : L
132
204
84
SYMBOL : (Not Visible)
The next thing i was doing, is to attach an escape sequence to each char "\x" to the HEX-Value and then convert it to String, but that doesn't work either:
s = "\x93\x93\x84\x93\x84";
ba.writeMultiByte(s,"ASCII");
trace(s);
This was my first approach (not working):
var byteArray:ByteArray = new ByteArray();
byteArray.writeMultiByte(textToConvert,"iso-8859-1");
trace("HIER: "+byteArray.readUTFBytes(byteArray.bytesAvailable));
What would be an universal apporach to solve this problem?
This is the xml, it has hidden ascii characters (quotes). I want to parse the values of the nodes including those characters:
Internally AS3 strings are encoded as 16-bit Unicode. They support your characters. It has also decoded it correctly as it has read the correct char code.
Does the font used for output have a glyph capable of rendering it? This applies even to the AS3 console. Your char isn't "empty", it just can't draw it. If you changed your trace to include quotes either side of the character you would see it writes the empty space still.
If you dump it to a TextField instead using a font you know has the correct support then it should work as expected.
If this doesn't meet your needs then you will need to do some kind of conversion. There is no generally accepted library to do this, as it is dependent on your needs. What should be done with single chars that typically need multiple to represent them? ø is normally translated to 'oe' but that may not be suitable in a fixed length string. There isn't an equiv for a most Hebrew, Cyrillic, Arabic etc letters. What rules do you want to apply to those? You need to decide what you need then do a conversion that matches those requirements (or pick a library that meets it).