Two word with the same representation in UTF-8 have different representation in ASCII

Question

Two word with the same representation in UTF-8 have different representation in ASCII

526 views Asked by TJ1 At 12 December 2013 at 06:37

I have a Farsi word that if shown in UTF-8 coding is like this:

"خطاب"

I have two versions of this word, both in Notepad++ in UTF-8 are shown as above. But if I look at them in ANSI mode then I see:

ïºïºŽï»„ïº§

and for the other one I see:

Ø®Ø·Ø§Ø¨

How come the same words have such a different representation in ANSI format? When I use PIL in Python to draw these, the result is correct for one of these and not correct for the other.

I appreciate any help on this.

Original Q&A

There are 1 answers

**jedivader** · Answer 1 · 2014-03-02T22:56:56+00:00

In Unicode you can represent some character in more than one way. In this case, these Arabic characters are represented with code points from the Arabic Presentation Forms-B Block in the first case, and with code points from the regular Arabic Block in the second case.

If you convert the text

ïºïºŽï»„ïº§

to a byte stream, you get

EFBA0F EFBA8E EFBB84 EFBAA7

Notice that you are not seeing a character representing the 0F byte in the text above, because it's a non-visual character.

Now that byte stream is representing a UTF-8-encoded text. Decoding it will give you the following Unicode code points:

FE8F FE8E FEC4 FEA7

You can match those in the Arabic Presentation Forms-B Block to form your Farsi text:

خطاب

You can do the same process for the other text: Ø®Ø·Ø§Ø¨ gives you the byte stream D8AE D8B7 D8A7 D8A8, which represents UTF-8-encoded text, which decoded gives you the Unicode code points 062e 0637 0627 0628, which matched to the regular Arabic Block gives you again the text خطاب.

TechQA.

Two word with the same representation in UTF-8 have different representation in ASCII

There are 1 answers

Related Questions in UTF-8

Related Questions in PYTHON-IMAGING-LIBRARY

Related Questions in ANSI

Related Questions in FARSI

Popular Questions

Popular Tags

Trending Questions