how to print Arabic text correctly in PYTHON

49.3k views Asked by At

I am using Python 2.7 and i try to print Arabic strings like these

print "ذهب الطالب الى المدرسة"

it's give the following output:

ط°ظ‡ط¨ ط§ظ„ط·ط§ظ„ط¨ ط§ظ„ظ‰ ط§ظ„ظ…ط¯ط±ط³ط©

The purpose is to print the text correctly, and not how to print each line. So, how can I print the string or content of text file correctly in its original form? like:

ذهب الطالب الى المدرسة
8

There are 8 answers

0
khelili miliana On

You need to add some line before your code

import sys
reload(sys)
sys.setdefaultencoding('utf-8')  
print "ذهب الطالب الى المدرسة"
0
yorodm On

You can either prefix your string with u like this

print u"ذهب الطالب الى المدرسة"

or make yourself compatible with python3 and put this in the top of your file

from __future__ import unicode_literals

Python27 strings (or bytestrings as they're known in Python3) do not handle unicode characters. Both the u and the import statement make your string unicode compatible.

5
Mohammad Yusuf On

Try this:

print u"ذهب الطالب الى المدرسة"

Output:

ذهب الطالب الى المدرسة

Demo: https://repl.it/EuHM/0

The default Python2.7 string works with utf-8 character set. And arabic is not included inside utf-8. So if you prefix it with u then it will treat that string as unicode string.

2
Dan-Dev On

In python 2.7

at the very top of your file you can declare:

# -*- coding: utf-8 -*-
print "ذهب الطالب الى المدرسة"

Updated:

If you can run this:

# -*- coding: utf-8 -*-
s = "ذهب الطالب الى المدرسة"
with open("file.txt", "w", encoding="utf-8") as myfile:
    myfile.write(s)

And the file generated "file.txt" contains the correct string then it is a problem with whatever you are displaying in in not python itself, I guess you could try displaying it in something else, maybe even PyQt.

0
Mohamd Alhawi On
import sys
text = "اطبع هذا النص".encode("utf-8")

or

text = "اطبع هذا النص".encode()

then

sys.stdout.buffer.write(text)

output

"اطبع هذا النص"
0
user16286011 On

You have two problems ... first you are using non Arabic font or non Unicode text ... and second you need a function like this to mix pure Arabic letters and gives you mixed Arabic letters:

def mixARABIC(string2):
    import unicodedata
    string2 = string2.decode('utf8')
    new_string = ''
    for letter in string2:
        if ord(letter) < 256: unicode_letter = '\\u00'+hex(ord(letter)).replace('0x','')
        elif ord(letter) < 4096: unicode_letter = '\\u0'+hex(ord(letter)).replace('0x','')
        else: unicode_letter = '\\u'+unicodedata.decomposition(letter).split(' ')[1]
        new_string += unicode_letter
    new_string = new_string.replace('\u06CC','\u0649')
    new_string = new_string.decode('unicode_escape')
    new_string = new_string.encode('utf-8')
    return new_string
1
Waleed Mohammed On

The following code works:

import arabic_reshaper

text_to_be_reshaped =  'اللغة العربية رائعة'

reshaped_text = arabic_reshaper.reshape(text_to_be_reshaped)

rev_text = reshaped_text[::-1]  # slice backwards 

print(rev_text)
1
Jalal Razavi On

by this module you can correct your text shape an direction. just install pips and use it.

# install: pip install --upgrade arabic-reshaper
import arabic_reshaper

# install: pip install python-bidi
from bidi.algorithm import get_display

text = "ذهب الطالب الى المدرسة"
reshaped_text = arabic_reshaper.reshape(text)    # correct its shape
bidi_text = get_display(reshaped_text)           # correct its direction