Writing Tamil characters to PDF using reportlab python module?

1k views Asked by At

I have a UTF-8 encoded file with contents in Tamil (an Indian Language). I have to read the contents of the file and make a PDF. I am using reportlab python module to do this.

I am able to open the file and read the contents and printing it to the terminal displays the contents perfectly. However, while writing the contents to PDF using reportlab, some characters (which are composite of two 'character symbols', the order gets reversed within the composite character. I have set a Tamil font for reportlab paragraph style. What am I missing?

from reportlab.pdfbase import pdfmetrics
from reportlab.lib.pagesizes import A4
from reportlab.lib.units import inch
from reportlab.platypus import SimpleDocTemplate, Paragraph, Spacer, PageBreak
from reportlab.lib.styles import ParagraphStyle, getSampleStyleSheet
from reportlab.lib.enums import TA_JUSTIFY
from reportlab.pdfbase.ttfonts import TTFont
pdfmetrics.registerFont(TTFont('Latha', '/home/srinivas/Fonts/latha/latha.ttf'))
from os import listdir
from os.path import isdir, isfile, join
import random
import codecs
from tamil import utf8 as tamil
PATH = 'tamil_file'
num_sets = 1
pages_per_set = 12
num_articles_per_page = 2

styles = getSampleStyleSheet()
styles.add(ParagraphStyle(name='CustomPara', fontName='Mangal', fontSize=14, alignment=TA_JUSTIFY, leading=24))

style = styles['CustomPara']
styleH = styles['Heading1']

for set_idx in range(num_sets):
    doc = SimpleDocTemplate(str(set_idx)+'.pdf', pagesize=A4)
    story = []
    for page in range(pages_per_set):
            story.append(Spacer(1, 0.1* inch))
            story.append(Paragraph(id, styleH))
            story.append(Spacer(1,  0.1 * inch))
            with codecs.open(join(PATH,selected_file),'r','utf-8') as f:
                for l in f.readlines():
                    print l # prints correctly in terminal
                    lines += l
            story.append(Paragraph(lines, style))
        story.append(PageBreak())
    doc.build(story)

Actual text: நாவல் மரத்தின் மருத்துவப் பயன்கள் போற்றத்தக்கவை

Saved wrong text: enter image description here

Note: If I copy the text from PDF and paste it here, it displays fine (wrong text is an image attachment)!

0

There are 0 answers