Persian PDF to Word Converter.
I have created an application to convert the PDF file into a word file. There is a problem with Persian PDf files when I try to convert them. The application converts the pdf into a word file but the text format is not readable, It will try to put the Letters and vice versa.
Here is the code which I have written.
import os.path
from tkinter import *
from tkinter import ttk
import tkinter.filedialog as fd
from pdf2docx import Converter
def open_file():
file_name = fd.askopenfilename()
file_path = os.path.basename(file_name).split('/')[-1]
change_file_type = file_path.replace('.pdf', '.docx')
pdf_file = file_name
docx_file = change_file_type
cv = Converter(pdf_file)
cv.convert(docx_file)
cv.close()
root = Tk()
root.title('PDF2Word Converter')
label_title = ttk.Label(root, text='Welcome to PDF Converter!')
label_open_pdf = ttk.Label(root, text='Open PDF: ')
label_Developer = ttk.Label(root, text='Developer: Behrooz Sharify')
open_button = ttk.Button(root, text='Open', command=open_file)
label_title.grid(row=0, column=1, columnspan=2)
label_open_pdf.grid(row=2, column=0)
label_Developer.grid(row=3, column=1, columnspan=2)
open_button.grid(row=2, column=3)
root.geometry('296x70')
root.resizable(False, False)
root.mainloop()
You can refer to the picture I uploaded here for the output problem:
I have used pdf2docx as well and it's far from perfection, I would recommend some online converters like smallpdf.com and pdfocr.org etc, the latter lets you convert scanned pdf to word for free.