Convert pdf to text python error

Question

Convert pdf to text python error

1.5k views Asked by Fakhriyanto At 09 June 2015 at 13:23

I want to convert pdf to text in specified directory

this is the code I tried

import os
import subprocess

def pdftotext(pdf):
    # insert your code here
    basename, _ = os.path.splitext(os.path.basename(pdf))
    subprocess.call(['pdftotext', '-enc', 'UTF-8',
                 pdf, os.path.join('c:\pdf\pydf\data', basename + '.txt')])

pdftotext("C:\\pdf\\pydf\\pdfs\\ipm.pdf")
with open(os.path.join('c:\\pdf\\pydf\\data', 'ipm.txt')) as infile:
   print(infile.read(1000))

but it get error

Traceback (most recent call last):
File "C:/pdf/browser.py", line 10, in <module>
pdftotext("C:\\pdf\\pydf\\pdfs\\ipm.pdf")
File "C:/pdf/browser.py", line 8, in pdftotext
pdf, os.path.join('c:\pdf\pydf\data', basename + '.txt'), '-'])
File "C:\Python34\lib\subprocess.py", line 537, in call
with Popen(*popenargs, **kwargs) as p:
File "C:\Python34\lib\subprocess.py", line 859, in __init__
restore_signals, start_new_session)
File "C:\Python34\lib\subprocess.py", line 1112, in _execute_child
startupinfo)
FileNotFoundError: [WinError 2] The system cannot find the file specified

What's wrong in my code?

Original Q&A

There are 1 answers

**Visgean Skeloru** · Answer 1 · 2015-06-09T13:31:28+00:00

Visgean Skeloru On 09 June 2015 at 13:31

The path to the file is incorrect, instead of C:\pdf\pydf\pdfs\ipm.pdf use os.path.join('c:', 'pdfs', 'pdfs', 'ipm.pdf')

TechQA.

Convert pdf to text python error

There are 1 answers

Related Questions in PYTHON

Related Questions in PDFTOTEXT

Popular Questions

Popular Tags

Trending Questions