PDF Tabular Data Extraction using pdftabextract

272 views Asked by Havishaa Sharma At 12 July 2021 at 09:53

I am trying to extract tabular data from text-based pdfs. PDFs are of different formats and I have to make a generalised solution. I came across one library named "pdftabextract" for this task. But, it works on scanned documents and has been designed for the same.

I want to use it for my text-based pdfs, but don't know how to do it.

Article Link : https://datascience.blog.wzb.eu/2017/02/16/data-mining-ocr-pdfs-using-pdftabextract-to-liberate-tabular-data-from-scanned-documents/

The above article shows step by step approach. But, I don't know how to use that for text-based pdfs. Please help.

Original Q&A

TechQA.

PDF Tabular Data Extraction using pdftabextract

There are 0 answers

Related Questions in PYTHON

Related Questions in PDF-EXTRACTION

Related Questions in PDFTABLES

Popular Questions

Popular Tags

Trending Questions