Extract Text From PDF File using Smalot/pdfparser return empty result

Question

Extract Text From PDF File using Smalot/pdfparser return empty result

635 views Asked by Awan At 27 October 2023 at 23:17

I want to extract text from a pdf file using smalot/pdfparser,but i've got empty result on some file. the pdf file without password and open normally using chrome. I've tried another pdf file working fine.

this is my code

$parser = new \Smalot\PdfParser\Parser(); // Parse pdf file using Parser library 
$pdf = $parser->parseFile($file);
$metaData = $pdf->getDetails();
print_r($metaData); 
$pages  = $pdf->getPages();
foreach ($pages as $page) {
            $text = $page->getText();
            echo "<div>".$text."</div>";
}
echo $file;

the result just

Array
Array
(
    [Producer] => cairo 1.17.4 (https://cairographics.org
    [Pages] => 1
)
<div></div>D:\web\D\public\pdf_po/123.pdf

can anyone explain my problem? this is my pdf file : www.mediafire.com/file/azb7yddqo2ry55j/123.pdf/file

Original Q&A

There are 1 answers

**K J** · Answer 1 · 2023-10-28T17:42:52+00:00

PDFtoText

Should give you the best results since there are no table divisions in a PDF text:

So when that text layout is reversed into Word.txt or as suitable for any other Text Processor you can simply draw the tables or divisions around the text. The alternative is import into Excel or any other "Spreadsheet" program.

Then it's easier to cut and paste real tabular data, or use it any other way. The primary trick is ensuring you extract exactly the gridlike way it is stored in a PDF and the closest to that is reprinting the text file via PDFtoText (there are many versions so find one that suits your needs).

TechQA.

Extract Text From PDF File using Smalot/pdfparser return empty result

There are 1 answers

Related Questions in PHP

Related Questions in PDF

Related Questions in PDFPARSER

Popular Questions

Popular Tags

Trending Questions