'Smalot PDF Parser' result: text not on the same line

2.6k views Asked by At

So I installed PDF Parser (http://www.pdfparser.org/). I checked their website and used the demo. This gave me the result I wanted. After hours of searching how to use a composer I finally managed to get it working. Now I’m stuck with the next problem how to get the result from the demo.

I used the example code given on the documentation page. It did extract the text but all the text is on the same line. When I use the demo, every new page started with a new paragraph and every piece of text was placed on a separated line. Code:

<?php

// Include Composer autoloader if not already done.
include 'vendor/autoload.php';

// Parse pdf file and build necessary objects.
$parser = new \Smalot\PdfParser\Parser();
$pdf    = $parser->parseFile('document.pdf');

// Retrieve all pages from the pdf file.
$pages  = $pdf->getPages();

// Loop over each page to extract text.
foreach ($pages as $page) {
    echo $page->getText();
}

?>

As I said when I used the code above I got all the text on one line. My question is how can I get the same result as the script on the demo page??

1

There are 1 answers

0
626 On

I had the same issue. Loop it this way with nl2br

// Loop over each page to extract text.
foreach ($pages as $page) {
echo nl2br($page->getText());
}