Simple HTML DOM Memory issue

723 views Asked by At

I'm trying to use PHP Simple HTML Dom Parser to parse some information from SQL query results. But it seems, that there is some HUGE memory problem with it. I create an html table using the SQL query results and then export the html table to a csv file. I am really new to this so my code is not the most efficient one. When I my query results are small the csv file is created successfully. But when the query results are large, the exported csv file does not have any sql results and instead shows this :

Fatal error: Call to a member function find() on boolean in /opt/lampp/htdocs/test.php on line 101

This is my function that takes the sqlresult and creates an html table and then exports it into a csv file:

 echo sql_to_html_table($sqlresult, $delim="\n" );

function sql_to_html_table($sqlresult, $delim="\n") {
// starting table
include_once('simple_html_dom.php');
$htmltable =  "<table>" . $delim ;
$counter   = 0 ;
// putting in lines
//while( $row = $sqlresult->mysqli_fetch_assoc()  ){
while($row = mysqli_fetch_assoc($sqlresult)) {
if ( $counter===0 ) {
// table header
$htmltable .=   "<tr>"  . $delim;
foreach ($row as $key => $value ) {
      $htmltable .=   "<th>" . $key . "</th>"  . $delim ;
  }
  $htmltable .=   "</tr>"  . $delim ;
  $counter = 22;
 }
  // table body
  $htmltable .=   "<tr>"  . $delim ;
  foreach ($row as $key => $value ) {
      $htmltable .=   "<td>" . $value . "</td>"  . $delim ;
  }
  $htmltable .=   "</tr>"   . $delim ;
 }
 // closing table

$htmltable .=   "</table>"   . $delim ;
 // return
//return( $htmltable ) ;
$html = str_get_html($htmltable);

header('Content-type: application/ms-excel');
header('Content-Disposition: attachment; filename=sample.csv');

$fp = fopen("php://output", "w");

foreach($html->find('tr') as $element)
{
$td = array();
foreach( $element->find('th') as $row)
{
    $td [] = $row->plaintext;
}
fputcsv($fp, $td);
$td = array();
foreach( $element->find('td') as $row)
{
    $td [] = $row->plaintext;
}
fputcsv($fp, $td);
}
fclose($fp);
} 

I have tried throwing an exception after $html = str_get_html($htmltable); like this:

if (!str_get_html($htmltable)) {
throw new exception('exception') ; 
}

and when I try to run the code my browser gives me this error:

Fatal error: Uncaught exception 'Exception' with message 'exception' in /opt/lampp/htdocs/test.php:96 Stack trace: #0 /opt/lampp/htdocs/test.php(62): sql_to_html_table(Object(mysqli_result), '\n') #1 {main} thrown in /opt/lampp/htdocs/test.php on line 96

2

There are 2 answers

2
Peter Geer On

Looking at a copy of simple_html_dom.php from SourceForge, this sounds like expected behavior for a sufficiently big HTML string. I see that str_get_html() has a check that will cause it to return false if the size of the string is greater than MAX_FILE_SIZE. And MAX_FILE_SIZE is defined with:

define('MAX_FILE_SIZE', 600000);

So it looks like simple_html_dom won't handle any string bigger than about 600kb. Since that's a built-in limitation, I guess your options are to either try to change the limit and see what happens or use a different library.

Alternatively, you could just skip the HTML portion altogether. If you need to generate the HTML for other purposes, that's fine, but there's no reason you can't bypass this problem by just building the CSV directly from the database results rather than from the HTML.

0
Dan Belden On

Maybe this is a little easier to understand:

function sql_to_csv($sqlresult, $delim = "\n") {

    // Loop each result into a csv row string
    while($row = mysqli_fetch_assoc($sqlresult)) {

        // Create/reset a var to hold the csv row content
        $csvRow = '';

        // Append each column value comma separated
        // Be warned of column values containing commas
        foreach ($row AS $columnValue) {
            $csvRow .= $columnValue . ',';
        }

        // Remove the trailing comma from the final column
        rtrim($csvRow, ',');

        // Send your CSV row to the browser
        echo $csvRow . $delim;
    }

    header('Content-type: text/csv');
    header('Content-Disposition: attachment; filename=sample.csv');
}

There are various issues with this approach not limited to, large output buffers, columns with multi-commas ...etc I recognise these problems but wanted to give an early approach to the solution instead of a large block of text.

The easiest way to debug PHP code is to run it with de-bugg outputting, the following may help you if the above is not useful:

var_dump($variable);
exit;

This will enable you to see the contents of the variable at run time, and may give better indication to your exception, given the line-number in your exceptions.

Goodluck.