Tips on speeding up CAM::PDF when appending more than 1,000 files?

107 views Asked by At

CAM::PDF from Chris Dolan has been a phenominal asset for me. Recent project calls for combining more than 1,000 small PDF files into one big file.

All is well until the pages get up to more than 200, at which point it starts to slow down. Eventually, it takes about 30 seconds or more to append each additional file.

I'm using the following code after each append, hoping to clear up cache to speed thing up:

if ($PDF->needsSave()) { $PDF->cleansave() }

I have already reduced each of the small PDF files down to 45kb each.

Short of server upgrades, is there anything else I should do on the coding side to see improvements in speed?

Thanks in advance!

1

There are 1 answers

0
Chris Dolan On

Chris Dolan here. I've never tried using CAM::PDF at that scale, but I tested with a little 100kb PDF and could not reproduce the slowdown. I tested with this little program:

use warnings;
use strict;
use CAM::PDF;

my $file = shift;
my $in = CAM::PDF->new($file) or die;

for my $i (0..1000) {
   print "$i\n";
   $in->appendPDF(CAM::PDF->new($file));
}

and it took about the same amount of time to append the 1000th file as the first one. Maybe there's some details of your specific PDFs that are causing pathological behavior in the library?? Without more info it's really tough to say.

Maybe the problem is that you're running out of memory and thrashing, but since the PDFs are so small I wouldn't have thought so. I assume you've tried with and without the cleansave()? As @zdmi says, combining them in a binary tree might help speed up some of the early combines but it might still be very slow combining the last few nodes.