How can I read PDF document properties using Perl and CAM::PDF?

3k views Asked by At

I want to read some PDF document property with Perl. I already have CAM::PDF installed on my system.

Is there an option to use this module to read the properties of a PDF document? If yes could someone give an example or refer to the relevant subroutine that does this?

Or, should I use another module? If yes which module?

2

There are 2 answers

0
Chris Dolan On BEST ANSWER

I like the PDF::API2 answer from Sinan Ünür. PDF::API2 is awesome.

I'm the author of CAM::PDF. Sorry I missed this question earlier. CAM::PDF comes with a cmdline tool to extract this sort of data (pdfinfo.pl).

My library does not support this officially, but it's easy to do if you don't mind hacking into internals.

#!perl -w                                                                                                                            
use strict;
use CAM::PDF;
my $infile = shift || die 'syntax...';
my $pdf = CAM::PDF->new($infile) || die;
my $info = $pdf->getValue($pdf->{trailer}->{Info});
if ($info) {
    for my $key (sort keys %{$info}) {
        my $value = $info->{$key};
        if ($value->{type} eq 'string') {
            print "$key: $value->{value}\n";
        } else {
            print "$key: <$value->{type}>\n";
        }
    }
}
0
Sinan Ünür On

I do not know much about CAM::PDF. However, if you are willing to install PDF::API2, you can do:

#!/usr/bin/env perl

use strict; use warnings;

use Data::Dumper;
use PDF::API2;

my $pdf = PDF::API2->open('U3DElements.pdf');

print Dumper { $pdf->info };

Output:

$VAR1 = {
          'ModDate' => 'D:20090427131238-07\'00\'',
          'Subject' => 'Adobe Acrobat 9.0 SDK',
          'CreationDate' => 'D:20090427125930Z',
          'Producer' => 'Acrobat Distiller 9.0.0 (Windows)',
          'Creator' => 'FrameMaker 7.2',
          'Author' => 'Adobe Developer Support',
          'Title' => 'U3D Supported Elements'
        };