I have a few hundred PDFs that I need to rip the first page off of and then throw into Tabula.

I thought this might work using a combination of PDFTK and Apple Terminal:

for file in desktop/test/*.pdf ; do pdftk *-page1.pdf cat output combined.pdf ; done

but I get the result:

Error: Unable to find file.
Error: Failed to open PDF file: *-page1.pdf
Errors encountered. No output created.
Done. Input errors, so no output created.

It appears to be looking for one specific file and not all pdfs. Any ideas?

1 Answers

notautogenerated On

You need to specify which files to take a page from with a handle. For instance, if you had two files, you would write pdftk A=in1.pdf B=in2.pdf cat A1 B1 output out.pdf. For many files, here is a script to generate the command line automatically.

handles=(); n=0; for i in *.pdf; do handles+=("`tr [0-9] [A-J] <<< $n`=$i"); ((n++)); done
pages=(); for i in `seq 0 $((n-1))`; do pages+=(`tr [0-9] [A-J] <<< $i`1); done
pdftk "${handles[@]}" cat ${pages[@]} output out.pdf