Command line software to batch convert TIFF to indexable PDF

Question

Command line software to batch convert TIFF to indexable PDF

2.1k views Asked by William Seemann At 29 May 2012 at 14:58

I need a utility to batch convert TIFF files to indexable PDF's. The software needs to run on linux and must work from the command line. The software does not need to be open source. I've tried the conversion using tesseract and hocr2pdf however they produce PDF's with garbled text (Note: the text is only garbled if you "select all" text in the PDF). I've found other utilities but they only run under Windows or don't work from the command line. Thanks in advance.

Original Q&A

There are 5 answers

Herr von Wurst On 29 May 2012 at 15:09

Mogrify should be able to help you:

http://linux.die.net/man/1/mogrify

thb On 29 May 2012 at 15:14

This answer is oblique and only partial. Disregard if it does not apply to you.

There may exist such software, but I am not familiar with it. If your need is strong enough that you will write 2000 lines of code or so to meet it, then there is the Linux-oriented Libpoppler, which gives you the interface to write a program to make its own, custom PDF, exactly the way you want it. Unfortunately, Libpoppler though valuable is not particularly pleasant to code to; and, unfortunately, if you do code to it, then you will probably find yourself reading long tracts of the PDF standard.

If you do write such software, you might consider publishing it as open source.

Good luck.

Tomato On 30 May 2012 at 12:05

This is exactlyu what you are looking for:

http://ocr4linux.com/en:start

Command line OCR tool for Linux based on best on the market OCR from ABBYY. (Disclaimer: I work for ABBYY)

Orsiris de Jong On 11 September 2016 at 15:51

I wrote a bash script that uses Tesseract 3 or Abbyy OCR 11. It can batch convert or run in directory monitor mode.

In your case

pmocr.sh --batch --target=PDF /path/to/tiff/files

See the script here: https://github.com/deajan/pmOCR

**William Seemann** · Accepted Answer · 2012-07-03T05:00:03+00:00

William Seemann On 03 July 2012 at 05:00 BEST ANSWER

After trying several tools (including Abbyy) I decided on: Vividata. They have decent pricing, run under Linux, and don't have a page per year limit.

TechQA.

Command line software to batch convert TIFF to indexable PDF

There are 5 answers

Related Questions in PDF

Related Questions in INDEXING

Related Questions in OCR

Related Questions in TIFF

Related Questions in DOCUMENT-CONVERSION

Popular Questions

Popular Tags

Trending Questions