creating formatted PNG 'pages' of document using Java/Scala

722 views Asked by At

I have a document in a scala.xml tree (this is easy to convert to whatever else) that I'd like to take and turn into a series of PNG files.

For example, the document might look like this:

<doc
  title="My Document"
  author="John Doe"
  created="1 July 1977"
  published="19 July 1799"
>
  <section heading="An Analysis of Multiparticles"> <!-- Section 1 -->
    <p>Paragraph one goes here</p> <!-- INTRODUCTION! -->
    <p>Paragraph two goes here</p>
  </section>
  <section heading="Conclusion of Multiparticles"> <!-- Section 2 -->
    <p>Paragraph one goes here</p> <!-- INTRODUCTION! -->
    <p>Paragraph two goes here</p>
  </section>

</doc>

I'd then like to turn that document into a PNG that looks something like this minus the red lines under made-up words (I'd supply the rules for formatting, typefaces to use, etc etc): Image

And, if possible, longer documents should be able to be "paginated" into any number of PNG files that would flow into the next one after hitting say, 500px of height or something.

If there is an existing Java library/package to do this that does any part of this (or a couple that manage to do it all put together)—great! Otherwise, I'd like to know where I should start for writing something to do this in Scala (preferably) or Java.

Thanks!

3

There are 3 answers

0
AudioBubble On BEST ANSWER

You want to use the iText library. This lets you manipulate the document, generate a PDF and whatever else you want to do with it, very advanced, very powerful, very Java. Once you have a PDF you can export its contents to any format you would like, there are lots of PDF -> PNG options on the internet.

From the front page:

Developers can use iText to:

* Serve PDF to a browser
* Generate dynamic documents from XML files or databases
* Use PDF's many interactive features
* Add bookmarks, page numbers, watermarks, etc.
* Split, concatenate, and manipulate PDF pages
* Automate filling out of PDF forms
* Add digital signatures to a PDF file
1
Rex Kerr On

I suggest going via LaTeX with, for instance, http://htmltolatex.sourceforge.net/. Once there you can set page sizes that are suitable, convert to PDF, explode the PDF into separate pages, and convert the pages to PNG at the size that you want.

Or do you really need the entire thing to be one program that runs under the JVM?

0
Sam Stainsby On

I would suggest PDF export instead. Others have mentioned iText: I've started using iText for a client (called from Scala). It seems to sit nicely between the low-level tedium of PDFBox and the higher level Jasper Reports.