I am trying to create a table in a DOCX file and then convert it to a PDF using Apache POI (version 5.2.3) and the XWPF Converter (version 2.0.4) library. I have successfully created the table and merged cells in the DOCX file. However, when I convert the DOCX file to PDF using the XWPF Converter, the resulting PDF does not have the proper formatting.
ByteArrayOutputStream byteArrayOutputStream = new ByteArrayOutputStream();
PdfOptions options = PdfOptions.create();
PdfConverter.getInstance().convert(document, byteArrayOutputStream, options);
byte[] pdfBytes = byteArrayOutputStream.toByteArray();
Expected result: I expect the converted PDF to maintain the table formatting and cell merging as it appears in the original DOCX file.
Actual result: The converted PDF does not accurately reflect the formatting of the table and merged cells.
The programmers of XDocReport have done a great job to handle the really complex file structure of a Microsoft Word
*.docxdocument in Office Open XML format. But, of course, there always are not solved problems.When it comes to tables in Word, then following problems are known to me:
A Word table might have row heights not set explicitly and so only determined by content. Then XDocReport not calculates the height considering the font descenders.
A Word table might have table cells hidden using
gridBeforeandwBeforefor cells before the first cell in row and/orgridAfterandwAfterfor cells after the last cell in row. Such cells are not part of the rows then and also are not set via cell merging. This is something what XDocReport not considers. And because of the missed cells, the whole table structure gets damaged.A Word table might have set alternating row background through table style. This is something what XDocReport not considers.
There might be more. But I doubt there is any free software out which really considers all of the complex possibilities of a Microsoft Word document. Even commercial software, except Microsoft Word itself, will have issues there.
Following short complete Java program can be used to test:
The
XWPFDocument.docxlooks like so:The resulting
XWPFDocument.pdflooks like so: