PDF conversion lose format when is converted by Docx4j

1.1k views Asked by At

I've a problem when pass file from .docx to pdf.

I use docx4j 3.2.2, the code for conversion:

import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.InputStream;
import java.io.OutputStream;

import org.docx4j.Docx4J;
import org.docx4j.openpackaging.packages.WordprocessingMLPackage;

public class PDFConverter {

    public static void main(String[] args) {
        createPDF();
    }

    private static void createPDF() {
        try {
            long start = System.currentTimeMillis();
            InputStream is = new FileInputStream(new File("docxFile"));
            WordprocessingMLPackage wordMLPackage = WordprocessingMLPackage.load(is);
            OutputStream out = new FileOutputStream(new File("pdfFile"));
            Docx4J.toPDF(wordMLPackage, out);
        } catch (Throwable e) {
            e.printStackTrace();
        }
    }
}

the conversion is realiced, but pdf lose the format, here the original docx and pdf conversion

Original

Conversion

This is because is needed some configuration?

regards.

1

There are 1 answers

1
Kirill Induchnyj On
package com.xxx.ecm.converter

import com.xxx.ecm.api.object.model.BaseContent
import org.docx4j.Docx4J
import org.docx4j.openpackaging.packages.WordprocessingMLPackage

class Docx2PdfConverter extends Converter 
{
    InputStream convert(BaseContent content) {
        try {
            byte[] bytes = content.inputStream.bytes
            InputStream is = new ByteArrayInputStream(bytes)
            ByteArrayOutputStream outputStream = new ByteArrayOutputStream(bytes.size())
            WordprocessingMLPackage wordMLPackage = WordprocessingMLPackage.load(is)
            Docx4J.toPDF(wordMLPackage, outputStream);
            new ByteArrayInputStream(outputStream.toByteArray())
        } catch (Exception e) {
          throw e
        }
    }
}