JAVA Merge 2 PDF byte arrays

9.3k views Asked by At

I receive two PDFs, each as a byte array. So now I have 2 arrays, a[] and b[]. I concatenate them and save them to c[]. When I convert c[] to a PDF, only the 2nd file shows up. When I check the length of c[], it is len(a[]) + len(b[]).

I found other questions about this for different programming languages, and they say that I can't just concatenate them like this, we need to use a PDF authoring library. Since I receive byte arrays to begin with, is there anything else that could work in my situation?

2

There are 2 answers

2
Nenad On

You can't just concatenate the byte arrays.

You can find a couple of solutions for merging PDF files here How to merge two PDF files into one in Java?

If you have the PDF files, you can just use PDFMergerUtility of pdfbox.

PDFMergerUtility ut = new PDFMergerUtility();
ut.addSource(...);
ut.addSource(...);
ut.addSource(...);
ut.setDestinationFileName(...);
ut.mergeDocuments();

If the PDF files are not available, you can just use the other solution with itext

import com.itextpdf.text.Document;
import com.itextpdf.text.pdf.PdfContentByte;
import com.itextpdf.text.pdf.PdfImportedPage;
import com.itextpdf.text.pdf.PdfReader;
import com.itextpdf.text.pdf.PdfWriter;

/**
 * This class is used to merge two or more 
 * existing pdf file using iText jar.
 */
public class PDFMerger {

   static void mergePdfFiles(List<InputStream> inputPdfList,
                             OutputStream outputStream) throws Exception{
      //Create document and pdfReader objects.
      Document document = new Document();
      List<PdfReader> readers = 
              new ArrayList<PdfReader>();
      int totalPages = 0;

      //Create pdf Iterator object using inputPdfList.
      Iterator<InputStream> pdfIterator = 
          inputPdfList.iterator();

      // Create reader list for the input pdf files.
      while (pdfIterator.hasNext()) {
          InputStream pdf = pdfIterator.next();
          PdfReader pdfReader = new PdfReader(pdf);
          readers.add(pdfReader);
          totalPages = totalPages + pdfReader.getNumberOfPages();
      }

      // Create writer for the outputStream
      PdfWriter writer = PdfWriter.getInstance(document, outputStream);

      //Open document.
      document.open();

      //Contain the pdf data.
      PdfContentByte pageContentByte = writer.getDirectContent();

      PdfImportedPage pdfImportedPage;
      int currentPdfReaderPage = 1;
      Iterator<PdfReader> iteratorPDFReader = readers.iterator();

      // Iterate and process the reader list.
      while (iteratorPDFReader.hasNext()) {
        PdfReader pdfReader = iteratorPDFReader.next();
        //Create page and add content.
        while (currentPdfReaderPage <= pdfReader.getNumberOfPages()) {
              document.newPage();
              pdfImportedPage = 
              writer.getImportedPage(pdfReader,currentPdfReaderPage);
              pageContentByte.addTemplate(pdfImportedPage, 0, 0);
              currentPdfReaderPage++;
        }
        currentPdfReaderPage = 1;
     }

     //Close document and outputStream.
     outputStream.flush();
     document.close();
     outputStream.close();

     System.out.println("Pdf files merged successfully.");
   }

}
0
Samit On

If anyone still looking for such solution, try this:

//Suppose we want to merge one pdf with another main pdf

          InputStream is1 = null;



          if (file1 != null) {

                 FileInputStream fis1 = new FileInputStream(file1);

                 byte[] file1Data = new byte[(int) file1.length()];

                 fis1.read(file1Data);

                 is1 = new java.io.ByteArrayInputStream(file1Data);

          }



          //

          InputStream mainContent = <ur main content>



          org.apache.pdfbox.pdmodel.PDDocument mergedPDF = new org.apache.pdfbox.pdmodel.PDDocument();

          org.apache.pdfbox.pdmodel.PDDocument mainDoc = org.apache.pdfbox.pdmodel.PDDocument.load(mainContent);

          org.apache.pdfbox.multipdf.PDFMergerUtility merger = new org.apache.pdfbox.multipdf.PDFMergerUtility();



          merger.appendDocument(mergedPDF, mainDoc);



          PDDocument doc1 = null;



          if (is1 != null) {

                 doc1 = PDDocument.load(is1);

                 merger.appendDocument(mergedPDF, doc1);

                //1st file appended to main pdf");

          }

         



          ByteArrayOutputStream baos = new ByteArrayOutputStream();

          mergedPDF.save(baos);

//Now either u save it here or convert into InputStream if u want

          ByteArrayInputStream mergedInputStream = new ByteArrayInputStream(baos.toByteArray());