Merge two PDF files page by page. Python, PyPDF2, Alteryx

45 views Asked by At

My goal is to combine two PDF files into one, and I want each page of both PDF files to be merged one after another. For example, the first page of the first pdf file should be followed by the first page of the other PDF file, same rules for the rest of the pages.

For this project, I am using PyPDF2, version 1.26.0 (a limitation of my company), and I already have a script, however, all my pages in the merged file are blank. Does anyone have any idea how can I re-write my script so I will have data from both PDF files? Python script that I am using:

Python script that I am using, but result are blank pages:


from ayx import Alteryx
from PyPDF2 import PdfFileReader, PdfFileWriter
import os
import PyPDF2
 
 
directory_path =  Alteryx.read('#1').iloc[0,0] #this projected is done in Alteryx, my path is to folder #with two PDF files
pdf_files = [file for file in os.listdir(directory_path)]
output_pdf = PdfFileWriter()
 
 
 
for i in range(0, min(len(pdf_files), len(pdf_files)-  len(pdf_files)%  2), 2):
    with open(os.path.join(directory_path,pdf_files[i]), 'rb') as file1,open(os.path.join(directory_path, pdf_files[i+1]),'rb') as file2:
        reader1 = PdfFileReader(file1)
        reader2 = PdfFileReader(file2)
        
        for page_num in range(max(reader1.getNumPages(), reader2.getNumPages())):
            if page_num < reader1.getNumPages():
                output_pdf.addPage(reader1.getPage(page_num))
            if page_num < reader2.getNumPages():
                output_pdf.addPage(reader2.getPage(page_num))
                
output_file_path = os.path.join(directory_path, 'merged.pdf')
with open(output_file_path, 'wb') as output_file:
    output_pdf.write(output_file)
1

There are 1 answers

1
Ayşe Nur Aslan On

You may be having this problem because one of the files you want to merge has fewer pages than the other. Your loop iterates through the number of pages of the longest of the two files. You can try to fix this.