How to extract multiple tables from multiple pages of a PDF and put them all in one DataFrame?

1k views Asked by At

I want to put all tables of a PDF into a single DataFrame and the tables to have the same columns.

ka1 = camelot.read_pdf(r"example.pdf",'all')

for i,table in enumerate(ka1):
 v = table.df
 w = pd.concat(v)

print(w)
1

There are 1 answers

0
Sam On

pandas.concat() expects a list of DataFrames. You could add all the DataFrames to a list in the for loop and concat them afterwards. For example:

ka1 = camelot.read_pdf(r"example.pdf",'all')

v = []
for i,table in enumerate(ka1):
    v.append(table.df)
w = pd.concat(v)

print(w)