I'm hoping to read the following PDF into a tidy data frame within R: PDF Table. The table even stretches across 70+ pages.
I am adept at reading in tables where each cell has one line, but I'm not sure how to extend that knowledge to cases where rows have a varying number of lines
Any help would be much appreciated!
I would suggest you to use
tabulizer
. It is better to extract tables from pdf files. Here the code for the file you shared:Output (some rows):