I want to edit a few lines in an uncompressed pdf. I found a similar problem but since I need to scan the file a few times to get the exact line positions I want to change this doesn't really suit (and the pure number of RegEx matches are more than desired). The pdf contains utf-8 encodable lines (a few of them I want to edit, bookmark target ids in particular) and a lot of blobs (guess images and so on). When I edit the file with notepad it's working fine, but when I do it programatically (reading in, changing a few lines, writing back) images and some formatting is missing. (Sine they are not read in at the firstplace, ignore-option)
with codecs.open("merged-uncompressed.pdf", "r", encoding='ascii', errors='ignore') as f:
I can read the file in with errors="surrogateescape"
and wanted to map the lines from above import but don't know if this approach can work.
Does anyone know a way how to deal with this?
Best, Lukas
I was able to solve this:
The code is very messy at the moment and so I don't want to publish it right now. But I want to add it at github within the next few weeks. If anyone needs it: just comment and it will have more priority.
Thanks to anyone who wanted to help:) Lukas