[Steganography ]Hiding Data in PDF files

9.5k views Asked by At

I'm trying to hide a file in a PDF file code. I've already search some information to help me. I've tried to uncompress the pdf using pdftk ( pdftk pdf.pdf output uncompress.pdf uncompress ). Then I tried different things such as :

  • Insert commentary : I put " %TEXT_TO_HIDE " in the uncompress pdf file code.
  • add new object : I put " 0 0 obj << TEXT_TO_HIDE << endobj " in the uncompress pdf file code.
  • modify an existing object

then i compress it using pdftk again

In each case, I obtain a new pdf, which is looking different from the original. It's not corrupted but images have different colors, and some original text are missing.

So, do you know some rules to change a pdf code without anyone notice ?

(PS : Sorry if my english is bad ^^ )

1

There are 1 answers

2
David van Driessche On BEST ANSWER

You cannot modify a PDF file in a text editor and expect the file to be still compliant in general. PDF is a binary format and you need to read the PDF specification to figure out how to modify it.

That said, there are heaps of places where you can "hide" information in a PDF document, the real question is how much data you want to hide, and to what purpose. The purpose typically links to how secure exactly this needs to be.

As some examples:

1) PDF allows embedding complete files in the actual PDF file. This is not really secure as anyone with decent software can extract these files (but the file itself could still be secured of course).

2) PDF allows adding arbitrary objects anywhere (or almost anywhere) in the file. This is a great way to hide information, but someone with the right tools can browse the object tree (even if the file is compressed) and see what you did.

3) PDF allows adding for example white text on a white background or text behind other objects. Again, there are ways around this for people with the right software.

4) Adobe's PDF spec allows at least 1K of fluff after the %%EOF marker (although ISO 32000 does not). Keep in mind that this is visible to anyone opening the file with a decent text or binary editor. (Thanks Jongware).

In short, you need to define much better what exactly you want to accomplish and how "secure" secure is in your use case.

You should also consider how "robust" the method must be. Should someone be able to save your PDF file with Acrobat for example with the hidden code intact? Some of the above methods may not be robust enough to ensure that with absolute certainty.