Grab all printfs with ptrace

619 views Asked by At

I want to attach myself to a process and intercept all printf calls from that process.

main.c

int main()
{
    int i;
    for(i = 0; i < 10; i++)
    {
        printf("HelloWorld\n");
        sleep(5);
    }
    return 0;
}

Then to attach I have this code, and I want to do an infinite loop or until the main.c finishes -- infinite loop will work this is only Hello World with ptrace for testing, nothing fancy.

#include <sys/ptrace.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <unistd.h>
#include <sys/user.h>   // For user_regs_struct

int main(int argc, char *argv[])
{  
   struct user_regs_struct regs;

   pid_t traced_process = atoi(argv[1]);

   long t = ptrace(PTRACE_ATTACH, traced_process, NULL, NULL);

   wait(NULL);

   ptrace(PTRACE_GETREGS, traced_process, NULL, &regs);
   long ins = ptrace(PTRACE_PEEKTEXT, traced_process, regs.eip, NULL);

   printf("EIP: %lx Instruction executed: %lx\n", regs.eip, ins);

   char *c = &ins;
   printf("%c\n",c);

   ptrace(PTRACE_DETACH, traced_process, NULL, NULL);

   return 0;
}

I tried to put while(1) after attaching but that will actually just loop on the first printf executed in main.c.

I am really struggling with this, every example I run into is literally a copy paste of the other with huge amounts of code that is not even related to what I'm trying to do. I do know for sure that printf is a write() in the kernel, so that's what I should be looking for.

So again I want to get a reference to the string that printf is trying to print to the screen in the other terminal. How do I do this?

1

There are 1 answers

1
Art On BEST ANSWER

You will need to reproduce half of a debugger to do this.

In short:

  1. Resolve the address of printf in the running process.

  2. Set a breakpoint at printf.

  3. Figure out the ABI of the architecture you're on and read the first argument to printf from the stack or relevant register.

Let's go through the steps in more detail.

Step 1 - finding the address of printf in the running process.

For step one your tracing program needs to know the binary that is being traced to open it up and resolve the symbols of that binary. You'll probably want to read the ELF specification for this or read some code that does something similar.

For a first attempt, I'd strongly recommend that you link your traced program statically. Because the next thing you need to do is to figure out the hooks that the dynamic linker provides specifically for debuggers. This is very often undocumented by operating systems. You'll probably need to read code for a debugger on the operating system you're using and do the same thing to hook into the dynamic linker to figure out which libraries are loaded where and use that information to extract symbols from those libraries and locate printf.

Step one can be bypassed by having your traced program print the address of printf. This might make more sense, because all these steps taken together are a bit of pain to get all working at the same time. I would recommend leaving step 1 to last because it's the hardest one to get right and doing step 2 and 3 will teach you about some tools you'll need to implement reading symbols from the traced process.

Step 2 - setting a breakpoint.

Now that you have the address of printf, let's get to step two. The breakpoint. If you're lucky your operating system provides a ptrace operating to insert a breakpoint into a running process. Just use that and you're golden. Read the ptrace documentation for how the breakpoint is signalled to you (you usually just wait).

If your ptrace implementation doesn't have a breakpoint feature, figure out the breakpoint instruction for your architecture, use whatever mechanism ptrace provides to overwrite the beginning of printf in the traced process with it. Then when you receive the breakpoint, write back the original contents of the code that you overwrote with the breakpoint instruction, overwrite the next instruction with a breakpoint instruction (be careful with variable length instruction architectures like the x86, you probably need to have a good instruction parser here), adjust the instruction pointer register as necessary (some architectures will not restart breakpoint instructions, so you have to do it yourself), restart the program until it hits the next instruction, restore the previous contents of the instruction you overwrote this time, (on a variable length instruction architectures you might need to repeat until you can fit the breakpoint instruction into the first instruction of the printf function) and put the breakpoint back into the first instruction again. Most modern systems have breakpoint functionality in ptrace, so pray that you don't need to do this, this is a real pain in the ass.

Step 3 - reading the string.

This is simple. Just find the ABI document, or read code someone else wrote, or just compile a simple function into assembler and look at how the assembler accesses the first argument of a function. Use that information to pull out the first argument to printf. You'll need to use the GETREGS (or equivalent) functionality in ptrace to get the registers and then PEEKDATA (or equivalent) to read the data of the string.