How to find illegal instructions in a program?

1.1k views Asked by At

I have a benchmark which is meant to run on a specific simulator, Some instructions where added to the benchmark to communicate with the simulator(not perform CPU operations), like dump stats or reset stats and so on.

Now I need to run the same benchmarks on another simulator, and there is really no other work around, I have to use the same binaries, of course it doesn't work as it generates a SIGILL or an Illegal Instruction error.

What I want now is to be able to remove the bad instructions directly from the executable binary (no source code, can't recompile, can't setup from somewhere else) and replace them with NOPs. So I ran the benchmark in gdb and used layout asm command to find the addresses of bad instructions. Here is the output: enter image description here

My question might seem a bit stupid but now I opened the binary in a text editor, and tried to use the addresses I got from gdb to find the illegal instructions in the binary, but no luck. the size of the binary file is about 1MB while the addresses start from about 4MB. How can I find the illegal instructions in the binary using the addresses I got from gdb? Here is a snippet from the binary showing its format:

616c 6967 6e00 5f5f 7265 6769 7374 6572
5f66 7261 6d65 5f69 6e66 6f00 5f49 4f5f
7664 7072 696e 7466 005f 5f70 7468 7265
6164 5f73 6574 7370 6563 6966 6963 5f69
6e74 6572 6e61 6c00 7763 7274 6f6d 6200
5f64 6c5f 636f 7272 6563 745f 6361 6368
655f 6964 005f 646c 5f73 6f72 745f 6669
6e69 005f 5f6e 6577 5f66 6f70 656e 0063
6c6f 7365 005f 5f73 7472 6e63 7079 5f73
7365 3200 5f5f 6c69 6263 5f63 6f6e 6e65
6374 005f 5f77 6d65 6d63 7079 005f 494f
5f69 7465 725f 6e65 7874 006d 355f 7061
6e69 6300 5f64 6c5f 636c 6f73 655f 776f
726b 6572 005f 646c 5f70 6167 6573 697a
6500 5f5f 7661 6c6c 6f63 005f 5f6d 656d
616c 6967 6e5f 686f 6f6b 005f 5f70 7468
7265 6164 5f69 6e69 745f 7374 6174 6963
5f74 6c
2

There are 2 answers

1
AudioBubble On BEST ANSWER

Your hex dump is just a bunch of function names so it doesn't tell us much. And you didn't mention an operating system either...

I'll assume if you can run gdb on it, you can use GNU binutils too.

For a start you can try objdump -h myprog. It will give a list of sections with their sizes, load addresses, and file offsets. If it tells you that there's a section starting at 401000 with file offset 400 and size at least af4, then the runtime location 401af4 is at file offset 401af4-401000+400.

If the offending address is in a shared library or if the program has done any remapping of its address space, the task will be harder.

1
old_timer On

You didnt specify the processor nor operating system. Looks like x86 which makes this much harder as it is a variable length instruction set, upon the first problem instruction the disassembler if you use one can get confused, depends on the disassembler.

Try to find out what the specific illegal instructions are that the simulator responds to, find those bit patterns in the binary, interate around disassembling and if those patterns are found replacing them with nops and repeating, not foolproof, and depending on how this binary was made there may be some attempts to prevent such hacking.

Certainly a non-trivial task...