I am implementing an Android application that in native code reads its own code loaded in memory. I am not interested in reading it on disk, but the one that is running.
extern char __ehdr_start;
ElfW(Ehdr)* elfheader = (ElfW(Ehdr)*)&__ehdr_start;
android_log("Self_textParser: ELF header [%1$p]", elfheader);
ElfW(Phdr) *programHeader = (ElfW(Phdr)*)((uintptr_t)elfheader + elfheader->e_phoff);
for (ElfW(Phdr)* current = programHeader; current < (programHeader + elfheader->e_phnum); ++current) {
if (current->p_type != PT_LOAD) {
android_log("Self_textParser: Discarding no loadable segment: p_offset (%1$p), p_vaddr (%2$p), p_paddr (%3$p), p_filesz (%4$p), p_memsz (%5$p), p_flags (%6$p)",
current->p_offset, current->p_vaddr, current->p_paddr, current->p_filesz, current->p_memsz, current->p_flags & 0xF);
continue;
}
android_log("Self_textParser: Found a loadable segment: p_offset (%1$p), p_vaddr (%2$p), p_paddr (%3$p), p_filesz (%4$p), p_memsz (%5$p), p_flags (%6$p)",
current->p_offset, current->p_vaddr, current->p_paddr, current->p_filesz, current->p_memsz, current->p_flags & 0xF);
if ((current->p_flags & PF_X) != 0) {
ElfW(Addr) loadBias = (uintptr_t)elfheader + current->p_vaddr;
android_log("Self_textParser: loadBias: %1$p, len: %2$p", loadBias, current->p_memsz);
unsigned char* p = loadBias;
for (int j = 0; j + ALIGNMENT < current->p_memsz; j += ALIGNMENT, p += ALIGNMENT) {
//android_log("Instruction: %02X%02X%02X%02X",*p,*(p+1),*(p+2),*(p+3));
if (*(p + 3) == 0x04 && *(p + 2) == 0x03 && *(p + 1) == 0x02 && *p == 0x01) {
android_log("Instruction found");
}
}
The problem is that as you can see in the logs, although it does not reach the maximum size of the segment, a global buffer overflow occurs much earlier...
2024-01-30 17:55:53.737 26800-26800 SampleApp com.test.app.testsuite D Self_textParser: ELF header [0x6fad64a000]
2024-01-30 17:55:53.737 26800-26800 SampleApp com.test.app.testsuite D Self_textParser: Discarding no loadable segment: p_offset (0x40), p_vaddr (0x40), p_paddr (0x40), p_filesz (0x1f8), p_memsz (0x1f8), p_flags (0x4)
2024-01-30 17:55:53.737 26800-26800 SampleApp com.test.app.testsuite D Self_textParser: Found a loadable segment: p_offset (0x0), p_vaddr (0x0), p_paddr (0x0), p_filesz (0x25926d0), p_memsz (0x25926d0), p_flags (0x5)
2024-01-30 17:55:53.737 26800-26800 SampleApp com.test.app.testsuite D Self_textParser: loadBias: 0x6fad64a000, len: 0x25926d0
2024-01-30 17:55:53.742 26794-26794 wrap.sh logwrapper I =================================================================
2024-01-30 17:55:53.743 26794-26794 wrap.sh logwrapper I [1m[31m==26800==ERROR: AddressSanitizer: global-buffer-overflow on address 0x006fad8a2e73 at pc 0x006faf9110e8 bp 0x007fddf81fc0 sp 0x007fddf81fb8
2024-01-30 17:55:53.743 26794-26794 wrap.sh logwrapper I [1m[0m[1m[34mREAD of size 1 at 0x006fad8a2e73 thread T0
I expect analyse the full .text code to find the 01020304 instruction in memory with an unsigned char* without buffer overflow.
The code reading "raw"
LOADtext segment is incompatible with AddressSanitizer. If you want to run it under AddressSanitizer, you should add__attribute__((no_sanitize_address))or some such, so the entire function containing your code is not instrumented (and any errors from it are never detected).It should be obvious why you can't read the
RWLOADsegment -- the data in that segment contains various global buffers, and AddressSanitizer runtime knows the boundaries of these, and has red-zones between them.But you are not doing that -- you are only reading the contents of
RXsegment, so why is that a problem?It's a problem because linkers often merge readonly
.rodataand.textsections into theRXLOADsegment (you can confirm section to segment mapping usingreadelf -Wl ...).So when you have
that data could well end up in the
RXsegment you are about to scan, and the AddressSanitizer instrumentation would place red-zones on either side of this 17-byte array. Any attempt to read 18th byte (wherever it happens in theRXLOADsegment) would triggerglobal-buffer-overflow. Which is exactly what happened with your program.P.S.
Your question is:
But you are not parsing
.textsection (if you did, there wouldn't have been a problem). You are parsing a segment containing.textsection, and other sections.