Why is my Rust executable mapped to such high addresses (near the stack) instead of 0x400000?

666 views Asked by At

I am learning about the Linux user space memory layout on x86_64 systems and wanted to print some addresses from some of the sections. I used this Rust code:

fn main() {
    let x = 3;        // should be stored on stack
    let s = "hello";  // should be in the .data section

    println!("stack ≈ {:p}", &x);
    println!(".text ≈ {:p}", main as *const ());
    println!(".data ≈ {:p}", s);


    use std::io;

    let mut f = std::fs::File::open("/proc/self/maps").unwrap();
    let out = io::stdout();
    io::copy(&mut f, &mut out.lock()).unwrap();
}

This code also prints the file /proc/self/maps to stdout. I compiled this file mem.rs simply with rustc mem.rs. It printed:

stack ≈ 0x7ffffbf82f2c
.text ≈ 0x7f45b7c0a2b0
.data ≈ 0x7f45b7c4d35b

7f45b6800000-7f45b6c00000 rw-- 00000000 00:00 0
7f45b6de0000-7f45b6f9a000 r-x- 00000000 00:00 664435             /lib/x86_64-linux-gnu/libc-2.19.so
7f45b6f9a000-7f45b6fa2000 ---- 001ba000 00:00 664435             /lib/x86_64-linux-gnu/libc-2.19.so
[ ... more .so files]
7f45b7a22000-7f45b7a23000 r--- 00022000 00:00 663920             /lib/x86_64-linux-gnu/ld-2.19.so
7f45b7a23000-7f45b7a24000 rw-- 00023000 00:00 663920             /lib/x86_64-linux-gnu/ld-2.19.so
7f45b7a24000-7f45b7a25000 rw-- 00000000 00:00 0
7f45b7aa0000-7f45b7aa2000 rw-- 00000000 00:00 0
7f45b7ab0000-7f45b7ab2000 rw-- 00000000 00:00 0
7f45b7ac0000-7f45b7ac1000 rw-- 00000000 00:00 0
7f45b7ad0000-7f45b7ad1000 rw-- 00000000 00:00 0
7f45b7ae0000-7f45b7ae2000 rw-- 00000000 00:00 0
7f45b7c00000-7f45b7c5f000 r-x- 00000000 00:00 1134580            /home/lukas/tmp/mem
7f45b7e5e000-7f45b7e62000 r--- 0005e000 00:00 1134580            /home/lukas/tmp/mem
7f45b7e62000-7f45b7e63000 rw-- 00062000 00:00 1134580            /home/lukas/tmp/mem
7f45b7e63000-7f45b7e64000 rw-- 00000000 00:00 0
7ffffb784000-7ffffb785000 ---- 00000000 00:00 0                  [stack]
7ffffb785000-7ffffbf84000 rw-- 00000000 00:00 0
7ffffc263000-7ffffc264000 r-x- 00000000 00:00 0                  [vdso]

At least the addresses I printed on my own seem to match what maps says. But when I execute cat /proc/self/maps in the terminal, I get this output:

00400000-0040b000 r-x- 00000000 00:00 107117                     /bin/cat
0060a000-0060b000 r--- 0000a000 00:00 107117                     /bin/cat
0060b000-0060c000 rw-- 0000b000 00:00 107117                     /bin/cat
0071c000-0073d000 rw-- 00000000 00:00 0                          [heap]
7f7deb933000-7f7debc30000 r--- 00000000 00:00 758714             /usr/lib/locale/locale-archive
7f7debc30000-7f7debdea000 r-x- 00000000 00:00 664435             /lib/x86_64-linux-gnu/libc-2.19.so
7f7debdea000-7f7debdf2000 ---- 001ba000 00:00 664435             /lib/x86_64-linux-gnu/libc-2.19.so
[ ... more .so files ...]
7f7dec222000-7f7dec223000 r--- 00022000 00:00 663920             /lib/x86_64-linux-gnu/ld-2.19.so
7f7dec223000-7f7dec224000 rw-- 00023000 00:00 663920             /lib/x86_64-linux-gnu/ld-2.19.so
7f7dec224000-7f7dec225000 rw-- 00000000 00:00 0
7f7dec250000-7f7dec252000 rw-- 00000000 00:00 0
7f7dec260000-7f7dec261000 rw-- 00000000 00:00 0
7f7dec270000-7f7dec272000 rw-- 00000000 00:00 0
7ffff09e8000-7ffff11e8000 rw-- 00000000 00:00 0                  [stack]
7ffff1689000-7ffff168a000 r-x- 00000000 00:00 0                  [vdso]

The latter result matches everything I read about this topic: the sections from the executable are mapped in the lower end of the virtual address space (beginning around 0x400000).

I executed and compiled everything in the Linux Subsystem for Windows (Ubuntu 14.04 basically). I know, it's new and stuff, but I'm fairly sure this is not an issue with the subsystem (please tell me if it is, though!). Rust 1.14 is that matters (I doubt it),

I also tried the same with a C program (excuse my probably bad C):

#include <stdio.h>
#include <errno.h>
#include <string.h>
#include <stdlib.h>

int main(int argc, char **argv) {
    FILE *test_file;
    char buf[4096];
    if ((test_file = fopen ("/proc/self/maps", "r")) != NULL) {
        while (!feof (test_file)) {
            fgets (buf, sizeof (buf), test_file);
            puts (buf);
        }
    }

    return 0;
}

It outputs something similar to cat:

00400000-00401000 r-x- 00000000 00:00 1325490                    /home/lukas/tmp/a.out
00600000-00601000 r--- 00000000 00:00 1325490                    /home/lukas/tmp/a.out
00601000-00602000 rw-- 00001000 00:00 1325490                    /home/lukas/tmp/a.out

Why is the Rust executable mapped to large addresses near the stack?

1

There are 1 answers

4
Shepmaster On BEST ANSWER

Using rustc -Z print-link-args addr.rs, you can see what linker invocation the Rust compiler will use. Since the current linker happens to be cc, we can directly reuse these options for the C program. Ignoring unimportant arguments and removing others one-by-one, I was left with this compiler invocation:

gcc -fPIC -pie addr.c -o addr-c

Compiling the C code like this produces similar addresses as the Rust-compiled executable, indicating that one or both of those options is a likely culprit. This changes the question to "why does -fPIC and/or -pie map to such high addresses?"

I found another question and answer that seems to shed light on that:

The PIE binary is linked just as a shared library, and so its default load address (the .p_vaddr of the first LOAD segment) is zero. The expectation is that something will relocate this binary away from zero page, and load it at some random address.

Using readelf -e on the Rust executable, we can see that the first LOAD segment does have a virtual address of zero:

Program Headers:
  Type           Offset             VirtAddr           PhysAddr
                 FileSiz            MemSiz              Flags  Align
  LOAD           0x0000000000000000 0x0000000000000000 0x0000000000000000
                 0x000000000005e6b4 0x000000000005e6b4  R E    200000
  LOAD           0x000000000005ead0 0x000000000025ead0 0x000000000025ead0
                 0x00000000000039d1 0x00000000000049e8  RW     200000

I guess that this then changes the question to "why are these random addresses chosen", but I'm not sure of that answer. ^_^ A hunch tells me that ASLR comes into play. This other answer seems to bear that out:

PIE is to support ASLR in executable files.

ASLR is a security technique to help harden programs against certain types of attacks, so it makes sense that Rust, with its safety-minded approach, would attempt to enable something like this by default. Indeed, the addresses change a small bit each invocation:

root@97bcff9a925c:/# ./addr | grep 'r-xp' | grep 'addr'
5587cea9d000-5587ceafc000 r-xp 00000000 00:21 206                        /addr
561d8aae2000-561d8ab41000 r-xp 00000000 00:21 206                        /addr
555c30ffd000-555c3105c000 r-xp 00000000 00:21 206                        /addr
55db249d5000-55db24a34000 r-xp 00000000 00:21 206                        /addr
55e988572000-55e9885d1000 r-xp 00000000 00:21 206                        /addr
560400e1b000-560400e7a000 r-xp 00000000 00:21 206                        /addr