I am learning about the Linux user space memory layout on x86_64 systems and wanted to print some addresses from some of the sections. I used this Rust code:
fn main() {
let x = 3; // should be stored on stack
let s = "hello"; // should be in the .data section
println!("stack ≈ {:p}", &x);
println!(".text ≈ {:p}", main as *const ());
println!(".data ≈ {:p}", s);
use std::io;
let mut f = std::fs::File::open("/proc/self/maps").unwrap();
let out = io::stdout();
io::copy(&mut f, &mut out.lock()).unwrap();
}
This code also prints the file /proc/self/maps
to stdout. I compiled this file mem.rs
simply with rustc mem.rs
. It printed:
stack ≈ 0x7ffffbf82f2c
.text ≈ 0x7f45b7c0a2b0
.data ≈ 0x7f45b7c4d35b
7f45b6800000-7f45b6c00000 rw-- 00000000 00:00 0
7f45b6de0000-7f45b6f9a000 r-x- 00000000 00:00 664435 /lib/x86_64-linux-gnu/libc-2.19.so
7f45b6f9a000-7f45b6fa2000 ---- 001ba000 00:00 664435 /lib/x86_64-linux-gnu/libc-2.19.so
[ ... more .so files]
7f45b7a22000-7f45b7a23000 r--- 00022000 00:00 663920 /lib/x86_64-linux-gnu/ld-2.19.so
7f45b7a23000-7f45b7a24000 rw-- 00023000 00:00 663920 /lib/x86_64-linux-gnu/ld-2.19.so
7f45b7a24000-7f45b7a25000 rw-- 00000000 00:00 0
7f45b7aa0000-7f45b7aa2000 rw-- 00000000 00:00 0
7f45b7ab0000-7f45b7ab2000 rw-- 00000000 00:00 0
7f45b7ac0000-7f45b7ac1000 rw-- 00000000 00:00 0
7f45b7ad0000-7f45b7ad1000 rw-- 00000000 00:00 0
7f45b7ae0000-7f45b7ae2000 rw-- 00000000 00:00 0
7f45b7c00000-7f45b7c5f000 r-x- 00000000 00:00 1134580 /home/lukas/tmp/mem
7f45b7e5e000-7f45b7e62000 r--- 0005e000 00:00 1134580 /home/lukas/tmp/mem
7f45b7e62000-7f45b7e63000 rw-- 00062000 00:00 1134580 /home/lukas/tmp/mem
7f45b7e63000-7f45b7e64000 rw-- 00000000 00:00 0
7ffffb784000-7ffffb785000 ---- 00000000 00:00 0 [stack]
7ffffb785000-7ffffbf84000 rw-- 00000000 00:00 0
7ffffc263000-7ffffc264000 r-x- 00000000 00:00 0 [vdso]
At least the addresses I printed on my own seem to match what maps
says. But when I execute cat /proc/self/maps
in the terminal, I get this output:
00400000-0040b000 r-x- 00000000 00:00 107117 /bin/cat
0060a000-0060b000 r--- 0000a000 00:00 107117 /bin/cat
0060b000-0060c000 rw-- 0000b000 00:00 107117 /bin/cat
0071c000-0073d000 rw-- 00000000 00:00 0 [heap]
7f7deb933000-7f7debc30000 r--- 00000000 00:00 758714 /usr/lib/locale/locale-archive
7f7debc30000-7f7debdea000 r-x- 00000000 00:00 664435 /lib/x86_64-linux-gnu/libc-2.19.so
7f7debdea000-7f7debdf2000 ---- 001ba000 00:00 664435 /lib/x86_64-linux-gnu/libc-2.19.so
[ ... more .so files ...]
7f7dec222000-7f7dec223000 r--- 00022000 00:00 663920 /lib/x86_64-linux-gnu/ld-2.19.so
7f7dec223000-7f7dec224000 rw-- 00023000 00:00 663920 /lib/x86_64-linux-gnu/ld-2.19.so
7f7dec224000-7f7dec225000 rw-- 00000000 00:00 0
7f7dec250000-7f7dec252000 rw-- 00000000 00:00 0
7f7dec260000-7f7dec261000 rw-- 00000000 00:00 0
7f7dec270000-7f7dec272000 rw-- 00000000 00:00 0
7ffff09e8000-7ffff11e8000 rw-- 00000000 00:00 0 [stack]
7ffff1689000-7ffff168a000 r-x- 00000000 00:00 0 [vdso]
The latter result matches everything I read about this topic: the sections from the executable are mapped in the lower end of the virtual address space (beginning around 0x400000).
I executed and compiled everything in the Linux Subsystem for Windows (Ubuntu 14.04 basically). I know, it's new and stuff, but I'm fairly sure this is not an issue with the subsystem (please tell me if it is, though!). Rust 1.14 is that matters (I doubt it),
I also tried the same with a C program (excuse my probably bad C):
#include <stdio.h>
#include <errno.h>
#include <string.h>
#include <stdlib.h>
int main(int argc, char **argv) {
FILE *test_file;
char buf[4096];
if ((test_file = fopen ("/proc/self/maps", "r")) != NULL) {
while (!feof (test_file)) {
fgets (buf, sizeof (buf), test_file);
puts (buf);
}
}
return 0;
}
It outputs something similar to cat
:
00400000-00401000 r-x- 00000000 00:00 1325490 /home/lukas/tmp/a.out
00600000-00601000 r--- 00000000 00:00 1325490 /home/lukas/tmp/a.out
00601000-00602000 rw-- 00001000 00:00 1325490 /home/lukas/tmp/a.out
Why is the Rust executable mapped to large addresses near the stack?
Using
rustc -Z print-link-args addr.rs
, you can see what linker invocation the Rust compiler will use. Since the current linker happens to becc
, we can directly reuse these options for the C program. Ignoring unimportant arguments and removing others one-by-one, I was left with this compiler invocation:Compiling the C code like this produces similar addresses as the Rust-compiled executable, indicating that one or both of those options is a likely culprit. This changes the question to "why does
-fPIC
and/or-pie
map to such high addresses?"I found another question and answer that seems to shed light on that:
Using
readelf -e
on the Rust executable, we can see that the firstLOAD
segment does have a virtual address of zero:I guess that this then changes the question to "why are these random addresses chosen", but I'm not sure of that answer. ^_^ A hunch tells me that ASLR comes into play. This other answer seems to bear that out:
ASLR is a security technique to help harden programs against certain types of attacks, so it makes sense that Rust, with its safety-minded approach, would attempt to enable something like this by default. Indeed, the addresses change a small bit each invocation: