I have the following simple program, which basically just mmap
s a file and sums every byte in it:
#include <errno.h>
#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>
#include <sys/mman.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
#include <string.h>
volatile uint64_t sink;
int main(int argc, char** argv) {
if (argc < 3) {
puts("Usage: mmap_test FILE populate|nopopulate");
return EXIT_FAILURE;
}
const char *filename = argv[1];
int populate = !strcmp(argv[2], "populate");
uint8_t *memblock;
int fd;
struct stat sb;
fd = open(filename, O_RDONLY);
fstat(fd, &sb);
uint64_t size = sb.st_size;
memblock = mmap(NULL, size, PROT_READ, MAP_SHARED | (populate ? MAP_POPULATE : 0), fd, 0);
if (memblock == MAP_FAILED) {
perror("mmap failed");
return EXIT_FAILURE;
}
//printf("Opened %s of size %lu bytes\n", filename, size);
uint64_t i;
uint8_t result = 0;
for (i = 0; i < size; i++) {
result += memblock[i];
}
sink = result;
puts("Press enter to exit...");
getchar();
return EXIT_SUCCESS;
}
I make it like this:
gcc -O2 -std=gnu99 mmap_test.c -o mmap_test
You pass it a file name and either populate
or nopopulate
1, which controls whether MAP_POPULATE
is passed to mmap
or not. It waits for you to type enter before exiting (giving you time to check out stuff in /proc/<pid>
or whatever).
I use a 1GB test file of random data, but you can really use anything:
dd bs=1MB count=1000 if=/dev/urandom of=/dev/shm/rand1g
When MAP_POPULATE
is used, I expect zero major faults and a small number of page faults for a file in the page cache. With perf stat
I get the expected result:
perf stat -e major-faults,minor-faults ./mmap_test /dev/shm/rand1g populate
Press enter to exit...
Performance counter stats for './mmap_test /dev/shm/rand1g populate':
0 major-faults
45 minor-faults
1.323418217 seconds time elapsed
The 45 faults just come from the runtime and process overhead (and don't depend on the size of the file mapped).
However, /usr/bin/time
reports ~15,300 minor faults:
/usr/bin/time ./mmap_test /dev/shm/rand1g populate
Press enter to exit...
0.05user 0.05system 0:00.54elapsed 20%CPU (0avgtext+0avgdata 977744maxresident)k
0inputs+0outputs (0major+15318minor)pagefaults 0swaps
The same ~15,300 minor faults is reported by top
and by examining /proc/<pid>/stat
.
Now if you don't use MAP_POPULATE
, all the methods, including perf stat
agree there are ~15,300 page faults. For what it's worth, this number comes from 1,000,000,000 / 4096 / 16 = ~15,250
- that is, 1GB divided in 4K pages, with an additional factor of 16 reduction coming from a kernel feature ("faultaround") which faults in up to 15 nearby pages that are already present in the page cache when a fault is taken.
Who is right here? Based on the documented behavior of MAP_POPULATE
, the figure returned by perf stat
is the correct one - the single mmap
call has already populated the page tables for the entire mapping, so there should be no more minor faults when touching it.
1Actually, any string other than populate
works as nopopulate
.