How to resolve Segmentation Fault in RISC-V Program

53 views Asked by At
// factorial.c

#include <stdio.h>

// Function to calculate factorial
unsigned long long factorial(unsigned int n) {
    if (n == 0)
        return 1;
    else
        return n * factorial(n - 1);
}

int main() {
    // Calculate factorial of 10
    unsigned long long result = factorial(10);

    // Print the result
    printf("Factorial of 10 = %llu\n", result);

    return 0;
}

Compiled the program using RISC-V GCC Compiler with optimization and profiling flags. Executed the program using Spike RISC-V simulator. Encountered a segmentation fault during execution.

vindhya@latitude-7390:~$ nano factorial.c
vindhya@latitude-7390:~$ /home/latitude-7390/rvv64/bin/riscv64-unknown-elf-gcc -O2 -pg factorial.c -o factorial
vindhya@latitude-7390:~$ /home/vindhya/riscv-vector/riscv-isa-sim/build/spike /home/latitude-7390/rvv64/bin/riscv64-unknown-elf/bin/pk ./factorial
z  0000000000000000 ra 00000000000101d4 sp 0000003fff7ffff0 gp 000000000001b018
tp 0000000000000000 t0 0000000000010818 t1 000000000000000f t2 0000000000000000
s0 0000000000000000 s1 0000000000000000 a0 00000000000101d4 a1 0000003ffffffb58
a2 0000000000000000 a3 0000000000000010 a4 000000000001a680 a5 0000000000000000
a6 000000000000001f a7 0000000000000000 s2 0000000000000000 s3 0000000000000000
s4 0000000000000000 s5 0000000000000000 s6 0000000000000000 s7 0000000000000000
s8 0000000000000000 s9 0000000000000000 sA 0000000000000000 sB 0000000000000000
t3 0000000000000000 t4 0000000000000000 t5 0000000000000000 t6 0000000000000000
pc 00000000000101cc va/inst 0000003fff7ffff8 sr 8000000200006020
User store segfault @ 0x0000003fff7ffff8
1

There are 1 answers

0
segfault On

There could be a number of problems at play in your setup/intent:

  1. The spike simulator runs statically built application binaries but you aren't supplying GCC with the -static option, like so:

    $ riscv64-unknown-elf-gcc -O2 -pg -static -o factorial factorial.c
    

    This is probably the reason why your code is segfaulting - its meandering into some unsupported dynamic link-load path when running on spike.

    BTW, Trying to run a recent build of spike with a program that wasn't statically compiled produces:

    ~/Work/repos/spike/test
    ❯ spike pk factorial
    not a statically linked ELF program
    

    Looking at spike's git history for that message produces:

    ~/Work/repos/spike/repos/riscv-pk master
    ❯ git log -S 'not a statically linked ELF program'   
    commit 099c99482f7ac032bf04caad13a9ca1da7ce58ed
    Date:   Tue Oct 1 12:13:28 2019 +0100
    
    Only accept statically linked binaries (#176)
    

    So that's a circa 2019 commit and I therefore assume that you are running a pretty old build of the simulator. It would be better to run the latest version.

  2. It is odd that your toolchain is permitting your program to link at all.

    The thing is, RISC-V toolchains for bare-metal development, such as a riscv64-unknown-elf toolchain which you appear to use, are built to optionally link against 'simpler' C libraries, like newlib or picolibc.

    Now GCC's '-pg' profiling option generates code that references the '_mcount' symbol. This symbol is not supplied by newlib or picolibc for RISC-V. It is however provided by glibc for RISC-V but that is for Linux user-space programs - not bare-metal ones.

    So for example, on my setup, trying to compile your program gives me:

    ❯ riscv64-unknown-elf-gcc -O2 -pg -static -o factorial factorial.c
    ~/Work/repos/spike/toolchain/riscv64-unknown-elf/lib/gcc/riscv64-unknown-elf/13.2.0/../../../../riscv64-unknown-elf/bin/ld: /tmp/ccbz7QsX.o: in function `factorial':
    factorial.c:(.text+0xa): undefined reference to `_mcount'
    ~/Work/repos/spike/toolchain/riscv64-unknown-elf/lib/gcc/riscv64-unknown-elf/13.2.0/../../../../riscv64-unknown-elf/bin/ld: /tmp/ccbz7QsX.o: in function    `main':
    factorial.c:(.text.startup+0x6): undefined reference to `_mcount'
    collect2: error: ld returned 1 exit status
    

    However, the link succeeds and the program runs correctly if I use a glibc linked toolchain - which would be one used typically to build Linux user-space programs:

    ~/Work/repos/spike/test
    ❯ riscv64-linux-gnu-gcc -O2 -pg -static -o factorial factorial.c
    
    ~/Work/repos/spike/test
    ❯ spike pk factorial
    Factorial of 10 = 3628800
    

    I can only conclude therefore that your toolchain has either not been built correctly, or is outdated and perhaps buggy.

  3. Given the above, the stack pointer value in sp and the related va/inst reference value appear odd. For this program to work with -pg you would need a Linux RISC-V toolchain and factorial.c to be built with -static.

    If you did that, spike would end up using a stack with a virtual address somewhere in the region of 0xffffffc000xxxxxx.

  4. Ultimately, profiling using GCC's '-pg' isn't intended to be used in bare-metal environments with functional ISA simulators like spike which do not model micro-architecture realities. You perhaps intend building and running programs under Linux on some cycle accurate target platform ?

Anyway, my suggestion would be to:

  1. Start from a recent upstream build of a RISC-V Linux toolchain, perhaps using the excellent crosstool-ng framework
  2. Along with a recent upstream build of spike and pk
  3. Taking care to build your program statically

All the best.