write a byte at address in .data segment in RISC-V assembly

11.2k views Asked by At

I am writing a RISC-V assembly program that need to store a word (saved into a register) into a .data segment:

.section .rodata
msg:
    .string "Hello World\n"

.section .data
num:
    .word 97 

.section .text
.global _start

_start:

    li a1, 100
    sw a1, num

    loop:
        j loop

But when the program reaches sw a1, num I get the error "illegal operands `sw a1,num'". How can I store datas into a memory location inside .data segment? could you give me some hints?

2

There are 2 answers

2
Palmer Dabbelt On

As a general rule of thumb, the assembly that you write into the assembler is its own programming language that just happens to look quite a bit like what's in the RISC-V ISA manual. RISC-V is a pretty simple ISA, so for most instructions there's actually no difference between the syntax of the ISA manual and the syntax accepted by the assembler. The place this starts to break down is when referencing symbols, because while you can fill out the immediate directly in your assembly code you probably want to rely on the linker to do so because you won't know the actual symbol address until link time (and it's likely to change as you modify your program).

In order to enable the linker to fill out symbol addresses in your code, you need to emit relocations from the assembler so the linker can later fill these out. I have a whole blog post on how this works at SiFive's blog, but we just refreshed the website and I can't figure out how to find it :).

In this case you're essentially trying to write assembly that implements the following C code

int num = 97;
void func(int val) { num = val; }

You've got all the data stuff correct in your original answer, so the only thing to worry about here is how to emit the correct instructions. You have a few options as to how to emit these. One option is to explicitly write each instruction and relocation into your source code, which would look like this

func:
    lui t0, %hi(num)
    sw  a0, %lo(num)(a0)
    ret

You can generate this assembly from my C code above by compiling with -mcmodel=medlow -mexplicit-relocs -O3. The GCC manual defines the other RISC-V backend specific options that control code generation.

If you're interested in more details, we have an assembly programmer's manual availiable on GitHub: https://github.com/riscv/riscv-asm-manual/blob/master/riscv-asm.md . It's far from complete, but we'd love to have help either pointing out issues or providing more content.

0
maxschlepzig On

The syntax of the store-word (sw) instruction is

sw rs2, offset(rs1)

where offset is a 12 bit immediate operand.

When you write sw a1, num you get an syntax error and the assembler fails with:

foo.s: : Assembler messages:
foo.s::13: Error: illegal operands `sw a1,num'

Perhaps the simplest way to solve this is to use the load-address (la) pseudo-instruction:

li a1, 100
la t0, num
sw a1, 0(t0)

Since the la instruction completely loads the address into a register we have to use 0 as offset.

The la pseudo-instruction expands to program-counter (PC) relative addressing, i.e. check with objdump:

00000000000100b0 <_start>:
   100b0:   06400593            addi    a1,zero,100
   100b4:   00001297            auipc   t0,0x1
   100b8:   01028293            addi    t0,t0,16 # 110c4 <__DATA_BEGIN__>
   100bc:   00b2a023            sw  a1,0(t0)

Alternatively, you can use absolute addressing:

li a1, 100
lui t0, %hi(num)
sw a1, %lo(num)(t0)

Note that the %hi() and %lo() assembler macros split a 32 bit address into its high 20 bits and low 12 bits parts (i.e. %hi(num) + sign_ext(%lo(num)) = num).