Definition question in Assembly (RiscV), help me understand

1k views Asked by At

Learning RiscV my lecturer defined a new command called Load Upper Immediate (lui) like this:

lui rt, imm

loads the lower halfword of the immediate imm into the upper halfword of register rt. The lower bits of the register are set to 0.

with the following image:

enter image description here

But there are few things which I can't understand:

  1. "loads the lower halfword of the immediate" what does this mean? I think if we have 32 bits in imm then it loads the first 16 am I right?

  2. Is the image correct at all? shouldn't the first half be all zeroes and the difination mentions? why we have those 0xf, 0, rt and where rt came from?

  3. Given the following command: lui S0, 0x1234
    What will it do? I don't know the value of location 1234 in memory...

3

There are 3 answers

0
old_timer On

Before even attempting to read or write assembly language you need to get the documentation. For the instruction set you are using, is this a MIPS question (the image you posted) or a risc-v question as documented in the text of the title, etc?

Assuming risc-v go to risc-v.org and follow the links to the documentation, they have made it extremely easy to find.

LUI in risc-v is defined as such

33222222222211111111 11
10987654321098765432 10987 6543210
     imm[31:12]        rd   opcode

bits 31:12 of the instruction are the immediate
bits 11:7  of the instruction are the destination register
bits 6:0   of the instruction are the opcode

Obviously every instruction needs some bits for the processor to decode to know what instruction it is, only one can have the pattern zero so there are non-zero bits in most opcodes. Likewise you need a destination register, where the value get stored, encoded in the instruction as well.

LUI (load upper immediate) is used to build 32-bit constants and uses the U-type format. LUI places the U-immediate value in the top 20 bits of the destination register rd, filling in the lowest 12 bits with zeros.

Painfully obvious how this instruction works.

I will admit the risc-v documentation could have been done better, finding the opcode...0b0110111 or 0x37.

Not sure what the confusion is about how humans read numbers

0b0110111 = 0x37 = 067 (octal) = 55 (decimal)

these all describe the same value, which all describe the same bit pattern in the instruction.

so that means they could have just put that in the instruction definition like everyone else.

33222222222211111111 11
10987654321098765432 10987 6543210
     imm[31:12]        rd  0110111

So knowing that we can for example construct

.word 0x12345137

assemble then disassemble

Disassembly of section .text:

00000000 <.text>:
   0:   12345137            lui x2,0x12345

Okay, so let's try that forward:

.word 0x12345137
lui x2,0x12345

assemble and disassemble

Disassembly of section .text:

00000000 <.text>:
   0:   12345137            lui x2,0x12345
   4:   12345137            lui x2,0x12345

So there we go, instruction encoding solved.

LUI (load upper immediate) is used to build 32-bit constants and uses the U-type format. LUI places the U-immediate value in the top 20 bits of the destination register rd, filling in the lowest 12 bits with zeros.

So this was quite clear the 32 bit constant is in this case

0x12345000

gets stored in register x2 in this case.

Both the encoding and the operation of all of the instructions are defined, most should be easy to understand. The encoding is very straight forward and easy to understand.

Now if this was a MIPS question and not a risc-v question then in this case it is as equally easy to understand. The 16 bit immediate goes into bits 31:16 of the constant being constructed with bits 15:0 all being zeros and that constant being stored in the register encoded in the instruction. Along with an opcode so the processor can know what instruction this is.

13
Erik Eidt On

Is the image correct at all? shouldn't the first half be all zeroes and the difination mentions? why we have those 0xf, 0, rt and where rt came from?

Yes, it is correct; however, that image shows us it's a MIPS instruction, not RISC V1.

The 0xf is the MIPS opcode for lui.  There is an unused field of 5 bits (the zeros) and a register field along with the 16-bit immediate.

lui is not a memory reference instruction — it merely loads a constant stored in the instruction into the register.

Given the following command: lui S0, 0x1234 What will it do?

lui s0, 0x1234     ; s0 becomes 0x12340000

You can look it up in the MIPS green sheet.

Suffice it to say that using 2 instructions we can form a 32-bit constant value (data value or memory address).  The first instruction, lui forms the upper 16 bits of the 32-bit value and the 2nd instruction supplies the lower 16 bits using an ordinary ori or addi.

Load Upper Imm.        lui rt,imm      I-Type        R[rt] = {imm, 16’b0}

This one instruction is equivalent to the 2 instruction sequence:

ori rt, r0, imm         ; load 16-bit constant
sll rt, rt, 16          ; shift left by 16 bits

Footnote 1: Here's the RISC V version of LUI:

 31    12 11   7 6       0
+-------------------------+
|  imm   |  rd  |  opcode |
+-------------------------+
    20       5        7

Quite different as you can see: the immediate is 20 bits long (not MIPS' 16-bits), and, of course, the opcode is on the right (and 7 bits not MIPS' 6 bits).  (There are also no wasted zeros.)

5
Peter Cordes On

First of all, this is MIPS's version of lui, not RISC-V. They're similar ISAs, but definitely have different machine code, and different sizes of immediates.

See @Erik's answer for parts of your question and some necessary background, but yes there is a problem with the wording in the image.

"loads the lower halfword of the immediate" what does this mean? I think if we have 32 bits in imm then it loads the first 16 am I right?

This part of the image explains it badly; arguably even wrong. The diagram in the image shows the entire 32-bit instruction word, with its various fields. The immediate is the low half-word of the instruction word. The "lower halfword of the immediate" is the whole immediate, and there is no upper half-word. Phrasing it that way makes zero sense and is highly misleading.

The value placed in the destination register is the immediate left-shifted by 16, so yes the immediate is placed in the upper half-word of the 32-bit register.


Given the following command: lui S0, 0x1234

Single-step it in MARS and see what happens. Look at the register value in the debugger. It's 0x1234 << 16.