Adding a byte from memory to the AX register

1.2k views Asked by At

I'm currently trying to figure out how to add the first byte in memory pointed to by the pointer register SI to the current contents of the AX register.

So if SI holds some address, and the values in memory at that address are: 00 and 01, I'm looking to add just 00 to the AX register.

The first instruction my assembly-noobish self tried was add ax, byte ptr [SI] but of course, no dice, as I'm trying to add operands of different sizes.

My current workaround is

mov dx,0000h             ;empty the contents of dx
mov dl,byte ptr [si]     ;get the value of the first byte in a register
add ax,dx                ;perform the originally desired addition

But this is incredibly wasteful and really hurts my executed instructions count (this is part of a subroutine that runs many times).

I'm limited to the 8086 instruction set so this question/answer by Peter Cordes which suggests movzx to condense my first two lines is unfortunately not viable.

1

There are 1 answers

2
Peter Cordes On BEST ANSWER

As you say, if you can assume a 386-compatible CPU, a good option (especially for modern CPUs) is movzx dx, byte ptr [mem] / add ax, dx. If not, I guess we can pretend we're tuning for a real 8086, where code size in bytes is often more important than instruction count. (Especially on 8088, with its 8-bit bus.) So you definitely want to use xor dx, dx to zero DX (2 bytes instead of 3 for mov reg, imm16), if you can't avoid a zeroing instruction altogether.

Hoist the zeroing of DX (or DH) out of any loop, so you just mov dl, [mem] / add ax, dx. If the function only does it once, you may need to (manually) inline the function in call sites that call it in a loop, if it's small enough for that to make sense. Or pick a register where callers are responsible for having the upper half zero.

As Raymond says, you can pick any other register whose high half you know to be zero at that point in your function. Perhaps you could mov cx, 4 instead of mov cl, 4 if you happened to need CL=4 for something else earlier, but you're done with CX by the time you need to add into AX. mov cx, 4 is only 1 byte longer, so you get CH zeroed with only 1 extra byte of code-size. (vs. xor cx, cx costs 2 bytes)


Another option is byte add/adc, but that isn't ideal for code size. (Or performance on later CPUs.)

  add al, [mem]      ; 2 bytes + extra depending on addr mode
  adc ah, 0          ; 3 bytes

So that's 1 byte more than if you already had a spare upper-zeroed register:

  mov  dl, [mem]     ; 2 bytes (+ optional displacement)
  add  ax, dx        ; 2 bytes

But on the plus side, add/adc doesn't need any extra register at all.


With the pointer in SI, it's worth looking for ways to take advantage of lodsb if you're really optimizing for code-size. That does mov al, [si] / inc si (or instead dec si if DF=1), but without affecting FLAGS. So you'd want to add into a different register.

xchg ax, reg is only 1 byte, but if you need two swaps it may not pay for itself if you actually have to return in AX, not some other register.