Checking for overflow and/or carry flags, getting an integer code of which happened

2.7k views Asked by At

First, I want to point out that this isn't really x86, it's msx88 which is a sort of simplified version of x86 for learning purposes.

I need to make a function that checks for arithmetic errors (carry, overflow) and I know that I can use jo and jc for checking, but the problem is returning back to the point after the check (I don't want to use call, and I am not sure if jumps store IP, so I don't know if I can use ret).

How can I modify my code so that I can execute JO, and if it makes the jump, so that it return to the next instruction after JO (JC)?

ORG 3000H 
ArithmeticError: MOV AX, 0 
JO overflow
JC carry
RET ;Return 
overflow: ADD AX, 1
carry: ADD AX, 2


;If overflow AX=1, if carry AX=2, if overflow and carry AX=3, else AX=0
ORG 2000H
CALL ArithmeticError

END
4

There are 4 answers

3
Maxim Razin On BEST ANSWER

You should save the flags before any arithmetic. Something like

  MOV AX,0 ; NB not XOR to keep flags intact!
  JNO no_overflow
  PUSHF ; save flags
  INC AX
  POPF ; restore them back for the second check
no_overflow:
  JNC no_carry
  ADD AX,2
no_carry:
  ; if AX is zero, we have no error
  TST AX
  JZ out
  CALL ArithmeticError
out:
  RET
4
Jonathan Wood On

A JMP does not preserve the calling address.

If you don't want to do a CALL, I would store the result somewhere, do your other processing, and then make the call (or do the same function within the current procedure).

It's not clear why you don't want to use a call, but a call is the best operation to use if you want to return to where you called it from.

0
Madhur Ahuja On

If you dont want to use call, you can push the modified IP onto stack, use the JMP and then use RET. Below is pseudo code:

PUSH [IP] +x ; [where x is the size which would evaluate to instruction just after POP IP]
JO OVERFLOW
ADD SP, y  ; y is size of address. if jump was not taken
XOR EAX,EAX  ;the IP of this instruction minus orginal IP would be x


OVERFLOW:
; program instructions
RET
0
Peter Cordes On

You actually just want to call ArithmeticError with AX= one or two bits right? The returning-from-JO is just an X-Y problem in your implementation.

There are a few strategies you can use. On simple one is to get FLAGS into AX and test the two bits there. (https://en.wikipedia.org/wiki/FLAGS_register has the layout). We can leave it to ArithmeticError to sort out the bit positions in the rare case we actually have an error, instead of doing extra work every time.

   pushf
   pop    ax           ; OF = 0x0800  CF=0x0001
   and    ax, 0x0801   ; zero the other bits, leaving only OF and CF
   jnz    ArithmeticError      ; tail-call / jump  if either were set
back:
   ret

ArithmeticError:
; at the start of ArithmeticError, if you want to shuffle 0x0801 to 0x03
   shr    ah, 2        ; 0x8 -> 0x2
   or     al, ah       ; AL = 0 0 0 0 ' 0 0 OF CF.    (AH can be non-zero)
   ... do your printing or whatever

Jumping to ArithmeticError lets it use our return address, not coming back to us. (This is an optimized tailcall.) Otherwise in the common case, jnz falls through so we reach our own ret. It comes out the same as doing this, because call foo/ret is equivalent to jmp foo.

   ...
   jz    no_error        ; equivalent without optimized conditional tailcall
   call ArithmeticError
no_error:
   ret

inc leaves CF unmodified in x86. I'm assuming that's also true in msx88. That could be useful if we ever wanted to use inc ax; inc ax to increment by two.

But mov doesn't touch FLAGS at all, so we can just mov ax,2 if we're more interesting in simplicity than code-size1.

   mov  al, 0
   jno  no_overflow
   mov  al, 2         ; AL = 2 if OF was set
no_overflow:
   adc  al, 0         ; AL += 0 + CF, setting the low bit or leaving unmodified
   jnz  ArithmeticError   ; use FLAGS set according to AL, by ADC
   ret

Footnote 1: inc ax is a 1-byte instruction in 16-bit mode, vs. 3 for mov ax, 0 or mov ax,2. When optimizing for ancient 8088 CPUs, you would use inc ax twice (2 bytes total) instead of add ax,2 (3 bytes) because code-fetch was the primary bottleneck.

But to save bytes, we used al instead of ax. You can zero-extend into AX if you want in the ArithmeticError case, or change the instructions to use AX directly.


If we have setcc (386 and later), we can turn a FLAGS condition into an integer 0 / 1 in a register. Without it, you might start with 8086 lahf (Load AH from FLAGS) to get CF into the bottom of AH, before doing something with OF (which is unfortunately outside the low 8 bits of FLAGS). lahf/and ah,1 emulates setc ah, except it also writes FLAGS so you'd have to wait until after branching on OF.

; kinda clunky
   setc   ah     ; AH = CF      ; optionally use CL or DL to avoid partial-register stalls or false dependencies
   seto   al     ; AL = OF
   add    al, ah    ; AL = 1 or 2 if either or both bits were set
                    ; we'll have to decode using AH to recover which one
   jnz   ArithmeticError   
   ret

; assuming that no-error is vastly more common than either carry or overflow
; we made the branching as cheap as possible, leaving more decode work for this
 ArithmeticError:
   sub  al, ah      ; undo the add, back to AH=CF, AL=OF
   shl  ah, 1
   add  al, ah      ; CF + 2*OF
 ; if I'd used separate regs like dl and al, could have used  lea eax, [eax + 2*edx]

Somewhat efficient assuming errors are rare: two not-taken branches

   jo   overflow
   mov  al, 0         ; only zero AL if mov al, 2 isn't going to run
   jc   carry
   ret

overflow:
   mov  al, 2
   jnc  nocarry      ; alternative:  ADC al,0  as in the earlier version
carry:
   inc  ax           ; like OR al,1   since AL=0 or 2 before this
nocarry:
   jmp   ArithmeticError       ; could be a call, if you want to put stuff after

This ends up checking CF in two places, but that's because we want to keep the fast path as minimal as possible, just two not-taken branches. (And a mov al,0).

You could take this even further and do more work sorting things out in the overflow: and carry: blocks, so the fast path is just jc / jo. So the carry block can't assume any register state was already set up. You might just use both halves of AH:AL separately. If errors are rare, the extra code required will still run a negligible amount of times.

But if you're replicating this for every operation that you want error checking for, it makes sense to tighten it up.

Putting the mov al,0 between the two branches spaces them out, which may help branch predictors in some CPUs. Maybe not important if they're usually both not-taken. But may help on in-order pipelines, especially P5 Pentium which can pair jcc in the V pipe (See Agner Fog's microarch guide and instruction tables). (ret is not pairable, but near call is.) Of course modern x86 does out-of-order exec, but producing correct predictions for dense branches can still be a problem for the front-end.

We're not depending on an a conditional-tailcall, so we can use ArithmeticError if we want to inline this block into functions instead of having a ret after the jc. Also, so even before 386 jcc rel16, we aren't limited to a [-128,+127] branch displacement to reach ArithmeticError with jcc rel8. JMP does have a 16-bit displacement version even in 8086.