How to move the zero flag into a register in x86-64?

522 views Asked by At

I want to move the zero flag set as the result of a comparison, e.g. "cmp rax,rbx", into a register. I know I can use one of PUSHF/PUSHFD/PUSHFQ to push the flags onto the stack, but now I want to move just the zero flag from the stack to a register.

According to https://www.felixcloutier.com/x86/pushf:pushfd:pushfq, it "Decrements the stack pointer by 4 (if the current operand-size attribute is 32) and pushes the entire contents of the EFLAGS register onto the stack." But that doesn't tell me the order they are pushed and how to access the zero flag.

According to the Intel Software Developers Manual (June 2023), section 7.3.13.2, the zero flag is #6 from the base at zero. (Also Wikipedia). So from that I would guess that mov al,[rsp+6] would do it. But I need to be sure because they are all just booleans -- I may grab the wrong boolean.

I'm in 64-bit mode, so I would be addressing the RFLAGS register. Section 7.3.1.4 of the manual says, "PUSHF and POPF behave the same in 64-bit mode as in non-64-bit mode. PUSHFD always pushes 64-bit RFLAGS onto the stack (with the RF and VM flags read as clear). POPFD always pops a 64-bit value from the top of the stack and loads the lower 32 bits into RFLAGS. It then zero extends the upper bits of RFLAGS."

Finally, what's the difference between PUSHFD and PUSHFQ? Double word and quadword, but which should I use in 64-bit mode?

2

There are 2 answers

3
vengy On

This MASM example extracts the ZF (bit #6) from the RFLAGS register using the very inefficient PUSHFQ/POP method and stores it in the rcx register. As noted in the comments below, there are much more efficient alternatives.

option casemap:none

.code

main proc
    mov rax,0               ; Init to 0
    mov rbx,0               ; Init to 0
    cmp rax,rbx             ; Comparison equal, sets ZF=1
    pushfq                  ; Push RFLAGS onto the stack
    pop rcx                 ; Load the RFLAGS from the stack into rcx
    shr rcx, 6              ; Shift right by 6, so the ZF is now the LSB of rcx
    and rcx, 1              ; Zero all other bits, except the LSB
                            ; rcx contains the ZF moved from the stack to a register
    ret
main endp

end

Here's a brief overview of the first 16-bit flags register:

flags

2
Nate Eldredge On

Don't use pushf for this. As has already been mentioned in comments, the best way to materialize ZF into a register is the conditional set instruction setz. This is exactly what it's for. setcc reg8 sets an 8-bit register to 0 or 1 according to whether the condition code cc is false or true.

The fact that it sets an 8-bit partial register, and leaves the other 56 bits of the 64-bit register unchanged, is awkward. It's rarely useful behavior, and is actually bad for performance because it can introduce a read dependency on the previous 64-bit register value. To mitigate this, you can zero the 64-bit register first. For instance:

xor ecx, ecx
cmp rax, rbx
setz cl

Note that xor ecx, ecx actually zeros all of rcx, and is more efficient than the obvious xor rcx, rcx; see What is the best way to set a register to zero in x86 assembly: xor, mov or and? and Why do x86-64 instructions on 32-bit registers zero the upper part of the full 64-bit register?. And make sure the xor goes before the cmp, not after, since xor updates the flags.


The pushf family is not a good choice here; it is a fairly slow instruction in itself (about 5-10 clock cycles latency), involves memory access, and then requires more work to manipulate the bits of the result. setz on the other hand is typically just one clock cycle. If for some reason you couldn't use setz, then your next best option would be a conditional jump:

    xor ecx, ecx
    cmp rax, rbx
    jnz onward
    mov ecx, 1
onward:
    ;; more code

But as far as your actual questions about pushf.

As far as "order", pushfq in 64-bit mode pushes the entire RFLAGS register as a single 64-bit value. The layout of RFLAGS is found in the Intel manuals and also in the Wikipedia page you already linked. The bit numbers start with 0 as the least significant bit. So ZF is bit 6, and a working but inefficient way to materialize ZF would be

pushfq
pop rax
shr eax, 6
and eax, 1

Your proposed mov al, [rsp+6] would load al with byte 6, i.e. bits 48-55. All bits above 32 are reserved and will probably be 0, so this will just give you 0 in al. Again, the flags are the individual bits of a single 64-bit quadword; they are not pushed as 64 separate bytes.


pushfd and pushfq are the same opcode; the question of whether to push 32 or 64 bits is determined by the current CPU mode. If you have correctly told your assembler that you are writing code for 64-bit mode (e.g. with bits 64 in nasm) then it will refuse to assemble pushfd.

nasm will actually treat the pushf mnemonic in 64-bit mode as pushfq since that is normally what you want. It is however possible to execute a 16-bit flag push in 64-bit mode with an operand size override byte; Intel's manuals call this pushf but nasm would require you to write it as pushfw. This is not useful, though, since it misaligns the stack. In 64-bit mode, you always want to push and pop full 64-bit (8-byte) units, except in very unusual situations.