How to use ins instruction with GNU assembler

862 views Asked by At

How do I use the x86 ins instruction with the GNU assembler? The instruction reference suggests the syntax INS m8/16/32, DX where e.g. m16 (I assume) is any 16 bit general purpose register whose only purpose it is to signify whether a byte/word/doubleword should be read, right?

Now, unfortunately, as rejects ins %ax,%dx with Error: operand type mismatch for 'ins', why is that?

For the record, I know I could simply use insb etc. but I'm calling this instruction via inline assembly inside a C++ program and the size of the input to be read depends on a template parameter (and string handling at compile time is not very practical).

EDIT: here is what I have now, for reference (I don't really like the macro)

#define INS(T) \
  __asm__ __volatile__("repne \n\t" \
                       "ins" T \
                       : "=D" (dest), "=c" (count) \
                       : "d" (port), "0" (dest), "1" (count) \
                       : "memory", "cc")

template<typename T>
void ins(uint16_t port, uint32_t dest, uint32_t count);

template<>
void ins<uint8_t>(uint16_t port, uint32_t dest, uint32_t count)
{ INS("b"); }

template<>
void ins<uint16_t>(uint16_t port, uint32_t dest, uint32_t count)
{ INS("w"); }

template<>
void ins<uint32_t>(uint16_t port, uint32_t dest, uint32_t count)
{ INS("l"); }
2

There are 2 answers

9
Nate Eldredge On BEST ANSWER

It's supposed to be a memory reference, not a register. The idea in Intel syntax is that you could write ins dword ptr [rdi], dx for a 32-bit read (aka insd), ins word ptr [rdi], dx for a 16-bit insw read, etc. You could maybe even write ins dword ptr [foo], dx and get a 32-bit read but the data would be written to [rdi] anyway. Nonetheless AT&T assembler syntax doesn't support this at all and the only way to specify the size is with the operand size suffix.

In GCC inline assembly, you can get an operand size suffix b,w,l,q matching the size of an operand (based on its type) with the z operand modifier. See the GCC manual, section "Extended Asm" under "x86 operand modifiers". So if you use types appropriately and add an operand which refers to the actual destination (not the pointer to it), you can do the following:

template<typename T>
void ins(uint16_t port, T *dest, uint32_t count) {
    asm volatile("rep ins%z2"
        : "+D" (dest), "+c" (count), "=m" (*dest)
        : "d" (port)
        : "memory");
}

Try it on godbolt

It's important here that the destination be a T * instead of a generic uint32_t since the size is inferred from the type T.

I've also replaced your duplicated input and output operands with the + read-write constraint. And to nitpick, the "cc" clobber is unnecessary because rep ins doesn't affect any flags, but it's redundant on x86 because every inline asm is assumed to clobber the flags anyway.

0
Michael Petch On

This answer is in response to the first version of the question that was asked prior to the major edit by the OP. This addresses the AT&T syntax of GNU assembler for the INS instruction.

The instruction set reference for INS/INSB/INSW/INSD — Input from Port to String shows that there are really only 3 forms of the INS instruction. One that takes a byte(B), word(W), or double word(D). Ultimately a BYTE(b), WORD(w), or DWORD(l) is read from the port in DX and written to or ES:RDI, ES:EDI, ES:DI. There is no form of the INS instructions that take a register as a destination, unlike IN that can write to AL/AX/EAX.

Note: with the IN instruction the port is considered the source and is the first parameter in AT&T syntax where the format is instruction src, dest.

In GNU assembler it is easiest to simply use either of these 3 forms:

insb      # Read BYTE from port in DX to [RDI] or ES:[EDI] or ES:[DI]
insw      # Read WORD from port in DX to [RDI] or ES:[EDI] or ES:[DI]
insl      # Read DWORD from port in DX to [RDI] or ES:[EDI] or ES:[DI]

In 16-bit code these instruction would do:

insb      # Read BYTE from port in DX to ES:[DI]
insw      # Read WORD from port in DX to ES:[DI]
insl      # Read DWORD from port in DX to ES:[DI]

In 32-bit code these instruction would do:

insb      # Read BYTE from port in DX to ES:[EDI]
insw      # Read WORD from port in DX to ES:[EDI]
insl      # Read DWORD from port in DX to ES:[EDI]

In 64-bit code these instruction would do:

insb      # Read BYTE from port in DX to [RDI]
insw      # Read WORD from port in DX to [RDI]
insl      # Read DWORD from port in DX to [RDI]

Why do assemblers support the long form as well? Mainly for documentation purposes but there is a subtle change that can be expressed with the long form and that is the size of the memory address (not the size of the data to move). In GNU assembler this is supported in 16-bit code:

insb (%dx),%es:(%di)      # also applies to INSW and INSL
insb (%dx),%es:(%edi)     # also applies to INSW and INSL

16-bit code can use the 16-bit or 32-bit registers to form the memory operand and this is how you can override it (there is another using addr overrides described below). In 32-bit code it is possible to do this:

insb (%dx),%es:(%di)      # also applies to INSW and INSL
insb (%dx),%es:(%edi)     # also applies to INSW and INSL

It is possible to use the 16-bit registers in a memory operand in 32-bit code. There are very few use cases where this is particularly useful but the processor supports it. In 64-bit code you are allowed to use 32-bit or 64-bit registers in a memory operand, so in 64-bit code this is possible:

insb (%dx),%es:(%rdi)      # also applies to INSW and INSL
insb (%dx),%es:(%edi)      # also applies to INSW and INSL

There is a shorter way in GNU assembler to change the memory address size and that is using INSB/INSW/INSL with the addr16, addr32, and addr64 overrides. As an example in 16-bit code these are equivalent:

addr32 insb               # Memory address is %es:(%edi). also applies to INSW and INSL
insb (%dx),%es:(%edi)     # Same as above

In 32-bit code these are equivalent:

addr16 insb               # Memory address is %es:(%di). also applies to INSW and INSL
insb (%dx),%es:(%di)      # Same as above

In 64-bit code these are equivalent:

addr32 insb               # Memory address is %es:(%edi). also applies to INSW and INSL
insb (%dx),%es:(%edi)     # Same as above