How do I use the x86 ins
instruction with the GNU assembler? The instruction reference suggests the syntax INS m8/16/32, DX
where e.g. m16
(I assume) is any 16 bit general purpose register whose only purpose it is to signify whether a byte/word/doubleword should be read, right?
Now, unfortunately, as
rejects ins %ax,%dx
with Error: operand type mismatch for 'ins'
, why is that?
For the record, I know I could simply use insb
etc. but I'm calling this instruction via inline assembly inside a C++ program and the size of the input to be read depends on a template parameter (and string handling at compile time is not very practical).
EDIT: here is what I have now, for reference (I don't really like the macro)
#define INS(T) \
__asm__ __volatile__("repne \n\t" \
"ins" T \
: "=D" (dest), "=c" (count) \
: "d" (port), "0" (dest), "1" (count) \
: "memory", "cc")
template<typename T>
void ins(uint16_t port, uint32_t dest, uint32_t count);
template<>
void ins<uint8_t>(uint16_t port, uint32_t dest, uint32_t count)
{ INS("b"); }
template<>
void ins<uint16_t>(uint16_t port, uint32_t dest, uint32_t count)
{ INS("w"); }
template<>
void ins<uint32_t>(uint16_t port, uint32_t dest, uint32_t count)
{ INS("l"); }
It's supposed to be a memory reference, not a register. The idea in Intel syntax is that you could write
ins dword ptr [rdi], dx
for a 32-bit read (akainsd
),ins word ptr [rdi], dx
for a 16-bitinsw
read, etc. You could maybe even writeins dword ptr [foo], dx
and get a 32-bit read but the data would be written to[rdi]
anyway. Nonetheless AT&T assembler syntax doesn't support this at all and the only way to specify the size is with the operand size suffix.In GCC inline assembly, you can get an operand size suffix
b,w,l,q
matching the size of an operand (based on its type) with thez
operand modifier. See the GCC manual, section "Extended Asm" under "x86 operand modifiers". So if you use types appropriately and add an operand which refers to the actual destination (not the pointer to it), you can do the following:Try it on godbolt
It's important here that the destination be a
T *
instead of a genericuint32_t
since the size is inferred from the typeT
.I've also replaced your duplicated input and output operands with the
+
read-write constraint. And to nitpick, the "cc" clobber is unnecessary becauserep ins
doesn't affect any flags, but it's redundant on x86 because every inline asm is assumed to clobber the flags anyway.