As far as I know, there is no instruction in SSE/AVX for loading an immediate. One workaround is loading a value to a normal register and movd
, but compilers seem to think this is more costly than loading from memory even for a single scalar value.
This makes memory access necessary every time doing an operation with common constants such as 1
, 0x80000000
, 0x7fffffff
, 0x3f800000
, 0x3f000000
, etc. Well, having these values encoded in the machine code will occupy 4 bytes each, but so does a 32-bit absolute or rip
-relative address, and I believe an immediate load is cheaper than any sort of memory load.
I always thought something like movss xmm, imm32
or broadcastss xmm, imm32
would be nice to have, but there must be a reason for not making such instructions. Why was it designed this way?