Can I move 256-bit from memory location immediately to YMM registers? If I want to fill an xmm register, I use in inline asm in gcc:
"movlpd mytest_1(%rip),%xmm1 \n\t"
"movhpd mytest_1+8(%rip),%xmm1 \n\t"
Can this be made easier I guess?
Furthermore: The same procedure move aligned or not 4 quadwords in 1 step to Ymm0? I look for the reverse of Vmovdqa ymm1, mem256 source -> destination.
These two instructions can be combined to one
movdqu
/movdqa
, because x86 is a Little Endian architectureBoth can also be used for AVX 32-bit memory transfer (
vmovdqu
/vmovdqa
):Regarding the second part of your question:
This does work in both directions, e.g. the possible instructions for
vmovdqa
: