Intel xmm registers do not load and multiply correctly

40 views Asked by At

I have written the following assembly programme to test how xmm registers multiply data;

# file: xmm.a
.global _start:
_start:
movsd (%rip), %xmm4
movsd -8(%rip), %xmm7
mulsd %xmm4, %xmm7
jmp _start

and assembled with as xmm.a -o xmm.o and linked with ld xmm.o -o xmm after that I gdb'ed the _start function with gdb --args ./xmm

then reached the following screen output:

the programme should read the 16 byte of code data directly into both xmm4 and xmm7 and multiply them howvever it seems that it is not doing so

My system is ubuntu 20.04.6 LTS Windows Subsystem for Linux on a intel i7 core machine

(gdb) disassemble /r  _start

  Dump of assembler code for function _start:

       0x00401000 <+0>:     f2 0f 10 25 00 00 00 00 movsd  0x0(%rip),%xmm4
       0x00401008 <+8>:     f2 0f 10 3d f8 ff ff ff movsd  -0x8(%rip),%xmm7   <_start+8>

       0x00401010 <+16>:    f2 0f 59 fc     mulsd  %xmm4,%xmm7

    => 0x00401014 <+20>:    eb ea   jmp    0x401000 <_start>

    End of assembler dump.

(gdb) x/8i _start-8

      0x400ff8:    add      %al,(%rax)
      0x400ffa:    add      %al,(%rax)
      0x400ffc:    add      %al,(%rax)
      0x400ffe:    add      %al,(%rax)
      0x401000 <_start>:    movsd  0x0(%rip),%xmm4
      0x401008 <_start+8>:  movsd  -0x8(%rip),%xmm7
      0x401010 <_start+16>: mulsd  %xmm4,%xmm7
   => 0x401014 <_start+20>: jmp    0x401000 <_start>

(gdb) print $xmm4

    $3 = {v4_float = {2.38630424e-29, -nan(0x783d10), 0, 0}, v2_double = {-nan(0x83d100ff20000), 0}, v16_int8 = {
    0, 0, -14, 15, 16, 61, -8, -1, 0, 0, 0, 0, 0, 0, 0, 0}, v8_int16 = {0, 4082, 15632, -8, 0, 0, 0, 0},
      v4_int32 = {267517952, -508656, 0, 0}, v2_int64 = {-2184660617396224, 0}, uint128 = 18444559413092155392}

(gdb) print $xmm7

    $4 = {v4_float = {0.0351714566, -nan(0x7ffff8), 0, 0}, v2_double = {-nan(0xffff83d100ff2), 0}, v16_int8 = {
    -14, 15, 16, 61, -8, -1, -1, -1, 0, 0, 0, 0, 0, 0, 0, 0}, v8_int16 = {4082, 15632, -8, -1, 0, 0, 0, 0},
      v4_int32 = {1024462834, -8, 0, 0}, v2_int64 = {-33335275534, 0}, uint128 = 18446744040374276082}

(gdb)

according to IEEE754 floating point standard, both the loaded data and multiplied data are not proper, I have no idea what I am missing here

0

There are 0 answers