I am porting a function from inline assembly to MASM in Visual Studio 2013 and am having trouble getting a return value out of it.
Here is the C caller and the assembly function prototype:
extern "C" void AbsMax(__m128d* samples, int len, __m128d* pResult);
__m128d AbsMax(__m128d* samples, int len)
{
__m128d absMax = { 0, 0 };
AbsMax(samples, len, &absMax);
return absMax;
}
And the assembly function:
.686 ;Target processor. Use instructions for Pentium class machines
.xmm
.model flat, c ;Use the flat memory model. Use C calling conventions
.code ;Indicates the start of a code segment.
AbsMax proc samples:PTR DWORD, len:DWORD, result:PTR XMMWORD
;; Load up registers. xmm0 is min, xmm1 is max. L is Ch0, H is Ch1.
mov ecx, [len]
shl ecx, 4
mov esi, [samples]
lea esi, [esi+ecx]
neg ecx
pxor xmm0, xmm0
pxor xmm1, xmm1
ALIGN 16
_loop:
movaps xmm2, [esi+ecx]
add ecx, 16
minpd xmm0, xmm2
maxpd xmm1, xmm2
jne _loop
;; Store larger of -min and max for each channel. xmm2 is -min.
pxor xmm2, xmm2
subpd xmm2, xmm0
maxpd xmm1, xmm2
movaps [result], xmm1 ; <=== access violation here
xor eax, eax
xor ebx, ebx
ret
AbsMax ENDP
END
As I understand the convention for MASM, return values are normally returned out through the EAX register. However, since I'm trying to return a 128-bit value I'm assuming an out parameter is the way to go. As you can see in the assembly listing, assigning the out parameter (movaps [result]
) is causing an access violation (Access violation reading location 0x00000000). I've validated the address of result in the debugger and it looks fine.
What am I doing wrong?
For educational purposes, I wrote up a version of your function that uses intrinsics:
Then I compiled using the VC++ x64 compiler using
cl /c /O2 /FA absmax.cpp
to generate an assembly listing (edited to remove line comments):Noting that x64 uses a
__fastcall
convention by default, and shadows the parameters on the stack, I see that the out parameter is in fact being written indirectly throughr8
, which is the third integer parameter for x64 code, per MSDN. I think if your assembly code adopts this parameter convention, it will work.The shadowed stack space is not initialized with the actual parameter values; it's intended for callees, if they need a place to stash the values while using the registers. That's why you're getting a zero value dereference error in your code. There's a calling convention mismatch. The debugger knows about the calling convention, so it can show you the enregistered value for the parameter.