sse/sse2 double matrix float vector multiplication

2.3k views Asked by At

I have to implement matrix-vector multiplication using sse/sse2. Vector and matrix are large. Matrix is double, vector is float.

The point is that all calculations I have to do on floats - when I get data from matrix I promote it to float, do the calculations and I get float vector (later after some additional calculations on floats I have to add some float values (float matrix) to double values (double matrix).

My question is how I can do it using SSE/SSE2 - the problem is with doubles - I have pointer to double* and I have to somehow convert 4 doubles into 4 floats to fit in __mm128... Are there any intructions to do that?

2

There are 2 answers

2
Paul R On BEST ANSWER

You need to call __m128 _mm_cvtpd_ps (__m128d a) (CVTDP2PS) twice to get two single precision float vectors, each containing two of your original double precision values, then merge these two float vectors into a single vector, using e.g. __m128 _mm_shuffle_ps(__m128 a, __m128 b, unsigned int imm8) (SHUFPS).

4
Jeremiah Willcock On

Changing from double to float is reducing the level of precision, not increasing it. For more accuracy, you should do the computations on doubles (promoting the vector to that type), then possibly cast the result back down to float afterwards. The instructions you need for conversion are cvtps2pd (float to double) and/or cvtpd2ps (double to float). Those only convert two values at a time (since only two doubles fit into an SSE register), so you will need to do your conversion in two parts.