I have a union that looks like this
union bareVec8f {
__m256 m256; //avx 8x float vector
float floats[8];
int ints[8];
inline bareVec8f(){
}
inline bareVec8f(__m256 vec){
this->m256 = vec;
}
inline bareVec8f &operator=(__m256 m256) {
this->m256 = m256;
return *this;
}
inline operator __m256 &() {
return m256;
}
}
the __m256 needs to be aligned on 32 byte boundary to be used with SSE functions, and should be automatically, even within the union.
And when I do this
bareVec8f test = _mm256_set1_ps(1.0f);
I get a segmentation fault. This code should work because of the constructor I made. However, when I do this
bareVec8f test;
test.m256 = _mm256_set1_ps(8.f);
I do not get a segmentation fault.
So because that works fine the union is probably aligned properly, there's just some segmentation fault being caused with the constructor it seems
I'm using gcc 64bit windows compiler
---------------------------------EDIT Matt managed to produce the simplest example of the error that seems to be happening here.
#include <immintrin.h>
void foo(__m256 x) {}
int main()
{
__m256 r = _mm256_set1_ps(0.0f);
foo(r);
}
I'm compiling with -std=c++11 -mavx
This is a bug in g++ for Windows. It does not perform 32-byte stack alignment when it should. Bug 49001 Bug 54412
On this SO thread someone made a Python script to process the assembly output by g++ to fix the problem, so that would be one option.
Otherwise, to avoid this in your union you could make the functions which take
__m256
by value, take it by reference instead. This shouldn't have any performance penalty unless optimization is low/off.In case you are unaware - union aliasing causes undefined behaviour in C++, it's not permitted to write
m256
and then readfloats
orints
for example. So perhaps there is a different solution to your problem.