Seg fault for a simple MPI code during Remote Memory access window creation

104 views Asked by At

I am working on an MPI code in which I am trying to use one-sided communication (RMA). I am creating a window using MPI_Win_create. The code works correctly and gives correct results but still 1 processor exits with a segmentation fault and I am unable to figure out why.

This code I am working on is quite big, but I am able to reproduce the same error in the following code too.

#include <stdio.h>
#include <mpi.h>
#include <boost/mpi.hpp>
#include <boost/mpi/collectives.hpp>
#include <boost/serialization/serialization.hpp>
#include <boost/serialization/vector.hpp>


int main(int argc, char *argv[])
{
    boost::mpi::environment env(argc, argv);
    boost::mpi::communicator world;
    printf("init\n");
    int * arr = new int[100];
    for(int i=0;i<100 ; i++)
    {
        arr[i]=i+world.rank()*100;
    }
    MPI_Win win;
    printf("create window\n");
    MPI_Win_create(arr, MPI_Aint(100*sizeof(int)), sizeof(int),MPI_INFO_NULL, MPI_COMM_WORLD, &win);
    MPI_Win_free(&win);
    printf("done\n");
    delete [] arr;

    return 0;

}

All the print statements are printed correctly by each of the processor. But still one process exits with a segfault. This is error output that I got,

[gpu023:32302:0:32318] Caught signal 11 (Segmentation fault: address not mapped to object at address 0x2b194396e760)
--------------------------------------------------------------------------
Primary job  terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpirun noticed that process rank 0 with PID 0 on node gpu023 exited on signal 11 (Segmentation fault).
--------------------------------------------------------------------------

Can anyone please help me out with this? I have been struggling to find what is the cause behind this. I am running this code with 64 processors on 2 nodes. The code runs correctly without any errors if I remove the 3 MPI_Win , MPI_Win_create and MPI_Win_free statements.

0

There are 0 answers