Hybrid loop parallelization with MPI_THREAD_MULTIPLE

112 views Asked by At

I am trying to parallelize the classic MPI_Issend MPI_Irecv for halo swapping with OpenMP threads and MPI_THREAD_MULTIPLE. This is, each thread will send a section of the main buffer to the right and left and each thread is responsible of getting a section of the buffer from the right and left.

#pragma omp parallel private(i,tid)
  {
    tid  = omp_get_thread_num();
    nthreads = omp_get_num_threads();

 // starting position for each thread
    int sizeid = SIZE/nthreads;
    int startid =  sizeid*tid;

    int tstep;
    for (tstep = 0; tstep < 5; tstep++){         
       MPI_Irecv(&recvright[startid], sizeid, MPI_INT, right, tid+101, comm, request + tid);
       MPI_Irecv(&recvleft[startid], sizeid, MPI_INT, left, tid+201, comm, request + nthreads + 1 + tid);

       MPI_Issend(&sendleft[startid], sizeid, MPI_INT, left, tid+101, comm, request + nthreads + 2 + tid);
       MPI_Issend(&sendright[startid], sizeid, MPI_INT, right, tid+201, comm, request + nthreads + 3 + tid);

       MPI_Waitall(4*nthreads, request, status);
     }    
}

However I am getting errors at the MPI_Waitall. Does anyone know why? What am I doing wrong?

1

There are 1 answers

1
Zulan On BEST ANSWER

You are calling MPI_Waitall on all requests... from all threads. Even requests that aren't even open yet - or already completed by other threads. Make sure to wait for each request only once, in your case in the thread you are initiating the non-blocking communication.

BTW. your request indexing is also wrong (overlapping). Instead of request + nthreads + 2 + tid you probably want request + nthreads * 2 + tid. However it would be much cleaner and better to simply make a thread local MPI_Request[4] array and wait for that, fixing also the initial issue.

See also https://stackoverflow.com/a/17591795/620382