MPI reverse probe

269 views Asked by At

Is there a way to check if some processes are waiting on MPI_Recv?

I have a root proc, and some slave processes.

Slave psedo-code:

while (1) {
    do_some_stuff; // calls MPI_Test and clear unused buffers
    MPI_Recv(buf, ...);
    do_something_with_buf;
    MPI_Isend(buf2, ...); // possibly many sends depending on what was in buf
}

If all slave processes hang on MPI_Recv, then job is done and I need to brake the loop. Now I need some way to notify slave processes that job is done. Is there any way to do this? I thought there might be something like reverse probe to check if anyone waits for message instead of checking if there is a message to recieve. Haven't found anything useful tho.

Edit: some more explanation.

I have one root proc, which reads a huge file and sends read data to workers(rest of processes). Each worker recieves a portion of data, so its well distributed(each worker has roughly same amount of data stored). Then those workers start to communicate with each other sending partial computations. When a worker recieves a partial computation it may produce a lot of new partial results, some of which need to be sent to other workes. The work is done when all workers have nothing to do and there are no more partial results waiting to be recieved.

1

There are 1 answers

1
Ed Smith On

You should be able to avoid the situation where there would be a receive expected but nothing sent. The sending processor, in a master slave type situations, should always be keeping track of how much work there is to send. Typically this master slave strategy would work with the master keeping track and killing off the slaves once the total is reached...

In terms of functions, the closest equivalent to a probe on the send side may be to use a non-blocking send MPI_isend, which returns a status that can be passed to something like MPI_test, which is non-blocking and will return MPI_SUCCESS for a message has been received successfully. You can also use MPI_Wait with the status if you want to block the sending code until the message has been received. Using test/wait with unique tags for each send to each processes will be a way to perform what you want.