communications in MPMD MPI executions

877 views Asked by At

this post is related to a previous post binding threads to certain MPI processes. Here, it was asked how MPI ranks could be assigned a different number of OpenMP threads. One possibility is as follows

$ mpiexec <global parameters>
          -n n1 <local parameters> executable_1 <args1> :
          -n n2 <local parameters> executable_2 <args2> :
          ...
          -n nk <local parameters> executable_k <argsk>

what I don't know is how the independent instances executable_1, executable_2, ..., executable_k communicate with each other. I mean if at some point during execution they need to exchange data, do they use a inter-communicator (among instances) and a intra-communicator (within the same instance, for example executable_1)?

Thanks.

1

There are 1 answers

0
Hristo Iliev On

All processes launched as a result of that command form a single MIMD/MPMD MPI job, i.e. they share the same world communicator. The first n1 ranks are running executable_1, the following n2 ranks are running executable_2, etc.

                   rank                 |  executable
----------------------------------------+---------------
                  0 .. n1-1             |  executable_1
                 n1 .. n1+n2-1          |  executable_2
              n1+n2 .. n1+n2+n3-1       |  executable_3
                   ....                 |      ....
 n1+n2+n3+..+n(k-1) .. n1+n2+n3+..+nk-1 |  executable_k

The communication happens simply by sending messages in MPI_COMM_WORLD. The separate executables do not form communicator groups on their own automatically. This is what distinguishes MPMD from starting child jobs using MPI_Comm_spawn - child jobs have their own world communicators and one uses intercommunicators to talk to them while the separate sub-jobs in an MIMD/MPMD job do not.

It is still possible for a rank to find out to which application context it belongs by querying the MPI_APPNUM attribute of MPI_COMM_WORLD. It makes it possible to create separate sub-communicators for each context (the different contexts are the commands separated by :) by simply performing a split using the appnum value as colour:

int *appnum, present;

MPI_Comm_get_attr(MPI_COMM_WORLD, MPI_APPNUM, &appnum, &present);
if (!present)
{
   printf("MPI_APPNUM is not provided!\n");
   MPI_Abort(MPI_COMM_WORLD, 0);
}

MPI_Comm appcomm;
MPI_Comm_split(MPI_COMM_WORLD, *appnum, 0, &appcomm);