The latest (v1.8.3
) OpenMPI documentation specifies that rankfiles must now use the logical cpu IDs reported by hwloc
rather than the physical IDs, see the last sentence in the Rankfiles
section of the mpirun
documentation here:
Starting with Open MPI v1.7, all socket/core slot locations are be specified as logical indexes (the Open MPI v1.6 series used physical indexes). You can use tools such as HWLOC’s "lstopo" to find the logical indexes of socket and cores.
I've noticed a few questions on this site (notably this question and the answer to this question) that indicate that one can specify physical cpu ids in an openMPI rankfile by prefixing the id with a p
. For example:
rank 0=localhost slot=p0
rank 1=localhost slot=p8
rank 2=localhost slot=p1
rank 3=localhost slot=p9
To request physical cpu id 0
for rank 0
, physical cpu id 8
for rank 1
etc...
I've tried searching for this syntax in the OpenMPI docs to no avail. I've also tried to have someone actually try constructing a rankfile this way in OpenMPI 1.6.4
, which he reported also doesn't work.
What version(s) of OpenMPI does this syntax work with? Is it documented anywhere? What is the formal syntax?
Thanks to Hristo Iliev for pointing me in the direction of the appropriate code. It seems the function
hwloc_base_slot_list_parse
appeared in the open-mpi code from version1.8
.Tracing back through the code I arrived at the
orte_rmaps_rankfile_parse
function which seems to go back as far as version1.3
. Looking into the history of this function, we find that the following code snippet appears from the version1.5
branch on in the section parsing the slot list:So from this I conclude that the answer to my question is that the
p
notation is supported in OpenMPI versions below1.5
Edit: I also found this message in the
Open MPI Users
mailing list which seems to support my findings.