OpenMPI / mpirun or mpiexec with sudo permission

9.1k views Asked by At

I'm working on a code that work with Epiphany processor (http://www.parallella.org/) and to run Epiphany codes i need sudo privileges on host side program. There is no escape from sudo!

Now i need to run this code across several nodes, in order to do that i'm using mpi but mpi wont function properly with sudo

#sudo mpirun -n 12 --hostfile hosts -x LD_LIBRARY_PATH=${ELIBS} -x EPIPHANY_HDF=${EHDF} ./hello-mpi.elf

Even a simple code that does node communication does not work. The ranks comes 0 if i use sudo. Communication between threads works but not across nodes. This is important because i wanted to divide the work load properly across the cards.

here is the simple code

#include <stdio.h>
#include <mpi.h>

int main(int argc, char *argv[]) {
   int numprocs, rank, namelen;
   char processor_name[MPI_MAX_PROCESSOR_NAME];

   MPI_Init(&argc, &argv);
   MPI_Comm_size(MPI_COMM_WORLD, &numprocs);
   MPI_Comm_rank(MPI_COMM_WORLD, &rank);
   MPI_Get_processor_name(processor_name, &namelen);

   printf("Hello World from MPI Process %d on machine %s\n", rank, processor_name);

   MPI_Finalize();
}

This code should spit out the rank number differently across the nodes but it does not work with sudo

Any help on this would be great

Here is the output from running the above code without sudo.

mpirun -n 3 --hostfile $MPI_HOSTS ./mpitest

output:

Hello world from processor work1, rank 1 out of 3 processors
Hello world from processor command, rank 0 out of 3 processors
Hello world from processor work2, rank 2 out of 3 processors

This is as expected.

Here is the output from running the above code with sudo.

sudo mpirun -n 3 --hostfile $MPI_HOSTS ./mpitest

output:

Hello world from processor command, rank 0 out of 1 processors
Hello world from processor work1, rank 0 out of 1 processors
Hello world from processor work2, rank 0 out of 1 processors

This is not.

Edit:-

I think @Hristo Iliev got the right answer but I'm not going to be able to test this out

2

There are 2 answers

7
Hristo Iliev On BEST ANSWER

Short answer: instead of sudo mpirun -n 12 ... ./hello-mpi.elf, the command should be:

mpirun -n 12 ... sudo -E ./hello-mpi.elf

For that to work properly, you have to modify the sudo configuration (via visudo) on all hosts and enable passwordless operation for your user:

username ALL = NOPASSWD:SETENV: /path/to/mpirun

This entry will allow your user to run sudo mpirun without first authenticating yourself, which is important since only the standard input of rank 0 is redirected. It will also allow you to execute sudo with the -E option in order to allow it to pass the special Open MPI variables (OMPI_...) to the executable (without those variables in the environment, the executables cannot connect to each other and instead run as singletons).

Long answer: Running mpirun with sudo results in the former being executed with effective user root. The way mpirun creates an MPI job is by first launching the requested number of executables and then waiting for them to get to know each other during the MPI_Init call. Depending on the content of the host list file, mpirun either spawns a child process (for host entries that match the host mpirun is executed on) or starts a process remotely using rsh, ssh or some other mechanism (e.g. many cluster resource management systems have their own mechanisms for that). When the rsh/ssh mechanism is used, since the program runs as root, mpirun attempts to log into the other host(s) as root. This usually fails for one or both of two reasons:

  • the root user cannot login into the specified host(s) without providing a password, e.g. using a public key authentication has not been set up;
  • the root user is not allowed to login remotely which is the default SSH configuration in many Unix systems since many years.

That's why you see rank 0 coming up (it's a local fork()-based spawn) and the other ranks missing. Since enabling remote root login is considered a security risk by many, I would rather go the way described in the short answer.

Another option would be to make hello-mpi.elf owned by root and set the Set UID bit via chmod u+s hello-mpi.elf. Then you won't need sudo at all. This will not work if the filesystem is mounted with the nosuid option or if some other security mechanism is active. Also root-owned suid binaries pose security risks since they always execute with root permissions, no matter what user runs them.

I wonder, why you need root permissions in order to talk to the Epiphany board. Is the SDK doing some fancy privileged operations or is it simply accessing a device file in /dev that is only writeable by root? If it's the latter, perhaps the device node could be created with different permissions.

0
svink On

I struggled for a while with this same issue and had to read the whole documentation to find the solution (I'm also working with a parallella cluster). It's pretty simple : During the installation of OpenMPI, you have to add the option -enable-orterun-prefix-by-default while configuring the installation...

$./configure -prefix=/usr/local --enable-orterun-prefix-by-default