I've got a raspberry pi cluster that has three nodes. I've installed mpi on it and i tried to excute an example code named cpi. The thing is that I get this error:
The command executed on master node:
mpiexec -f machinefile -n 2 mpi-build/examples/cpi
The result:
Process 0 of 2 is on Pi01
Fatal error in PMPI_Reduce: A process has failed,
error stack:PMPI_Reduce(1259)...............:MPI_Reduce(sbuf=0xbebc6630,rbuf=0xbebc6638,count=1, MPI_DOUBLE, MPI_SUM, root=0, MPI_COMM_WORLD) failed
MPIR_Reduce_impl(1071)..........:
MPIR_Reduce_intra(877)..........:
MPIR_Reduce_binomial(184).......:
MPIDI_CH3U_Recvq_FDU_or_AEP(630): Communication error with rank 1
Process 1 of 2 is on Pi02
I've used SSH Keygens between the master and each slave so, there is no need to use the password to login between each node. (But if a slave connects to another it must login to another slave using the password, this means that I didn't share the ssh keygens between the slaves, but only between the master and the slaves.)
Programs that print helloworld with the process rank and the PC that executed it work properly, but when a process needs to communicate with another, I get the error as stated above. What should I do?