I'm working on a school project using MPI. I use MPICH2 and write code in Fortran. I run my code on my school's server with multiple computing slots. Each slot consists of several computing cores. I try to use parallel computing to speed up my code. I distribute sub-jobs to each core and gather value using MPI_Gather. Some cores never return value and it seems that they got trapped in some infinite loop. Some cores never call the first MPI_Barrier. But there is no infinite loop in sub-jobs. I also do series code and it works well. I put my code in the attachment.
call MPI_COMM_RANK(MP_LIBRARY_WORLD, rank, ierr)
call MPI_COMM_SIZE(MP_LIBRARY_WORLD, numtasks, ierr)
loop_min=int(rank*ceiling(float(point_num)/float(numtasks)))+1
loop_max=int((rank+1)*ceiling(float(point_num)/float(numtasks)))
do ind=loop_min,loop_max,1
if (ind>point_num) then
exit
end if
current_wealth_dist=total_grid(:,ind)
if (current_wealth_dist(1)==0.) then
call X_init_aiyagarizero(sendbuf(2:))
else
call X_init_aiyagari(sendbuf(2:))
end if
sendbuf(1)=ind
call MPI_GATHER(sendbuf,number_plc_function+1,MPI_REAL,recvbuf,number_plc_function+1,MPI_REAL,0,MP_LIBRARY_WORLD,ierr)
!print *, "Point", ind, "Finished"
end do
print *,rank, "work finished"
call MPI_BARRIER(MP_LIBRARY_WORLD,ierr)
print *, "After the First Barrier"
call MPI_Bcast(recvbuf,(number_plc_function+1)*point_num,MPI_REAL,0,MP_LIBRARY_WORLD,ierr)
print *, rank, "Finish Broadcast"
call MPI_BARRIER(MP_LIBRARY_WORLD,ierr)
do iter=1,point_num
do jter=1,4
init_policy_functions(int(recvbuf(1,iter)),:,jter)=recvbuf(2:,iter)
end do
end do