I'm working on ScaLAPACK and trying to get used to BLACS routines which is essential using ScaLAPACK.
I've had some elementary course on MPI, so have some rough idea of MPI_COMM_WORLD stuff, but has no deep understanding on how it works internally and so on.
Anyway, I'm trying following code to say hello using BLACS routine.
program hello_from_BLACS
use MPI
implicit none
integer :: info, nproc, nprow, npcol, &
myid, myrow, mycol, &
ctxt, ctxt_sys, ctxt_all
call BLACS_PINFO(myid, nproc)
! get the internal default context
call BLACS_GET(0, 0, ctxt_sys)
! set up a process grid for the process set
ctxt_all = ctxt_sys
call BLACS_GRIDINIT(ctxt_all, 'c', nproc, 1)
call BLACS_BARRIER(ctxt_all, 'A')
! set up a process grid of size 3*2
ctxt = ctxt_sys
call BLACS_GRIDINIT(ctxt, 'c', 3, 2)
if (myid .eq. 0) then
write(6,*) ' myid myrow mycol nprow npcol'
endif
(**) call BLACS_BARRIER(ctxt_sys, 'A')
! all processes not belonging to 'ctxt' jump to the end of the program
if (ctxt .lt. 0) goto 1000
! get the process coordinates in the grid
call BLACS_GRIDINFO(ctxt, nprow, npcol, myrow, mycol)
write(6,*) 'hello from process', myid, myrow, mycol, nprow, npcol
1000 continue
! return all BLACS contexts
call BLACS_EXIT(0)
stop
end program
and the output with 'mpirun -np 10 ./exe' is like,
hello from process 0 0 0 3 2
hello from process 4 1 1 3 2
hello from process 1 1 0 3 2
myid myrow mycol nprow npcol
hello from process 5 2 1 3 2
hello from process 2 2 0 3 2
hello from process 3 0 1 3 2
Everything seems to work fine except that 'BLACS_BARRIER' line, which I marked (**) in the code's leftside.
I've put that line to make the output like below whose title line always printed at the top of the it.
myid myrow mycol nprow npcol
hello from process 0 0 0 3 2
hello from process 4 1 1 3 2
hello from process 1 1 0 3 2
hello from process 5 2 1 3 2
hello from process 2 2 0 3 2
hello from process 3 0 1 3 2
So the question goes,
I've tried BLACS_BARRIER to 'ctxt_sys', 'ctxt_all', and 'ctxt' but all of them does not make output in which the title line is firstly printed. I've also tried MPI_Barrier(MPI_COMM_WORLD,info), but it didn't work either. Am I using the barriers in the wrong way?
In addition, I got SIGSEGV when I used BLACS_BARRIER to 'ctxt' and used more than 6 processes when executing mpirun. Why SIGSEGV takes place in this case?
Thank you for reading this question.
To answer your 2 questions (in future it is best to give then separate posts)
1) MPI_Barrier, BLACS_Barrier and any barrier in any parallel programming methodology I have come across only synchronises the actual set of processes that calls it. However I/O is not dealt with just by the calling process, but at least one and quite possibly more within the OS which actually the process the I/O request. These are NOT synchronised by your barrier. Thus ordering of I/O is not ensured by a simple barrier. The only standard conforming ways that I can think of to ensure ordering of I/O are
2) Your second call to BLACS_GRIDINIT
creates a context for 3 by 2 process grid, so holding 6 process. If you call it with more than 6 processes, only 6 will be returned with a valid context, for the others
ctxt
should be treated as an uninitialised value. So for instance if you call it with 8 processes, 6 will return with a validctxt
, 2 will return withctxt
having no valid value. If these 2 now try to usectxt
anything is possible, and in your case you are getting a seg fault. You do seem to see that this is an issue as later you havebut I see nothing in the description of BLACS_GRIDINIT that ensures ctxt will be less than zero for non-participating processes - at https://www.netlib.org/blacs/BLACS/QRef.html#BLACS_GRIDINIT it says
There is no mention of what
ctxt
will be if the process is not part of the resulting grid - this is the kind of problem I find regularly with the BLACS documentation. Also please don't usegoto
, for your own sake. You WILL regret it later. UseIf ... End If
. I can't remember when I last usedgoto
in Fortran, it may well be over 10 years ago.Finally good luck in using BLACS! In my experience the documentation is often incomplete, and I would suggest only using those calls that are absolutely necessary to use ScaLAPACK and using MPI, which is much, much better defined, for the rest. It would be so much nicer if ScaLAPACK just worked with MPI nowadays.