Calling BLACS with more processes than used

602 views Asked by At

I want to create a parallel program, which makes heavy use of SCALAPACK. The basis of SCALAPACK is BLACS, which itself relies on MPI for interprocess communication.

I want to start the program with a defined number of processes (e.g. the number of cores on the machine) and let the algorithm decide, how to use these processes for calculations.

As a testcase I wanted to use 10 processes. 9 of these processes should get arranged in a grid (BLACS_GRIDINIT) and the 10th process should wait till the other processes are finished.

Unfortunately, OpenMPI crashes because the last process doesn't get into a MPI context from BLACS, while the others did.

Question: What is the correct way to use BLACS with more processes than needed?

I did some experiments with additional MPI_INIT and MPI_FINALIZE calls, but none of my tries were successful.


I started with the sample code from Intel MKL (shortened a little bit):

      PROGRAM HELLO 
*     -- BLACS example code --
*     Written by Clint Whaley 7/26/94 
*     Performs a simple check-in type hello world 
*     .. 
*     .. External Functions ..
      INTEGER BLACS_PNUM
      EXTERNAL BLACS_PNUM 
*     .. 
*     .. Variable Declaration ..
      INTEGER CONTXT, IAM, NPROCS, NPROW, NPCOL, MYPROW, MYPCOL
      INTEGER ICALLER, I, J, HISROW, HISCOL 

*     Determine my process number and the number of processes in 
*     machine 
      CALL BLACS_PINFO(IAM, NPROCS) 

*     Set up process grid that is as close to square as possible 
      NPROW = INT( SQRT( REAL(NPROCS) ) )
      NPCOL = NPROCS / NPROW 

*     Get default system context, and define grid
      CALL BLACS_GET(0, 0, CONTXT)
      CALL BLACS_GRIDINIT(CONTXT, 'Row', NPROW, NPCOL)
      CALL BLACS_GRIDINFO(CONTXT, NPROW, NPCOL, MYPROW, MYPCOL) 

*     If I'm not in grid, go to end of program 
      IF ( (MYPROW.GE.NPROW) .OR. (MYPCOL.GE.NPCOL) ) GOTO 30

*     Get my process ID from my grid coordinates 
      ICALLER = BLACS_PNUM(CONTXT, MYPROW, MYPCOL) 

*     If I am process {0,0}, receive check-in messages from 
*     all nodes 
      IF ( (MYPROW.EQ.0) .AND. (MYPCOL.EQ.0) ) THEN

         WRITE(*,*) ' '

         DO 20 I = 0, NPROW-1
            DO 10 J = 0, NPCOL-1

               IF ( (I.NE.0) .OR. (J.NE.0) ) THEN
                  CALL IGERV2D(CONTXT, 1, 1, ICALLER, 1, I, J)
               END IF 
*              Make sure ICALLER is where we think in process grid
              CALL BLACS_PCOORD(CONTXT, ICALLER, HISROW, HISCOL)
              IF ( (HISROW.NE.I) .OR. (HISCOL.NE.J) ) THEN
                 WRITE(*,*) 'Grid error!  Halting . . .'
                 STOP
              END IF
              WRITE(*, 3000) I, J, ICALLER
10         CONTINUE 
20      CONTINUE
        WRITE(*,*) ' '
        WRITE(*,*) 'All processes checked in.  Run finished.' 

*     All processes but {0,0} send process ID as a check-in
      ELSE
         CALL IGESD2D(CONTXT, 1, 1, ICALLER, 1, 0, 0)
      END IF

30    CONTINUE

      CALL BLACS_EXIT(0)

1000  FORMAT('How many processes in machine?') 
2000  FORMAT(I) 
3000  FORMAT('Process {',i2,',',i2,'} (node number =',I,
     $       ') has checked in.')

      STOP
      END

Update: I investigated the source code of BLACS to see, what happens there.

The call BLACS_PINFO initializes the MPI context with MPI_INIT, if this didn't happen before. This means, that at this point, everything works as expected.

At the end, the call to BLACS_EXIT(0) should free all resources from BLACS and if the argument is 0, it should also call MPI_FINALIZE. Unfortunately, this doesn't work as expected and my last process doesn't call MPI_FINALIZE.

As a workaround, one could ask MPI_FINALIZED and call MPI_FINALIZE if necessary.

Update 2: My previous tries were done with Intel Studio 2013.0.079 and OpenMPI 1.6.2 on SUSE Linux Enterprise Server 11.

After reading ctheo's answer, I tried to compile this example with the tools given by Ubuntu 12.04 (gfortran 4.6.3, OpenMPI 1.4.3, BLACS 1.1) and was successful.

My conclusion is, that Intel's implementation appears to be buggy. I will retry this example in the not so far future with the newest service release of Intel Studio, but don't expect any changes.

However, I would appreciate any other (and maybe better) solution.

2

There are 2 answers

2
Wesley Bland On

I don't know the answer, and I would hazard a guess that the set of people that participate in SO and those who know the answer to your question is < 1. However, I'd suggest that you might have slightly better luck asking on scicomp or by contacting the ScaLAPACK team at the University of Tennessee directly through their support page. Good luck!

3
ztik On

I don't think that you need to do much to use less processes in SCALAPACK. The BLACS_PINFO subroutine returns the total number of processes. If you want to use one less, just do NPROCS = NPROCS - 1. I used your sample code (fixes some typos in FORMAT), added the subtraction and got the following output:

$ mpirun -n 4 ./a.out 

Process { 0, 0} (node number = 0) has checked in.
Process { 0, 1} (node number = 1) has checked in.
Process { 0, 2} (node number = 2) has checked in.

 All processes checked in.  Run finished.

The BLACS_GRIDINIT creates a grid with the reduced NPROCS. By calling BLACS_GRIDINFO one process gets MYPROW=MYPCOL=-1

On the other hand if you want to create multiple grids that use different processes then probably you should use BLACS_GRIDMAP subroutine. The sample code below creates two equal grids with half size of total processes.

      PROGRAM HELLO 
*     .. 
      INTEGER CONTXT(2), IAM, NPROCS, NPROW, NPCOL, MYPROW, MYPCOL
      INTEGER ICALLER, I, J, HISROW, HISCOL
      integer UMAP(2,10,10)
*     
      CALL BLACS_PINFO(IAM, NPROCS) 
      NPROCS = NPROCS/2
*     
      NPROW = INT( SQRT( REAL(NPROCS) ) )
      NPCOL = NPROCS / NPROW 
*     
      DO IP = 1, 2
        DO I = 1, NPROW
          DO J = 1, NPCOL
            UMAP(IP,I,J) = (IP-1)*NPROCS+(I-1)*NPCOL+(J-1)
          ENDDO
        ENDDO
        CALL BLACS_GET(0, 0, CONTXT(IP))
        CALL BLACS_GRIDMAP(CONTXT(IP), UMAP(IP,:,:), 10, NPROW, NPCOL )
      ENDDO
*
      DO IP = 1, 2
        CALL BLACS_GRIDINFO(CONTXT(IP), NPROW, NPCOL, MYPROW, MYPCOL) 
        IF(MYPROW.GE.0 .AND. MYPCOL.GE.0 ) THEN
          WRITE(*,1000) IAM, MYPROW, MYPCOL, IP
        END IF
      ENDDO

      CALL BLACS_EXIT(0)
 1000 FORMAT('Process ',i2,' is (',i2,','i2 ') of grid ',i2)
*
      STOP
      END

I got the following output:

$ mpirun -n 8 ./a.out 
Process  0 is ( 0, 0) of grid  1
Process  1 is ( 0, 1) of grid  1
Process  2 is ( 1, 0) of grid  1
Process  3 is ( 1, 1) of grid  1
Process  4 is ( 0, 0) of grid  2
Process  5 is ( 0, 1) of grid  2
Process  6 is ( 1, 0) of grid  2
Process  7 is ( 1, 1) of grid  2

I did not collect the data in process zero. So you can get this output if all processes are local.