Conflict between IMSL and MPI

166 views Asked by At

I am trying to divide my fortran code into several parts and I want to parallelize each part by using MPI. For each part, I use IMSL library to solve an optimization problem (use BCONF). However, I find that IMSL library has its own subroutines about MPI and it does not allow me to call the standard MPI start subroutine "Call MPI_INIT(ierror)". It just gives me an fatal error and ends the program.

I give two examples to illustrate the issue.

Example 1, print "Hello World " from each node:

program main
use mpi
implicit none
integer ( kind = 4 ) error
integer ( kind = 4 ) id
integer ( kind = 4 ) p
call MPI_Init ( error )
call MPI_Comm_size ( MPI_COMM_WORLD, p, error )
call MPI_Comm_rank ( MPI_COMM_WORLD, id, error )
write ( *, * ) ' Process ', id, ' says "Hello, world!"'
call MPI_Finalize ( error )
end program

When I compile and run without IMSL library, it gives me the correct answer:

mpif90 -o a.out hello_mpi.f90
mpiexec -n 4 ./a.out

Process 3 says "Hello, world!"
Process 0 says "Hello, world!"
Process 2 says "Hello, world!"
Process 1 says "Hello, world!"

Now If I change nothing to the code but just add IMSL library, it will cause the error:

mpif90 -o a.out hello_mpi.f90 $LINK_FNL_STATIC_IMSL $F90FLAGS
mpiexec -n 4 ./a.out

*** FATAL ERROR 1 from MPI_INIT. A CALL was executed using the IMSL
*** FATAL ERROR 1 from MPI_INIT. A CALL was executed using the IMSL
*** FATAL ERROR 1 from MPI_INIT. A CALL was executed using the IMSL
*** dummy routine. Parallel performance needs a functioning MPI
*** library.
*** dummy routine. Parallel performance needs a functioning MPI
*** library.
*** dummy routine. Parallel performance needs a functioning MPI
*** library.
*** FATAL ERROR 1 from MPI_INIT. A CALL was executed using the IMSL
*** dummy routine. Parallel performance needs a functioning MPI
*** library.

In the first example, changing "$LINK_FNL_STATIC_IMSL" to "LINK_MPI" will cure the problem, but it does not work in a more realistic example here:

Example 2: use MPI and each node use IMSL library to calculate quadrature nodes

program main
USE GQRUL_INT
use mpi
implicit none
integer ( kind = 4 ) error
integer ( kind = 4 ) id
integer ( kind = 4 ) p
real ( kind = 8 ) QW(10), QX(10)
call MPI_Init ( error )
call MPI_Comm_size ( MPI_COMM_WORLD, p, error )
call MPI_Comm_rank ( MPI_COMM_WORLD, id, error )
write ( *, * ) ' Process ', id, ' says "Hello, world!"'
CALL GQRUL (10, QX, QW )
call MPI_Finalize ( error )
end program

When I compile and run, program stops at "MPI_INIT":

mpif90 -o a.out hello_mpi.f90 $LINK_FNL_STATIC_IMSL $F90FLAGS
mpiexec -n 4 ./a.out 

*** FATAL ERROR 1 from MPI_INIT. A CALL was executed using the IMSL
*** dummy routine. Parallel performance needs a functioning MPI
*** library.
*** FATAL ERROR 1 from MPI_INIT. A CALL was executed using the IMSL
*** dummy routine. Parallel performance needs a functioning MPI
*** library.
*** FATAL ERROR 1 from MPI_INIT. A CALL was executed using the IMSL
*** FATAL ERROR 1 from MPI_INIT. A CALL was executed using the IMSL
*** dummy routine. Parallel performance needs a functioning MPI
*** library.
*** dummy routine. Parallel performance needs a functioning MPI
*** library.

If I change the linking option to $LINK_MPI, the program crashes at the IMSL library subroutine:

mpif90 -o a.out hello_mpi.f90 $LINK_MPI $F90FLAGS
mpiexec -n 4 ./a.out

Process 1 says "Hello, world!"
Process 0 says "Hello, world!"
Process 3 says "Hello, world!"
Process 2 says "Hello, world!"
forrtl: severe (174): SIGSEGV, segmentation fault occurred
Image PC Routine Line Source 
a.out 00000000018D5C75 Unknown Unknown Unknown
a.out 00000000018D3A37 Unknown Unknown Unknown
a.out 000000000188ADC4 Unknown Unknown Unknown
a.out 000000000188ABD6 Unknown Unknown Unknown
a.out 000000000184BCB9 Unknown Unknown Unknown
a.out 000000000184F410 Unknown Unknown Unknown
libpthread.so.0 00007EFC178C67E0 Unknown Unknown Unknown
a.out 000000000178E634 Unknown Unknown Unknown
a.out 000000000178A423 Unknown Unknown Unknown
a.out 0000000000430491 Unknown Unknown Unknown
a.out 000000000042AACD Unknown Unknown Unknown
a.out 00000000004233D2 Unknown Unknown Unknown
a.out 0000000000422FEA Unknown Unknown Unknown
a.out 0000000000422DD0 Unknown Unknown Unknown
a.out 0000000000422C9E Unknown Unknown Unknown
libc.so.6 00007EFC16F7BD1D Unknown Unknown Unknown
a.out 0000000000422B29 Unknown Unknown Unknown
forrtl: severe (174): SIGSEGV, segmentation fault occurred
Image PC Routine Line Source 
a.out 00000000018D5C75 Unknown Unknown Unknown
a.out 00000000018D3A37 Unknown Unknown Unknown
a.out 000000000188ADC4 Unknown Unknown Unknown
a.out 000000000188ABD6 Unknown Unknown Unknown
a.out 000000000184BCB9 Unknown Unknown Unknown
a.out 000000000184F410 Unknown Unknown Unknown
libpthread.so.0 00007EFDE2A037E0 Unknown Unknown Unknown
a.out 000000000178E634 Unknown Unknown Unknown
a.out 000000000178A423 Unknown Unknown Unknown
a.out 0000000000430491 Unknown Unknown Unknown
a.out 000000000042AACD Unknown Unknown Unknown
a.out 00000000004233D2 Unknown Unknown Unknown
a.out 0000000000422FEA Unknown Unknown Unknown
a.out 0000000000422DD0 Unknown Unknown Unknown
a.out 0000000000422C9E Unknown Unknown Unknown
libc.so.6 00007EFDE20B8D1D Unknown Unknown Unknown
a.out 0000000000422B29 Unknown Unknown Unknown
forrtl: severe (174): SIGSEGV, segmentation fault occurred
Image PC Routine Line Source 
a.out 00000000018D5C75 Unknown Unknown Unknown
a.out 00000000018D3A37 Unknown Unknown Unknown
a.out 000000000188ADC4 Unknown Unknown Unknown
a.out 000000000188ABD6 Unknown Unknown Unknown
a.out 000000000184BCB9 Unknown Unknown Unknown
a.out 000000000184F410 Unknown Unknown Unknown
libpthread.so.0 00007FBF21C277E0 Unknown Unknown Unknown
a.out 000000000178E634 Unknown Unknown Unknown
a.out 000000000178A423 Unknown Unknown Unknown
a.out 0000000000430491 Unknown Unknown Unknown
a.out 000000000042AACD Unknown Unknown Unknown
a.out 00000000004233D2 Unknown Unknown Unknown
a.out 0000000000422FEA Unknown Unknown Unknown
a.out 0000000000422DD0 Unknown Unknown Unknown
a.out 0000000000422C9E Unknown Unknown Unknown
libc.so.6 00007FBF212DCD1D Unknown Unknown Unknown
a.out 0000000000422B29 Unknown Unknown Unknown
forrtl: severe (174): SIGSEGV, segmentation fault occurred
Image PC Routine Line Source 
a.out 00000000018D5C75 Unknown Unknown Unknown
a.out 00000000018D3A37 Unknown Unknown Unknown
a.out 000000000188ADC4 Unknown Unknown Unknown
a.out 000000000188ABD6 Unknown Unknown Unknown
a.out 000000000184BCB9 Unknown Unknown Unknown
a.out 000000000184F410 Unknown Unknown Unknown
libpthread.so.0 00007F8084FD67E0 Unknown Unknown Unknown
a.out 000000000178E634 Unknown Unknown Unknown
a.out 000000000178A423 Unknown Unknown Unknown
a.out 0000000000430491 Unknown Unknown Unknown
a.out 000000000042AACD Unknown Unknown Unknown
a.out 00000000004233D2 Unknown Unknown Unknown
a.out 0000000000422FEA Unknown Unknown Unknown
a.out 0000000000422DD0 Unknown Unknown Unknown
a.out 0000000000422C9E Unknown Unknown Unknown
libc.so.6 00007F808468BD1D Unknown Unknown Unknown
a.out 0000000000422B29 Unknown Unknown Unknown

==================================================         =================================
= BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
= EXIT CODE: 174
= CLEANING UP REMAINING PROCESSES
= YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
==================================================             =================================

I am running this code on a UNIX system on my school's supercomputer and I am using intel compiler and MPICH version 3.0.1. My actual code is very similar to the second example, which uses some IMSL subroutines on each node. Can you please help me to make it work? Thank you!

1

There are 1 answers

0
shanmu .S On

Finally, d_1999 provided a link that gives me enough information to solve this problem. I just need to change the link flag to $LINK_MPIS and then I can run the second sample code without any problem.