[UPDATE] the code and a few sentences were changed to reflect a realization explained in my second comment. The code should compile with the line below, however, I have an older gfortran and may not be seeing some errors that you might.
gfortran BLU_implementation_copy.f90 -o BLU_implementation_copy.x
I'm getting an incredibly inconsistent seg fault error at runtime with Fortran 90.
The overall objective of my code is to factor a randomly generated, complex, symmetric, diagonally-dominant matrix in blocks that can be easily parallelized. A stripped down version (except for the relevant functions) can be found at the end.
I'm running into the error when breaking up the original matrix into blocks (tiles) in the function mat2tiles. Specifically this line of code keeps failing:
o(i,j)=cell(x(a:min(i*Nblock,M), b:min(j*Nblock, N)))
This is assigning a block of the input matrix X to matrix o at index i,j. It accomplishes this because each index of o is a matrix NblockxNblock through the derived type:
type cell
complex*16 :: block(Nblock,Nblock)
end type cell
type(cell) :: o(M/Nblock+rem,N/Nblock+rem)
For certain matrix sizes and tile sizes, everything works perfectly. For larger sizes, the error starts to appear in tile sizes close to the matrix size and eventually reach those at half the size of the matrix. For example, the smallest instance of this happening I have found is a 20x20 matrix with a tile size of 18. But this doesn't happen consistently. If I try out smaller matrices that all run normally and build up to that size, it runs. If I do a larger size that seg faults and then run the 20x20 with 18, it seg faults. The smallest consistent parameters I've found have been 25x25 with a tile size of 23. Yet even with this size, if I print out the block where it fails after the o(i,j) assignment line:
print*, o(2,1)%block
It suddenly runs through the whole thing without a problem. Taking out the print statement makes it seg fault again. Finally, the most annoying one [fixed - refer to second comment by me], if I do a 2000x2000 matrix with a tile size of 1000 or more, I get a seg fault just after getting into the function (which means it's occurring during the variable allocations). This leads me to believe the issue may stem from how the matrix is allocated, especially since I'm using a derived type. But when I try to diagnose the size and contents of the matrix, everything seems normal.
program BLU_implementation
implicit none
integer :: rem
integer, parameter :: M=20, N=20, Nblock=19
real*8 :: start, finish
complex*16 :: A(M,N)
type cell
complex*16 :: block(Nblock,Nblock)
end type cell
!determines if cell matrix doesn't need to be bigger
rem=1
if (modulo(M,Nblock)==0) then
rem=0
endif
call cpu_time(start)
call functionCalling(A, M, N, Nblock, rem)
call cpu_time(finish)
print*, 'overall time:', finish-start, 'seconds'
contains
!==================================================================================================================
subroutine functionCalling(A, M, N, Nblock, rem)
implicit none
integer :: IPIV, INFO, M, N, Nblock, rem
real*8 :: start, finish
complex*16 :: A(M,N)
type(cell) :: C(M/Nblock+rem,N/Nblock+rem)
call cpu_time(start)
A= CSPDmatrixFill(A,M,N)
call cpu_time(finish)
print*, 'matrix fill time:', finish-start, 'seconds'
call cpu_time(start)
C= mat2tiles(A,M,N,Nblock,rem)
call cpu_time(finish)
print*, 'tiling time:', finish-start, 'seconds'
end subroutine
!===================================================================================================================
! generates a complex, symmetric, positive-definite matrix (based off of another's code)
function CSPDmatrixFill(A, M, N) result (Matrix)
! initialization
implicit none
integer :: i, j
integer :: M, N
real*8 :: x, xi
complex*16 :: A(M, N), Matrix(M, N), EYE(M, N), MT(N, M)
EYE=0
forall(j=1:M) EYE(j,j)=1
! execution
call random_seed
do i=1, M
do j=1, N
call random_number (x)
call random_number(xi)
Matrix(i,j) = cmplx(x,xi)
end do
end do
! construct a symmetric matrix ( O(n^2) )
call Mtranspose(Matrix, M, N)
Matrix = Matrix+MT
! make positive definite (diagonally dominant)
Matrix = Matrix + N*EYE
end function CSPDmatrixFill
!======================================================================================================
subroutine Mtranspose(A, i, j)
! takes a matrix and the two parameters used to make the matrix: A(i,j)
! returns a matrix with switched indices: A(j,i)
implicit none
integer :: i, j
complex*16 :: A(i,j), MT(j,i)
MT=A(j,i)
return
end subroutine Mtranspose
!=======================================================================================================
!MAT2TILES - breaks up an array into a cell array of adjacent sub-arrays of equal sizes
!
! O=mat2tiles(X,M,N,Nblock)
!
!will produce a cell array o containing adjacent chunks of the array X(M,N)
!with each chunk of dimensions NblockxNblock. If Nblock does
!not divide evenly into size(X,i), then the chunks at the upper boundary of
!X along dimension i will have bogus values that do not affect the factorization
!in the places where the matrix doesn't occupy. (according to older versions. Might have changed with some edits)
!
function mat2tiles(X,M,N,Nblock,rem) result(o)
! initialization
implicit none
integer :: a, b, i, j, M, N, Nblock, rem
complex*16 :: X(M,N)
type(cell) :: o(M/Nblock+rem,N/Nblock+rem)
! diagnostic print statements
print*,size(o(1,1)%block), size(o(1,2)%block), size(o(2,1)%block), size(o(2,2)%block)
print*, 'got to start'
! turn matrix x into cell matrix o
do j=1, N/Nblock+rem
if (j==1) then
b=j
else
b=b+Nblock
endif
do i=1, M/Nblock+rem
if (i==1) then
a=i
else
a=a+Nblock
endif
! diagnostic print statement
print*, 'writing to o: i:', i, 'j:', j, 'i*Nblock:', i*Nblock, 'j*Nblock:', j*Nblock, 'min of i:', min(i*Nblock, M), &
'min of j:', min(j*Nblock, N), 'a:', a, 'b:', b
o(i,j)=cell(x(a:min(i*Nblock,M), b:min(j*Nblock, N)))
enddo
enddo
! diagnostic print statement
print*, 'got to end'
return
end function mat2tiles
!==================================================================================================
end program
After narrowing down the issue to using numbers vs. variables of those same numbers, it was discovered that Fortran doesn't like assigning a matrix to a matrix of different dimensions even if it fit inside. This was weird because smaller values of
M
,N
, andNblock
got around the issue perfectly.The solution was simply defining
o(i,j)%block(1:Nblock,1:Nblock)=x(dim1:dim2,etc)
instead ofo(i,j)=cell(x(dim1:dim2,etc)
fori=1
andj=1
and having a slightly modified instance of each for all cases ofi
andj
.Correct code (for all cases of matrix size and tile size) looks like:
With this correction, my code runs on the old and new versions of gfortran. Interesting to note, however, is how the Mac OS X version of gfortran never ran into this segfault, only the Linux version.