I am having troubles using cufftPlanMany. After creating the plans and taking the forward and inverse FFTs, I could not get the original data back. Please, find attached a minimum version of the code.
program test_cufft
use cudafor
use cufft
integer :: plan_r2c
integer :: plan_c2r
real,allocatable,dimension(:,:,:,:), device :: eta_d
complex,allocatable,dimension(:,:,:,:), device :: etak_d
nv = 4
nx = 256
ny = 512
nz = 512
nx21 = nx/2+1
allocate( eta_d(nv,nx,ny,nz) )
allocate( etak_d(nv,nx21,ny,nz) )
batch = nv;
rank = 3;
n = (/ nx, ny, nz /);
idist = nx*ny*nz;
odist = nx21*ny*nz;
inembed = (/ nx, ny, nz /);
onembed = (/ nx21, ny, nz /);
istride = 1;
ostride = 1;
istat = cufftPlanMany( plan_r2c, rank, n, inembed, istride, idist, &
onembed, ostride, odist, CUFFT_R2C, batch )
istat = cufftPlanMany( plan_c2r, rank, n, onembed, ostride, odist, &
inembed, istride, idist, CUFFT_C2R, batch )
! Initialize eta_d
istat = cufftExecR2C( plan_r2c, eta_d, etak_d )
istat = cufftExecC2R( plan_c2r, etak_d, eta_d )
eta_d = eta_d/idist
end program test_cufft
The problem is after I did the forward and inverse FFTs, I could not get the original data back.
Please, what am I doing wrong?
Should the ordering of data be eta_d(batch,nx,ny,nz) or eta_d(nx,ny,nz,batch)
?
I would say the correct ordering is
(nz, ny, nx, batch)
But it's important to relate these to your array indexing and storage order as well.
In CUFFT terminology, for a 3D transform(*) the
nz
direction is the fastest changing index, with typical usage (stride=1) being adjacent data in memory, corresponding to adjacent elements in a transform.This direction (I think of it as the elements along a row, i.e. the "z" index being the column index) is also the direction of a multidimensional transform that gets "reduced" in the complex domain, for the R2C/C2R transform types.
With that in mind, I would rewrite your code this way:
(NVIDIA HPC SDK 20.9, Tesla V100 GPU)
and it seems to give expected results for my simple test case.
(*) For a 2D transform, the
ny
dimension would be the most rapidly varying, and for a 1D transform thenx
dimension (of course) is the most rapidly varying.The multi-dimensional transforms and advanced data layout sections of the CUFFT manual may also be useful reading.