In CUDA kernel function, why the tex3D cannot return any value?

3.2k views Asked by At

Here, I have two different version of my code.

The first one is a complete CUDA program that from CUDA SDK. In kernel, tex3D works well.

The second version is more complex with many OpenGL functions and OpenGL textures. Its .cu file is same as the first one. However, I use a cudaMalloc variable to get the value from tex3D in the same kernel function, I found the tex3D function returns nothing. Actually, two program use the same way to create 3D texture as the code below:

#define  SIZE_X 128 //numbers in elements
#define  SIZE_Y 128
#define  SIZE_Z 128
typedef float  VolumeType;
cudaExtent volumeSize = make_cudaExtent(SIZE_X, SIZE_Y, SIZE_Z); 

cudaArray *d_volumeArray = 0; //for tex
cudaArray *d_transferFuncArray; //for transferTex

texture<VolumeType, 3, cudaReadModeElementType> tex;         // 3D texture
texture<float4, 1, cudaReadModeElementType> transferTex; // 1D transfer function texture

//initialize the 3d texture "tex" with a 3D array "d_volumeArray"
cudaChannelFormatDesc channelDesc = cudaCreateChannelDesc<VolumeType>();
cutilSafeCall( cudaMalloc3DArray(&d_volumeArray, &channelDesc, volumeSize) ); 

// set texture parameters
tex.normalized = true;                      // access with normalized texture coordinates
tex.filterMode = cudaFilterModeLinear;      // linear interpolation
tex.addressMode[0] = cudaAddressModeClamp;  // clamp texture coordinates
tex.addressMode[1] = cudaAddressModeClamp;
CUDA_SAFE_CALL(cudaBindTextureToArray(tex, d_volumeArray, channelDesc));// bind array to 3D texture

//get the real value for 3D texture "tex"
float *d_volumeMem;
cutilSafeCall(cudaMalloc((void**)&d_volumeMem, SIZE_X*SIZE_Y*SIZE_Z*sizeof(float)));

.....//assign value to d_volumeMem in GPU. I've already checked the d_volumeMem is valid

//copy d_volumeMem to 3DArray
cudaMemcpy3DParms copyParams = {0};
copyParams.srcPtr = make_cudaPitchedPtr((void*)d_volumeMem, SIZE_X*sizeof(VolumeType),   SIZE_X, SIZE_Y); 
copyParams.dstArray = d_volumeArray;
copyParams.extent = volumeSize;
copyParams.kin = cudaMemcpyDeviceToDevice;
cutilSafeCall( cudaMemcpy3D(&copyParams) ); 

Code below is the kernel function calling the tex3D. Actually, it's just the same as the CUDA SDK volumeRender's kernel, that is a implementation of ray-casting.

__global__ void d_render(....)
{
 ......//ray-casting progress

 float temp = tex3D(tex, pos1, pos2, pos3); 
 //pos1 pos2 pos3 is valid
 //Here actually I use a cudaMalloc variable "s" to store the temp
 //In the first version, s has different value for different position
 //In the second version, temp is 0 all the time

......//ray-casting progress
}

I think these code are good because most of them are from the CUDA SDK volumeRender and work well in my first version code.

But I have no idea why in the second version tex3D is suddenly invalid. Maybe some other OpenGL textures has some negative effects?

1

There are 1 answers

0
TonyLic On BEST ANSWER

Actually, the problem is that I do the bind-texture-stuff before choose CUDA device. So, calling the cudaChooseDevice and cudaSetGLDevice first will make things work well!