cudaMemCpy2d error (cudaErrorInvalidValue) when running "debug" configuration

421 views Asked by At

This is driving me crazy. I can't figure out for the life of me why this is happening. Basically, I have this code that was working totally fine under Linux (Nsight eclipse edition). I tried making it compatible with Windows by creating a Visual Studio 2013 project and configuring it.

At this point everything seems to be fine, the code compiles without any problems. It even runs fine when I use the "Release" configuration. However, as soon as I try the Debug configuration, the portion below crashes with a cudaErrorInvalidValue error.
I've tracked down the problem to the optimization flag. Disabling optimization will result in a crash. Using /O2 or /O1, the code runs fine!

Again, this works just fine under Linux with or without optimization. I wonder what gives in Windows optimization. If it's of any help, I'm using Visual Studio 2013 (Update 4) with CUDA 6.5 and static library linking. (On Linux it was CUDA 6.5 but dynamic library linking).

The whole code is available here.

size_t hostPitch = (size_t)getHostPitch();
size_t devicePitch = (size_t)getDevicePitch();
size_t cal = (size_t)(width * numChannels * sizeof(T));
size_t h = (size_t)height;
cudaError_t eCUDAResult = cudaMemcpy2D((void*)this->hostData, hostPitch, (const void*)this->deviceData, devicePitch, cal, h, cudaMemcpyDeviceToHost);
1

There are 1 answers

1
Maghoumi On BEST ANSWER

The comment by @Park Young-Bae solved my problem (though it took some more efforts than having a simple breakpoint!)
The undefined behavior was caused by my carelessness. In one of the classes, I had forgotten to override copy and assign. Therefore, when an object was being returned its destructor was called and was freeing all the CUDA memory! As a result, subsequent CUDA API calls on that object were working on dangling references.

Can't believe how easy it is to miss something tiny in C++ and spend hours on debugging