Missing symbol: cuDevicePrimaryCtxRelease vs cuDevicePrimaryCtxRelease_v2

1.3k views Asked by At

I'm trying to build the following program:

#include <iostream>
#include <cuda.h>

int main() {
    const char* str;

    auto status = cuInit(0);
    cuGetErrorString(status, &str);
    std::cout << "status = " << str << std::endl;

    int device_id = 0;
    CUcontext primary_context_id;
    status = cuDevicePrimaryCtxRetain(&primary_context_id, device_id);
    cuGetErrorString(status, &str);
    std::cout << "status = " << str << std::endl;

    status = cuDevicePrimaryCtxRelease(device_id);
    cuGetErrorString(status, &str);
    std::cout << "status = " << str << std::endl;
}

Compilation always goes fine; but, with CUDA 10.2, linking works, while with CUDA 11.2, I get:

/usr/bin/ld: a.o: in function `main':
a.cpp:(.text+0xcc): undefined reference to `cuDevicePrimaryCtxRelease_v2'
collect2: error: ld returned 1 exit status

Why is this happening and how can I fix it?

Note: I'm using Devuan Beowulf with driver version 440.82 (have not installed a new driver for CUDA 11.2).

1

There are 1 answers

4
einpoklum On

Well, I think I have an idea of why this happens.

This is about how cuDevicePrimaryCtxRelease() is defined. Let's run:

grep PrimaryCtxRelease /usr/local/cuda/include/cuda.h | grep -v "^ "

In CUDA 10.2, we get:

CUresult CUDAAPI cuDevicePrimaryCtxRelease(CUdevice dev);

while in CUDA 11.2, we get:

#define cuDevicePrimaryCtxRelease           cuDevicePrimaryCtxRelease_v2
CUresult CUDAAPI cuDevicePrimaryCtxRelease(CUdevice dev);

That is, the API name has changed, but the header file leaves an alias to the new name. (And that's a confusing piece of code, I would say.)

Now, let's peer into the object files I get in the two different versions of CUDA, using objdump -t | c++filt | grep cu. With CUDA 10.2, it's:

0000000000000000         *UND*  0000000000000000 cuInit
0000000000000000         *UND*  0000000000000000 cuGetErrorString
0000000000000000         *UND*  0000000000000000 cuDevicePrimaryCtxRetain
0000000000000000         *UND*  0000000000000000 cuDevicePrimaryCtxRelease

while with CUDA 11.2, it's:

0000000000000000         *UND*  0000000000000000 cuInit
0000000000000000         *UND*  0000000000000000 cuGetErrorString
0000000000000000         *UND*  0000000000000000 cuDevicePrimaryCtxRetain
0000000000000000         *UND*  0000000000000000 cuDevicePrimaryCtxRelease_v2

(note the _v2).

so it's probably the case that the installed driver only contains the non-_v2 symbol, hence the undefined symbol.

What I would still appreciate help with is how to work around this issue other than by updating the driver.