The CUDA Runtime API has the functions cudaGetSymbolAddress()
and cudaGetSymbolSize()
for working with device-side globals from host-side code, using their names (source-code identifiers) as handles.
In the Driver API, we have cuModuleGetGlobal()
, which lets us do the same thing... except that it takes a CUmodule which the global symbol is situated in. If you're working with code that you dynamically compiled and loaded/added into a module then you're all set. But what if those globals are part of your program, compiled statically using NVCC rather than loaded dynamically?
I would assume that there's some sort of "primary module" or "default module" for each compiled program, with its baked-in globals and functions. Can I get a handle for it?
There is, and if you pull apart the runtime API emitted host boilerplate code which makes it work and some runtime traces, you will see it relies on a lot of statically defined symbols and a couple of undocumented runtime API functions which internally maintain the module the runtime API uses.
Using the driver API, no. If you need to interact with the runtime API, then use the runtime API.