I have a kernel which uses a little __constant__ memory multiple times and needs to copy different values to __constant__ memory each time. Recently, I needed to make this kernel multi stream concurrent.
How can I make each stream a copy of that __constant__ memory?
You can't. A
__constant__variable has context/device level scope. If your code only uses a "little" amount of constant memory, just pass it as a kernel argument. Kernel arguments are stored in a dedicated constant memory bank on all supported architectures.