Background:
I have written a CUDA program that performs processing on a sequence of symbols. The program processes all sequences of symbols in parallel with the stipulation that all sequences are of the same length. I'm sorting my data into groups with each group consisting entirely of sequences of the same length. The program processes 1 group at a time.
Question:
I am running my code on a Linux machine with 4 GPUs and would like to utilize all 4 GPUs by running 4 instances of my program (1 per GPU). Is it possible to have the program select a GPU that isn't in use by another CUDA application to run on? I don't want to hardcode anything that would cause problems down the road when the program is run on different hardware with a greater or fewer number of GPUs.
The environment variable
CUDA_VISIBLE_DEVICES
is your friend.I assume you have as many terminals open as you have GPUs. Let's say your application is called
myexe
Then in one terminal, you could do:
In the next terminal:
and so on.
Then the first instance will run on the first GPU enumerated by CUDA. The second instance will run on the second GPU (only), and so on.
Assuming bash, and for a given terminal session, you can make this "permanent" by exporting the variable:
thereafter, all CUDA applications run in that session will observe only the third enumerated GPU (enumeration starts at 0), and they will observe that GPU as if it were device 0 in their session.
This means you don't have to make any changes to your application for this method, assuming your app uses the default GPU or GPU 0.
You can also extend this to make multiple GPUs available, for example:
means the GPUs that would ordinarily enumerate as 2 and 4 would now be the only GPUs "visible" in that session and they would enumerate as 0 and 1.
In my opinion the above approach is the easiest. Selecting a GPU that "isn't in use" is problematic because:
So the best advice (IMO) is to manage the GPUs explicitly. Otherwise you need some form of job scheduler (outside the scope of this question, IMO) to be able to query unused GPUs and "reserve" one before another app tries to do so, in an orderly fashion.