I have a struct that looks like this
struct LstmLayer {
int deviceId;
thrust::device_vector <real_t> W;
thrust::device_vector <real_t> gradW;
LstmLayer() : deviceId(0) {}
LstmLayer(int __deviceId__) : deviceId(__deviceId__) {}
void setDevice(int __deviceId__) { deviceId = __deviceId__; }
void init(bool initParams) {
W.resize(4*lstmSize * 2*lstmSize);
gradW.resize(4*lstmSize * 2*lstmSize);
if (initParams) GPU_Random_Vector(W);
}
}
Now I want to initialize an array of LstmLayer
, with each element on a different GPU device. I do it as follows
struct LstmLayer lstmLayers[MAX_NUM_LSTM_LAYERS];
for (int i = 0; i < numLstmLayers; ++i) {
CUDA_SAFE_CALL(cudaSetDevice(i));
lstmLayers[i].setDevice(i);
lstmLayers[i].init(true);
}
Running this program gives the following error
terminate called after throwing an instance of 'thrust::system::system_error'
what(): driver shutting down
Please tell me what's wrong with my code and how to do it right? Thank you in advanced.
The problem is you are instantiating all the vectors in the same CUDA GPU context, then trying to use them in a different context. This happens because the default constructor for each
device_vector
gets called when you define the array of structures. To annotate your code:The solution is probably to redefine the device vectors as pointers, and explicitly call their constructors in your
init
method. There are a lot of different ways this could be done, for example:[disclaimer: written in browser, never compiled, use at own risk]
obviously you would need to defined a destructor to prevent memory leaks. There are other possibilities, I'll leave that as an exercise to the reader.