Suppose I have a struct:
typedef enum {ON_CPU,ON_GPU,ON_BOTH} memLocation;
typedef struct foo *foo;
struct foo {
cudaEvent_t event;
float *deviceArray;
float *hostArray;
memLocation arrayLocation;
};
a function:
void copyArrayFromCPUToGPUAsync(foo bar, cudaStream_t stream)
{
cudaStreamWaitEvent(stream, bar->event);
if (bar->arrayLocation == ON_CPU) {
// ON_CPU means !ON_GPU and !ON_BOTH
cudaMemcpyAsync(cudaMemcpyHostToDevice, stream);
bar->arrayLocation = ON_BOTH;
}
cudaEventRecord(bar->event, stream);
}
void doWorkOnGPUAsync(foo bar, cudaStream_t stream)
{
cudaStreamWaitEvent(stream, bar->event);
// do async work
cudaEventRecord(bar->event, stream);
}
And the following scenario (with a lion, witch, and wardrobe fitting in somewhere as well):
// stream1, stream2, and stream3 have no prior work
// assume bar->arrayLocation = ON_GPU
doWorkOnGPUAsync(bar, stream1);
copyArrayFromCPUToGPUAsync(bar, stream2); // A no-op
doWorkOnGPUAsync(bar, stream3);
Is the above safe? I.e. will stream2
still wait on stream1
to finish its "work" if it itself does no work? And will the resulting recorded cudaEvent
reflect this, such that stream3
will not start until stream1
finishes?
This should be safe.
There is no mention anywhere (that I know) of some kind "event cancellation" due to lack of other work between a wait-on-event and the recording of another event. And it doesn't matter that you're re-using the same event object in the
cudaEventRecord()
call, since as the Runtime API docs say:Additional notes: