Does cudnnCreate() call create multiple streams internally?

Question

Does cudnnCreate() call create multiple streams internally?

393 views Asked by sandeep.ganage At 19 October 2020 at 17:46

I am writing a simple multi-stream CUDA application. Following is the part of code where I create cuda-streams, cublas-handle and cudnn-handle:

cudaSetDevice(0);

int num_streams = 1;

cudaStream_t streams[num_streams];
cudnnHandle_t mCudnnHandle[num_streams];
cublasHandle_t mCublasHandle[num_streams];

for (int ii = 0; ii < num_streams; ii++) {
    cudaStreamCreateWithFlags(&streams[ii], cudaStreamNonBlocking);
    cublasCreate(&mCublasHandle[ii]);
    cublasSetStream(mCublasHandle[ii], streams[ii]);
    cudnnCreate(&mCudnnHandle[ii]);
    cudnnSetStream(mCudnnHandle[ii], streams[ii]);
}

Now, my stream count is 1. But when I profile the executable of above application using Nvidia Visual Profiler I get following:

For every stream I create it creates additional 4 more streams. I tested it with num_streams = 8, it showed 40 streams in profiler. It raised following questions in my mind:

Does cudnn internally create streams? If yes, then why?
If it implicitly creates streams then what is the way to utilize it?
In such case does explicitly creating streams make any sense?

Original Q&A

There are 1 answers

**Robert Crovella** · Accepted Answer · 2020-10-19T18:08:15+00:00

Does cudnn internally create streams?

Yes.

If yes, then why?

Because it is a library, and it may need to organize CUDA concurrency. Streams are used to organize CUDA concurrency. If you want a detailed explanation of what exactly the streams are used for, the library internals are not documented.

If it implicitly creates streams then what is the way to utilize it?

Those streams are not intended for you to utilize separately/independently. They are for usage by the library, internal to the library routines.

In such case does explicitly creating streams make any sense?

You would still need to explicitly create any streams you needed to manage CUDA concurrency outside of the library usage.

I would like to point out that this statement is a bit misleading:

"For every stream I create it creates additional 4 more streams."

What you are doing is going through a loop, and at each loop iteration you are creating a new handle. Your observation is tied to the number of handles you create, not the number of streams you create.

TechQA.

Does cudnnCreate() call create multiple streams internally?

There are 1 answers

Related Questions in CUDA

Related Questions in GPGPU

Related Questions in CUDNN

Popular Questions

Popular Tags

Trending Questions