Threads syncronization in CUDA

Question

Threads syncronization in CUDA

158 views Asked by Max At 09 June 2015 at 09:10

I have a 3D grid of 3D blocks, and within each block I need to compute sequentially on the "z" layers of the block. In other words, I want to execute first all (x,y,0) threads, then all (x,y,1), etc. I need to execute my threads layer by layer (counting layers by axis z). I know about the function __syncthreads(), but I don't know how to syncronize threads with it the way I want.

UPD:

 __global__ void Kernel(//some params)
 {


      //some code 
      __syncthreads();
 }

It syncronizes all the threads in the block. But I need to execute all the threads where z = 0, then all the threads where z=1, etc.

Original Q&A

There are 2 answers

Avi Ginsburg On 09 June 2015 at 09:49

__syncthreads() blocks the threads within a block. If you insist on using __syncthreads() you would have to ensure that the block size is the same as a(x*y), which is not necessarily possible, and even then, the order of the blocks isn't guaranteed. An alternative way of obtaining your goal, would be to launch a kernel per layer, and sync between kernels (i.e. block kernel concurrency). That of course really depends on what your exact kernel does and if breaking it up like that is feasible.

**Noel** · Accepted Answer · 2015-06-09T14:44:07+00:00

You can use a simple loop, and specify the threads you want to do the work in each iteration. Something like:

for (int z = 0; z < zmax; z++) {
    if (threadIdx.z == z) { 
        //do whatever with x and y
    }
    __syncthreads();
}

In each iteration, threads with a specific z-index execute the instructions, while the others are idle; at the end of each iteration all threads synchronize.

TechQA.

Threads syncronization in CUDA

There are 2 answers

Related Questions in C++

Related Questions in CUDA

Popular Questions

Popular Tags

Trending Questions