Numbapro cuda python defining array in thread register in gpu

Question

Numbapro cuda python defining array in thread register in gpu

614 views Asked by jalatif At 28 November 2014 at 10:14

I know how to create a global device function inside Host using np.array or np.zeros or np.empty(shape, dtype) and then using cuda.to_device to copy.

Also, one can declare shared array as cuda.shared.array(shape, dtype)

But how to create an array of constant size in the register of a particular thread inside gpu function.

I tried cuda.device_array or np.array but nothing worked.

I simply want to do this inside a thread -

x = array(CONSTANT, int32) # should make x for each thread

Original Q&A

There are 1 answers

**talonmies** · Accepted Answer · 2015-05-29T06:49:29+00:00

Numbapro supports numba.cuda.local.array(shape, type) for defining thread local arrays.

As with CUDA C, whether than array is defined in local memory or register is a compiler decision based on usage patterns of the array. If the indexing pattern of the local array is statically defined and there is sufficient register space, the compiler will use registers to store the array. Otherwise it will be stored in local memory. See this question and answer pair for more information.

TechQA.

Numbapro cuda python defining array in thread register in gpu

There are 1 answers

Related Questions in PYTHON

Related Questions in CUDA

Related Questions in PYCUDA

Related Questions in NUMBA

Related Questions in NUMBA-PRO

Popular Questions

Popular Tags

Trending Questions