clrobj(<class name>) does not have llvm when passing array of struct to GPU Kernel (ALEA Library)

111 views Asked by At

I am getting the "Fody/Alea.CUDA: clrobj(cGPU) does not have llvm" build error for a code in which I try to pass an array of struct to the NVIDIA Kernel using ALEA library. Here is a simplified version of my code. I removed the output gathering functionality in order to keep the code simple. I just need to be able to send the array of struct to the GPU for the moment.

using Alea.CUDA;
using Alea.CUDA.Utilities;
using Alea.CUDA.IL;

namespace GPUProgramming
  public class cGPU
   public int Slice;
   public float FloatValue;

  [AOTCompile(AOTOnly = true)]
  public class TestModule : ILGPUModule
    public TestModule(GPUModuleTarget target) : base(target)

    const int blockSize = 64;

    public void Kernel2(deviceptr<cGPU> Data, int n)
      var start = blockIdx.x * blockDim.x + threadIdx.x;
      int ind = threadIdx.x;

      var sharedSlice =         __shared__.Array<int>(64);
      var sharedFloatValue =    __shared__.Array<float>(64);

      if (ind < n && start < n)
        sharedSlice[ind] = Data[start].Slice;
        sharedFloatValue[ind] = Data[start].FloatValue;

    public void Test2(deviceptr<cGPU> Data, int n, int NumOfBlocks)
      var GridDim = new dim3(NumOfBlocks, 1);
      var BlockDim = new dim3(64, 1);

        var lp = new LaunchParam(GridDim, BlockDim);
        GPULaunch(Kernel2, lp, Data, n);
      catch (CUDAInterop.CUDAException x)
        var code = x.Data0;
        Console.WriteLine("ErrorCode = {0}", code);
    public void Test2(cGPU[] Data)
      int NumOfBlocks = Common.divup(Data.Length, blockSize);
      using (var d_Slice = GPUWorker.Malloc(Data))
          Test_Kernel2(d_Slice.Ptr, Data.Length, NumOfBlocks);
        catch (CUDAInterop.CUDAException x)
          var code = x.Data0;
          Console.WriteLine("ErrorCode = {0}", x.Data0);

There are 1 answers

Xiang Zhang On BEST ANSWER

Your data is class, which is reference type. Try use struct. Reference type doesn't fit Gpu well, since it require of allocating small memory on the heap.