Why I cannot print inside a kernel in Pycuda?

Question

Why I cannot print inside a kernel in Pycuda?

49 views Asked by KansaiRobot At 19 July 2023 at 03:21

I have the following start of a code

import numpy as np
from pycuda import driver, gpuarray
from pycuda.compiler import SourceModule
import pycuda.autoinit
MATRIX_SIZE = 3  
matrix_mul_kernel = """
__global__ void Matrix_Mul_Kernel(float *d_a, float *d_b, float *d_c)
{
      int tx = threadIdx.x;
      int ty = threadIdx.y;
      float value = 0;

      int s=5;

      printf("X %d Y \\n",s);
  
      for (int i = 0; i < %(MATRIX_SIZE)s; ++i) {
          float d_a_element = d_a[ty * %(MATRIX_SIZE)s + i];
          float d_b_element = d_b[i * %(MATRIX_SIZE)s + tx];
           value += d_a_element * d_b_element;
       }
 
       d_c[ty * %(MATRIX_SIZE)s + tx] = value;
   } """
  
matrix_mul = matrix_mul_kernel % {'MATRIX_SIZE': MATRIX_SIZE}
  
mod = SourceModule(matrix_mul)

The part inside the kernel with printf, if I do printf("hello"); it goes fine but when trying to print an integer (I was trying to print tx and ty but never mind, any would be fine) an error appears

Traceback (most recent call last):
  File "/media/cbe421fe-1303-4821-9392-a849bfdd00e2/MyStudy/PyCuda/9_matrix_mul.py", line 26, in <module>
    matrix_mul = matrix_mul_kernel % {'MATRIX_SIZE': MATRIX_SIZE}
TypeError: %d format: a number is required, not dict

Why is this code failing?

Previously when no constant was used, I could print the thread x and y

EDIT: Even stranger when I do this

printf("X %s Y \\n",5);

It does not fail but prints this

X {'MATRIX_SIZE': 3} Y 
X {'MATRIX_SIZE': 3} Y 
X {'MATRIX_SIZE': 3} Y 
X {'MATRIX_SIZE': 3} Y 
X {'MATRIX_SIZE': 3} Y 
X {'MATRIX_SIZE': 3} Y 
X {'MATRIX_SIZE': 3} Y 
X {'MATRIX_SIZE': 3} Y 
X {'MATRIX_SIZE': 3} Y

So apparently no matter the variable it is always interpreted as the dictionary {'MATRIX_SIZE': 3} therefore the error. The question is why?

what is happening here?

Original Q&A

There are 1 answers

**eten** · Answer 1 · 2023-07-19T07:00:50+00:00

The issue is that your printf call uses the same string interpolation specifier (%d) used by python's string interpolation. From Python's documentation:

When the right argument is a dictionary (or other mapping type), then the formats in the string must include a parenthesised mapping key into that dictionary inserted immediately after the '%' character.

To avoid mixing python and cuda's string interpolation, you can use python's newer string formatting.

matrix_mul = matrix_mul_kernel.format(MATRIX_SIZE=3)

Wherever you need MATRIX_SIZE, use {MATRIX_SIZE}

TechQA.

Why I cannot print inside a kernel in Pycuda?

There are 1 answers

Related Questions in PYTHON

Related Questions in CUDA

Related Questions in PYCUDA

Popular Questions

Trending Questions