I tried to implement vector addition using Unified Memory architecture. Here is my code
#include<stdio.h>
#include<cuda.h>
#define n 10
__global__ void vec_add(float *c, float *a, float *b, int n){
int i;
//Get global thread ID
i = blockDim.x*blockIdx.x+threadIdx.x;
if(i<n){
c[i] = a[i] + b[i];
}
}
int main(int argc, char* argv[]){
int thread_count;
float *a, *b, *c;
thread_count = strtol(argv[1], NULL, 10);
cudaMallocManaged(&c, n*sizeof(float));
cudaMallocManaged(&a, n*sizeof(float));
cudaMallocManaged(&b, n*sizeof(float));
for(int i=0; i<n; i++){
a[i]=1.0;
b[i]=2.0;
}
//Launch Kernel
vec_add<<<1,thread_count>>>(c, a, b, n);
//Synchronize threads
cudaDeviceSynchronize();
for(int i=0; i<n; i++){
printf("%f + %f =%f\n", a[i], b[i], c[i]);
}
cudaFree(c);
cudaFree(a);
cudaFree(b);
return 0;
}
I got error while run the codeexpected a ")"
. I did not found the parenthesis problem. How could I recover from the error? Also I need a brief structure description about how to write cuda program using unified memory.
here is the brief description.
The problem you have is here:
You may not know how a C++ preprocessor macro (
#define
) works. It creates a substitution that will be performed by the preprocessor. So what you are telling the preprocessor to do is to change your kernel definition line like thisAnd of course that is not valid C++ syntax for a function definition. One possible way to fix this would be to change your variable name in the (kernel) function definition to be something other than
n
, perhaps like this:Even though this happens to be a CUDA kernel definition, the problem here would be exactly the same if you wrote an ordinary function definition, and used
n
as one of the function parameters. This is related to C++ understanding, not anything specific or unique to CUDA.