I dont have an HD5850 but how can I know maximum workgroup size of it for opencl ? What is the preferred floating point vector width for HD5850? I suspected it was 5 but did not work on a friends computer who has 5850. Tried width 4 but did not work fast enough now I suspect work group size is not optimal. Doing NBody for 25k 50k and 100k particles consists of float8 variables for x,y,z, vx,vy,vz.
Thanks.
Use clGetDeviceInfo to poll for CL_DEVICE_MAX_WORK_GROUP_SIZE. I think the 5850 will have this at 256, but that may not be optimal for your kernel.
Use the same technique to poll for CL_DEVICE_PREFERRED_VECTOR_WIDTH_FLOAT, which I think is 4 on your card.
clGetDeviceInfo