I am insterested in using PGI OpenACC runtime API directly from code compiled by GCC.
I've noticed that the PGI OpenACC installation provides two openacc.h
headers. One for PGI (located in include/openacc.h
) and another that seems to be compatible with GCC (etc/include_acc/openacc.h
). It is safe to use the second header with GCC?
So far I've been able to compile & run a small test:
#include <openacc.h>
#include <cuda_runtime_api.h>
#include <stdio.h>
int main()
{
acc_init( acc_device_nvidia );
int ndev = acc_get_num_devices( acc_device_nvidia );
printf("Num OpenACC devices: %d\n", ndev);
cudaGetDeviceCount(&ndev);
printf("Num CUDA devices: %d\n", ndev);
return 0;
}
Using PGI:
pgcc -acc -ta=tesla,cuda8.0 -Mcuda ./test.c -o oacc_test.pgi
Using GCC + PGI OpenACC:
gcc -isystem /usr/local/cuda-8.0/include -isystem /usr/local/pgi/linux86-64/17.4/etc/include_acc -o oacc_test.both test.c -L/usr/local/cuda-8.0/lib64 -Wl,-rpath,/usr/local/cuda-8.0/lib64 -lcudart -lcuda -L/usr/local/pgi/linux86-64/17.4/lib -Wl,-rpath,/usr/local/pgi/linux86-64/17.4/lib -laccapi -laccg -laccnc -laccn -laccg2 -ldl -lpgc -lm
Using GCC + GCC OpenACC: (for comparison)
gcc -fopenacc -isystem /usr/local/cuda-8.0/include -o oacc_test.gnu test.c -L/usr/local/cuda-8.0/lib64 -Wl,-rpath,/usr/local/cuda-8.0/lib64 -lcudart -lcuda
And execute:
$ ./oacc_test.pgi
Num OpenACC devices: 4
Num CUDA devices: 4
$ ./oacc_test.both
Num OpenACC devices: 4
Num CUDA devices: 4
$ ./oacc_test.gnu
libgomp: device type nvidia not supported
More info:
$ ldd oacc_test.pgi
linux-vdso.so.1 (0x00007ffd843f8000)
libaccapi.so => /usr/local/pgi/linux86-64/17.4/lib/libaccapi.so (0x00007fa5a2b9f000)
libaccg.so => /usr/local/pgi/linux86-64/17.4/lib/libaccg.so (0x00007fa5a2981000)
libaccnc.so => /usr/local/pgi/linux86-64/17.4/lib/libaccnc.so (0x00007fa5a2777000)
libaccn.so => /usr/local/pgi/linux86-64/17.4/lib/libaccn.so (0x00007fa5a2552000)
libaccg2.so => /usr/local/pgi/linux86-64/17.4/lib/libaccg2.so (0x00007fa5a233c000)
libcudapgi.so => /usr/local/pgi/linux86-64/17.4/lib/libcudapgi.so (0x00007fa5a213b000)
libcudart.so.8.0 => /usr/local/cuda/lib64/libcudart.so.8.0 (0x00007fa5a1ed5000)
libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007fa5a1b49000)
libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007fa5a1945000)
libcudadevice.so => /usr/local/pgi/linux86-64/17.4/lib/libcudadevice.so (0x00007fa5a1731000)
libpgmp.so => /usr/local/pgi/linux86-64/17.4/lib/libpgmp.so (0x00007fa5a14af000)
libnuma.so => /usr/local/pgi/linux86-64/17.4/lib/libnuma.so (0x00007fa5a12ae000)
libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007fa5a1091000)
libpgc.so => /usr/local/pgi/linux86-64/17.4/lib/libpgc.so (0x00007fa5a0dae000)
libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007fa5a0aaa000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fa5a070b000)
libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007fa5a04f2000)
/lib64/ld-linux-x86-64.so.2 (0x000055767be3b000)
librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007fa5a02ea000)
$ ldd oacc_test.both
linux-vdso.so.1 (0x00007ffe55753000)
libcudart.so.8.0 => /usr/local/cuda/lib64/libcudart.so.8.0 (0x00007f7ddfe3c000)
libcuda.so.1 => /usr/lib/x86_64-linux-gnu/libcuda.so.1 (0x00007f7ddf3d8000)
libaccapi.so => /usr/local/pgi/linux86-64/17.4/lib/libaccapi.so (0x00007f7ddf1b8000)
libaccg.so => /usr/local/pgi/linux86-64/17.4/lib/libaccg.so (0x00007f7ddef9a000)
libaccnc.so => /usr/local/pgi/linux86-64/17.4/lib/libaccnc.so (0x00007f7dded90000)
libaccn.so => /usr/local/pgi/linux86-64/17.4/lib/libaccn.so (0x00007f7ddeb69000)
libaccg2.so => /usr/local/pgi/linux86-64/17.4/lib/libaccg2.so (0x00007f7dde955000)
libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f7dde751000)
libpgc.so => /usr/local/pgi/linux86-64/17.4/lib/libpgc.so (0x00007f7dde46e000)
libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f7dde16a000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f7ddddcb000)
libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f7dddbac000)
librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007f7ddd9a4000)
libnvidia-fatbinaryloader.so.378.13 => /usr/lib/x86_64-linux-gnu/libnvidia-fatbinaryloader.so.378.13 (0x00007f7ddd753000)
/lib64/ld-linux-x86-64.so.2 (0x00005593f06f5000)
$ ldd oacc_test.gnu
linux-vdso.so.1 (0x00007ffd967d7000)
libcudart.so.8.0 => /usr/local/cuda/lib64/libcudart.so.8.0 (0x00007f9002679000)
libcuda.so.1 => /usr/lib/x86_64-linux-gnu/libcuda.so.1 (0x00007f9001c15000)
libgomp.so.1 => /usr/lib/x86_64-linux-gnu/libgomp.so.1 (0x00007f90019e8000)
libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f90017cb000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f900142c000)
libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f9001226000)
librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007f900101e000)
libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f9000d1a000)
libnvidia-fatbinaryloader.so.378.13 => /usr/lib/x86_64-linux-gnu/libnvidia-fatbinaryloader.so.378.13 (0x00007f9000ac9000)
/lib64/ld-linux-x86-64.so.2 (0x0000563eee684000)
Is is safe to use the PGI OpenACC Runtime API that way?
Also threre is any difference between the CUDA runtime provided by Nvidia (usually in /usr/local/cuda
) and the one provided by PGI (in my case in /usr/local/pgi/linux86-64/2017/cuda
)?
I've noticed that pgcc
uses the CUDA 7.5 from it's own install path but when -ta=cuda8.0
is provided it uses the one in /usr/local/cuda
. Any special reason?
PGI compiled objects are interoperable with GNU and it's fine to mix in PGI OpenACC compiled code with GNU compiled objects. Though, the OpenACC runtimes libraries aren't compatible so I'd recommend not mixing the OpenACC code. Note that GNU support for OpenACC has gotten a lot better in their 7.0 release, so while I work for PGI, I'd encourage you to try both compilers. The one caveat is the they (GNU) don't support the "kernels" construct, so you'll want to stick to using "parallel" regions.
As for the CUDA libraries, PGI ships all the libraries that we need to compile your OpenACC code. Though, there's no difference in the CUDA libraries themselves. We didn't want users to have to co-install the CUDA SDK and it allows us to add convenience flags such as "-Mcudalib[=cublas|cufft|curand|cusolver|cusparse]" since we know where these libraries are located as well as include our own Fortran interface modules to these libraries.
Unless you have the flag "CUDAROOT=" set on your compilation line, "-ta=tesla:cuda8.0" should be using the PGI supplied CUDA 8.0 directory located in "$PGI/linux86-64/2017/cuda/8.0". Are you sure it's using the /usr/local/cuda install? You can double check by adding the verbose flag (-v) to see what the compiler driver is executing or "-dryrun" to see the commands without having the driver execute them.
Another possibility is that you're using "-L" or "-Wl" flags to point to the CUDA install (like you do with GNU) in which case the linker will pick-up the CUDA libraries from these directories. Though since they're the same libraries as we ship, it shouldn't be a problem.