Linking PGI OpenACC Runtime Library directly with gcc

838 views Asked by At

I am insterested in using PGI OpenACC runtime API directly from code compiled by GCC.

I've noticed that the PGI OpenACC installation provides two openacc.h headers. One for PGI (located in include/openacc.h) and another that seems to be compatible with GCC (etc/include_acc/openacc.h). It is safe to use the second header with GCC?

So far I've been able to compile & run a small test:

#include <openacc.h>
#include <cuda_runtime_api.h>
#include <stdio.h>

int main()
{
   acc_init( acc_device_nvidia );

   int ndev = acc_get_num_devices( acc_device_nvidia );

   printf("Num OpenACC devices: %d\n", ndev);

   cudaGetDeviceCount(&ndev);

   printf("Num CUDA devices: %d\n", ndev);

   return 0;
}

Using PGI:

pgcc -acc -ta=tesla,cuda8.0 -Mcuda ./test.c -o oacc_test.pgi

Using GCC + PGI OpenACC:

gcc -isystem /usr/local/cuda-8.0/include -isystem /usr/local/pgi/linux86-64/17.4/etc/include_acc -o oacc_test.both test.c -L/usr/local/cuda-8.0/lib64 -Wl,-rpath,/usr/local/cuda-8.0/lib64 -lcudart -lcuda -L/usr/local/pgi/linux86-64/17.4/lib -Wl,-rpath,/usr/local/pgi/linux86-64/17.4/lib -laccapi -laccg -laccnc -laccn -laccg2 -ldl -lpgc -lm

Using GCC + GCC OpenACC: (for comparison)

gcc -fopenacc -isystem /usr/local/cuda-8.0/include -o oacc_test.gnu test.c -L/usr/local/cuda-8.0/lib64 -Wl,-rpath,/usr/local/cuda-8.0/lib64 -lcudart -lcuda

And execute:

$ ./oacc_test.pgi 
Num OpenACC devices: 4
Num CUDA devices: 4
$ ./oacc_test.both 
Num OpenACC devices: 4
Num CUDA devices: 4
$ ./oacc_test.gnu 

libgomp: device type nvidia not supported

More info:

$ ldd oacc_test.pgi 
    linux-vdso.so.1 (0x00007ffd843f8000)
    libaccapi.so => /usr/local/pgi/linux86-64/17.4/lib/libaccapi.so (0x00007fa5a2b9f000)
    libaccg.so => /usr/local/pgi/linux86-64/17.4/lib/libaccg.so (0x00007fa5a2981000)
    libaccnc.so => /usr/local/pgi/linux86-64/17.4/lib/libaccnc.so (0x00007fa5a2777000)
    libaccn.so => /usr/local/pgi/linux86-64/17.4/lib/libaccn.so (0x00007fa5a2552000)
    libaccg2.so => /usr/local/pgi/linux86-64/17.4/lib/libaccg2.so (0x00007fa5a233c000)
    libcudapgi.so => /usr/local/pgi/linux86-64/17.4/lib/libcudapgi.so (0x00007fa5a213b000)
    libcudart.so.8.0 => /usr/local/cuda/lib64/libcudart.so.8.0 (0x00007fa5a1ed5000)
    libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007fa5a1b49000)
    libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007fa5a1945000)
    libcudadevice.so => /usr/local/pgi/linux86-64/17.4/lib/libcudadevice.so (0x00007fa5a1731000)
    libpgmp.so => /usr/local/pgi/linux86-64/17.4/lib/libpgmp.so (0x00007fa5a14af000)
    libnuma.so => /usr/local/pgi/linux86-64/17.4/lib/libnuma.so (0x00007fa5a12ae000)
    libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007fa5a1091000)
    libpgc.so => /usr/local/pgi/linux86-64/17.4/lib/libpgc.so (0x00007fa5a0dae000)
    libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007fa5a0aaa000)
    libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fa5a070b000)
    libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007fa5a04f2000)
    /lib64/ld-linux-x86-64.so.2 (0x000055767be3b000)
    librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007fa5a02ea000)

$ ldd oacc_test.both 
    linux-vdso.so.1 (0x00007ffe55753000)
    libcudart.so.8.0 => /usr/local/cuda/lib64/libcudart.so.8.0 (0x00007f7ddfe3c000)
    libcuda.so.1 => /usr/lib/x86_64-linux-gnu/libcuda.so.1 (0x00007f7ddf3d8000)
    libaccapi.so => /usr/local/pgi/linux86-64/17.4/lib/libaccapi.so (0x00007f7ddf1b8000)
    libaccg.so => /usr/local/pgi/linux86-64/17.4/lib/libaccg.so (0x00007f7ddef9a000)
    libaccnc.so => /usr/local/pgi/linux86-64/17.4/lib/libaccnc.so (0x00007f7dded90000)
    libaccn.so => /usr/local/pgi/linux86-64/17.4/lib/libaccn.so (0x00007f7ddeb69000)
    libaccg2.so => /usr/local/pgi/linux86-64/17.4/lib/libaccg2.so (0x00007f7dde955000)
    libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f7dde751000)
    libpgc.so => /usr/local/pgi/linux86-64/17.4/lib/libpgc.so (0x00007f7dde46e000)
    libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f7dde16a000)
    libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f7ddddcb000)
    libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f7dddbac000)
    librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007f7ddd9a4000)
    libnvidia-fatbinaryloader.so.378.13 => /usr/lib/x86_64-linux-gnu/libnvidia-fatbinaryloader.so.378.13 (0x00007f7ddd753000)
    /lib64/ld-linux-x86-64.so.2 (0x00005593f06f5000)

$ ldd oacc_test.gnu 
    linux-vdso.so.1 (0x00007ffd967d7000)
    libcudart.so.8.0 => /usr/local/cuda/lib64/libcudart.so.8.0 (0x00007f9002679000)
    libcuda.so.1 => /usr/lib/x86_64-linux-gnu/libcuda.so.1 (0x00007f9001c15000)
    libgomp.so.1 => /usr/lib/x86_64-linux-gnu/libgomp.so.1 (0x00007f90019e8000)
    libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f90017cb000)
    libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f900142c000)
    libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f9001226000)
    librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007f900101e000)
    libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f9000d1a000)
    libnvidia-fatbinaryloader.so.378.13 => /usr/lib/x86_64-linux-gnu/libnvidia-fatbinaryloader.so.378.13 (0x00007f9000ac9000)
    /lib64/ld-linux-x86-64.so.2 (0x0000563eee684000)

Is is safe to use the PGI OpenACC Runtime API that way?

Also threre is any difference between the CUDA runtime provided by Nvidia (usually in /usr/local/cuda) and the one provided by PGI (in my case in /usr/local/pgi/linux86-64/2017/cuda)? I've noticed that pgcc uses the CUDA 7.5 from it's own install path but when -ta=cuda8.0 is provided it uses the one in /usr/local/cuda. Any special reason?

1

There are 1 answers

1
Mat Colgrove On BEST ANSWER

PGI compiled objects are interoperable with GNU and it's fine to mix in PGI OpenACC compiled code with GNU compiled objects. Though, the OpenACC runtimes libraries aren't compatible so I'd recommend not mixing the OpenACC code. Note that GNU support for OpenACC has gotten a lot better in their 7.0 release, so while I work for PGI, I'd encourage you to try both compilers. The one caveat is the they (GNU) don't support the "kernels" construct, so you'll want to stick to using "parallel" regions.

As for the CUDA libraries, PGI ships all the libraries that we need to compile your OpenACC code. Though, there's no difference in the CUDA libraries themselves. We didn't want users to have to co-install the CUDA SDK and it allows us to add convenience flags such as "-Mcudalib[=cublas|cufft|curand|cusolver|cusparse]" since we know where these libraries are located as well as include our own Fortran interface modules to these libraries.

Unless you have the flag "CUDAROOT=" set on your compilation line, "-ta=tesla:cuda8.0" should be using the PGI supplied CUDA 8.0 directory located in "$PGI/linux86-64/2017/cuda/8.0". Are you sure it's using the /usr/local/cuda install? You can double check by adding the verbose flag (-v) to see what the compiler driver is executing or "-dryrun" to see the commands without having the driver execute them.

Another possibility is that you're using "-L" or "-Wl" flags to point to the CUDA install (like you do with GNU) in which case the linker will pick-up the CUDA libraries from these directories. Though since they're the same libraries as we ship, it shouldn't be a problem.