Tensorflow is unable to find GPU

12 views Asked by At

The following operations were committed in HPC cloud system.

NAME="CentOS Linux"
VERSION="7 (Core)"

CUDA Info

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 510.47.03    Driver Version: 510.47.03    CUDA Version: 11.6     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA A100-PCI...  Off  | 00000000:41:00.0 Off |                    0 |
| N/A   33C    P0    39W / 250W |      0MiB / 40960MiB |      0%      Default |
|                               |                      |             Disabled |


----
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2021 NVIDIA Corporation
Built on Sun_Mar_21_19:15:46_PDT_2021
Cuda compilation tools, release 11.3, V11.3.58
Build cuda_11.3.r11.3/compiler.29745058_0
  1. If install tensorflow-gpu==1.15.0, GPU is able to be found
  2. If install tensorflow==2.12, GPU is NOT able to be found
>>> tf.config.list_physical_devices(device_type=None)
[PhysicalDevice(name='/physical_device:CPU:0', device_type='CPU')]

or

from tensorflow.python.client import device_lib

def get_available_devices():
    local_device_protos = device_lib.list_local_devices()
    return [x.name for x in local_device_protos]

print(get_available_devices()) # ['/device:CPU:0']

While installing tensorflow-gpu==1.15.0 is able to detect GPU, I need to use TF 2.

Please help me to figure it out and if any additional information is required, let me know.

I tried the above operations in another HPC system with CUDA 12.2 which has the same issue.

0

There are 0 answers