The following operations were committed in HPC cloud system.
NAME="CentOS Linux"
VERSION="7 (Core)"
CUDA Info
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 510.47.03 Driver Version: 510.47.03 CUDA Version: 11.6 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA A100-PCI... Off | 00000000:41:00.0 Off | 0 |
| N/A 33C P0 39W / 250W | 0MiB / 40960MiB | 0% Default |
| | | Disabled |
----
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2021 NVIDIA Corporation
Built on Sun_Mar_21_19:15:46_PDT_2021
Cuda compilation tools, release 11.3, V11.3.58
Build cuda_11.3.r11.3/compiler.29745058_0
- If install tensorflow-gpu==1.15.0, GPU is able to be found
- If install tensorflow==2.12, GPU is NOT able to be found
>>> tf.config.list_physical_devices(device_type=None)
[PhysicalDevice(name='/physical_device:CPU:0', device_type='CPU')]
or
from tensorflow.python.client import device_lib
def get_available_devices():
local_device_protos = device_lib.list_local_devices()
return [x.name for x in local_device_protos]
print(get_available_devices()) # ['/device:CPU:0']
While installing tensorflow-gpu==1.15.0 is able to detect GPU, I need to use TF 2.
Please help me to figure it out and if any additional information is required, let me know.
I tried the above operations in another HPC system with CUDA 12.2 which has the same issue.