ONNXRuntimeError : 1 : FAIL : Failed to load library libonnxruntime_providers_cuda.so with error: libcurand.so.10

2.4k views Asked by At

ONNX Failed to create CUDAExecutionProvider.

Environment: venv on Ubuntu 22.04 with python 3.10

GPU: Quadro P600 driver 535

cuda 11.8, cudnn 8.6, & cudnn 8.9 are installed correctly on the system.

Packages installed:

absl-py==2.0.0
apturl==0.5.2
astunparse==1.6.3
bcrypt==3.2.0
blinker==1.4
Brlapi==0.8.3
cachetools==5.3.2
certifi==2020.6.20
chardet==4.0.0
click==8.0.3
colorama==0.4.4
coloredlogs==15.0.1
command-not-found==0.3
cryptography==3.4.8
cuda-python==11.8.0
cupshelpers==1.0
Cython==3.0.5
dbus-python==1.2.18
defer==1.0.6
distro==1.7.0
distro-info==1.1+ubuntu0.1
duplicity==0.8.21
fasteners==0.14.1
filelock==3.9.0
flatbuffers==23.5.26
fsspec==2023.4.0
future==0.18.2
gast==0.4.0
google-auth==2.23.4
google-auth-oauthlib==1.0.0
google-pasta==0.2.0
grpcio==1.59.3
h5py==3.10.0
httplib2==0.20.2
humanfriendly==10.0
idna==3.3
importlib-metadata==4.6.4
jeepney==0.7.1
Jinja2==3.1.2
keras==2.14.0
keyring==23.5.0
language-selector==0.1
launchpadlib==1.10.16
lazr.restfulclient==0.14.4
lazr.uri==1.0.6
libclang==16.0.6
lockfile==0.12.2
louis==3.20.0
macaroonbakery==1.3.1
Mako==1.1.3
Markdown==3.5.1
MarkupSafe==2.1.3
ml-dtypes==0.2.0
monotonic==1.6
more-itertools==8.10.0
mpmath==1.3.0
netifaces==0.11.0
networkx==3.0
numpy==1.24.3
nvidia-cublas-cu11==11.11.3.6
nvidia-cuda-cupti-cu11==11.8.87
nvidia-cuda-nvcc-cu11==11.8.89
nvidia-cuda-runtime-cu11==11.8.89
nvidia-cudnn-cu11==8.7.0.84
nvidia-cufft-cu11==10.9.0.58
nvidia-curand-cu11==10.3.0.86
nvidia-cusolver-cu11==11.4.1.48
nvidia-cusparse-cu11==11.7.5.86
nvidia-nccl-cu11==2.16.5
oauthlib==3.2.0
olefile==0.46
onnxruntime-gpu==1.16.3
opencv-python==4.8.1.78
opt-einsum==3.3.0
packaging==23.2
paramiko==2.9.3
pexpect==4.8.0
Pillow==9.0.1
protobuf==4.25.1
ptyprocess==0.7.0
pyasn1==0.5.1
pyasn1-modules==0.3.0
pycairo==1.20.1
pycups==2.0.1
PyGObject==3.42.1
PyJWT==2.3.0
pymacaroons==0.13.0
PyNaCl==1.5.0
pyparsing==2.4.7
pyRFC3339==1.1
python-apt==2.4.0+ubuntu2
python-dateutil==2.8.1
python-debian==0.1.43+ubuntu1.1
pytz==2022.1
pyxdg==0.27
PyYAML==5.4.1
reportlab==3.6.8
requests==2.25.1
requests-oauthlib==1.3.1
rsa==4.9
screen-resolution-extra==0.0.0
SecretStorage==3.3.1
six==1.16.0
sympy==1.12
systemd-python==234
tensorboard==2.14.1
tensorboard-data-server==0.7.2
tensorflow[and-cuda]==2.14.1
tensorflow-estimator==2.14.0
tensorflow-io-gcs-filesystem==0.34.0
tensorrt==8.5.3.1
termcolor==2.3.0
torch==2.1.1+cu118
torchaudio==2.1.1+cu118
torchvision==0.16.1+cu118
triton==2.1.0
typing_extensions==4.5.0
ubuntu-advantage-tools==8001
ubuntu-drivers-common==0.0.0
ufw==0.36.1
unattended-upgrades==0.1
urllib3==1.26.5
usb-creator==0.3.7
wadllib==1.3.6
Werkzeug==3.0.1
wrapt==1.14.1

Error message I get:

2023-11-22 18:34:08.959317292 [E:onnxruntime:Default, provider_bridge_ort.cc:1480 TryGetProviderInfo_CUDA] /onnxruntime_src/onnxruntime/core/session/provider_bridge_ort.cc:1193 onnxruntime::Provider& onnxruntime::ProviderLibrary::Get() [ONNXRuntimeError] : 1 : FAIL : Failed to load library libonnxruntime_providers_cuda.so with error: libcurand.so.10: cannot open shared object file: No such file or directory

2023-11-22 18:34:08.959328586 [W:onnxruntime:Default, onnxruntime_pybind_state.cc:747 CreateExecutionProviderInstance] Failed to create CUDAExecutionProvider. Please reference https://onnxruntime.ai/docs/execution-providers/CUDA-ExecutionProvider.html#requirements to ensure all dependencies are met.

To reproduce:

import time
import cv2
import numpy as np
import torch
import tensorflow as tf
import tensorrt
import onnxruntime as ort
print(tf.config.list_physical_devices('GPU'))
print(ort.get_device())
print(torch.cuda.is_available())
cuda = True
providers = ['CUDAExecutionProvider', 'CPUExecutionProvider'] if cuda else ['CPUExecutionProvider']
w = "yolov7.onnx"
session = ort.InferenceSession(w, providers=providers)
outputs_labels = session.get_outputs()[0].name
inputs_names = session.get_inputs()[0].name

The print statements output respectively is:

[PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]
GPU
True

libcurand.so.10 exists in its path in cuda installation files.

I don't know why I get this error. I tried:

  1. Reinstalling cuda and cudnn
  2. Setting LD_LIBRARY_PATH for CUDA
  3. Reinstalling onnxruntime-gpu
  4. Trying the code with and without importing torch
  5. Trying the code with and without importing tensorflow
  6. Making new clean venv and just install onnxruntime-gpu and opencv-python and trying the code.

All this did not work for me.

I expect the problem is because of some package's compatibility issues. But I can't figure it out. I made sure to install everything to be compatible.

0

There are 0 answers