I am trying to install trt-llm-rag-windows following the guide on its github repo and encountered the issue while trying to build the trt engine with the following command:
python build.py --model_dir <path to llama13_chat model> --quant_ckpt_path <path to model.pt> --dtype float16 --use_gpt_attention_plugin float16 --use_gemm_plugin float16 --use_weight_only --weight_only_precision int4_awq --per_group --enable_context_fmha --max_batch_size 1 --max_input_len 3000 --max_output_len 1024 --output_dir <TRT engine folder>
Stacktrace:
Traceback (most recent call last):
File "C:\Users\user\inference\TensorRT-LLM\examples\llama\build.py", line 22, in <module>
import torch
File "C:\Users\user\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\__init__.py", line 129, in <module>
raise err
OSError: [WinError 127] The specified procedure could not be found. Error loading "C:\Users\user\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\lib\cudnn_adv_train64_8.dll" or one of its dependencies.
Sideinfo (Environment):
- gpu: NVIDIA GeForece RTX 4070 Ti
- python: 3.10
-
- pip: 24.0
-
- tensorrt: 9.2.0.post12.dev5
-
- torch: 2.1.0+cu121
-
- fsspec: 2023.5.0
- TensorRT: TensorRT-9.1.0.4.Windows10.x86_64.cuda-12.2.llm.beta
- Cuda: cuda_12.2.2_537.13_windows
- cuDNN: cudnn-windows-x86_64-8.9.7.29_cuda12-archive
- LLM: Llama-2-13b-chat-hf
- TensorRT-LLM: release/0.5.0
I checked:
- package folder for the missing dll -> it exists
- import torch directly via python3.10 console -> works (including the "missing" dll)
- import tensorrt_llm; print(tensorrt_llm._utils.trt_version()) -> works and returns "[TensorRT-LLM] TensorRT-LLM version: 0.8.09.2.0.post12.dev5"
- force reinstall torch via "python3.10 -m pip install --force-reinstall torch==2.1.0 torchvision==0.16.0 torchaudio==2.1.0 --index-url https://download.pytorch.org/whl/cu121" --> works, has no effect
Additional context: I encountered multiple version conflicts while following the guide, including downloading a newer Version of cudnn (cudnn64_9). But could fix all of them. Curious is that the python package does not seem to be broken, cause it works when imported directly.
Did anyone encounter a similar issue and can shed some light on it?
Edit: Fixed formatting