creating llm instance from pretrained weights using ctransformers

305 views Asked by At

I'm running the python 3 code below in a jupyter notebook. In the code I'm trying to create an instance of the llama-2-7b-chat model loading weights that have been quantized using gguf. I'm trying to load the weights using the ctransformers module. when I try to create the llm instance from pretrained weights using the code below I'm getting the error message:

"OSError: libcudart.so.12: cannot open shared object file: No such file or directory"

The full code and error message are below. I'm running the code on ubuntu server 18.04 LTS. The list of relevant python modules in the conda virtual environment I'm using are also below.

Can anyone see what the issue is and suggest how to solve it?

python modules:

torchaudio                2.0.0               py310_cu117    pytorch
torchtriton               2.0.0                     py310    pytorch
torchvision               0.15.0              py310_cu117    pytorch
pytorch                   2.0.0           py3.10_cuda11.7_cudnn8.5.0_0    pytorch
pytorch-cuda              11.7                 h778d358_5    pytorch
pytorch-mutex             1.0                        cuda    pytorch
ctransformers             0.2.27                   pypi_0    pypi
cuda-cudart               11.7.99                       0    nvidia
cuda-cupti                11.7.101                      0    nvidia
cuda-libraries            11.7.1                        0    nvidia
cuda-nvrtc                11.7.99                       0    nvidia
cuda-nvtx                 11.7.91                       0    nvidia
cuda-runtime              11.7.1                        0    nvidia
cudatoolkit               10.1.243             h6bb024c_0    nvidia
cudnn                     7.6.5                cuda10.1_0    anaconda

code:

import os 
import ctransformers

#Set the path to the model file

download_path='/home/username/stuff/username_storage/LLM/llama/gguf/llama-2-7b-chat.Q4_K_M.gguf'

model_path = os.path.join(os.getcwd(), download_path)
#Create the AutoModelForCausalLM class

llm = ctransformers.AutoModelForCausalLM.from_pretrained(model_path, model_type="gguf", gpu_layers=5, threads=24, reset=False, context_length=10000, stream=True,max_new_tokens=256, temperature=0.8, repetition_penalty=1.1)
#Start a conversation loop

error:

---------------------------------------------------------------------------
OSError                                   Traceback (most recent call last)
Cell In[2], line 11
      8 model_path = os.path.join(os.getcwd(), download_path)
      9 #Create the AutoModelForCausalLM class
---> 11 llm = ctransformers.AutoModelForCausalLM.from_pretrained(model_path, model_type="gguf", gpu_layers=5, threads=24, reset=False, context_length=10000, stream=True,max_new_tokens=256, temperature=0.8, repetition_penalty=1.1)

File ~/anaconda3/envs/llm_gguf/lib/python3.10/site-packages/ctransformers/hub.py:175, in AutoModelForCausalLM.from_pretrained(cls, model_path_or_repo_id, model_type, model_file, config, lib, local_files_only, revision, hf, **kwargs)
    167 elif path_type == "repo":
    168     model_path = cls._find_model_path_from_repo(
    169         model_path_or_repo_id,
    170         model_file,
    171         local_files_only=local_files_only,
    172         revision=revision,
    173     )
--> 175 llm = LLM(
    176     model_path=model_path,
    177     model_type=model_type,
    178     config=config.config,
    179     lib=lib,
    180 )
    181 if not hf:
    182     return llm

File ~/anaconda3/envs/llm_gguf/lib/python3.10/site-packages/ctransformers/llm.py:246, in LLM.__init__(self, model_path, model_type, config, lib)
    240         raise ValueError(
    241             "Unable to detect model type. Please specify a model type using:\n\n"
    242             "  AutoModelForCausalLM.from_pretrained(..., model_type='...')\n\n"
    243         )
    244     model_type = "gguf"
--> 246 self._lib = load_library(lib, gpu=config.gpu_layers > 0)
    247 self._llm = self._lib.ctransformers_llm_create(
    248     model_path.encode(),
    249     model_type.encode(),
    250     config.to_struct(),
    251 )
    252 if self._llm is None:

File ~/anaconda3/envs/llm_gguf/lib/python3.10/site-packages/ctransformers/llm.py:126, in load_library(path, gpu)
    124 if "cuda" in path:
    125     load_cuda()
--> 126 lib = CDLL(path)
    128 lib.ctransformers_llm_create.argtypes = [
    129     c_char_p,  # model_path
    130     c_char_p,  # model_type
    131     ConfigStruct,  # config
    132 ]
    133 lib.ctransformers_llm_create.restype = llm_p

File ~/anaconda3/envs/llm_gguf/lib/python3.10/ctypes/__init__.py:374, in CDLL.__init__(self, name, mode, handle, use_errno, use_last_error, winmode)
    371 self._FuncPtr = _FuncPtr
    373 if handle is None:
--> 374     self._handle = _dlopen(self._name, mode)
    375 else:
    376     self._handle = handle

OSError: libcudart.so.12: cannot open shared object file: No such file or directory

update: ran command below in my ubuntu 18.04 LTS server

nvidia-smi

got output below:

Tue Nov  7 17:55:41 2023       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.141.03   Driver Version: 470.141.03   CUDA Version: 11.4     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  Off  | 00000000:42:00.0 Off |                  N/A |
|  0%   33C    P8    10W / 260W |   1908MiB /  7974MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A     42819      C   ...3/envs/new_llm/bin/python     1905MiB |
+-----------------------------------------------------------------------------+
0

There are 0 answers