I can install llama cpp with cuBLAS using pip as below:
CMAKE_ARGS="-DLLAMA_CUBLAS=on" FORCE_CMAKE=1 pip install llama-cpp-python
However, I don't know how to install it with cuBLAS when using poetry. Installation is possible, but cuBLAS Acceleration is not available.
I checked that I can use cuBLAS when I installed it with pip in my environment.
I added llama-cpp-python dependency to the pyproject.toml file as below:
[tool.poetry.dependencies]
python = ">=3.10, <3.13"
...
llama-cpp-python = "^0.2.13"
...
I tried
CMAKE_ARGS="-DLLAMA_CUBLAS=on" FORCE_CMAKE=1 poetry install
And
export CMAKE_ARGS="-DLLAMA_CUBLAS=on"
export FORCE_CMAKE=1
poetry install
I encountered a similar issue and found a workaround. While Poetry doesn't directly support passing environment variables like pip, I used
poetry run pip install
as a temporary solution. This approach involves setting the necessary environment variables and then running:This method allowed me to install
llama-cpp-python
with CU-BLAS support, which I couldn't achieve solely with Poetry. It's important to note that this bypasses Poetry's dependency resolution, so use it cautiously and document it in your project.I tried using https://github.com/volopivoshenko/poetry-plugin-dotenv/, but was still not getting it to work.