langchain HuggingFaceEmbeddings() model load with 8 bit

739 views Asked by Nik At 14 July 2023 at 15:38

I'm trying to use Databricks Dolly model from HuggingFace repo to create embeddings. My 16GB GPU is running out of memory even when I'm using 3B version of the model so I'm trying to load it in 8 bit:

embeddings = HuggingFaceEmbeddings(model_name="databricks/dolly-v2-3b", model_kwargs={'load_in_8bit':True})

Looks like load_in_8bit kwarg is not permitted here but I know it's possible to load a model this way when instantiating a pipeline. Is there a way to do the same for embeddings? Couldn't find anything helpful in langchain docs about this

Original Q&A

TechQA.

langchain HuggingFaceEmbeddings() model load with 8 bit

There are 0 answers

Related Questions in LANGCHAIN

Related Questions in LARGE-LANGUAGE-MODEL

Related Questions in DATABRICKS-DOLLY

Popular Questions

Popular Tags

Trending Questions