I have embedded a PDF using the OPENAI embeddings and have saved it in a local file. Basically I am trying to get a text input, for example: "cat" and perform similarity search. I have tried using the following implementation:
const embedder = new OpenAIEmbeddings();
const inputEmbedding = await embedder.embedQuery(input_prompt);
const jsonString = fs.readFileSync('./embedded.json', 'utf-8');
const book_embeddings = JSON.parse(jsonString);
const ind = new HNSWLib(275);
for (let i = 0; i < embeddings.length; i++) {
ind.add(embeddings[i], i);
}
const k = 2;
const result = ind.similaritySearch(inputEmbedding, k);
console.log(result);
275 is the length of the embedded list representing the PDF. When I run this I receive the following error: "Cannot read properties of undefined (reading 'index')". Which I don't understand. The same error occurs if I just instantiate the HNSWLib object on its own. This suggests to me that something may be wrong with the way I imported the library. to import it I used: import { HNSWLib } from "langchain/vectorstores/hnswlib";
.
I managed to get it working creating a vector store first starting from the raw PDF text adequately split into paragraphs, something like const vectorStore = await HNSWLib.fromTexts()
, followed by the text you want to embed and the embedding. However this isn't what I'm looking for because I already have the embedding of the PDF document. If there was a function like HNSWLib.fromEmbeddings()
that would work, but unfortunately that doesn't exist. Any suggestions?
Thanks
..................................................................