PostgreSQL & Langchain query - No operator matches the given name and argument types

20 views Asked by At

I'm storing simple dataset in a Postgresql database using langchain and openai embedding.

This seems to be working fine.

But when querying the database I get the below error message.

connection_string = "postgresql://[email protected]/aitest"
collection_name = "items"

def datastore():
    text = 'My long text is blue'
    text_splitter = CharacterTextSplitter(
        chunk_size=1000,
        chunk_overlap=200,
        length_function=len
    )
    texts = text_splitter.split_text(text)

    embeddings = OpenAIEmbeddings()
    
    db = PGEmbedding.from_texts(
        embedding=embeddings,
        texts=texts,
        collection_name=collection_name,
        connection_string=connection_string,
    )
   

def retrieve():
    embeddings = OpenAIEmbeddings()


    db1 = PGEmbedding.from_existing_index(
        embedding=embeddings,
        collection_name=collection_name,
        pre_delete_collection=False,
        connection_string=connection_string,
    )

    query = "What did the president say about Ketanji Brown Jackson"
    db1.similarity_search_with_score(query)

Running datastore(), creates a langchain_pg_collection and lagchain_pg_embedding with data. Note that lagchain_pg_embedding.embedding column is of type "_float4" (not sure if this is expected).

Running retrieve() crashes and returns below error:

ProgrammingError: (psycopg2.errors.UndefinedFunction) operator does not exist: real[] <-> real[] LINE 1: ...edding_uuid, abs(langchain_pg_embedding.embedding <-> ARRAY[... ^ HINT: No operator matches the given name and argument types. You might need to add explicit type casts. [SQL: SELECT langchain_pg_embedding.collection_id AS langchain_pg_embedding_collection_id, langchain_pg_embedding.embedding AS langchain_pg_embedding_embedding, langchain_pg_embedding.document AS langchain_pg_embedding_document, langchain_pg_embedding.cmetadata AS langchain_pg_embedding_cmetadata, langchain_pg_embedding.custom_id AS langchain_pg_embedding_custom_id, langchain_pg_embedding.uuid AS langchain_pg_embedding_uuid, abs(langchain_pg_embedding.embedding <-> %(embedding_1)s::REAL[]) AS distance FROM langchain_pg_embedding WHERE langchain_pg_embedding.collection_id = %(collection_id_1)s::UUID ORDER BY abs(langchain_pg_embedding.embedding <-> %(embedding_2)s::REAL[]) ASC LIMIT %(param_1)s] [parameters: {'embedding_1': [-0.052560476281678786, -0.008834546998953006, -0.008705921112153828, 0.007737844863369729, 0.0037538363447478127, 0.0042243351349390045, -0.002059701 ... (34103 characters truncated) ... 785052453730274, -0.029570345478911762, 0.01835961005386705, 0.025603261651909844, -0.020891503500820918, -0.02010621047563943, -0.026970754196800433], 'collection_id_1': UUID('9defca44-19a2-453d-a3ee-5c1293542873'), 'embedding_2': [-0.052560476281678786, -0.008834546998953006, -0.008705921112153828, 0.007737844863369729, 0.0037538363447478127, 0.0042243351349390045, -0.002059701 ... (34103 characters truncated) ... 785052453730274, -0.029570345478911762, 0.01835961005386705, 0.025603261651909844, -0.020891503500820918, -0.02010621047563943, -0.026970754196800433], 'param_1': 4}] (Background on this error at: https://sqlalche.me/e/20/f405)

I'm new to langchain and followed this tutorial https://python.langchain.com/docs/integrations/vectorstores/pgembedding but haven't been able to figure out what is causing the problem.

Thanks

0

There are 0 answers