Supabase and pgvector

124 views Asked by At

hi how can I add a custom filter to a sql function in a supabase database that uses pgvector to store embeddings.

    create table documents (
  id bigserial primary key,
  content text,
  embedding vector(1536),
custom_id text // Custom id for each company
);

Something like this:

    create
or replace function match_documents (
  query_embedding vector (1536),
  match_threshold float,
  match_count int,
  custom_id text
) returns table (id bigint, content text, similarity float) language sql stable as $$
  select
    documents.id,
    documents.content,
    1 - (documents.embedding <=> query_embedding) as similarity
  from documents
  where documents.custom_id = match_documents.custom_id AND 1 - (documents.embedding <=> query_embedding) > match_threshold
  order by similarity desc
  limit match_count;
$$;

Call from project:

const { data: documents, error } = await supabaseClient.rpc('match_documents', {
    query_embedding: embedding,
    match_threshold: .73, // Choose an appropriate threshold for your data
    match_count: 10, // Choose the number of matches
    custom_id: "2",
  })

Within the database I have different embeddings, and if I ask a question by passing the custom_id, it does not respect the where filter, and it responds with information about embeddings that do not belong to that cutom_id.

Database from supabase

Is there any way of filtering the documents and then... doing all the embeddings query logic?

Thanks!!!

1

There are 1 answers

6
dshukertjr On BEST ANSWER

I would add wrap the threshold filter with parenthesis just in case, but otherwise everything looks good. I would double check that your function definition is up to date. Try dropping the function first and then recreating the function to make sure the function definition is up to date.

create or replace function match_documents (
  query_embedding vector (1536),
  match_threshold float,
  match_count int,
  custom_id text
) returns table (id bigint, content text, similarity float, custom_id text) language sql stable as $$
  select
    documents.id,
    documents.content,
    1 - (documents.embedding <=> query_embedding) as similarity,
    documents.custom_id
  from documents
  where 
    documents.custom_id = match_documents.custom_id AND
    (1 - (documents.embedding <=> query_embedding) > match_threshold)
  order by similarity desc
  limit match_count;
$$;