Error while loading documents in Weaviate

179 views Asked by At

Getting this error message, please help me resolve it

{
    'error': [
        {
            'message': "'id' is a reserved property name, no such prop with name 'id' found in class 'LangChain_14ee2519c3154dcb9ea3f47a8967d0e8' in the schema. Check your schema files for which properties in this class are available"
        }
    ]
}

My code:

loader = ConfluenceLoader(url=URL, username=USER_NAME, api_key="API_KEY")

documents = loader.load(space_key="SPACE_KEY",     include_attachments=True, limit=150)

text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=0)

documents = text_splitter.split_documents(documents)

db = Weaviate.from_documents(documents, embeddings, client=client, by_text=False)

chatbot = RetrievalQA.from_chain_type(llm=llm, chain_type="stuff",

retriever=db.as_retriever(search_type="similarity",search_kwargs={"k": 1}),)

prompt = PromptTemplate(template=template, input_variables=\["query"\])

import sys
while True:
    message=input("prompt: ")
    if message=='exit'
        print('Exiting')
        sys.exit()
    if message=='':
        continue
    response = chatbot.run(prompt.format(query=message))
1

There are 1 answers

0
Duda Nogueira On

Duda from Weaviate here :)

The problem here is that Lanchain ConfluenceLoader (used v0.0.329) brings in a metada field called id, and that field is reserved in Weaviate.

As a workaround, you can change this metadata key before ingesting the data:

no_ids = []
for d in documents:
    d.metadata["confluence_id"] = d.metadata["id"]
    del d.metadata["id"]
    no_ids.append(d)
documents = no_ids

Thanks for this report. I believe we can add a check here and change the the id key (and other reserved fields) when a reserved field is given: https://github.com/langchain-ai/langchain/blob/master/libs/langchain/langchain/vectorstores/weaviate.py

Ps: in order to test the ConfluenceLoader, no need to set up a Confluence account, you can use:

loader = ConfluenceLoader(url="https://templates.atlassian.net/wiki/")
docs = loader.load(space_key="RD", limit=3, max_pages=5)

Let me know if this helps :)