How do I prevent a version conflict when reindexing/adding the same document back into the Solr core?

1k views Asked by At

I have a Solr core containing 60k documents. I have updated the field types in the schema.xml and I do not want to delete the Solr core for reindexing. I am trying to retrieve the documents with a Solr search and then try to add that same document with that same id back into Solr. In doing this, I get a version conflict.

Example: I retrieve one document using a Pysolr search request. The document looks like this:

doc = {
        "type":"person",
        "lastname":"Johnson",
        "firstname":"Bobby",
        "id":"person_abcd",
        "_version_":1691404871556661248}

The above document still exists in Solr and I do not want to change it. I want to reindex it/add it again back into Solr because the field types in the schema.xml have changed.

When I do:

import pysolr

core = pysolr.Solr('http://localhost:10000/solr/core', always_commit=True)
core.add(doc)

I get the following error:

pysolr.SolrError: Solr responded with an error (HTTP 409): [Reason: version conflict for person_abcd expected=1691404871556661248 actual=1691426574942863360]

Why does the 'actual' version change and does not stay as the 'expected' version?

How can I solve this (examples are appreciated) ?

1

There are 1 answers

3
EricLavault On BEST ANSWER

The _version_ field is used internally by Solr to manage partial update and update log features. You should not include it in your documents when reindexing. Just remove it.

If you need Solr Optimistic Concurrency feature, in this case the _version_ must be specified as part of the update command in the request, not in the documents.