Best solution for changing settings in Elasticsearch

163 views Asked by At

I have 2 question have indexed: FIRE_DETECTED and SMOKE:DETECTED in Elasticsearch

Goal

I want to search with query = 'fire' -> result: FIRE_DETECTED

query = 'dectected' -> result: FIRE_DETECTED and SMOKE:DETECTED

Some solution

  1. Add more setting in analyzer
  • We need to create a new index with new setting (Add Token filter: word_delimiter_graph)
  • Reindex
  • Problem: How to add setting in production without effect customer?
  1. Add 1 more field into Elasticsearch filterd_question
  • Split data with : and _
  • Save splited data in this filterd_question field
  • Problem: We need 1 more field

What is the best solution for this? (Add more solutions if need)

2

There are 2 answers

0
Amit On BEST ANSWER

Again, this is really good and very common scenario while working with elasticsearch and as requirements keeps changes and in order to support them, we have to change the way we index the data in ES.

Both the approaches which you mentioned are used by companies and they both have their trad-offs and you have to choose one which suits according to your requirements.

Change/add the analyzer will require below steps in order to make it work:

  1. Close the index
  2. Add/Edit the analyzer definition.
  3. open the index
  4. Reindex all the documents(you should use the index alias with zero down time to efficiently do it and minimize the impact on end-users)
  5. After step-4, your new searches, will work.

Pros: it won't create new fields, hence would save the space, hence more efficient and cleaner way of doing this change.

cons would be that re-index might take a lot of time, based on number of documents and its comparatively complex process.

Add a custom-analyzer and then add a new field using newly added analyzer

In this case also, it requires closing/opening a index, unless you are using the inbuilt analyzer, but in this case, your new documents or documents which are updating will have the new fields, so your search according to new analyzer/logic will bring partial results, but this is could be fine based on your use-case.

Pros: relatively simpler approach and doesn't require full-re indexing in all the cases.

Cons: extra space, if old field is not being used and complexity varies according to use-cases.

0
apurbojha On

If you don't want to change/add analyzer. You can try using wildcard query. Although the con would be performance.