Ngram Tokenizer on field, not on query

Question

Ngram Tokenizer on field, not on query

357 views Asked by Christophe Schutz At 09 January 2017 at 16:12

I'm having trouble finding the solution for a use case here. Basically, it's pretty simple : I need to perform a "contains" query, like a SQL like '%...%'.

I've seen there is a regexp query, which I actually managed to get working perfectly, but as it seems to scale badly, i'm trying out nGrams. Now, I've played around with them before and know "how they work", but the behaviour isn't the one I expect it to be.

Basically, i've configured my analyzer to be mingram =2, maxgram = 20. Say I index a user called "Christophe". I want the query "Chris" to actually match, which it does, since Chris is a 5-gram of Christophe. The problem is, "Risotto" matches aswell, because it gets broken down into Ngrams and ultimately "is" is a 2-gram of "Christophe" and so it matches aswell.

What I need is the analyzer to actually break down the indexed field in nGrams at indexing time, and compare those to the FULL text query. Risotto should match Risotto, XXXRisottoXXX and so on, but not Risolo or something where the nGrams do match.

Is there any solution ?

Original Q&A

There are 1 answers

**NikoNyrh** · Answer 1 · 2017-01-09T16:34:10+00:00

You need to use search_analyzer setting to have distinct index time and search time analyzers.

Sample from docs:

"mappings": {
  "my_type": {
    "properties": {
      "text": {
        "type": "text",
        "analyzer": "autocomplete", 
        "search_analyzer": "standard" 
      }
    }
  }
}

TechQA.

Ngram Tokenizer on field, not on query

There are 1 answers

Related Questions in ELASTICSEARCH

Related Questions in N-GRAM

Popular Questions

Popular Tags

Trending Questions