opensearchserver tokenizer for permutation of all words in query

Question

opensearchserver tokenizer for permutation of all words in query

422 views Asked by Ankit Agarwal At 03 May 2016 at 06:49

I need to configure Open-search server to analyse the query in such a way that any permutation of words in the query are matched, it return the document.

For example, In indexation of a field I have a phrase "knee pain". Now if my query is like "how to remove pain in human knee". I want that this query output the document having "knee pain" in indexation field.

Hence my requirement to break the query string as "remove","pain","human","knee","remove pain",""remove knee","remove human","pain knee","human knee","knee pain","human pain",etc.

So that it matches "knee pain". Is there any tokenizer or filter which can help me to achieve this.

Original Q&A

There are 1 answers

**Fix It Scotty** · Answer 1 · 2016-05-03T12:44:49+00:00

Select your index, click on the Schema tab, and then click the Analyzers tab.

I normally edit the TextAnalyzer and add additional filters to it. I normally start with the lower case and stop filter to make searches case-insensitive and remove stop words like "a", "an", "the".

Then, the Shingle filter will give you the n-grams to make phrase matches. Shingle filter with a shingle size of 3-4 four words usually works. Shingling is creating overlapping permutations of word phrases from the analyzed text. "The brown fox jumps high" with a shingle size of 3 would create analyzed n-grams of 1,2, and 3 words. IE, 1-word: "the", "brown", "fox", "jumps", "high". 2-word: "the brown", "brown fox", "fox jumps", "jumps high", etc.

TechQA.

opensearchserver tokenizer for permutation of all words in query

There are 1 answers

Related Questions in INDEXING

Related Questions in ANALYZER

Related Questions in OPENSEARCH

Related Questions in OPEN-SEARCH-SERVER

Popular Questions

Popular Tags

Trending Questions