Lucene documents scoring/ranking with regex query

Question

Lucene documents scoring/ranking with regex query

242 views Asked by Vasyl Senko At 29 December 2016 at 11:46

I am using Azure Search, but suppose my question is more relevant to Lucene. Can't find any information of how documents' ranks (scores) are being calculated when a query fully of partly consists of regex. For example:

Searching for "microsoft" returns normally calculated scores:

{ score: 6.088776, name: "Microsoft Research" }
{ score: 5.9090853, name: "Microsoft Corporation" }
{ score: 5.0747375, name: "Microsoft Philippines, Inc." }
{ score: 4.93202, name: "Microsoft Dynamics, Inc." }

When searching for "/.micro./" returns with scores equal to 1:

{ score: 1, name: "Microsoft Dynamics, Inc." }
{ score: 1, name: "Microsoft Philippines, Inc." }
{ score: 1, name: "Microsoft Startup Alley" }

And searhing for "microsoft /.micro./", returns I suppose sum of "microsoft" term score and /.micro./ term score (always equals to 1):

{ score: 5.2132897, name: "Microsoft Research" }
{ score: 5.198583, name: "Microsoft Corporation" }
{ score: 4.973414, name: "Microsoft Philippines, Inc." }

What I need is to run fully regex query and have calculated scores.

Original Q&A

There are 1 answers

**Nate Ko** · Accepted Answer · 2017-01-05T00:23:12+00:00

In Azure Search, wildcard search queries like prefix, regex and fuzzy search queries go through an internal query rewriting process and return constant scores. This is mainly due to performance reasons and also to prevent our default term-frequency based scoring (TF-IDF) from biasing towards matches from less frequent unique terms. The behavior is documented in https://learn.microsoft.com/en-us/rest/api/searchservice/lucene-query-syntax-in-azure-search#bkmk_searchscoreforwildcardandregexqueries. There currently isn't a way to change this default behavior. If you feel that the feature is important, please create an entry in our user voice (https://feedback.azure.com/forums/263029-azure-search) to help us prioritize. Thank you.

Nate

TechQA.

Lucene documents scoring/ranking with regex query

There are 1 answers

Related Questions in REGEX

Related Questions in LUCENE

Related Questions in AZURE-COGNITIVE-SEARCH

Popular Questions

Popular Tags

Trending Questions