NEST partial matching for multiple values: Wildcards in Terms or any other way?

916 views Asked by At

I want to find users based on a list of usernames. the list of the username might contain partial usernames and will come from a web application.

var userNames = new List<string> (...); // not sure how many!

LINQ:

var userEntity =  allUsers.Where(p=> userNames.Any(x=>  p.UserName.Contains(x)))

NEST: ???

must.Terms(t => t.Field(f => f.UserName).Terms<string>(usernames))

but this only returns the exact matches and not the partials.

How would you translate the above LINQ query into NEST (ElasticSearch)?

1

There are 1 answers

0
Russ Cam On

The inefficient way would be to set up a bool query with should clauses (if relevancy scores are needed) or a filter (if relevancy scores are not needed) clause with bool query should clauses, with a collection of prefix, wildcard or regexp queries, one for each unique username you are looking for. Depending on the amount of data that you are looking at and the number of usernames you are searching for, the performance of this query could be really bad.

The much more efficient way is to index usernames with an analyzer that tokenizes usernames according to the tokens that you'd expect to search with e.g. if you're looking to match on username Russ by searching for Ru, then you'd want to build an analyzer that includes edgengram tokenizer or token filter, then use a match query, or any query that undergoes analysis at query time.