cloudsearch query to boost exact match on range

1k views Asked by At

In a cloudsearch structured query.

I have a couple of fields I am searching on.

On field one, the user selects "2" On field two the user selects "1"

I am wanting to run this as a range query, so that the results that are returned are -1 to +1

eg. on field one the range would be 1,3 and on field 2 it would be 0,2

What I am wanting to do is sort the results so that the results that match both field 1 and field 2 are at the top, and the rest under it.

eg. where field one=2 and field two =1 would be at the top and the rest are not in any specific order,

note: I do need to end up sorting the results by distance, so that all the exact matching results are in distance order, then all the rest are ordered by distance.

I am sure I can do this with 2 queries, just trying to make it work in one query if at all possible to lighten the load.

1

There are 1 answers

0
alexroussos On BEST ANSWER

Say your fields are 'a' and 'b', and the specified values are a=2 and b=1 (as in your example, except I've named the fields 'a' and 'b' instead of 'one' and 'two'). Here are the various terms of your query.

Range Query

This is the query for the range a±1 and b±1 where a=2 and b=1:

q=(and (range field=a[1,3]) (range field=b[0,2]))

Rank Expression

For your rank expression, compute a distance-based score using absolute value so that scores 'a' and 'b' can't cancel each other out (like a=3,b=0 would, for example):

expr.rank1=abs(a-2)+abs(b-1)

Sort by Rank

That defined a ranking expression named rank1, which we now want to sort by, starting with the lowest values ('0' means a=2,b=1):

sort=rank1 asc

Return the Rank

For debugging purposes, you may want return the ranking score:

return=rank1

Put all those terms together and you've got your query.

Further Potentially-Useful Things

If you want to get fancy and penalize things in a non-linear way, you can use exp. For example, if you want to differentiate between 'a' and 'b' both being off by 1 vs 'a' being an exact match and 'b' being off by 2 (eg a=3,b=2 will rank ahead of a=2,b=3 even though the previous ranker would give them both a score of 2):

expr.rank1=exp(abs(a-2))+exp(abs(b-1))

And you can use boolean logic and the ternary operator to detect and prefer certain results that meet certain criteria, eg to give a big boost when 'a' and 'b' are on-target, a smaller boost when 'a' or 'b' is on target, etc (since we're sorting in low-to-high, a boost in rank is actually achieved by adding less to the result):

((a==1&&b==2)?0:100)+((a==1||b==2)?0:1000)+abs(a-1)+abs(b-2)

See http://docs.aws.amazon.com/cloudsearch/latest/developerguide/configuring-expressions.html