Recall returns nothing when querying rank-profile

Question

Recall returns nothing when querying rank-profile

96 views Asked by kaega At 12 October 2020 at 17:55

I have a sample Vespa instance and I want to train a lightgbm model from the rank-profile. https://docs.vespa.ai/documentation/learning-to-rank.html

However, anytime I specify the recall with the docID, I get 0 hits. I'm using example code from here: https://github.com/vespa-engine/sample-apps/blob/master/text-search/src/python/collect_training_data.py

body = create_request_top_hits("test", "training", hits=2)
get_features(url, body)

And this correctly returns:

[{'id': 'index:domains/0/944f3a850511f388fe97ac85',
  'relevance': 1.2427330381582673,
  'source': 'domains',
  'fields': {'uri': '6202597992',
   'rankfeatures': {'bm25(body)': 2.8145480372957787,
    'nativeFieldMatch(categories)': 0.0,
    'nativeFieldMatch(concepts)': 0.8591903630989031,
    'nativeFieldMatch(links)': 0.0,
    'nativeFieldMatch(title)': 0.0,
    'nativeProximity(categories)': 0.0,
    'nativeProximity(concepts)': 0.0,
    'nativeProximity(links)': 0.0,
    'nativeProximity(title)': 0.0,
    'rankingExpression(time_ranking)': 1.0}}},
 {'id': 'index:domains/0/93f92aae1d6a010c2111e9b7',
  'relevance': 1.2010786365413106,
  'source': 'domains',
  'fields': {'uri': '6206270866',
   'rankfeatures': {'bm25(body)': 2.0397289658724347,
    'nativeFieldMatch(categories)': 0.0,
    'nativeFieldMatch(concepts)': 0.8591903630989031,
    'nativeFieldMatch(links)': 0.0,
    'nativeFieldMatch(title)': 0.0,
    'nativeProximity(categories)': 0.0,
    'nativeProximity(concepts)': 0.0,
    'nativeProximity(links)': 0.0,
    'nativeProximity(title)': 0.0,
    'rankingExpression(time_ranking)': 1.0}}}]

To see if recall works, we'll use the top result:

'id': 'index:domains/0/944f3a850511f388fe97ac85'
'uri': '6202597992'  # docIDs are derived from the uri field

And set the recall to the docid:

doc_id = [6202597992, "6202597992", "944f3a850511f388fe97ac85"]  # multiple representations...
body = create_request_specific_ids("test", "training", doc_id)
get_features(url, body)

I would expect this to return the rank features from before but instead I get 0 hits. This is the full return:

{'root': {'id': 'toplevel', 'relevance': 1.0, 'fields': {'totalCount': 0}, 'coverage': {'coverage': 100, 'documents': 798, 'full': True, 'nodes': 5, 'results': 5, 'resultsFull': 5}}}

I've checked docs and examples and I haven't been able to find any information here. Any insights would be greatly appreciated.

Original Q&A

There are 1 answers

**Jo Kristian Bergum** · Accepted Answer · 2020-10-12T18:14:32+00:00

The collect script/function expects that there is a field called id in your document schema. If you alter the script to use the uri field instead you should be able to retrieve the documents.

TechQA.

Recall returns nothing when querying rank-profile

There are 1 answers

Related Questions in VESPA

Popular Questions

Popular Tags

Trending Questions