I am trying to decipher what the passage_score and result[].score mean (in percentile terms) in the discovery results. This is so that we can filter out passages and results which do not meet a minimum confidence threshold.
For example in this result set:
{
...
"passages": [
{
"document_id": "AA",
"passage_score": 14.303232050575723,
...
},
{
"document_id": "BB",
"passage_score": 14.089714658115533,
...
}
],
"results": [
{
"id": "AA",
"score": 1.5188946,
...
},
{
"id": "BB",
"score": 1.5188946,
...
}
]
}
how would I convert the scores into a percentile equivalent for comparison ? In RnR, I used to do this using the ranker.confidence field.
According to Official Documentation about Watson Discovery, the passages are generated by sophisticated Watson algorithms to determine the best passages of text from all of the documents returned by the
query
.I think maybe you can use the
highlight
parameter,highlight
: A boolean that specifies whether the returned output includes a highlight object in which the keys are field names and the values are arrays that contain segments of query-matching text highlighted by the HTML*
tag.Or the
top_hits
parameter: Returns the documents ranked by the score of the query or enrichment. Can be used with any query parameter or aggregation. This example returns the 10 top hits for a term aggregation.