I have the following mapping in my ElasticSearch index (simplified as the other fields are irrelevant:
{
"test": {
"mappings": {
"properties": {
"name": {
"type": "keyword"
},
"entities": {
"type": "nested",
"properties": {
"text_property": {
"type": "text"
},
"float_property": {
"type": "float"
}
}
}
}
}
}
}
The data looks like this (again simplified):
[
{
"name": "a",
"entities": [
{
"text_property": "foo",
"float_property": 0.2
},
{
"text_property": "bar",
"float_property": 0.4
},
{
"text_property": "baz",
"float_property": 0.6
}
]
},
{
"name": "b",
"entities": [
{
"text_property": "foo",
"float_property": 0.9
}
]
},
{
"name": "c",
"entities": [
{
"text_property": "foo",
"float_property": 0.2
},
{
"text_property": "bar",
"float_property": 0.9
}
]
}
]
I'm trying perform a bucket aggregation on the maximum value of float_property for each document. So for the example above, the following would be the desired response:
...
{
"buckets": [
{
"key": "0.9",
"doc_count": 2
},
{
"key": "0.6",
"doc_count": 1
}
]
}
as doc a's highest nested value for float_property is 0.6, b's is 0.9 and c's is 0.9.
I've tried using a mixture of nested and aggs, along with runtime_mappings, but I'm not sure in which order to use these, or if this is even possible.
I've managed to figure this out in the end.
The two things I hadn't realised were:
scriptinstead of afieldkey to bucket aggregations.nestedqueries, you can access nested values directly usingparams._source.The combination of these two things allowed me to write the correct query:
Response:
I'm confused though, because I thought the correct way to access
nestedfields was by using thenestedquery type. Unfortunately there's very little documentation for this, so I'm still unsure if this is the intended/correct way to aggregate on scripted nested fields.