I'm guessing this is most likely going to be an aggregation because I don't know the value of one particular field - the field that needs to be compared with other results. I'm trying to create a list of results where: fieldA in Result1 has the exact same value as fieldA in Result2 (I don't care what it is) AND where 1 Result1 has "blue" in fieldB and Result2 has "green" in fieldB (I do care that "blue is present in one and "green" is present in the other. Am I correct that this is only achievable through aggregations? If so, what would that aggregation script look like (I'm assuming it's a script too). Thanks.
So, here's some sample data:
[
{
"record_id": "1",
"record_type": "typeA",
"field_a": "1111111111",
"field_b": "blue"
},
{
"record_id": "2",
"record_type": "typeA",
"field_a": "1111111111",
"field_b": "green"
},
{
"record_id": "3",
"record_type": "typeA",
"field_a": "2222222222",
"field_b": "blue"
},
{
"record_id": "4",
"record_type": "typeA",
"field_a": "2222222222",
"field_b": "yellow"
}
]
My query only knows the value of "record_type". What I need is a list of "field_a" values that match where there are at least 2 results where 1 has "blue" in "field_b" and 1 has "green" in "field_b".
So, in this example, I would like to know that "1111111111" matches that criteria - there is 1 with "blue" and 1 with "green", while "2222222222" does NOT match since there is 1 with "blue" and 1 with "yellow" (i.e., not "green").
I know I can do an aggregation that, for instance, counts the values in "field_a" for my results.
However, there can be 1 - N with the same value in "field_a". I specifically need to know if there is at least 1 with "blue" in "field_b" and at least 1 with "green" in "field_b" for a given "field_a" value.
Hopefully that illustrates the problem, but if not here's what I "think" I should be doing - an aggregation of an aggregation:
{
"size": 0,
"query": {
"bool": {
"must": [
{
"query_string": {
"query": "*",
"analyze_wildcard": true,
"default_field": "*"
}
}
],
"filter" : [
{
"terms" : {
"record_type.keyword" : [
"typeA"
],
"boost" : 1.0
}
}
]
}
},
"aggs": {
"uniq_field_a_values": {
"terms": {
"field": "field_a.keyword",
"size" : 10000
}
},
"aggs": {
[ what should this look like???? ]
}
}
}
I was able to get what I wanted by doing the following where I have "[ what should this look like???? ] written above: