I'm migrating from SQL to elasticsearch but I faced some issues with aggregations especially group by

my query looks like

SELECT    count(*) as total,country_code 
FROM      orders 
WHERE     product_id = ? 
GROUP BY  country_code 
ORDER BY  total desc LIMIT 3 

SQL RESULT

I've tried this one but not working

{
    "query": {
        "bool": {
            "must": [
                {
                    "match": {
                        "line_items.product_id": {
                            "query": "0001112223333"
                        }
                    }
                }
            ]
        }
    },
    "from": 0,
    "size": 3,
    "aggregations": {
        "country_code": {
            "aggregations": {
                "COUNT(*)": {
                    "value_count": {
                        "field": "_index"
                    }
                }
            },
            "terms": {
                "field": "country_code",
                "size": 200
            }
        }
    }
}

ES RESULT

2 Answers

0
Kamal On Best Solutions

Based on your images, make use of the keyword datatype and not text.

According to the link for keyword,

They are typically used for filtering (Find me all blog posts where status is published), for sorting, and for aggregations. Keyword fields are only searchable by their exact value.

The reason you observe those errors is because you are trying to run aggregation query on text datatype. Text datatype goes through the Analysis phase where ES would take the value, breaks it into tokens and stores them in inverted index,

I'd suggest you to make use of multi-fields where your mapping for country_code would be as below:

Mapping:

{  
   "properties":{  
      "country_code":{  
         "type":"text",
         "fields":{  
            "keyword":{  
               "type":"keyword"
            }
         }
      }
   }
}

Aggregation Query:

{
    "query": {
        "bool": {
            "must": [
                {
                    "match": {
                        "line_items.product_id": {
                            "query": "0001112223333"
                        }
                    }
                }
            ]
        }
    },
    "from": 0,
    "size": 3,
    "aggregations": {
        "country_code": {
            "aggregations": {
                "COUNT(*)": {
                    "value_count": {
                        "field": "_index"
                    }
                }
            },
            "terms": { 
                "field": "country_code.keyword",          <----- change this
                "size": 200
            }
        }
    }
}

Note the above field where I've used country_code.keyword in my aggregation query.

Hope this helps!

0
Chr0nicl3 On

You should consider using product id as a keyword rather than a text type and then use term query on it rather than match query, as it would be much more efficient. Also since you do not require any data from the docs, you can set the size to 0 for the query.

Also, you should be using keyword type in the mapping for the country_code field.

This simple query should get your job done -

{
  "size": 0,
  "query": {
    "term": {
      "line_items.product_id": 1116463
    }
  },
  "aggregations": {
    "ad_type": {
      "terms": {
        "field": "country_code",
        "size": 200
      }
    }
  }
}

P.S. - Share your index mapping as well, as it would make the picture a bit more clear.