sql query translation Query DSL kibana

1.6k views Asked by At

Please explain the logic of the query translation from sql in kibana console. The most confusing is "order" : "asc", while i request desc.

The numbers "10985", and "11030" also looks very strange. If I re-run translation these numbers are changing.

I make a query translation:

    POST _sql/translate
{
  "query": "SELECT day_of_week, avg(taxful_total_price) FROM kibana_sample_data_ecommerce WHERE customer_id = 52 GROUP BY day_of_week ORDER BY avg(taxful_total_price) DESC LIMIT 2"
  }

Translation:

    {
  "size" : 0,
  "query" : {
    "term" : {
      "customer_id" : {
        "value" : 52,
        "boost" : 1.0
      }
    }
  },
  "_source" : false,
  "stored_fields" : "_none_",
  "aggregations" : {
    "groupby" : {
      "composite" : {
        "size" : 1000,
        "sources" : [
          {
            "10985" : {
              "terms" : {
                "field" : "day_of_week",
                "missing_bucket" : true,
                "order" : "asc"
              }
            }
          }
        ]
      },
      "aggregations" : {
        "11030" : {
          "avg" : {
            "field" : "taxful_total_price"
          }
        }
      }
    }
  }
}
2

There are 2 answers

2
littledaxter On

Regarding the changing number, they are only the names of the aggregation; as if you would use the "AS 11030" in sql. It allows you to use the aggregation in other aggregation by refeering to its name. It is mandatory to have a name for an aggregation in the structure of the ES query. Perhaps it is normal behavior of the translation to randomly name the aggregations with numbers. It should not have incidence over the results of the query or it's behavior.

0
Andrei Stefan On

The aggregation names don't have a meaning, they are generated randomly, but ES-SQL keeps track of which is which (obviously) and whenever one aggregation needs the result of another one, it knows which one to use.

There is an improvement request - https://github.com/elastic/elasticsearch/issues/43531 - to have consistent naming, so that caching can be used. Not yet implemented, and currently not on the near future roadmap.

Regarding the asc DESC discrepancy, if you pay close attention, your DESC ordering is on AVG while the asc sorting is for the composite's terms aggregations, which are two different things. ES-SQL performs the sorting on aggregations "client" side (as opposed to server/ES side), because there is no ES query that would perform that sorting while using composite aggregation.

Bottom line is that the query you see there is complete, but not quite, because that's the query ES-SQL runs on Elasticsearch, but there is also a client-side task that performs the actual sorting by AVG. There is an improvement issue already opened to enhance the translate API to indicate that, at least, the generated query may not be enough to achieve the same results one user would if he/she would run that query on ES itself.