Date histogram aggregation Elasticsearch

857 views Asked by At

I want to filter and get data from elastic search. where I have tried Date histogram aggregation but its not solving my purposes. I have data like:

[
   {
      "id":1,
      "title":"Sample news",
      "date":"2020-09-17",
      "regulation":[
         {
            "id":1,
            "name":"sample name",
            "date":"2020-09-17"
         },
         {
            "id":2,
            "name":"sample name 1",
            "date":"2020-09-18"
         }
      ]
   },
   {
      "id":2,
      "title":"Sample news 1",
      "date":"2020-09-17",
      "regulation":[
         {
            "id":1,
            "name":"sample name",
            "date":"2020-09-18"
         },
         {
            "id":2,
            "name":"sample name 1",
            "date":"2020-09-17"
         }
      ]
   }
]

I want to filter and get data like:

year: {
  month: {
   day: {
    news: int,
    regulations: int,
   }
 }
}

That means per day news and regulation count as a date Hierarchy. I can achieve data like that:

        "2020-09-17" : {
          "key_as_string" : "2020-09-17",
          "key" : 1600300800000,
          "doc_count" : 1
        },
        "2020-09-18" : {
          "key_as_string" : "2020-09-18",
          "key" : 1600387200000,
          "doc_count" : 0
        },
        "2020-09-19" : {
          "key_as_string" : "2020-09-19",
          "key" : 1600473600000,
          "doc_count" : 0
        },

using

GET /news/_search?size=0
{
  "aggs": {
    "news_over_time": {
      "date_histogram": {
        "field": "date",
        "calendar_interval": "day",
        "keyed": true,
        "format": "yyy-MM-dd"
      }
    }
  }
}

But it's not solving my purpose. How can I do that using Elasticsearch and Elasticsearch dsl

Expected response: Expected response:

2020: {
  09: {
   17: {
    news: 2,
    regulation: 2
   },
   18: {
    news: 0,
    regulation: 2
   }
 }
}

2

There are 2 answers

0
Lupidon On

I didn't sure what your expected respone, but if you want to get the number of news for every day this is the request you looking for

GET /news/_search?size=0
{
  "aggs": {
    "news_over_time": {
      "date_histogram": {
        "field": "regulation.date",
        "calendar_interval": "day",
        "format": "yyy-MM-dd"
         }
      }
   }
}
4
Sahil Gupta On

Since the news date and regulation date are 2 different fields & one of them belong to parent doc and other to nested doc. I am not completely sure that we can directly do what you are asking for (I myself is also exploring for the same). However, below query should also work for you.

GET news/_search
{
  "size": 0, 
  "aggs": {
    "news_over_time": {
      "date_histogram": {
        "field": "date",
        "calendar_interval": "day",
        "keyed": true,
        "format": "yyy-MM-dd"
      }
    },"regulations_over_time":{
      "nested": {
        "path": "regulation"
      },"aggs": {
        "regulation": {
          "date_histogram": {
            "field": "regulation.date",
            "calendar_interval": "day",
            "keyed": true,
            "format": "yyy-MM-dd"
          }
        }
      }
    }
  }
}

It will provide results in below form:

"aggregations" : {
"regulations_over_time" : { //<=== Regulations over time based on regulationDate
  "doc_count" : 9,
  "regulation" : {
    "buckets" : {
      "2020-09-17" : {
        "key_as_string" : "2020-09-17",
        "key" : 1600300800000,
        "doc_count" : 5
      },
      "2020-09-18" : {
        "key_as_string" : "2020-09-18",
        "key" : 1600387200000,
        "doc_count" : 4
      }
    }
  }
},
"news_over_time" : { //<======= news over time based on news date
  "buckets" : {
    "2020-09-17" : {
      "key_as_string" : "2020-09-17",
      "key" : 1600300800000,
      "doc_count" : 2
    },
    "2020-09-18" : {
      "key_as_string" : "2020-09-18",
      "key" : 1600387200000,
      "doc_count" : 2
    }
  }
}
}
}

You can then merge these 2 stats together if required.