ElasticSearch 5.*, query for: field not exist or if exist value should be this

1.9k views Asked by At

Consider below Stop field is the timestamp field.

i want to filter data with below condition:

  1. stop field not exist
  2. or, stop field value is >= now

I know, i should use must_not but cannot figure out how.

I want to do some scoring on child type and use this score to sort parent, then filter out parent using stop field.

GET indexName/parentType/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "has_child": {
            "type": "child-type",
            "score_mode": "max",
            "query": {
              "function_score": {
                "functions": [
                  {
                    "script_score": {
                      "script": {
                        "file": "score-analytics",
                        "lang": "expression"
                      }
                    }
                  }
                ]
              }
            }
          }
        },
        {
          "bool": {
            "should": [
              {
                "range": {
                  "stop": {
                    "gte": "now"
                  }
                }
              }
            ]
          }
        }
      ]
    }
  }
}
2

There are 2 answers

0
Lax On

You need to use exist filter:

"bool": 
      { "should":[
         { "range": 
             { "stop": 
                 { "gte": "now" }
             }
         },
         { "query": 
           { "exists" : 
              { "field" : "stop" } 
            }  
          }
 ] }
0
Kulasangar On

What if you use, must_not bool condition in order to filter that the field does not exist. Your query could look something like this:

"query": {
    "filtered": {
      "filter": {
        "bool": {
          "must_not": [
            {
              "exists": {
                "field": "stop" <-- give the field which shouldn't exist
              }
            }
          ]
        }
      }
    }
  }

The above is a sample so that you could reproduce. For the second condition, seems like using range query as you've done would do. I can't pretty much assure a better way of getting a timestamp range. Hope it helps!