Putting Filters with parameters directly causes error

124 views Asked by At

So, i have been trying to add unique token filter with parameters without creating a separate token filter in my custom analyzer.

According to documentation an example of token filter with parameters without creating a separate token filter: https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-length-tokenfilter.html#analysis-length-tokenfilter-analyze-ex

According to documentation an example of token filter with parameters without creating a separate token filter: https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-unique-tokenfilter.html

PUT pokemon
{
  "settings": 
  {
    "analysis": 
    {
      "analyzer": 
      {
        "deba_analyzer": 
        {
          "char_filter": ["html_strip"],
          "tokenizer": "uax_url_email",
          "filter": ["lowercase","stop","kstem",{"type": "unique","only_on_same_position": true}]
        }
      }
    }
  },
  "mappings": 
  {
    "properties": 
    {
      "name": 
      {
        "type":"text"
      }
    }
  }
}

However, cutting to the chase, its giving me error:

"reason" : "Failed to load settings from [{\"analysis\":{\"analyzer\":{\"deba_analyzer\":{\"filter\":[{\"only_on_same_position\":true,\"type\":\"unique\"}],\"char_filter\":[\"html_strip\"],\"tokenizer\":\"uax_url_email\"}}}}]",
"caused_by" : {
  "type" : "illegal_state_exception",
  "reason" : "only value lists are allowed in serialized settings"
}
1

There are 1 answers

0
Joe - Check out my books On

Even the token filter link you've attached specified the unique filter under separate filter definitions and you'll need to do the same. In other words, the analyzer's filter array accepts only

  1. names of built-in filters with default settings
  2. or names of custom filters which need to be defined separately.

So use this:

PUT pokemon
{
  "settings": {
    "analysis": {
      "analyzer": {
        "deba_analyzer": {
          "char_filter": [
            "html_strip"
          ],
          "tokenizer": "uax_url_email",
          "filter": [
            "lowercase",
            "stop",
            "kstem",
            "my_unique_filter"                <--
          ]
        }
      },
      "filter": {
        "my_unique_filter": {                 <--
          "type": "unique",
              "only_on_same_position": true
        }
      }
    }
  },
  "mappings": {
    "properties": {
      "name": {
        "analyzer": "deba_analyzer",          <--
        "type": "text"
      }
    }
  }
}

And don't forget to specify your meticulously created analyzer -- if you don't it's going to be registered but not applied to any of your fields.


EDIT

The _analyze API from your example does allow specifying the filters verbosely but the PUT API simply does not.