Migrate NEST 6.6 to Elasticsearch client 8. Unclear on how to define analyzers

602 views Asked by At

I'm trying to migrate my Elasticsearch to 8 and I am having difficulty understanding how to properly migrate my analyzer definitions.

This is the code I previously had (as simiplified as possible) using NEST 6.6:

Client.CreateIndex(
  index => index.Settings(
    settings => settings.Analysis(
       analysis => analysis.TokenFilters(
       tokenFilter => tokenFilter.Synonym("synonym", syn => syn.SynonymsPath("analysis/synonym.txt"))
    ).Analyzers(analyzers => analyzers
       .Custom("mycustom", cust => cust
         .Filters("stop", "synonym").Tokenizer("standard")
        )
    )
  )
)

This creates the index with the following:

{
  "settings": {
    "index": {
      "analysis": {
        "filter": {
          "synonym": {
            "type": "synonym",
            "synonyms_path": "analysis/synonym.txt"
          }
        },
        "analyzer": {
          "mycustom": {
            "filter": [
              "stop",
              "synonym"
            ],
            "type": "custom",
            "tokenizer": "standard"
          }
        }
      }
    }
  }
}

Here is my attempt at migrating to Elastic.Clients.Elasticsearch 8:

Client.Indices.Create(index => index.Settings(
 settings => setting.Analysis(
   analysis => analysis
     .Filter(tokenFilter => tokenFilter.Add(
        "synonym", new TokenFilter(new TokenFilterDefinitions(
            // This is where I start getting lost
            new Dictionary<string, ITokenFilterDefinition> {
            { "synonym", new SynonymTokenFilter() { // What are the keys meant to be?
                  SynonymsPath = "analysis/synonym.txt"
             } } }))))
     .Analyzer(analyzers =>
        analyzers.Custom("mycustom", cust => cust.Filter(new[] {"stop", "synonym"})
                 .Tokenizer("standard"))
        )) 
)

This is clearly the wrong syntax because the generated JSON request looks like:

"filter": {
    "synonym": {
        "synonym": {
            "synonyms_path": "analysis/synonym.txt",
            "type": "synonym"
         }
    }
...

I have also tried:

tokenFilter.Add("synonym", new SynonymTokenFilter() { 
   SynonymsPath = "analysis/synonym.txt"
})

in an attempt to move it up in the JSON hierarchy but then it does not compile because SynonymTokenFilter is not compatible with TokenFilter which tokenFilter.Add requires.

I still don't understand how to recreate the same kind of index I had before in code.

2

There are 2 answers

0
Pynt On BEST ANSWER

I imagine there've been a lot of additions to the new library.

Here is an example of how to implement it with Elastic.Clients.Elasticsearch 8.1.2:

 var response = await _client.Indices
        .CreateAsync("myindex", config => config
                .Settings(settings => settings
                    .Analysis(a => a
                        .Analyzers(analyzers => analyzers
                            .Custom("mycustom", cust => cust
                                .Filter(new string[] { "stop", "synonym" })
                                )
                        )
                        .TokenFilters(tokenFilters => tokenFilters
                            .Synonym("synonym", syn => syn
                                .SynonymsPath("analysis/synonym.text")
                            )
                        )
                    )
                )
            );
0
apokryfos On

I've also asked the same question at the Elastic discussion forums and the response there was this is a bug with the new client's auto-code generation and an issue was raised

In the meantime I've worked around this issue using code like below:

var response = Client.Transport.Request<CreateIndexResponse>(
    HttpMethod.PUT,
    $"/{indexName}",
    PostData.String("<actual index definition as a JSON string>")
);

This works, but is less than ideal because we have to maintain a JSON of the index definition instead of defining it in code.