I am running into errors when I try to disable dynamic mapping in ElasticSearch settings. I am using ElasticSearch 1.7 version for implementation.
StackTrace :
8151 [main] WARN org.apache.hadoop.mapred.YarnChild - Exception running child : org.elasticsearch.hadoop.rest.EsHadoopInvalidRequest: Found unrecoverable error [10.74.51.71:9200] returned Not Found(404) - [TypeMissingException[[test_2017051222] type[[vehicle, trying to auto create mapping, but dynamic mapping is disabled]] missing]]; Bailing out..
at org.elasticsearch.hadoop.rest.RestClient.retryFailedEntries(RestClient.java:207)
at org.elasticsearch.hadoop.rest.RestClient.bulk(RestClient.java:170)
at org.elasticsearch.hadoop.rest.RestRepository.tryFlush(RestRepository.java:225)
at org.elasticsearch.hadoop.rest.RestRepository.flush(RestRepository.java:248)
at org.elasticsearch.hadoop.rest.RestRepository.doWriteToIndex(RestRepository.java:187)
at org.elasticsearch.hadoop.rest.RestRepository.writeToIndex(RestRepository.java:163)
at org.elasticsearch.hadoop.mr.EsOutputFormat$EsRecordWriter.write(EsOutputFormat.java:151)
at org.apache.hadoop.mapred.ReduceTask$NewTrackingRecordWriter.write(ReduceTask.java:566)
at org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:89)
at org.apache.hadoop.mapreduce.lib.reduce.WrappedReducer$Context.write(WrappedReducer.java:105)
at org.apache.hadoop.mapreduce.Reducer.reduce(Reducer.java:150)
at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:171)
at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:635)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:390)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Settings snippet :
"settings": {
"number_of_shards": 5,
"number_of_replicas": 1,
"index.query.default_field":"test",
"index.refresh_interval" : "5s",
"index.mapper.dynamic": false ,
"analysis": {
"filter": {
"ngram_filter": {
"type": "ngram",
"min_gram": 2,
"max_gram": 18,
"token_chars": [
"letter",
"digit"
]
}
},
"analyzer": {
"ngram_analyzer": {
"type": "custom",
"tokenizer": "standard",
"filter": [
"lowercase",
"ngram_filter"
]
}
}
}
}
I am seeing that dynamic mapping is disabled in settings on ES endpoint but the job fails. I have a avro json mapping file and es json mapping file where avro json mapping file is the superset while es json mapping file is subset. I do not want all the fields in the superset mapping file to be reflected on ES index instead only dump fields which are in subset mapping file. Am I doing it wrong or is there any other way of doing it.
Thanks.
That's because you have set
"index.mapper.dynamic": false
which means new types will not be created automatically without first declaring them.What you would want to do is set
"dynamic": "false"
in the mapping of your type.PUT /test_index { "mappings": { "test_type": { "dynamic": "false" } } }
For more Info: https://www.elastic.co/guide/en/elasticsearch/guide/1.x/dynamic-mapping.htmlExample:
Run Mapping
PUT /my_index { "mappings": { "testing": { "dynamic": "false", "properties": { "field1": { "type": "string", "index": "analyzed" } } } } }
Index the document in
testing
typePOST /my_index/testing/1 { "field1":"demo", "field99":"anotherDemo" }
Response of
GET /my_index/testing/_mapping
{ "my_index": { "mappings": { "testing": { "dynamic": "false", "properties": { "field1": { "type": "string" } } } } } }
As you will see that there is no mapping for the field99.