I have installed 2 ES podman containers on 2 different servers(primary and secondary)
Primary elasticsearch.yml config:
network.host: 0.0.0.0
cluster.name: xyz-es
node.name: node-primary # On the primary, use 'node-primary', and 'node-secondary' on the secondary
path.data: /usr/share/elasticsearch/data
discovery.seed_hosts: ["172.16.211.99", "172.18.205.99"] # ["ip_of_primary", "ip_of_secondary"]
cluster.initial_master_nodes: ["node-primary", "node-secondary"]
xpack.security.enabled: false
Secondary elasticsearch.yml config
network.host: 0.0.0.0
cluster.name: xyz-es
node.name: node-secondary # On the primary, use 'node-primary', and 'node-secondary' on the secondary
path.data: /usr/share/elasticsearch/data
discovery.seed_hosts: ["172.16.211.99", "172.18.205.99"] # ["ip_of_primary", "ip_of_secondary"]
cluster.initial_master_nodes: ["node-primary", "node-secondary"]
xpack.security.enabled: false
However , the secondary are unable to copy the data from primary server, when checking the logs the secondary Elasticsearch node is able to complete the initial handshake with the primary node, but then the follow-up connection fails with a NoRouteToHostException error.
{"@timestamp":"2023-10-26T08:36:24.295Z", "log.level": "WARN", "message":"address [172.16.211.99:9300], node [null], requesting [false] discovery result: [node-primary][10.89.0.9:9300] connect_exception: Failed execution: io.netty.channel.AbstractChannel$AnnotatedNoRouteToHostException: No route to host: 10.89.0.9/10.89.0.9:9300: No route to host: 10.89.0.9/10.89.0.9:9300: No route to host", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"elasticsearch[node-secondary][generic][T#3]","log.logger":"org.elasticsearch.discovery.PeerFinder","elasticsearch.node.name":"node-secondary","elasticsearch.cluster.name":"xyz-es"}
{"@timestamp":"2023-10-26T09:19:32.071Z", "log.level": "WARN", "message":"completed handshake with [{node-primary}{BGp7fx9kTR6Fh6m1pCan2g}{JfFpk5WNSgG0ayvMP8uB8g}{node-primary}{10.89.0.9}{10.89.0.9:9300}{cdfhilmrstw}{8.10.2}{7000099-8100299}] at [172.16.211.99:9300] but followup connection to [10.89.0.9:9300] failed", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"elasticsearch[node-secondary][generic][T#3]","log.logger":"org.elasticsearch.discovery.HandshakingTransportAddressConnector","elasticsearch.node.name":"node-secondary","elasticsearch.cluster.name":"xyz-es","error.type":"org.elasticsearch.transport.ConnectTransportException","error.message":"[node-primary][10.89.0.9:9300] connect_exception","error.stack_trace":"org.elasticsearch.transport.ConnectTransportException: [node-primary][10.89.0.9:9300] connect_exception\n\tat [email protected]/org.elasticsearch.transport.TcpTransport$ChannelsConnectedListener.onFailure(TcpTransport.java:1154)\n\tat [email protected]/org.elasticsearch.action.support.SubscribableListener$FailureResult.complete(SubscribableListener.java:285)\n\tat [email protected]/org.elasticsearch.action.support.SubscribableListener.tryComplete(SubscribableListener.java:197)\n\tat [email protected]/org.elasticsearch.action.support.SubscribableListener.setResult(SubscribableListener.java:222)\n\tat [email protected]/org.elasticsearch.action.support.SubscribableListener.onFailure(SubscribableListener.java:141)\n\tat [email protected]/org.elasticsearch.transport.netty4.Netty4TcpChannel.lambda$addListener$0(Netty4TcpChannel.java:62)\n\tat [email protected]/io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:590)\n\tat [email protected]/io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:583)\n\tat [email protected]/io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:559)\n\tat [email protected]/io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:492)\n\tat [email protected]/io.netty.util.concurrent.DefaultPromise.setValue0(DefaultPromise.java:636)\n\tat [email protected]/io.netty.util.concurrent.DefaultPromise.setFailure0(DefaultPromise.java:629)\n\tat [email protected]/io.netty.util.concurrent.DefaultPromise.tryFailure(DefaultPromise.java:118)\n\tat [email protected]/io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.fulfillConnectPromise(AbstractNioChannel.java:321)\n\tat [email protected]/io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:337)\n\tat [email protected]/io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:776)\n\tat [email protected]/io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:689)\n\tat [email protected]/io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:652)\n\tat [email protected]/io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:562)\n\tat [email protected]/io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)\n\tat [email protected]/io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)\n\tat java.base/java.lang.Thread.run(Thread.java:1623)\nCaused by: org.elasticsearch.common.util.concurrent.UncategorizedExecutionException: Failed execution\n\tat [email protected]/org.elasticsearch.action.support.SubscribableListener.wrapAsExecutionException(SubscribableListener.java:178)\n\tat [email protected]/org.elasticsearch.common.util.concurrent.ListenableFuture.wrapException(ListenableFuture.java:38)\n\tat [email protected]/org.elasticsearch.common.util.concurrent.ListenableFuture.wrapException(ListenableFuture.java:27)\n\t... 18 more\nCaused by: java.util.concurrent.ExecutionException: io.netty.channel.AbstractChannel$AnnotatedNoRouteToHostException: No route to host: 10.89.0.9/10.89.0.9:9300\n\t... 21 more\nCaused by: io.netty.channel.AbstractChannel$AnnotatedNoRouteToHostException: No route to host: 10.89.0.9/10.89.0.9:9300\nCaused by: java.net.NoRouteToHostException: No route to host\n\tat java.base/sun.nio.ch.Net.pollConnect(Native Method)\n\tat java.base/sun.nio.ch.Net.pollConnectNow(Net.java:673)\n\tat java.base/sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:973)\n\tat [email protected]/io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:337)\n\tat [email protected]/io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:334)\n\tat [email protected]/io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:776)\n\tat [email protected]/io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:689)\n\tat [email protected]/io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:652)\n\tat [email protected]/io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:562)\n\tat [email protected]/io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)\n\tat [email protected]/io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)\n\tat java.base/java.lang.Thread.run(Thread.java:1623)\n"}
if you look at the logs, you can see that the secondary server will contact primary and primary will respond with the container internal IP which is not reachable over the network
the Kube.yml:
# Save the output of this file and use kubectl create -f to import
# it into Kubernetes.
#
# Created with podman-4.4.1
apiVersion: v1
kind: Pod
metadata:
creationTimestamp: "2023-10-09T06:58:43Z"
labels:
app: xyz-elascitsearch
name: xyz-elascitsearch
spec:
containers:
- args:
- eswrapper
image: docker.elastic.co/elasticsearch/elasticsearch:8.10.2
name: engine
ports:
- containerPort: 9200
hostPort: 9200
- containerPort: 9300
hostPort: 9300
resources: {}
securityContext: {}
volumeMounts:
- mountPath: /usr/share/elasticsearch/config/elasticsearch.yml
name: home-xyz-podman-projects-xyz-elascitsearch-config-elasticsearch.yml-host-0
- mountPath: /usr/share/elasticsearch/data
name: home-xyz-podman-projects-xyz-elascitsearch-data-host-1
hostname: xyz-elascitsearch
restartPolicy: Never
volumes:
- hostPath:
path: /home/xyz/podman/projects/xyz-elascitsearch/config/elasticsearch.yml
type: File
name: home-xyz-podman-projects-xyz-elascitsearch-config-elasticsearch.yml-host-0
- hostPath:
path: /home/xyz/podman/projects/xyz-elascitsearch/data
type: Directory
name: home-xyz-podman-projects-xyz-elascitsearch-data-host-1
status: {}
How to enforce the secondary node to use the seed_hosts IP and not the respond IP ? or is there another configuration i am missing , appreciate anyone help.
Found the solution:
Add the publish_host to the config to let other nodes what IP to call and not default to the host which in this case was localhost defaulting to the container IP