Nodetool describecluster list all nodes unreachable

324 views Asked by At

I am deploying cassandra on two public networks, when nodes are started i can see all the node has joined the ring. Also nodetool describecluster shows all nodes are reachable.

After sometime i see nodes are not able to connect to each other and nodetool describecluster shows all nodes in unreachable list.

FYI, i have used public_ip as BROADCAST_ADDRESS AND RPC_ADDRESS. Listen address is the private_ip.

1

There are 1 answers

0
Aaron On

One reason this can happen, is that firewalls are sometimes configured to find and kill idle connections. The Linux kernel has default TCP "keepalive" settings that it can use to refresh long-running connections. The default values for these settings can be seen using sysctl:

$ sudo sysctl -a | grep keepalive
net.ipv4.tcp_keepalive_intvl = 75
net.ipv4.tcp_keepalive_probes = 9
net.ipv4.tcp_keepalive_time = 7200

In an effort to combat this problem, DataStax recommends adjusting these values in production deployments:

$ sudo sysctl -w \
net.ipv4.tcp_keepalive_time=60 \
net.ipv4.tcp_keepalive_probes=3 \
net.ipv4.tcp_keepalive_intvl=10

You can also add each of those values to your system's equivalent of the/etc/sysctl.conf file (minus the backslashes) and implement that via sysctl also:

sudo sysctl -p /etc/sysctl.conf