Can't start second node in MySQL Galera cluster

5.5k views Asked by At

All action are performed in debian 7 virtual machines. Two nodes have installed: galera replicator, mysql galera from codership, percona-xtrabackup, netcat-openbsd (requried by percona-xtrabackup). The third node has only galera replicator and acts as arbitrator with garbd running.

Config on node #1 (192.168.0.102)

wsrep_provider=/usr/lib/galera/libgalera_smm.so
wsrep_provider_options="gcache.size=2G"
wsrep_cluster_name="clusterTest"
wsrep_cluster_address="gcomm://"
wsrep_node_name="node-1"
wsrep_node_address=192.168.0.102
wsrep_node_incoming_address=192.168.0.102
wsrep_slave_threads=16
wsrep_sst_method=xtrabackup
wsrep_sst_receive_address=192.168.0.102
wsrep_sst_auth=root:somepass

Config on node #2 (192.168.0.103)

wsrep_provider=/usr/lib/galera/libgalera_smm.so
wsrep_provider_options="gcache.size=2G"
wsrep_cluster_name="clusterTest"
wsrep_cluster_address="gcomm://192.168.0.102"
wsrep_node_name="node-2"
wsrep_node_address=192.168.0.103
wsrep_node_incoming_address=192.168.0.103
wsrep_slave_threads=16
wsrep_sst_method=xtrabackup
wsrep_sst_receive_address=192.168.0.103
wsrep_sst_auth=root:somepass
wsrep_sst_donor="node-1"

At first run only node-1 has a database for testing, let's call it testDB.

What I do:

1. node-1> service mysql start
Result: node is working, testDB is accessible from any host and the node itself.
2. node-3> garbd --address gcomm://192.168.0.102,192.168.0.103 --group "clusterTest"
Resutl: the cluster size is 2.
3. node-2> service mysql start
Result: the cluster size is 3, but the init-script reports that service start failed, however the processes are running, the sst is performed.

Also I can't access mysql running on node-2:

ERROR 2002 (HY000): Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (111)

And from remote host:

PHP Warning:  mysqli::mysqli(): (HY000/2003): Can't connect to MySQL server on '192.168.0.103' (111)

Cluster state from node-1:

wsrep_local_state_comment    | Donor/Desynced
wsrep_incoming_addresses     | 192.168.0.102:3306,,192.168.0.103:3306
wsrep_cluster_conf_id        | 3                                     
wsrep_cluster_size           | 3   

If I start the mysql on node-2 with wsrep_provider set to "none", the database is fully accessible from local and remote host and is equal to the database on node-1. If I start the cluster again, the situation repeats, node-2 is only visible by other nodes, cluster becomes desynced and node-2 is not accessible neither from console, nor from remote hosts.

2

There are 2 answers

2
When1ConsumeMem On

Your most helpful tool when troubleshooting Galera issues will be the MySQL error logs. In Debian, they are located in /var/log/syslog by default.

It appears you're using Node 1 to bootstrap your cluster. It's critical to get your wsrep_cluster_address settings correct. The settings for both nodes should be as follows:

Node 1

wsrep_cluster_address=gcomm://


Node 2

wsrep_cluster_address=gcomm://192.168.0.102,192.168.0.103
0
Ho.Farahmand On

In my case i don't installed lsof iproute2 rsync dnsutils procps

please make sure they are installed.

apt install -y lsof iproute2 rsync dnsutils procps