This howto will show you how to build a basic multi-master MySQL cluster (actually, a Percona XtraDB cluster) with a load balancer in front of it.
Software:
- OS: Ubuntu 12.04 LTS (precise pangolin).
- Database: Percona XtraDB Cluster. Percona XtraDB is a higher-performance, backwards-compatible drop-in replacement version of the InnoDB storage engine for MySQL. The Percona XtraDB Cluster is the Percona Server setup with the Galera multi-master cluster libraries built in.
- Load balancer: HAproxy. HAProxy is a very high performance TCP/HTTP load balancer.
We will be building three database nodes, connected to one load balancer. Something like this:
The steps:
- Setup the machines
- Install Percona XtraDB Cluster on the database machines
- Configure the bootstrap node
- Configure the rest of the cluster
- Test the cluster
- Daemonise the clustercheck script on the database machines with xinetd
- Set the HAProxy load balancer up
- Win
Create your servers
I used DigitalOcean for this tutorial, but use whomever you like.
Install Percona XtraDB Cluster on the database servers
Log in to each of your database servers and set up Percona XtraDB Cluster.
First, add the key to apt:
$ apt-key adv --keyserver keys.gnupg.net \ --recv-keys 1C4CBDCDCD2EFD2A
(Create and) Edit the /etc/apt/sources.list.d/percona.list
file, adding the following:
deb http://repo.percona.com/apt precise main deb-src http://repo.percona.com/apt precise main
Update apt:
$ apt-get update
Install Percona XtraDB Cluster Server:
$ apt-get install percona-xtradb-cluster-server-5.5
as well as the client:
$ apt-get install percona-xtradb-cluster-client-5.5
It will ask for a MySQL root password during each install. I kept them the same over the nodes.
Configure the bootstrap node
Choose a machine that will create the initial cluster. Forming an initial cluster is called bootstrapping the cluster.
Stop MySQL on that machine:
$ service mysql stop
(Create &) Edit /etc/mysql/my.cnf
on the bootstrap node.
Add the following:
[mysqld] datadir=/var/lib/mysql user=mysql # Path to Galera library wsrep_provider=/usr/lib/libgalera_smm.so # Empty gcomm address is being used when cluster is getting bootstrapped wsrep_cluster_address=gcomm:// # Cluster connection URL contains the IPs of node#1, node#2 and node#3 #wsrep_cluster_address=gcomm://192.168.10.11,192.168.10.12,192.168.10.13 # In order for Galera to work correctly binlog format should be ROW binlog_format=ROW # MyISAM storage engine has only experimental support default_storage_engine=InnoDB # This is a recommended tuning variable for performance innodb_locks_unsafe_for_binlog=1 # This changes how InnoDB autoincrement locks are # managed and is a requirement for Galera innodb_autoinc_lock_mode=2 # Node #1 address wsrep_node_address=192.168.10.11 # SST method wsrep_sst_method=xtrabackup # Cluster name wsrep_cluster_name=my_ubuntu_cluster # Authentication for SST method wsrep_sst_auth="sstuser:s3cretPass"
wsrep
stands for WriteSet REPlication, BTW. wsrep is a project that aims to develop a generic database replication plugin interface, and Galera is a specific wsrep provider.
Notice line 10, where the wsrep_cluster_address (gcomm://)
address is empty. This will tell this node to bootstrap itself (create a new cluster) on startup. We will comment line 10 and uncomment line 13 later when the rest of the cluster is ready. You’ll notice that line 13 specifies all the other node addresses in the cluster – it is stricly speaking only necessary to specify one other node, but listing all of them is noted as a best practice in the Percona documentation.
Some notes about some of the settings
wsrep_cluster_name
is the unique name of your cluster. It should be identical on all nodes that form part of the same cluster.
wsrep_sst_method
specifies the way a State Snapshot Transfer (SST) is taken. Cluster nodes are kept in sync through SSTs and ISTs (Incremental State Transfer). An SST is a full data copy from a donor node to a joining node, and an IST a way for a joining node to catch up through a transfer of partial state snapshots (writesets) from the donor node’s writeset cache.
The following SST options exist:
xtrabackup
– uses Percona XtraBackup to take an SST. It’s useful because it doesn’t lock the donor database during an SST. We need to set up an SST user for xtrabackup, hence the values inwsrep_sst_auth
.rsync
– uses rsync to transfer the relevant info. Donor node is locked as READ-ONLY during transfers.mysqldump
– uses mysqldump to take an SST. Also READ-ONLY locks the database during transfers. Needs awsrep_sst_auth
user, and that user needs MySQL root privileges.
You can also use a custom script to take an SST, but for our purposes we will be using xtrabackup.
Save the my.cnf
file.
You can now start the MySQL service on the bootstrap node:
$ service mysql start
If all went well it should start up without a hitch.
We now need to add the SST
user as identified in the wsrep_sst_auth
directive.
Log into mysql on the bootstrap node:
$ mysql -uroot -p
Let’s check the node status first:
mysql> show status like 'wsrep%'; +-----------+-----------------------------------------------------------+ | Variable_name | Value | +-----------+-----------------------------------------------------------+ | wsrep_local_state_uuid | 844b3ad2-29ca-11e3-8586-e72851d735f1 | ... | wsrep_local_state | 4 | | wsrep_local_state_comment | Synced | ... | wsrep_cluster_size | 1 | | wsrep_cluster_status | Primary | | wsrep_connected | ON | ... | wsrep_ready | ON | +----------------------------+------------------------------------------+ 40 rows in set (0.01 sec)
Looking good – this output shows a successfully bootstrapped node.
Add the wsrep_sst_auth
user:
mysql> CREATE USER 'sstuser'@'localhost' IDENTIFIED BY 's3cretPass'; mysql> GRANT RELOAD, LOCK TABLES, REPLICATION CLIENT ON *.* TO 'sstuser'@'localhost'; mysql> FLUSH PRIVILEGES;
You can use the MySQL root user for SSTs, but it’s not recommended.
Time to configure the other nodes and get them to join the cluster.
Configure the rest of the cluster
Log into the other database cluster nodes, stop their MySQL services, and edit their /etc/mysql/my.cnf
files:
[mysqld] datadir=/var/lib/mysql user=mysql # Path to Galera library wsrep_provider=/usr/lib/libgalera_smm.so # Cluster connection URL contains IPs of node#1, node#2 and node#3 wsrep_cluster_address=gcomm://192.168.10.11,192.168.10.12,192.168.10.13 # In order for Galera to work correctly binlog format should be ROW binlog_format=ROW # MyISAM storage engine has only experimental support default_storage_engine=InnoDB # This is a recommended tuning variable for performance innodb_locks_unsafe_for_binlog=1 # This changes how InnoDB autoincrement locks are # managed and is a requirement for Galera innodb_autoinc_lock_mode=2 # Node #2 address wsrep_node_address=192.168.10.12 # Cluster name wsrep_cluster_name=my_ubuntu_cluster # SST method wsrep_sst_method=xtrabackup #Authentication for SST method wsrep_sst_auth="sstuser:s3cretPass"
This cnf file differs from the bootstrap node’s cnf file in that line 10 (wsrep_cluster_address
) specifies the cluster addresses. Line 26 has to be set to the node’s own IP address.
Repeat for the last node, changing line 25 and 26 to reflect the correct address.
Start MySQL:
$ service mysql start
You should see something like:
* Starting MySQL (Percona XtraDB Cluster) database server mysqld * SST in progress, setting sleep higher mysqld * Checking for corrupt, not cleanly closed, and upgrade needing tables.
That SST in progress line tells you that it connected to the server and is receiving an SST.
Once all the nodes are connected you can stop the bootstrap node’s MySQL service, comment line 10 in it’s my.cnf, and uncomment line 13 in it. Restart the MySQL service on it after saving the cnf file. Not fixing the wsrep_cluster_address
directive on the bootstrap node will make it re-bootstrap itself on restart.
Test the cluster
All three nodes should now be running happily connected to each other. Let’s test some things.
Look at the cluster size:
mysql> show status like 'wsrep%'; +-----------+-----------------------------------------------------------+ | Variable_name | Value | +-----------+-----------------------------------------------------------+ | wsrep_local_state_uuid | 844b3ad2-29ca-11e3-8586-e72851d735f1 | ... | wsrep_local_state | 4 | | wsrep_local_state_comment | Synced | ... | wsrep_cluster_size | 3 | | wsrep_cluster_status | Primary | | wsrep_connected | ON | ... | wsrep_ready | ON | +----------------------------+------------------------------------------+ 40 rows in set (0.01 sec)
Nice.
Try creating a database on one node, and running SHOW DATABASES;
on another node. Should feel pretty cool to see the live replication 😉
Percona installs a script called clustercheck
in /usr/bin/clustercheck
. It’s a simple script that checks a node’s MySQL wsrep_local_state
variable value and outputs HTTP 200
if the node is in sync, and HTTP 503
if it isn’t (or if it couldn’t connect).
It needs a MySQL user called clustercheckuser
with password clustercheckpassword!
. You can change this user, BTW – look for the following lines in /usr/bin/clustercheck
:
MYSQL_USERNAME="${1-clustercheckuser}" MYSQL_PASSWORD="${2-clustercheckpassword!}"
Just change the default values there if you want to change the clustercheck user.
Add the clustercheck user to MySQL:
mysql> grant process on *.* to 'clustercheckuser'@'localhost' identified by 'clustercheckpassword!'; mysql> flush privileges;
You don’t have to repeat user addition on the other nodes as it will propagate through the cluster by itself. Just remember to edit the clustercheck scripts on the other nodes if you are not using the default username and password.
Now run clustercheck:
$ clustercheck HTTP/1.1 200 OK Content-Type: text/plain Connection: close Content-Length: 40 Percona XtraDB Cluster Node is synced.
All our tests check out.
Daemonise the clustercheck script
We are going to set up the load balancer soon. But we need to find a way to make the load balancer aware that a node has gone out of sync. That’s why the clustercheck script exists.
To expose the clustercheck script we will be using xinetd. This is a daemon that listens on a port you specify, running a specific command once you connect to that port, and returning the output of the command to you.
Percona, nice people that they are, installs an xinetd config file for clustercheck as part of the server install. It should be located at /etc/xinetd.d/mysqlchk
, and the contents probably look something like:
service mysqlchk { # this is a config for xinetd, place it in /etc/xinetd.d/ disable = no flags = REUSE socket_type = stream port = 9200 wait = no user = nobody server = /usr/bin/clustercheck log_on_failure += USERID only_from = 0.0.0.0/0 # recommended to put the IPs that need # to connect exclusively (security purposes) per_source = UNLIMITED }
As you can see, it specifies that xinetd should listen on port 9200, and return the output of /usr/bin/clustercheck
if you connect to port 9200.
Edit /etc/services
to allow xinetd to listen for this, I added the following line right at the end:
# Local services mysqlchk 9200/tcp # mysqlchk
Install xinetd if it hasn’t been installed already:
$ apt-get install xinetd
If it has been installed already, just restart xinetd:
$ service xinetd restart
You should see it listening on port 9200 now:
$ netstat -ntlp Active Internet connections (only servers) Proto Recv-Q Send-Q Local Address ... State PID/Program name tcp 0 0 0.0.0.0:3306 ... LISTEN 2877/mysqld tcp 0 0 0.0.0.0:9200 ... LISTEN 3545/xinetd tcp 0 0 0.0.0.0:22 ... LISTEN 786/sshd tcp 0 0 0.0.0.0:4567 ... LISTEN 2877/mysqld tcp6 0 0 :::22 ... LISTEN 786/sshd
If you telnet to the address you should see the clustercheck output:
$ telnet 127.0.0.1 9200 Trying 127.0.0.1... Connected to 127.0.0.1. Escape character is '^]'. HTTP/1.1 200 OK Content-Type: text/plain Connection: close Content-Length: 40 Percona XtraDB Cluster Node is synced. Connection closed by foreign host.
Lather, rinse, repeat for the other database nodes.
Set the load balancer up
Log into your server that you want to use for load balancing, 192.168.10.10
in this example.
Install HAProxy
$ apt-get install haproxy
Now, edit /etc/haproxy/haproxy.cfg
:
global log 127.0.0.1 local0 log 127.0.0.1 local1 notice maxconn 4096 uid 99 gid 99 #daemon debug #quiet defaults log global mode http option tcplog option dontlognull retries 3 redispatch maxconn 2000 contimeout 5000 clitimeout 50000 srvtimeout 50000 listen mysql-cluster 0.0.0.0:3306 mode tcp balance roundrobin option tcpka option httpchk server db01 192.168.10.11:3306 check port 9200 inter 12000 rise 3 fall 3 server db02 192.168.10.12:3306 check port 9200 inter 12000 rise 3 fall 3 server db03 192.168.10.13:3306 check port 9200 inter 12000 rise 3 fall 3
The interesting detail to look at are lines 28-30, where HAProxy’s httpchk functionality is used to check for an HTTP 200
status on port 9200 of the cluster nodes – thus running the clustercheck script via xinetd, and taking a node out of circulation while it is not synced. It’s a better way to manage nodes than just checking if port 3306 is open.
The HAProxy service won’t start by default – as it has no sane default configuration – but now that we have provided a sane configuration we can set it to start. Edit /etc/default/haproxy
, setting ENABLED
to 1:
# Set ENABLED to 1 if you want the init script to start haproxy. ENABLED=1 # Add extra flags here. #EXTRAOPTS="-de -m 16"
Start HAProxy with
service haproxy start
You will probably get a warning about <debug> mode incompatible with <quiet> and <daemon>. Keeping <debug> only. You can remove this by editing /etc/haproxy/haproxy.cfg
and removing the debug option and enabling the quiet and daemon options (lines 7 – 9). I found it useful to mess around in debug mode at first, though.
You can check that it’s running by either looking for the process:
$ ps aux | grep -i haproxy ... /usr/sbin/haproxy -f /etc/haproxy/haproxy.cfg -D -p /var/run/haproxy.pid
or looking for the process listening on port 3306:
$ netstat -ntlp | grep 3306 tcp 0 0 0.0.0.0:3306 ... LISTEN 2783/haproxy
Unexpected results – Host Blocked Errors
I had a problem the first time I set this up. I got Error Code: 1129 Host ‘192.168.10.10’ is blocked because of many connection errors; unblock with ‘mysqladmin flush-hosts’ after a while when connected to my cluster. I’m still not sure why it happened, or that my actions cured it; I do know it hasn’t happened again after adding the option tcpka
(line 26: enabling TCP Keep Alive) to the haproxy.cfg file. Not understanding it feels a bit like magic. Hopefully someone more clued up can weigh in with some ideas. The MySQL manual notes that it happens after a certain amount of interrupted connection requests, hence my trying to keep the connections alive. Since the setting does nothing much for security it may make sense to set the max_connect_errors
value in your MySQL configuration to some insanely high value.
Now that HAProxy is running (and you are hopefully not dealing with unexpected results) you should be able to connect through MySQL to your load balancer, and it should be round-robining the requests to your cluster nodes.
Congrats! Have fun.
PS: Possible Tweaks
Security can be improved a bit – I would suggest at least locking down xinetd to only allow connections from your load balancer (look at line 12 in /etc/xinetd.d/mysqlchk
– the directive takes += as operator so you can list multiple IPs below each other).
You can also close ports on the cluster nodes to only allow MySQL and SSH access. The MySQL cluster needs 3306, 4444, 4567, and 4568 open, and SSH is typically running on port 22.
Using Uncomplicated Firewall:
$ ufw allow 22 $ ufw allow 3306 $ ufw allow 4444 ...etc $ ufw enable
You can fiddle with iptables
or similar if ufw
isn’t complex enough, something like:
$ iptables -A INPUT -i eth0 -p tcp -m tcp \ --source 192.168.10.11/24 --dport 3306 -j ACCEPT
should do to open specific ports.
HAProxy also has quite a pretty & colourful useful stats module that you can enable by adding something like this:
listen stats :8080 mode http stats enable stats hide-version stats realm Haproxy\ Statistics stats uri / stats auth Username:Password
to the /etc/haproxy/haproxy.cfg
file. Restart HAProxy, and browse port 8080 on your load balancer’s address. Remember to change Username and Password to values that you like for HTTP authentication.

Round-robining between the cluster nodes is an OK, but not great, strategy for the load balancer. If two nodes receive conflicting writes rollbacks can occur. It would be better to only write to a specific node at a time, failing to another node only if the active node goes offline.
You can achieve this by changing the haproxy.cfg
file, lines 23-31:
listen mysql-cluster 0.0.0.0:3306 mode tcp balance leastconn option tcpka option httpchk server db01 192.168.10.11:3306 check port 9200 inter 12000 rise 3 fall 3 server db02 192.168.10.12:3306 check port 9200 inter 12000 rise 3 fall 3 backup server db03 192.168.10.13:3306 check port 9200 inter 12000 rise 3 fall 3 backup
PPS: Further reading
Here are some of the articles I used to write this howto:
- My howto follows the Percona Ubuntu article – I filled in gotchas and gaps that tripped me up, and added some extra contextual info: http://www.percona.com/doc/percona-xtradb-cluster/howtos/ubuntu_howto.html
- Using xinetd: http://www.linuxjournal.com/article/4490?page=0,0
- Percona docs: Bootstrapping the cluster: http://www.percona.com/doc/percona-xtradb-cluster/manual/bootstrap.html
- HAProxy configuration manual: http://cbonte.github.io/haproxy-dconv/configuration-1.4.html#2.1
- The Percona XtraDB Cluster glossary: http://www.percona.com/doc/percona-xtradb-cluster/glossary.html