Saturday, April 18, 2009

This tutorial shows how to configure a MySQL 5 cluster with three nodes: two storage nodes and one management node. This cluster is load-balanced by a high-availability load balancer that in fact has two nodes that use the Ultra Monkey package which provides heartbeat (for checking if the other node is still alive) and ldirectord (to split up the requests to the nodes of the MySQL cluster).

In this document I use Debian Sarge for all nodes. Therefore the setup might differ a bit for other distributions. The MySQL version I use in this setup is 5.0.19. If you do not want to use MySQL 5, you can use MySQL 4.1 as well, although I haven't tested it.

This howto is meant as a practical guide; it does not cover the theoretical backgrounds. They are treated in a lot of other documents in the web.

This document comes without warranty of any kind! I want to say that this is not the only way of setting up such a system. There are many ways of achieving this goal but this is the way I take. I do not issue any guarantee that this will work for you!

1 My Servers I use the following Debian servers that are all in the same network (192.168.0.x in this example):

sql1.example.com: 192.168.0.101 MySQL cluster node 1 sql2.example.com: 192.168.0.102 MySQL cluster node 2 loadb1.example.com: 192.168.0.103 Load Balancer 1 / MySQL cluster management server loadb2.example.com: 192.168.0.104 Load Balancer 2 In addition to that we need a virtual IP address : 192.168.0.105. It will be assigned to the MySQL cluster by the load balancer so that applications have a single IP address to access the cluster.

Although we want to have two MySQL cluster nodes in our MySQL cluster, we still need a third node, the MySQL cluster management server, for mainly one reason: if one of the two MySQL cluster nodes fails, and the management server is not running, then the data on the two cluster nodes will become inconsistent ("split brain"). We also need it for configuring the MySQL cluster.

As the MySQL cluster management server does not use many resources, and the system would just sit there doing nothing, we can put our first load balancer on the same machine, which saves us one machine, so we end up with four machines.

2 Set Up The MySQL Cluster Management ServerFirst we have to download MySQL 5.0.19 (the max version!) and install the cluster management server (ndb_mgmd) and the cluster management client (ndb_mgm - it can be used to monitor what's going on in the cluster). The following steps are carried out on loadb1.example.com (192.168.0.103):

4 Test The MySQL ClusterOur MySQL cluster configuration is already finished, now it's time to test it. On the cluster management server (loadb1.example.com), run the cluster management client ndb_mgm to check if the cluster nodes are connected:

Now we create the same database on sql2.example.com (yes, we still have to create it, but afterwards testtable and its data should be replicated to sql2.example.com because testtable uses ENGINE=NDBCLUSTER):

5 How To Restart The ClusterNow let's asume you want to restart the MySQL cluster, for example because you have changed /var/lib/mysql-cluster/config.ini on loadb1.example.com or for some other reason. To do this, you use the ndb_mgm cluster management client on loadb1.example.com:

This means that the cluster nodes sql1.example.com and sql2.example.com and also the cluster management server have shut down.

Run

quit;

to leave the ndb_mgm console.

To start the cluster management server, do this on loadb1.example.com:

loadb1.example.com:

ndb_mgmd -f /var/lib/mysql-cluster/config.ini

and on sql1.example.com and sql2.example.com you run

sql1.example.com / sql2.example.com:

ndbd

or, if you have changed /var/lib/mysql-cluster/config.ini on loadb1.example.com:

ndbd --initial

Afterwards, you can check on loadb1.example.com if the cluster has restarted:

loadb1.example.com:

ndb_mgm

On the ndb_mgm console, type

show;

to see the current status of the cluster. It might take a few seconds after a restart until all nodes are reported as connected.

Type

quit;

to leave the ndb_mgm console.

6 Configure The Load BalancersOur MySQL cluster is finished now, and you could start using it now. However, we don't have a single IP address that we can use to access the cluster, which means you must configure your applications in a way that a part of it uses the MySQL cluster node 1 (sql1.example.com), and the rest uses the other node (sql2.example.com). Of course, all your applications could just use one node, but what's the point then in having a cluster if you do not split up the load between the cluster nodes? Another problem is, what happens if one of the cluster nodes fails? Then the applications that use this cluster node cannot work anymore.

The solution is to have a load balancer in front of the MySQL cluster which (as its name suggests) balances the load between the MySQL cluster nodes. The load blanacer configures a virtual IP address that is shared between the cluster nodes, and all your applications use this virtual IP address to access the cluster. If one of the nodes fails, then your applications will still work, because the load balancer redirects the requests to the working node.

Now in this scenario the load balancer becomes the bottleneck. What happens if the load balancer fails? Therefore we will configure two load balancers (loadb1.example.com and loadb2.example.com) in an active/passive setup, which means we have one active load balancer, and the other one is a hot-standby and becomes active if the active one fails. Both load balancers use heartbeat to check if the other load balancer is still alive, and both load balancers also use ldirectord, the actual load balancer the splits up the load onto the cluster nodes. heartbeat and ldirectord are provided by the Ultra Monkey package that we will install.

It is important that loadb1.example.com and loadb2.example.com have support for IPVS (IP Virtual Server) in their kernels. IPVS implements transport-layer load balancing inside the Linux kernel.

¦ libsensors3 not functional ¦ ¦ ¦ ¦ It appears that your kernel is not compiled with sensors support. As a ¦ ¦ result, libsensors3 will not be functional on your system. ¦ ¦ ¦ ¦ If you want to enable it, have a look at "I2C Hardware Sensors Chip ¦ ¦ support" in your kernel configuration. ¦

you can ignore it.

Answer the following questions:

Do you want to automatically load IPVS rules on boot?

Select a daemon method.

The libdbd-mysql-perl package we've just installed does not work with MySQL 5 (we use MySQL 5 on our MySQL cluster...), so we install the newest DBD::mysql Perl package:

You must list one of the load balancer node names (here: loadb1) and list the virtual IP address (192.168.0.105) together with the correct netmask (24) and broadcast address (192.168.0.255). If you are unsure about the correct settings, http://www.subnetmask.info/ might help you.

vi /etc/ha.d/authkeys

auth 33 md5 somerandomstring

somerandomstring is a password which the two heartbeat daemons on loadb1 and loadb2 use to authenticate against each other. Use your own string here. You have the choice between three authentication mechanisms. I use md5 as it is the most secure one.

/etc/ha.d/authkeys should be readable by root only, therefore we do this:

loadb1.example.com / loadb2.example.com:

chmod 600 /etc/ha.d/authkeys

6.3 Configure ldirectord Now we create the configuration file for ldirectord, the load balancer:

Please fill in the correct virtual IP address (192.168.0.105) and the correct IP addresses of your MySQL cluster nodes (192.168.0.101 and 192.168.0.102). 3306 is the port that MySQL runs on by default. We also specify a MySQL user (ldirector) and password (ldirectorpassword), a database (ldirectordb) and an SQL query. ldirectord uses this information to make test requests to the MySQL cluster nodes to check if they are still available. We are going to create the ldirector database with the ldirector user in the next step.

Now we create the necessary system startup links for heartbeat and remove those of ldirectord (bacause ldirectord will be started by heartbeat):

6.4 Create A Database Called ldirectorNext we create the ldirector database on our MySQL cluster nodes sql1.example.com and sql2.example.com. This database will be used by our load balancers to check the availability of the MySQL cluster nodes.

# Enable configuration of arp_ignore optionnet.ipv4.conf.all.arp_ignore = 1# When an arp request is received on eth0, only respond if that address is# configured on eth0. In particular, do not respond if the address is# configured on lonet.ipv4.conf.eth0.arp_ignore = 1# Ditto for eth1, add for all ARPing interfaces#net.ipv4.conf.eth1.arp_ignore = 1# Enable configuration of arp_announce optionnet.ipv4.conf.all.arp_announce = 2# When making an ARP request sent through eth0 Always use an address that# is configured on eth0 as the source address of the ARP request. If this# is not set, and packets are being sent out eth0 for an address that is on# lo, and an arp request is required, then the address on lo will be used.# As the source IP address of arp requests is entered into the ARP cache on# the destination, it has the effect of announcing this address. This is# not desirable in this case as adresses on lo on the real-servers should# be announced only by the linux-director.net.ipv4.conf.eth0.arp_announce = 2# Ditto for eth1, add for all ARPing interfaces#net.ipv4.conf.eth1.arp_announce = 2

sysctl -p

Add this section for the virtual IP address to /etc/network/interfaces:

If your tests went fine, you can now try to access the MySQL database from a totally different server in the same network (192.168.0.x) using the virtual IP address 192.168.0.105:

mysql -h 192.168.0.105 -u ldirector -p

(Please note: your MySQL client must at least be of version 4.1; older versions do not work with MySQL 5.)

You can now switch off one of the MySQL cluster nodes for test purposes; you should then still be able to connect to the MySQL database.

8 AnnotationsThere are some important things to keep in mind when running a MySQL cluster:

- All data is stored in RAM! Therefore you need lots of RAM on your cluster nodes. The formula how much RAM you need on ech node goes like this:

(SizeofDatabase Ã— NumberOfReplicas Ã— 1.1 ) / NumberOfDataNodes

So if you have a database that is 1 GB of size, you would need 1.1 GB RAM on each node!

- The cluster management node listens on port 1186, and anyone can connect. So that's definitely not secure, and therefore you should run your cluster in an isolated private network!

It's a good idea to have a look at the MySQL Cluster FAQ: http://dev.mysql.com/doc/refman/5.0/en/mysql-cluster-faq.html and also at the MySQL Cluster documentation: http://dev.mysql.com/doc/refman/5.0/en/ndbcluster.html

A very helpful articleSubmitted by lionel (not registered) on Mon, 2009-04-13 15:24.Found this article very helpful. Followed it step by step and found no problems at all. Used to think mysql replication to be a big deal but this article made it look so simple. Thanks Guys

Cheers

Lionel

Developer at Shopnics

reply | view as pdfA few more detailsSubmitted by d60eba (registered user) on Thu, 2008-01-10 12:11.This is a great tutorial - I used it to set everything up myself. However, I found a few details lacking about what was going on behind the scenes, and also how to recover after a server crash. Anyway, I've written everything up here: http://aciddrop.com/2008/01/10/step-by-step-how-to-setup-mysql-database-replication/ It's for MySQL 5.0 on Centos 4, but is good for other distros. (I've credited you with a link at the bottom).

Cheers,

Leon

reply | view as pdflocking/unlocking tables warningSubmitted by taikonautzero (registered user) on Mon, 2006-12-18 13:57.When locking the tables with FLUSH TABLES WITH READ LOCK; do not quit the mysql shell else you will lose the lock, use another shell to do the db dump instead.

FreeBSD version 4.0.26

MySQL version 4.10

reply | view as pdfmaster details in slave my.cnf not explicitly neededSubmitted by Anonymous (not registered) on Tue, 2006-01-17 00:29.Nice and concice howto (too bad I only found it now while I figured this stuff out a month ago)! I'd like to remark that when editing the slave my.cnf config file, the master details do not necessarilly have to be filled in there, since the slave mysql server doesn't read this config when restarting. When issuing the CHANGE MASTER TO command, the slave mysql server creates a master.info file in the mysql data directory where it stores the master details, along with it's current synchronization position. The master.info file contains thus everything that the slave needs when restarting. Reason for not having the master details (including slave_user password) in the my.cnf config file is by default world readable (at least in debian GNU/Linux it is) and the master.info can only be read by mysql (and root, obviously). Furthermore I'd like to point out another (fast) way of getting the master data to the slave when setting up replication: with the read lock still on (don't close the mysql client in which you issued the FLUSH TABLES WITH READ LOCK command, otherwise the lock will be gone); create a tarball of the entire mysql data directory (or only the desired databases, the filenames to include are obvious), release the lock, copy the tarball to the slave machine using ftp, scp or whatever, change to the data directory overthere and extract the tarball. Take special care when copying the mysql database to the slave this way, because you will override any existing account on the slave. In debian, dpkg-reconfigure mysql-server might be needed to resolve problems with the debian-sys-maint-user that arise when replacing the entire mysql database. Good luck, Thomasreply | view as pdfdebian-sys-maint userSubmitted by Anonymous (not registered) on Fri, 2006-02-03 23:39.When copying everything, including the mysql database on Debian, you will screw up the password for the debian-sys-main account. Just grab the password on your master server from /etc/mysql/debian.cnf and put this password in the same file on your new slave.

in the mysql command line does the same. so only setting the my.cnf and restarting mysql does the trick.

reply | view as pdfWon't work on newer MYSQL versionsSubmitted by xabin (registered user) on Wed, 2008-03-12 19:25.The syntax 'LOAD DATA FROM MASTER' will not work on the newer versions of MySQL-server, see this page; http://dev.mysql.com/doc/refman/5.0/en/load-data-from-master.html

reply | view as pdfanother thing that will help hereSubmitted by nephish (registered user) on Wed, 2007-05-02 00:00. if you start mysql with /etc/init.d/mysql restart, you cannot just put the lines above for the my.cnf file anywhere in the file for the slave.

they must be in the [mysqld] block.

took me two hours to figgure that one out.

reply | view as pdfkeep in mind this note on rSubmitted by Anonymous (not registered) on Tue, 2006-06-06 11:00.keep in mind this note on replication (found on mysql doc site), I lost 2 days trying to understand why my DBs where not replicating!

Note that if you client does not do a "USE dbname", binlog-do-db=dbname will not binlog a query like: "update in dbname.foobar set foo=1"

You explicitly have to do a USE before a query in order to have your query binlogged, it looks like. Replication on the slave side can do wildcard matches .. but the master cannot (a la binlog-wild-do-table=dbname.%). So make sure your clients do a use, if you plan to replicate those tables it updates.

reply | view as pdfMissing a few options, more detailed commands for slaveSubmitted by Anonymous (not registered) on Mon, 2006-01-16 19:02.http://dev.mysql.com/doc/refman/4.1/en/replication-howto.html

You need to read the database replication documents a little more in depth; depending on your version of MySQL and wether or not you use InnoDB, you want to also include something like the following on the master server:

Now go ahead and set your root password, create your databases, etc. Everything done on the master will replicate faithfully over to the slave.

reply | view as pdfLittle tipSubmitted by Anonymous (not registered) on Mon, 2006-01-16 17:26.I've done the exact same thing a while ago. I wrote a little perlscript to keep the databases in sync (when run it makes sure they're synced). Some people might find it helpful. http://files.printf.dk/software/clustersync.txt