MySQL Cluster Management : Part 1

Configuring multiple management nodes

Every MySQL Cluster must have a management node to start and also to carry out critical tasks such as allowing other nodes to restart, running online backups, and monitoring the status of the cluster. However, it is strongly recommended for a production cluster to ensure that a management node is always available, and this requires more than one node. In this recipe, we will discuss the minor complications that more than one management node will bring before showing the configuration of a new cluster with two management nodes. Finally, the modification of an existing cluster to add a second management node will be shown.

Getting ready

In a single management node cluster, everything is simple. Nodes connect to the management node, get a node ID, and join the cluster. When the management node starts, it reads the config.ini file, starts and prepares to give the cluster information contained within the config.ini file out to the cluster nodes as and when they join.

This process can become slightly more complicated when there are multiple management nodes, and it is important that each management node takes a different ID. Therefore, the first additional complication is that it is an extremely good idea to specify node IDs and ensure that the HostName parameter is set for each management node in the config.ini file.

It is technically possible to start two management nodes with different cluster configuration files in a cluster with multiple management nodes. It is not difficult to see that this can cause all sorts of bizarre behavior including a likely cluster shutdown in the case of the primary management node failing. Ensure that every time the config.ini file is changed, the change is correctly replicated to all management nodes. You should also ensure that all management nodes are always using the same version of the config.ini file.

It is possible to hold the config.ini file on a shared location such as a NFS share, although to avoid introducing complexity and a single point of failure, the best practice would be to store the configuration file in a configuration management system such as Puppet (http://www.puppetlabs.com/) or Cfengine (http://www.cfengine.org/).

How to do it...

The following process should be followed to configure a cluster for multiple management nodes. In this recipe, High Availability with MySQL Cluster. Initially, this recipe will cover the procedure to be followed in order to configure a new cluster with two management nodes. Thereafter, the procedure for adding a second management node to an already running single management node cluster will be covered.

The first step is to define two management nodes in the global configuration file config.ini on both management nodes.

In this example, we are using IP addresses 10.0.0.5 and 10.0.0.6 for the two management nodes that require the following two entries of [ndb_mgmd] in the config.ini file:

Update the [mysqld] section of each SQL node's /etc/my.cnf to point to both management nodes:

[mysqld]ndb-connectstring=10.0.0.5,10.0.0.6

Update the [mysqld] section of each SQL node's /etc/my.cnf to point to both management nodes:

[mysqld]ndb-connectstring=10.0.0.5,10.0.0.6

Now, prepare to start both the management nodes. Install the management node on both nodes, if it does not already exist.

Before proceeding, ensure that you have copied the updated config.ini file to both management nodes.

Start the first management node by changing to the correct directory and running the management node binary (ndb_mgmd) with the following flags:

--initial: Deletes the local cache of the config.ini file and updates it (you must do this every time the config.ini file is changed).

--ndb-nodeid=X: Tells the node to connect as this nodeid, as we specified in the config.ini file. This is technically unnecessary if there is no ambiguity as to which nodeid this particular node may connect to (in this case, both nodes have a HostName defined). However, defining it reduces the possibility of confusion.

--config-file=config.ini: This is used to specify the configuration file. In theory, passing a value of the config.ini file in the local directory is unnecessary because it is the default value. But in certain situations, it seems that passing this in any case avoids issues, and again this reduces the possibility of confusion.

Finally, restart all SQL nodes (mysqld processes). On RedHat-based systems, this can be achieved using the service command:

[root@node1 ~]# service mysqld restart

Congratulations! Your cluster is now configured with multiple management nodes. Test that failover works by killing a management node, in turn, the remaining management nodes should continue to work.

There's more...

It is sometimes necessary to add a management node to an existing cluster if for example, due to a lack of hardware or time, an initial cluster only has a single management node.

Adding a management node is simple. Firstly, install the management client on the new node . Secondly, modify the config.ini file, as shown earlier in this recipe for adding the new management node, and copy this new config.ini file to both management nodes. Finally, stop the existing management node and start the new one using the following commands:

[root@node5 mysql-cluster]# ndb_mgmd --config-file=config.ini --initial --ndb-nodeid=12009-08-15 21:29:53 [MgmSrvr] INFO -- NDB Cluster Management Server. mysql-5.1.34 ndb-7.0.62009-08-15 21:29:53 [MgmSrvr] INFO -- Reading cluster configuration from 'config.ini'torage node one at a time. Ensure that you only stop one node per nodegroup at a time and wait for it to fully restart before taking another node in the nodegroup, when offline, in order to avoid any downtime.

See also

Look at the section for the online addition of storage nodes (discussed later in this article) for further details on restarting storage nodes one at a time.

Obtaining usage information

This recipe explains how to monitor the usage of a MySQL Cluster, looking at the memory, CPU, IO, and network utilization on storage nodes.

Getting ready

MySQL Cluster is extremely memory-intensive. When a MySQL Cluster starts, the storage nodes will start using the entire DataMemory and IndexMemory allocated to them. In a production cluster with a large amount of RAM, it is likely that this will include a large proportion of the physical memory on the server.

How to do it...

An essential part of managing a MySQL Cluster is looking into what is happening inside each storage node. In this section, we will cover the vital commands used to monitor a cluster.

To monitor the memory (RAM) usage of the nodes within the cluster, execute the &ltnodeid> REPORT MemoryUsage command within the management client as follows:

This information shows that these nodes are actually using 0% of their DataMemory and IndexMemory.

Memory allocation is important and unfortunately a little more complicated than a percentage used on each node. There is more detail about this in the How it works... section of this recipe, but the vital points to remember are:

It is a good idea never to go over 80 percent of memory usage (particularly not for DataMemory)

In the case of a cluster with a very high memory usage, it is possible that a cluster will not restart correctly

MySQL Cluster storage nodes make extensive use of disk storage unless specifically configured not to, regardless of whether a cluster is using disk-based tables. It is important to ensure the following:

There is sufficient storage available

There is sufficient IO bandwidth for the storage node and the latency is not too high

The bi and bo columns represent the blocks read from a disk and blocks written to a disk, respectively. The first line can be ignored (it's the average since boot), and the number passed to the command, in this case, the refresh rate in seconds. By using a tool such as bonnie (refer to the See also section at the end of this recipe) to establish the potential of each block device, you can then check to see the maximum proportion of each block device is currently being used.

At times of high stress, like during a hot backup, if the disk utilization is too high it is potentially possible that the storage node will start spending a lot of time in the iowait state—this will reduce performance and should be avoided. One way to avoid this is by using a separate block device (that is, disk or raid controller) for the backups mount point.

How it works...

Data within the MySQL Cluster is stored in two parts. In broader terms, the fixed part of a row (fields with a fixed width, such as INT, CHAR, and so on) is stored separately from variable fields (for example, VARCHAR).

As data is stored in 32 KB pages, it is possible for variable-length data to become quite fragmented in cases where a cluster only has free space in existing pages that are available because data has been deleted.

Fragmentation is clearly bad. To reduce it, run the SQL command optimize table as follows:

There's more...

It is also essential to monitor network utilization because latency will dramatically increase as utilization gets close to 100 percent of either an individual network card or a network device like a switch. If network latency increases by a very small amount, then its effect on performance will be significant. This article will not discuss the many techniques for monitoring the overall network health. However, we will see a tool called iptraf that is very useful inside clusters for working out which node is interacting with which node and what proportion of network resources it is using.

A command such as iptraf –i eth0 will show the network utilization broken down by connection, which can be extremely useful when trying to identify connections on a node that are causing problems. The screenshot for the iptraf command is as follows:

The previous screenshot shows the connections on the second interface (dedicated to cluster traffic) for the first node in a four-storage node cluster. The connection that each node makes with the others (10.0.0.2, 10.0.0.3, and 10.0.0.4 are other storage nodes) is obvious as well as the not entirely obvious ports selected for each connection. There is also a connection to the management node. The Bytes column gives a clear indication of which connections are most utilized.

See also

Adding storage nodes online

The ability to add a new node without any downtime is a relatively new feature of MySQL Cluster which dramatically improves long-term uptime in cases where the regular addition of nodes is required, for example, where data volume or query load is continually increasing.

Getting ready

In this recipe, we will show an example of how to add two nodes to an existing two-node cluster (while maintaining NoOfReplicas=2 or two copies of each fragment of data).

The start point for this recipe is a cluster with two storage nodes and one management node running successfully with some data imported . Ensure that the world database has been imported as an NDB table.

How to do it...

Firstly, ensure that your cluster is fully running (that is, all management and storage nodes are running). The command to do this is as follows:

Now, restart the active current nodes—in this case, the nodes with id 2 and 3 (10.0.0.1 and 10.0.0.2). This can be done with the management client command &ltnodeid> RESTART or by killing the ndbd process and restarting (there is no need for --initial):

If you check the status of the show command in the management client, shortly after starting the new storage nodes, you will notice that the newly-started storage nodes move to a started state very rapidly (when compared to other nodes in the cluster). However, but they are shown as belonging to "no nodegroup" as shown in the following output:

Now, we need to create a new nodegroup for these nodes. We have set NoOfReplicas=2 in the config.ini file, so each nodegroup must contain two nodes. We use the CREATE NODEGROUP&ltnodeID>,&ltnodeID> command to add a nodegroup.

If we had NoOfReplicas=4, we would pass four comma-separated nodeIDs to this command.

ndb_mgm> CREATE NODEGROUP 4,5Nodegroup 1 created<

Nodegroup 1 now exists. To see the information, use the show command as follows:

Congratulations! You have now added two new nodes to your cluster, which will be used by the cluster for new fragments of data. Look at the There's more… section of this recipe to see how you can get these nodes used right away and the How it works… section for a brief explanation of what is going on behind the scenes.

How it works...

After you have added the new nodes, it is possible to take a look at how a table is being stored within the cluster. If you used the world sample database imported in , then you will have a City table inside the world database. Running the ndb_desc binary as follows on a storage or management node shows you where the data is stored.

The first parameter, after –d, is the database name and the second is the table name. If a [mysql_cluster] section is not defined in /etc/my.cnf, the management node IP address may be passed with -c.

Therefore, when we add a new nodegroup, it is important to reorganize the data in he existing nodes to ensure that it is spread out across the whole cluster and this does not happen automatically. New data, however, is automatically spread out across the whole cluster.

The process to reorganize data in the cluster to use all storage nodes is outlined in the next section.

There's more...

To reorganize the data within a cluster to use all new storage nodes, run the ALTER TABLE x REORGANIZE PARTITION query in a SQL node, substituting x for a table name. This command must be run once per table in the cluster.

In NDB 7.0, the redistribution does not include unique indexes (only ordered indexes are redistributed) or BLOB table data. This is a limitation that is likely to be removed in later releases. If you have a large amount of these two forms of data, then it is likely to that you will notice unequal loadings on your new nodes even after this process. Newly inserted data will, however, be distributed across all nodes correctly.

This query can be executed on any storage node and should not affect the execution of other queries—although it will, of course, increase the load on the storage nodes involved:

Replicating between MySQL Clusters

Replication is commonly used for single MySQL servers. In this recipe, we will explain how to use this technique with MySQL Cluster—replicating from one MySQL Cluster to another and replicating from a MySQL Cluster to a standalone server.

Getting ready

Replication is often used to provide a Disaster Recovery site, some distance away from a primary location, which is asynchronous (in contrast with the synchronous nature of the information flows within a MySQL Cluster). The asynchronous nature of replication means that the main cluster does not experience any performance degradation at the expense of a potential loss of a small amount of data in the event of the master cluster failing.

Replication involving a MySQL Cluster introduces the concept of replication channels. A replication channel is made up of two replication nodes. One of these nodes is in the source machine or cluster, and the other in the destination machine or cluster. It is good practice to have more than one replication channel for redundancy.

The following diagram illustrates the replication channel:

Note that this diagram shows two replication channels. Currently, with Cluster Replication, only one channel can be active at any one time. It is good practice to have another channel set up almost ready to go, so that in the event one of the nodes involved in the primary channel fails, it is very quick to bring up a new channel.

In general, all replication nodes should be of the same, or very similar, MySQL version.

How to do it...

Firstly, prepare the two parts of the replication channel. In this example, we will replicate from one cluster to another. The source end of the channel is referred to as the master and the destination as the slave.

All mysqld processes (SQL nodes or standalone MySQL servers) involved as a replication agent (either as master or slave) must be configured to have a unique server-ID. Additionally, the master must also have some additional configuration in the [mysqld] section of /etc/my.cnf. Start by adding this to the master SQL node's /etc/my.cnf file as follows:

# Enable cluster replicationlog-binbinlog-format=ROWserver-id=3

Add the server-id parameter only to all MySQL servers that are acting as slave nodes, and restart all SQL nodes that have had my.cnf modified:

On the slave, issue a CHANGE MASTER TO command as follows to tell the slave where the master is, what user and password to use to log in, what logfile it is currently at, and what logfile position to start from:

You can see that the new database has been correctly replicated to the slave cluster. It is good practice to test this replication channel with some real data, perhaps by importing the world dataset into the new database on the master cluster.

How it works...

MySQL Cluster replication is implemented by a dedicated thread—the NDB binlog injector thread that runs on each SQL node and produces a standard binary log (binlog), which a slave can connect to normally. This binlog injector thread ensures that all changes within the cluster that the SQL node is a member of are inserted into the binary log and not just the queries that were executed on that specific SQL node. This thread additionally ensures that these transactions are inserted in the correct serialization order. Therefore, the vast majority of the process is identical to the standard MySQL Replication.

If you only had a single SQL node in a cluster, then there would be no need for this thread, and standard MySQL Replication would work perfectly. Unfortunately, there is very little use in a MySQL Cluster with one SQL node.

There's more...

Cluster replication is an extremely powerful tool. In the following section, we will cover a couple of the most useful and more advanced techniques of cluster replication.

Alerts & Offers

Series & Level

We understand your time is important. Uniquely amongst the major publishers, we seek to develop and publish the broadest range of learning and information products on each technology. Every Packt product delivers a specific learning pathway, broadly defined by the Series type. This structured approach enables you to select the pathway which best suits your knowledge level, learning style and task objectives.

Learning

As a new user, these step-by-step tutorial guides will give you all the practical skills necessary to become competent and efficient.

Beginner's Guide

Friendly, informal tutorials that provide a practical introduction using examples, activities, and challenges.

Essentials

Fast paced, concentrated introductions showing the quickest way to put the tool to work in the real world.

Cookbook

A collection of practical self-contained recipes that all users of the technology will find useful for building more powerful and reliable systems.

Blueprints

Guides you through the most common types of project you'll encounter, giving you end-to-end guidance on how to build your specific solution quickly and reliably.

Mastering

Take your skills to the next level with advanced tutorials that will give you confidence to master the tool's most powerful features.

Starting

Accessible to readers adopting the topic, these titles get you into the tool or technology so that you can become an effective user.

Progressing

Building on core skills you already have, these titles share solutions and expertise so you become a highly productive power user.