Executive Summary

This document illustrates the deployment process of MySQL on Solaris cluster (SC). It also focuses on regression and failover testing of HA MySQL, and describes the tests performed. Solaris 10 fully supports MySQL and the HA cluster application agent (data service) for MySQL.

A cluster provides a
single view of services for applications such as databases, web
services, and file services. Services can scale to additional
processors with the addition of nodes. A data service is an application designed to run under the control of a cluster.

The MySQL Open Source database is multi-threaded and consists of
an SQL server, client programs and libraries, administrative tools, and
APIs. Sun acquired MySQL AB in January 2008.

The objective is to help
facilitate adoption of the Sun stack by the Open Source community, and
enhance Sun's strong commitment to open source products, including Open Source databases.

Document Scope

The scope of the deployment environment comprises MySQL Community Server on a Solaris 10 global zone. While the platform of choice was SPARC, the results are identical if deployed on an x86-64 platform.

The scope of testing includes upgrade testing, regression testing, and failover testing. Testing with MySQL Cluster, and performance benchmarking is outside of the scope.

Benefits of Solaris Cluster

Reduced system downtime, availability of systems and applications, and increased application throughput via scalable services are some of the benefits realized when using Solaris Cluster.

The Solaris Cluster framework includes Sun Cluster, Sun Cluster Geographic Edition, developer tools and support for commercial and open-source applications through agents. It provides high availability and disaster recovery for local and geographic clusters. This leads to an increased choice in storage replication and networking configurations.

A data service consists of an application, cluster configuration files, and management methods that control starting stopping, and monitoring of the application. The public network interface is a logical hostname or a shared address.

Solaris Cluster provides for Higher Availability (HA) for enterprise applications. HA components can survive a single s/w or h/w failure. The I/O fencing feature keeps shared storage transparent to a faulty node, and ensures integrity of data. For example, in a 2-node cluster that has a disabled interconnect, each node may assume that it's part of a cluster, and may try and form one. This potential split-brain scenario is avoided by I/O fencing.

A logical interface is presented to the applications. There is no single point of failure in the NICs. Striped traffic over the interconnects with transparent failover results in better network utilization.

Device path names are uniform across nodes for the shared LUNs, thereby resulting in easier manageability. Node times can be synchronized during configuration.

A comprehensive list of HA applications (data services) are available, and all configurations are certified and tested. Tight integration with the Solaris kernel results in better fault management and heartbeat mechanism.

Setup Environment

A traditional cluster includes two or more nodes cabled together via private interconnects, and connected simultaneously to shared storage devices (multihost devices). Network adapters provide client access to the cluster. Different architectures and topologies can be used for deployment. The core cluster and data service software, and disk management software are used collectively to monitor, access and administer various resources.

The setup comprises two SPARC V1280 systems connected by two private interconnects.The MySQL Community Server version used is 5.0.45 .

SPARC platform :

Nodes : v1280-137-03 and v1280-137-04

CPU : 12 UltraSPARC-III+ x 1200 MHz

Memory : 49,152 MB and 98,304 MB

Storage : SE 6120 (14 x 73 GB )

Operating System : Solaris 10 SPARC, Update 4

After a default OS installation, it is recommended to install the patches for the OS version being used, from http://sunsolve.sun.com .

Ensure that the SC3.x version being used has support for Solaris 10, Update x.

Installation of Solaris Cluster and HA MySQL Service s/w

This section describes the installation and configuration of the Sun Cluster and the HA MySQL service s/w components.

On the first node, perform these steps as root user :

Remove the previous product registry, if any :# /var/sadm/prod/SUNWentsys5/uninstall

Download and uncompress the SC s/w .zip file : Navigate to sun.com, click on 'downloads' , and follow the links; (suncluster-3_2-ga-solaris-sparc.zip)Set the PATH variable to include the directory of the Sun Cluster binaries :# PATH=$PATH:/usr/cluster/bin:/usr/ccs/bin:/usr/sfw/bin; export PATH

Navigate to the sc/Solaris_sparc sub-directory, and type :# ./installer

1. Choose 'No' for : Install the full set of Sun Java(TM) Enterprise System Products and Services?

2. Choose options 4,6 to select these products, and then 'A' on each to install all its components :[X] 4. Sun Cluster 3.2[X] 6. Sun Cluster Agents 3.2

3. Choose 1 for :Upgrade the shared components that were installed in the previous step ?

On the 2nd node, perform all the above steps (1-15 as applicable), as done for the 1st node, except for these differences :

16. Choose 1 : Create a new cluster or add a cluster node

17. Choose : Add this machine as a node in an existing cluster

18. Choose 'Typical' mode of operation, and supply the cluster name chosen earlier

19. Choose yes for : Do you want to use autodiscovery (yes/no) [yes]?

20. Choose remaining prompts similar to those for the first node.

Once the nodes have booted into the cluster, the configuration of Sun Cluster and HA MySQL s/w are complete.

Configuring a failover file system using a ZFS pool

This section describes the creation and configuration of a failover file system using a ZFS pool.

The database will reside on a global file system whereas the installation will be on local file systems. A failover file system has increased performance over a cluster file system as all nodes do not have to commit. However, as only one node sees the file system at a time, there is a slight increase in failover time.

On the first node, perform these steps as root user :

1. Choose a suitable shared disk among those in your setup. The disk names would have two entries, and can be listed by typing :# scdidadm -L

2. Execute the zpool command as is (volume mysql3 and the specified shared disk name are for illustration):# zpool create mysql3 c1t20030003BA13E6A1d0

Modify and configure the default file mysql_config , in order to create a fault monitor user and a test database for the MySQL instance. The fault monitor attempts to restart the server in case of a shutdown :

# /opt/mysql/mysql/bin/mysql -S /tmp/v1280-logical.sock -uroot -padmin123> use mysql> grant all on \*.\* to 'root'@'v1280-137-03' identified by 'admin123';> grant all on \*.\* to 'root'@'v1280-137-04' identified by 'admin123';> grant all on \*.\* to 'root'@'v1280-logical' identified by 'admin123';> UPDATE user SET Grant_priv='Y' WHERE User='root' AND Host='v1280-logical';> UPDATE user SET Grant_priv='Y' WHERE User='root' AND Host='v1280-137-03';> UPDATE user SET Grant_priv='Y' WHERE User='root' AND Host='v1280-137-04';> exit

Refresh the grant privileges for the administrator :# /opt/mysql/mysql/bin/mysql -S /tmp/v1280-logical.sock -uroot -padmin123 > use mysql> grant all on \*.\* to 'root'@'v1280-137-03' identified by 'admin123';> grant all on \*.\* to 'root'@'v1280-137-04' identified by 'admin123';> grant all on \*.\* to 'root'@'v1280-logical' identified by 'admin123';> UPDATE user SET Grant_priv='Y' WHERE User='root' AND Host='v1280-logical';> UPDATE user SET Grant_priv='Y' WHERE User='root' AND Host='v1280-137-03';> UPDATE user SET Grant_priv='Y' WHERE User='root' AND Host='v1280-137-04';

Internal test suite run

The MySQL benchmark suite (currently single-threaded) is part of a server installation. It can be used to determine which database operations in an implementation perform well or poorly.

The following run illustrates the setup and completion of the test suite. The actual performance numbers are not the focus, this being a default implementation.

Run the internal test suite on a MySQL 5.0.45 32-bit implementation :

On both nodes, download and install the the MySQL DBD driver, and the Perl Modules to access the database servers :As root user, download and uncompress the latest DBI/DBD files from :http://www.cpan.org/modules/by-category/07_Database_Interfaces/DBD/(Eg. Currently DBI-1.602.tar.gz and DBD-mysql-4.006.tar.gz).

S/W fault test run

A.Perform a manual failover of an executing MySQL client transaction:

On the primary node (resources and db instance up), execute multiple transactions that insert records in a table. Check that transactions are either committed or rolled back, and that the data is consistent. Measure the switchover (controlled failover) time:

Node 1 - 3rd window..The script aborts soon after 'clrg switch' begins executing. Those transactions that completed before the abort are committed to the database successfully. The in flight transaction rolls back completely, and pending transactions are not executed. All the threads are stopped, tables and logs flushed, and a clean shutdown is performed.

Failover time is the difference between the moment the fault is injected on Node 2 (MySQL DB instance no
longer available) to the moment gds_svc_start is completed on the same node (MySQL DB instance available again) = 67 sec. Here, the nearest equivalent fault injection message reported on Node 2 is that of Node 1 going down.

Failover time is the difference between the moment the fault is injected on Node 1 (MySQL DB instance no
longer available) to the moment gds_svc_start is completed on Node 2 (MySQL DB instance available again) = 56 sec. Here, the nearest equivalent fault injection message reported on Node 1 is that of the pnmd daemon exiting.

Failover time is the difference between the moment the fault is injected on Node 2 (MySQL DB instance no
longer available) to the moment gds_svc_start is completed on the same node (MySQL DB instance available again) = 50 sec. Here, the nearest equivalent fault injection message reported on Node 1 is that of the resource status on Node 1 being faulted.

S/W fault regression test run

An internal regression test suite (developed by Sun's cluster team) was used to execute cluster regression tests. The suite comprises a set of s/w fault tests, and is automated. The kit s/w is installed onto a client machine. The client is configured to access the cluster, and invokes the tests.

Eg. client m/c is v490-240-01 :On the cluster machines and on the client, add the 'dats' user, and install the test suite package (SUNWdats).

On the client, supply input to generate the data services configuration file :

> cd /opt/SUNWdats/tset_dataservice/bin> ./get_dsinfoEnter the name of the output file : mysql-faultEnter the name of one of the cluster nodes : v1280-137-03Obtaining cluster configuration information. Please wait...Do you want to run tests using New Command Set ? [y/n] : y

Select the Data Service Type 1) Failover dataservice with one resource group 2) Scalable dataservice with one Shared Address Resource group and one Scalable Resource Group 3) Pre-created Resource Group Configuration 4) OtherEnter your selection : 3

Obtaining the registered Resource Group Names on the cluster. Please wait...These are the registered Resource Groups on the cluster : mysql-rgWhich Resource Groups are needed for this dataservice ?Enter the Resource Group Names.End the list with a blank line.Resource Group Name ? mysql-rgResource Group Name ?You entered the following Resource Group Names : mysql-rgIs this correct ? [y/n] : yObtaining the Resource Groups/Resources Properties. Please wait...Processing Resource Group mysql-rg (failover)Do you have a client program for Resource Group mysql-rg ? [y/n] : nResource sql-stor of SUNW.HAStoragePlus:4

10: Kill the application daemon process repeatedly to exceed the Retry_count within the Retry_interval. This should result in the restarting of the data service on the same node until Retry_count is reached, and failover of the data service after the subsequent kill attempt# scrgadm -c -j mysql -y Retry_interval=1450# scrgadm -c -g mysql-rg -y Pingpong_interval=360# kill -9 <pid of mysqld> // repeat 2-3 times after mysqld restarts; failover should occur eventually

11: If the fault monitor daemon processes associated with a resource are killed, they should automatically be restarted# kill -9 <pid of gds_probe>