This chapter is from the book

This chapter is from the book

With an ever-growing market share and a continual stream of new software and
functionality, MySQL AB's products have made a dramatic impact on the
technology industry. New versions, features, and functions are arriving at
an increasing pace, but database designers, developers, and administrators
might feel overwhelmed by the sheer number of products, along with how frequently
they're updated.

Although the primary purpose of this book is to help you coax the most performance
out of your MySQL installation, you should first get the lay of the land of
MySQL's entire product suite. To help make things clear, each of the major
products can be classified into a small group of categories. In addition, this
chapter describes how they are covered within this book, and also mentions
some helpful performance-related tools that now ship with the products.

After reviewing the product line, this chapter briefly calls out the versions
that are covered in this book, along with some of the platforms that we tested
when making our recommendations.

Finally, this chapter lists all of MySQL's major performance enhancements
beginning with version 4.0. This list might help you determine when it's
time to upgrade (if you're an existing user) or which version to choose
(if you're new to MySQL).

MySQL Products

To help make the most sense of MySQL AB’s broad, rapidly growing
product suite, these products are classified into the following categories:

MySQL Core Technologies

Database Storage Engines and Table Types

Distributed Computing Technologies

Graphical Tools and Assistants

Connectors

APIs

Utilities

Each of these categories are explained in the following sections. In addition
to these products, MySQL’s website features hundreds of partner solutions
(commercial, shareware, and freeware) that add value throughout the database
design, development, deployment, and management cycles; there are also many
applications built using MySQL technology.

MySQL Core Technologies

As the foundation of the entire product line, these technologies span a wide
range of functionality, from MySQL’s implementation of SQL to its query
optimizer to memory management and communication. This book continually points
out ways to improve these components’ performance. Specifically, chapters
are dedicated to making the most of your SQL statements, MySQL’s query
optimizer, general database server engine settings, and other core
technology-related features.

Database Storage Engines and Table Types

Responsible for accumulating and retrieving information, the database storage
engine lies at the heart of your MySQL installation. When it comes to picking a
specialized storage engine or table type, MySQL offers database designers and
administrators a surfeit of choices. This book spends considerable time
discussing the following:

MyISAM—Fast, compressible, and
FULLTEXT-searchable, this is the default MySQL engine.

InnoDB—Robust, transaction-ready, with strong
referential integrity, this storage engine is often used to support complex,
high-volume applications, in which transactional guarantees are
essential.

MERGE—By creating a single view of multiple
identical MyISAM tables, this storage engine is essential to feed reporting or
Decision Support System (DSS)/Online Analytical Processing (OLAP)
tools.

MEMORY—Previously known as HEAP, its tables are
memory-based, extremely fast and easy to configure, letting developers leverage
the benefits of in-memory processing via a standard SQL interface.

ARCHIVE—As its name indicates, this storage engine
is aimed at applications with very large volumes of infrequently-or-never
updated information. Its tables are parsimonious in their consumption of disk
resources.

CSV—By creating comma-separated files (.csv), this
storage engine makes it very easy for developers to feed other applications that
consume these kinds of files with MySQL-based data.

FEDERATED—Define and access remote tables as if
they were hosted locally.

NDB Cluster—As the underlying storage engine
technology of MySQL Cluster, NDB Cluster makes it possible for multiple
computers to keep their in-memory data in sync, leading to dramatic scalability
and performance improvements.

Of the preceding list, the MyISAM and InnoDB storage engines see the most
usage, which is one reason why this book has chapters dedicated to each of them,
along with a chapter exploring MySQL Cluster (Chapter 17, "Clustering and
Performance").

MySQL offers several additional storage engines that are not covered in this
book. These include the following:

ISAM—Although this is the original MySQL storage
engine, the MyISAM engine has superseded this product; in fact, it will no
longer be distributed from version 5.0. Nevertheless, many of the suggestions
for improving MyISAM response might also apply for legacy ISAM tables.

Berkeley Database (BDB)—This was the first MySQL
storage engine to offer transactional support, among many other advanced
features. However, the InnoDB storage engine has garnered, by far, the higher
market share for this kind of storage engine, so this book primarily focuses on
InnoDB.

EXAMPLE—This is not a storage engine per se;
instead, it can best be thought of as a template that shows MySQL’s
worldwide development community how to write a storage engine.

MaxDB—This is not a storage engine, but a separate
product, originally developed by Adabas, and then overseen by SAP. It’s
used by thousands of SAP customers today. Given the different lineages of the
main MySQL product line and MaxDB, it is not covered in this book. However, many
of the general-purpose recommendations (for example, designing for speed,
indexing, and overhead reduction) made in this book are also applicable to
MaxDB.

Distributed Computing Technologies

Replication and MySQL Cluster are the two foremost MySQL distributed
computing technologies. Replication refers to the act of keeping multiple
"slave" computers in sync with a "master" server. Because this
is such a simple yet powerful way to increase throughput, Chapter 16,
"Optimal Replication," is dedicated to replication best practices.

MySQL Cluster leverages multiple computers into a single team; this yields
impressive performance and reliability gains, and is only limited by the amount
of hardware you have at your disposal. This topic also merits its own chapter.
Chapter 17 explores scenarios in which clustering makes good performance
sense.

Graphical Tools and Assistants

From the beginning, MySQL products have typically been configured, monitored,
and managed from the command line. However, several MySQL offerings now provide
an easy-to-use, graphical interface:

MySQL Administrator—Makes it possible for
administrators to set up, evaluate, and tune their MySQL database server. This
is intended as a replacement for mysqladmin.

MySQL Query Browser—Provides database developers
and others with a graphical database operation interface. It is especially
useful for seeing multiple query plans and result sets in a single user
interface.

Configuration Wizard—Makes it easy for
administrators to pick and choose from a predefined list of optimal settings, or
create their own.

MySQL System Tray—Provides Windows-based
administrators a single view of their MySQL instance, including the ability to
start and stop their database servers. It is similar to tools offered by other
database vendors.

These important capabilities are referred to throughout the book. The
Configuration Wizard is examined later in this chapter.

By using the ODBC connector to MySQL, any ODBC-aware client application (for
example, Microsoft Office, report writers, Visual Basic) can connect to MySQL
without knowing the vagaries of any MySQL-specific keyword restrictions, access
syntax, and so on; it’s the connector’s job to abstract this
complexity into an easily used, standardized interface.

APIs

MySQL AB and several third parties provide application programming interface
(API) libraries to let developers write client applications in a wide variety of
programming languages, including the following:

C (provided automatically with MySQL)

C++

Eiffel

.NET

Perl

PHP

Python

Ruby

Tcl

Currently, C, PHP, and Perl represent the most widely used APIs from the
preceding list, with ODBC connector-using client application development tools
also seeing extensive usage. Although this book is not meant to be a detailed
programming guide for any particular language, it does discuss the interplay
between your chosen API and MySQL performance in Chapter 9.

Where applicable, this book points out how to use these tools to boost
performance. For example, the mysqldump utility is covered in great detail in
Chapter 15, "Improving Import and Export Operations."

Performance-Related Tools

MySQL ships a number of tools that can help database administrators
configure, test, and tune their MySQL installations. Some of these tools are
aimed at people interested in source code, whereas others are aimed at a broader
audience. Each of these tools are briefly examined in the following
sections.

Benchmark Suite

MySQL’s benchmark suite, available for download from their website, is
a useful set of automated tests to help determine overall system performance for
a broad collection of common database-oriented tasks. For example, the following
is a snippet of Perl code that tests inserting new rows into a table:

Although these tests don’t help you determine the optimal database
schema design, query construction, or application logic practices, they are
useful for testing the before-and-after impact of changes to your MySQL server
configuration settings. Just be certain that you take overall system load into
consideration when evaluating the results.

BENCHMARK() Function

The built-in BENCHMARK() function is useful for running raw timing tests on
various computational functions within MySQL. The results of these tests can
help you:

Compare MySQL’s processing capabilities for disparate
operations.

Compare the same operations on different hardware/OS platforms.

For example, you can compare how long it takes MySQL to calculate the MD5 128
bit checksum for a randomly generated number on a modern, multiprocessor Linux
machine versus a five-year-old, single-CPU desktop computer. This actually tests
two MySQL functions: MD5() and RAND().

You could perform this test by hand, time the results, and write them down on
paper:

Notice the difference in how long it took to return the results: This is the
number you should watch.

You can use this function to test the amount of time necessary to complete
any expression. Note that BENCHMARK(), although valuable, does not tell you
whether a particular query is efficient. For that kind of task, use the EXPLAIN
statement, which is reviewed in great detail during Chapter 6,
"Understanding the MySQL Optimizer," study of the MySQL query
optimizer.

Configuration Wizard

Recent versions of MySQL now offer an optional Configuration Wizard,
typically launched upon installation. This section takes a look at the sequence
of steps followed by this wizard, along with how these topics are addressed
throughout the book.

Note that this wizard is quite dynamic, so your experience might be different
from the one presented here (see Figure
3.1).

Your first decision is to choose either a boilerplate ("standard") or
customized ("detailed") installation process. Don’t underestimate
the value of the boilerplate configuration; it has been well thought out, and
represents a good catch-all setup (see Figure
3.2).

Figure 3.2 Choose
between a customized or general-purpose configuration.

If you choose the customized path, the first decision you must make is to
select the type of database server that you are configuring as shown in Figure
3.3.

This is an important decision because the workloads experienced by
transactional and decision support database servers are quite different, meaning
that their respective configurations need to reflect this diversity.

This book keeps this diversity in mind throughout, and makes recommendations
accordingly.

The wizard next provides a choice on how to configure the initial InnoDB
tablespace (see Figure
3.5).

Configuring the correct number of concurrent sessions, network protocols, and
character sets are your next assessments, as shown in Figures
3.6, 3.7, and
3.8.

The impact of connectivity and network settings on performance are examined
as part of several chapters, including those on general engine tuning, optimal
application development, and network configuration. However, character set
issues are not part of the subject matter in this book.

The wizard then gives us a choice on how the database server will be started,
as well as security alternatives (see Figures
3.9 and 3.10).

Because a Windows server is running for this example, MySQL provides
Windows-specific options. The interplay between MySQL and its host operating
system is explored in Chapter 14, "Operating System, Web Server and
Connectivity Tuning"; aside from the performance degradation inherent in
overly complex permission schemes, security is largely a peripheral topic for
this book.