How to Improve Database Performance Using Database Smart Flash Cache

on Oracle Linux

Adapted from an Oracle white paper written by Rick Stehno

The information in this article is applicable to the all Sun Flash Accelerator Fxx Series adapters. The specific performance data presented in this article was collected using a Sun Flash Accelerator F40 PCIe Card.

You can obtain large performance gains and improved response times by combining Oracle's Sun Flash Accelerator F40 PCIe Card with the Database Smart Flash Cache feature of Oracle Database running on Oracle Linux with the Unbreakable Enterprise Kernel. This article describes how.

Note: The capability described in this article is supported only on the Oracle Linux and Oracle Solaris operating systems beginning with Oracle Database 11g Release 2. The procedures in this article apply to Oracle Linux.

Advantages of Using Flash-Based Storage

Online transaction processing (OLTP), data warehousing (DW), and analytics are typical uses for Oracle Database. OLTP and DW require fast response times and high throughput, making it difficult for database administrators (DBAs) to maintain and scale their infrastructure as the number of users grows and the amount of data increases. While performance bottlenecks might appear in several areas, including the network and the processor, they are most often caused by slow hard disk drives.

Flash-based storage provides performance that falls between the performance levels of hard disk drives and DDR3 memory. For example, the typical response time for a small data read from a hard disk drive is 5 milliseconds. Flash-based devices do this in 50 microseconds to 300 microseconds. Initial implementations of flash-based drives (SSDs, or solid-state drives) were intended to replace a hard disk drive in direct-attach storage or RAID subsystems. Mounting SSDs on a PCIe card is a recent innovation that alleviates throughput constraints that are also caused by the storage interface.

About Sun Flash Accelerator F40 PCIe Card

Sun Flash Accelerator F40 PCIe Card offers 400 GB capacity with over 149,000 random input/output operations per second (IOPs) and 2.1 GB/sec bandwidth performance in a single low-profile PCIe card. Its low-latency and high random IOPS performance results in fast response and increased I/O throughput. It uses advanced onboard controllers for enhanced reliability and low CPU overhead. It presents itself to the operating system as a flash card with four SAS drives that can be used for nonpersistent (cache) and persistent (storage) data.

Sun Flash Accelerator F40 PCIe Card is designed for a high level of reliability and compatibility with Oracle hardware and storage systems and Oracle software. It's ideal for use with the Database Smart Flash Cache feature available with Oracle Database 11g Release 2 and later releases.

How Oracle Database Uses Sun Flash Accelerator F40 PCIe Card

Oracle Database 11g Release 2 Enterprise Edition allows you to use flash devices to increase the effective size of the Oracle Database buffer cache (Level 2 cache) without adding more main memory. This capability is referred to as Database Smart Flash Cache.

Figure 1 shows a system with and without Database Smart Flash Cache. As you can see, the system treats Sun Flash Accelerator F40 PCIe Card as a transparent extension of the buffer cache. Because frequently accessed data is cached in the card, the database does not have to wait for data to arrive from slow hard disk drives. I/O service times can be up to 15 times faster.

Figure 1. A system with and without Database Smart Flash Cache.

When the database requests data I/O, the system first looks in the buffer pool. If the data is not found, the system then looks in the Database Smart Flash Cache buffer. If it does not find the data there, only then does it look in disk storage. Not only are performance and response times greatly improved, but also much better IOPS/$, IOPS/GB, IOPS/Watt, and server utilization efficiency are achieved.

Oracle recommends a flash cache size of 4 to 10 times the Oracle Database System Global Area (SGA) size. This amount will allow you to offload most of your disk I/O to flash. The I/O from disk will be stored in the flash buffer cache once it is evicted from the database buffer. All subsequent reads for that particular row are then done from flash. This is a clean (read) cache, because any dirty blocks (writes) are flushed to disk. This approach provides the necessary data protection, because any changes are already written to disk. Therefore, no RAID or mirroring is required.

How to Configure Sun Flash Accelerator F40 PCIe Card as a File System

Sun Flash Accelerator F40 PCIe Card is a block device optimized for 8K block sizing and alignment (consistent with that of Oracle databases). This section explains actions you can take to tune the card for maximum performance in an Oracle Linux environment with Unbreakable Enterprise Kernel.

The following steps configure the card as one file system that one database can use. Other options would be to create multiple aligned partitions on the card and allocate these partitions to other databases residing on the server for their own Database Smart Flash Cache.

Bypass journaling when you create a file system. Instead of using EXT-3 for the file system, use EXT- 2 or EXT- 4 with journaling turned off, which eliminates double writes to Sun Flash Accelerator F40 PCIe Card in certain cases. The reduction in writes increases performance and prolongs the life of the card.

Use the following commands to use EXT-2 with journaling turned off (the noatime option is described below):

mkfs -t ext2 /dev/sda1 or mkfs.ext2 /dev/sda1

When mounting the new non-journaling device, use the following command:

mount -t ext2 -o noatime /dev/sda1 /mountpoint

Use the following command to create an EXT-4 file system:

mkfs -t ext4 /dev/sda1 or mkfs.ext4 /dev/sda1

After creating the EXT-4 file system, journaling will be on by default. Verify this by executing the tune4fs command:

tune4fs -l /dev/sda1 | grep 'Filesystem features'

The has_journal feature should be listed. To turn off journaling, execute the following:

tune4fs -O ^has_journal /dev/sda1

To verify that journaling is disabled, execute the following command to make sure has_journal is not listed as enabled.

tune4fs -l /dev/sda1 | grep 'Filesystem features'

When mounting the new EXT-4 device, use the following command (the noatime option is described below):

mount -t ext4 -o noatime /dev/sda1 /mountpoint

The DEADLINE I/O scheduler is enabled by default in the Unbreakable Enterprise Kernel for Oracle Linux. To verify that it is enabled, issue the following statement as root:

cat /sys/block/sda/queue/scheduler
noop anticipatory [deadline] cfq

In addition to changing the DEADLINE I/O scheduler, use the noatime file system mount option in the /etc/fstab file. This option eliminates the need for the system to create writes to the file system when objects are only being read. This option also enables faster access to the files, plus it causes less wear on Sun Flash Accelerator F40 PCIe Card.

This example shows how the /etc/fstab entry invokes the noatime option:

/dev/sda1 /osfc ext2 defaults,noatime 1 2

An alternative to invoking the noatime option is to specify it when executing the mount command (see Step 3 above for an example of combining the noatime option with the ext2 file system option):

mount -o noatime /dev/sda1 /osfc

The performance of Sun Flash Accelerator F40 PCIe Card can benefit by increasing queue depth (QD) from the default of 128 to 256 or higher, depending on the load. Since the latency of the card is so small, more I/O operations can be run in parallel on the card. In order to modify queue depth, the nr_requests parameter will need to be modified to a value that is the same or larger than the new queue depth value. Here are examples of modifying both the nr_requests and queue_depth parameters for /dev/sda:

To provide expanded capacity for the Database Smart Flash Cache, multiple Sun Flash Accelerator F40 PCIe Cards can be deployed using volume management software. Multiple cards can be mirrored, but in the following example, two cards were installed using Oracle Automatic Storage Management to expand the cache area capacity.

A benefit of using an Oracle Automatic Storage Management diskgroup for Database Smart Flash Cache is that multiple databases on the server will be able to share this diskgroup to create multiple Database Smart Flash Caches for the different databases.

See the "How to Configure Sun Flash Accelerator F40 PCIe Card as a File System" section of this article for information on configuring both Sun Flash Accelerator F40 PCIe Cards. When using Oracle Automatic Storage Management, the only steps in that section that need to be performed are aligning the card and changing the Oracle Linux I/O scheduler to deadline.

Another option for creating an Oracle Automatic Storage Management diskgroup is to use the Oracle ASM Configuration Assistant, which uses a graphical user interface to create the diskgroup.

How to Configure Oracle Database

This section describes the changes needed to enable and configure an Oracle Database 11g Release 2 database with the Database Smart Flash Cache feature, which is supported by Oracle Linux and by Oracle Solaris.

Setting up the Database Smart Flash Cache feature is very easy and requires just a few steps to aggregate the Sun Flash Accelerator F40 PCIe Cards to a pool, specify the path to the flash devices, and specify the size of the flash devices, as follows.

While the Database Smart Flash Cache is dynamic and very efficient due to the fact that it automatically migrates and evicts data as needed, DBAs have the option to pin objects/hot data to the Database Smart Flash Cache using the KEEP command. Normally a DBA would pin an object to the KEEP buffer pool, which resides in memory. By pinning an object to the Database Smart Flash Cache, real memory requirements are reduced, and performance is increased since more selected objects/hot data can then be accessed directly from the faster flash device instead of from a much slower disk. The syntax for pinning an object to the Database Smart Flash Cache is the following:

Conclusion

Flash accelerates applications, increases productivity, and improves business responsiveness. Based upon the benchmarks that were executed for this article using Oracle's Sun Flash Accelerator F40 PCIe Card and the Database Smart Flash Cache feature of Oracle Database running on Oracle Linux with Unbreakable Enterprise Kernel, large performance gains were realized. Whether you are running Oracle Database or other I/O-intensive applications, similar performance gains and improved response times can be realized in the enterprise using the configuration presented in this article for workloads that

Are disk-bound

Are I/O-intensive and read-oriented

Are I/O-bound by large number of disk IOPS

Require low latency and high random I/O throughput

As a side benefit, implementing the Database Smart Flash Cache feature with Sun Flash Accelerator F40 PCIe Card reduces the hard disk IOPS for reads, and the reduction in IOPS for reads results in improved physical writes with less latency to disk. This not only improves application performance and response times, but increases server efficiency due to less storage I/O waiting.

These are significant benefits for customers running large databases. The Database Smart Flash Cache feature—used with Sun Flash Accelerator F40 PCIe Card and Oracle Linux with Unbreakable Enterprise Kernel or Oracle Solaris—provides a platform that can scale and perform to the demanding needs of growing enterprises.