Deploying virtualization into a production data center can provide an interesting mix of pros and cons. By consolidating workloads onto fewer server, physical management is simplified. But what about managing the VMs? While storage solutions can provide much-needed flexibility, it’s still up to datacenter administrators to determine their needs and develop appropriate solutions. In this article, I’ll present storage-related considerations for datacenter administrators.

Estimating Storage Capacity Requirements

Virtual machines generally require a large amount of storage. The good news is that this can, in some cases, improve storage utilization. Since direct-attached storage is not confined to a per-server basis (which often results in a lot of unused space), using centralized storage arrays can help. There’s also a countering effect, however: Since the expansion of virtual disk files is difficult to predict, you’ll need to leave some unallocated space for expansion. Storage solutions that provide for over-committing space (sometimes referred to as “soft-allocation”) and for dynamically resizing arrays can significantly simplify management.

To add up the storage requirements, you should consider the following:

The sum of the sizes of all “live” virtual disk files

Expansion predictions for virtual disk files

State-related disk files such as those used for suspending virtual machines and maintaining point-in-time snapshots

Space required for backups of virtual machines

All of this can be a tall order, but hopefully the overall configuration is no more complicated than that of managing multiple physical machines.

Placing Virtual Workloads

One of the best ways to reduce disk contention and improve overall performance is to profile virtual workloads to determine their requirements. Performance statistics help determine the number, size, and type of IO operations. Table 1 provides an example.

Table 1: Assigning workloads to storage arrays based on their performance requirements

In the provided example, the VMs are assigned to separate storage arrays to minimize contention. By combining VMs with “compatible” storage requirements on the same server, administrators can better distribute load and increase scalability.

Selecting Storage Methods

When planning to deploy new virtual machines, datacenter administrators have several different options. The first is to use local server storage. Fault-tolerant disk arrays that are directly-attached to a physical server can be easy to configure. For smaller virtualization deployments, this approach makes sense. However, when capacity and performance requirements grow, adding more physical disks to each server can lead to management problems. For example, arrays are typically managed independently, leading to wasted disk space and requiring administrative effort.

That’s where network-based storage comes in. By using centralized, network-based storage arrays, organizations can support many host servers using the same infrastructure. While support for technologies varies based on the virtualization platform, NAS, iSCSI, and SAN-based storage are the most common. NAS devices use block-level IO and are typically used as file servers. They can be used to store VM configuration and hard disk files. However, latency and competition for physical disk resources can be significant.

SAN and iSCSI storage solutions perform block-level IO operations, providing raw access to storage resources. Through the use of redundant connections and multi-pathing, they can provide the highest levels of performance, lowest latency, and simplified management.

In order to determine the most appropriate option, datacenter managers should consider workload requirements for each host server and its associated guest OS’s. Details include the number and types of applications that will be running, and their storage and performance requirements. The sum of this information can help determine whether local or network-based storage is most appropriate.

Monitoring Storage Resources

CPU and memory-related statistics are often monitoring for all physical and virtual workloads. In addition to this information, disk-related performance should be measured. Statistics collected at the host server level will provide an aggregate view of disk activity and whether storage resources are meeting requirements. Guest-level monitoring can help administrators drill-down into the details of which workloads are generating the most activity. While the specific statistics that can be collected will vary across operating systems the types of information that should be monitoring include:

IO per Second (IOPs): This statistic refers to the number of disk-related transactions that are occurring at a given instant. IOPs are often used as the first guideline for determining overall storage requirements.

Storage IO Utilization: This statistic refers to the percentage of total IO bandwidth that is being consumed at a given point in time. High levels of utilization can indicate the need to upgrade or move VMs.

Paging operations: Memory-starved VMs can generate significant IO traffic due to paging to disk. Adding or reconfiguring memory settings can help improve performance.

Disk queue length: The number of IO operations that are pending. A consistently high number will indicate that storage resources are creating a performance bottleneck.

Storage Allocation: Ideally, administrators will be able to monitor the current amount of physical storage space that is actually in use for all virtual hard disks. The goal is to proactively rearrange or reconfigure VMs to avoid over-allocation.

VM disk-related statistics will change over time. Therefore, the use of automated monitoring tools that can generate reports and alerts are an important component of any virtualizations storage environment.

Summary

Managing storage capacity and performance should be high on the list of responsibilities for datacenter administrators. Virtual machines can easily be constrained by disk-related bottlenecks, causing slow response times or even downtime. By making smart VM placement decisions and monitoring storage resources, many of these potential bottlenecks can be overcome. Above all, it’s important for datacenter administrators to work together with storage managers to ensure that business and technical goals remain aligned over time.

It’s common for new technology to require changes in all areas of an organization’s overall infrastructure. Virtualization is no exception. While many administrators often focus on CPU and memory constraints, storage-related performance is also a very common bottleneck. In some ways, virtual machines can be managed like physical ones. After all, each VM runs its own operating systems, applications, and services. But there are also numerous additional considerations that must be taken into account when designing a storage infrastructure. By understanding the unique needs of virtual machines, storage managers can build a reliable and scalable data center infrastructure to support their VMs.

Analyzing Disk Performance Requirements

For many types of applications, the primary consideration around which the storage infrastructure is designed is based on I/O operations per second (IOPS). IOPS refer to the number of read and write operations that are performed, but do not always capture the whole picture. Additional considerations include the type of activity. For example, since virtual disks that are stored on network-based storage arrays must support guest OS disk activity, the average I/O request size tends to be small. Additionally, I/O requests are frequent and often random in nature. Paging can also create a lot of traffic on memory-constrained host servers. There are also other considerations that will be workload-specific. For example, it’s also good to measure the percentage of read vs. write operations when designing the infrastructure.

Now, multiply all of these statistics by the number of VMs that are being supported on a single storage device, and you are faced with the very real potential for large traffic jams. The solution? Optimize the storage solution for supporting many, small, and non-sequential IO operations. And, most importantly, distribute VMs based on their levels and types of disk utilization. Performance monitoring can help generate the information you need.

Considering Network-Based Storage Approaches

Many environments already use a combination of NAS, SAN, and iSCSI-based store to support their physical servers. These methods can still be used for hosting virtual machines, as most virtualization platforms provide support for them. For example, SAN- or iSCSI-based volumes that are attached to a physical host server can be used to store virtual machine configuration files, virtual hard disks, and related data. It is important to note that, by default, the storage is attached to the host and not to the guest VM. Storage managers should keep track of which VMs reside on which physical volumes for backup and management purposes.

In addition to providing storage at the host-level, guest operating systems (depending on their capabilities) can take advantage of NAS and iSCSI-based storage. With this approach, VMs can directly connect to network-based storage. A potential drawback, however, is that guest operating systems can be very sensitive to latency, and even relatively small delays can lead to guest OS crashes or file system corruption.

Evaluating Useful Storage Features

As organizations place multiple mission-critical workloads on the same servers through the use of virtualization, they can use various storage features to improve reliability, availability and performance. Implementing RAID-based striping across arrays of many disks can help significantly improve performance. The array’s block size should be matched to the most common size of I/O operations. However, more disks means more chances for failures. So, features such as multiple parity drives and hot standby drives are a must.

Fault tolerance can be implemented through the use of multi-pathing for storage connections. For NAS and iSCSI solutions, storage managers should look into having multiple physical network connections and implementing fail-over and load-balancing features by using network adapter teaming. Finally, it’s a good idea for host servers to have dedicated network connections to their storage arrays. While you can often get by with shared connections in low-utilization scenarios, the load placed by virtual machines can be significant and can increase latency.

Planning for Backups

Storage administrators will have the need to backup many of their virtual machines. Apart from allocating the necessary storage space, it is necessary to develop a method for dealing with exclusively-locked virtual disk files. There are two main approaches:

Guest-Level Backups: In this approach, VMs are treated like physical machines. Generally, you would install backup agents within VMs, define backup sources and destinations, and then let them go to work. The benefit of this approach is that only important data is backed up (thereby reducing required storage space). However, your backup solution must be able to support all potential guest OS’s and versions. And, the complete recovery process can involve many steps, including reinstalling and reconfiguring the guest OS.

Host-Level Backups: Virtual machines are conveniently packaged into a few important files. Generally, this includes the VM configuration file and virtual disks. You can simply copy these files to another location. The most compatible approach involves stopping or pausing the VM, copying the necessary files, and then restarting the VM. The issue, however, is that this can require downtime. Numerous first- and third-party solutions are able to backup VMs while they’re “hot”, thereby eliminating service interruptions. Regardless of the method used, replacing a failed or lost VM is easy – simple restore the necessary files to the same or another host server and you should be ready to go. The biggest drawback of host-level backups is in the area of storage requirements. You’re going to be allocating a ton of space for the guest OS’s, applications, and data you’ll be storing.

Storage solutions options such as the ability to perform snapshot-based backups can be useful. However, storage administrators should thoroughly test the solution and should look for explicitly-stated virtualization support from their vendors. Remember, backups must be consistent to a point in time, and non-virtualization-aware solutions might neglect to flush information stored in the guest OS’s cache.

Summary

By understanding and planning for the storage-related needs of virtual machines, storage administrators can help their virtual environments scale and keep pace with demand. While some of the requirements are somewhat new, many involve utilizing the same storage best practices that are used for physical machines. Overall, it’s important to measure performance statistics and to consider storage space and performance when designing a storage infrastructure for VMs.

Much of the power and flexibility of virtualization solutions comes from the features available for virtual hard disks. Unfortunately, the many different configuration types that are available, you can end up reducing overall performance if you’re not careful. A key concept is virtual hard disk file placement. Let’s look at some scenarios and recommendations that can have a significant impact on performance.

VHD File Placement

Most production-class servers will have multiple physical hard disks installed, often to improve performance and to provide redundancy. When planning for allocating VHDs on the host’s file system, the rule is simple: Reduce disk contention. The best approach requires an understanding of how VHD files are used.

If each of your VMs has only one VHD, then you can simply spread them across the available physical spindles based on their expected workload. A common configuration is to use one VHD for the OS and to attach another for data storage. If both VHDs will be busy, placing then on different physical volumes can avoid competition for resources. Other configurations can be significantly more complicated, but the general rule still applies: try to spread disk activity across physical spindles whenever possible.

Managing Undo and Differencing Disks

If you are using undo disks or differencing disks, you’ll want to arrange them such that concurrent I/O is limited. Figure 1 shows an example in which differencing disks are spread across physical disks. In this configuration, the majority of disk read activity is occurring on the parent VHD file, whereas the differencing disk will experience the majority of write activity. Of course, these are only generalizations as the size of the VHDs and the actual patterns of read and write activity can make a huge difference.

Figure 1: Arranging parent and child VHD files for performance.

In some cases, using undo disks can improve performance (for example, when the undo disks and base VHDs are on separate physical spindles). In other cases, such as when you have a long chain of differencing disks, you can generate a tremendous amount of disk-related overhead. For some read and write operations, Virtual Server might need to access multiple files to find the “latest” version of the data. And, this problem will get worse over time. Committing undo disks and merging differencing disks with their parent VHDs are important operations that can help restore overall performance.

Fixed-Size vs. Dynamically-Expanding VHDs

The base type for VHDs you create can have a large affect on overall performance. While dynamically-expanding VHDs can make more efficient use of physical disk space on the host, they tend to get fragmented as they grow. Fixed-size VHDs are more efficient since physical disk space is allocated and reserved when they’re created. The general rule is, if you can spare the disk space, go with fixed-size hard disks. Also, keep in mind that you can always convert between fixed-size and dynamically-expanding VHDs, if your needs change.

Host Storage Configuration

The ultimate disk-related performance limits for your VMs will be determined by your choice of host storage hardware. One important decision (especially for lower-end servers) is the type of local storage connection. IDE-based hard disks will offer the poorest performance, whereas SATA, SCSI, and Serial-Attached SCSI (SAS) will offer many improvements. The key to the faster technologies is that they can efficiently carry out multiple concurrent I/O operations (a common scenario when multiple VMs are cranking away on the same server).

When evaluating local storage solutions, there are a couple of key parameters to keep in mind. The first is overall disk throughput (which reflects the total amount of data that can be passed over the connection in a given amount of time). The other important metric is the number of I/O operations per second that can be processed. VM usage patterns often result in a large number of small I/O operations. Just as important is the number of physical hard disks that are available. The more physical disk spindles that are available, the better will be your overall performance.

Using RAID

Various implementations of RAID technology can also make the job of placing VHD files easier. Figure 2 provides a high-level overview of commonly-used RAID levels, and their pros and cons. By utilizing multiple physical spindles in each array, performance can be significantly improved. Since multiple disks are working together at the disk level, the importance of manually moving VHD files to independent disks is reduced. And, of course, you’ll have the added benefit of fault-tolerance.

Figure 2: Comparing various RAID levels

Virtual IDE vs. SCSI Controllers

Virtual Server allows you two different methods for connecting virtual hard disks to your VMs: IDE and SCSI. Note that these options are independent of the storage technology you’re using on the host server. The main benefit of IDE is compatibility: Pretty much every x86-compatible operating system supports the IDE standard. You can have up to four IDE connections per VM, and each can have a virtual hard disk or virtual CD/DVD-ROM device attached.

While IDE-based connections work well for many simpler VMs, SCSI connections offer numerous benefits. First, VHDs attached to an IDE channel are limited to 127GB, whereas SCSI-attached VHDs can be up to 2 terabytes in size. Additionally, the virtual SCSI controller can support up to a total of 28 attached VHDs (four SCSI adapters times seven available channels on each)! Figure 3 provides an overview of the number of possible disk configurations.

Figure 3: Hard disk connection interface options for VHDs

If that isn’t enough there’s one more advantage: SCSI-attached VHDs often perform better than IDE-attached VHDs, especially when the VM is generating a lot of concurrent I/O operations. Figure 3 shows an overview of the available hard disk connections for a VM.

Figure 4: Configuring a SCSI-attached VHD for a VM.

One helpful feature is that, in general, the same VHD file can be attached to either IDE or SCSI controllers without making changes. A major exception to the rule is the generally the boot hard disk, as BIOS and driver changes will likely be required to make that work. Still, the rule for performance is pretty simple: Use SCSI-attached VHDs whenever you can and use IDE-attached VHDs whenever you must.

Summary

When you’re trying to setup a new Virtual Server installation for success, designing and managing VHD storage options is a great first step. Disk I/O bottlenecks are a common cause of real-world performance limitations, but there are several ways to reduce them. In the next article, I’ll talk about maintaining VHDs to preserve performance over time.

Disclaimer

All content provided or linked to from this site is provided as-is and with no implied warranty. Use of any technical content should be at users' discretion and the author will take no responsibility for any of its usage or applications.