I recently had a great question on some of the differences in virtual machine disk presentation from one of our amazing clients, and I thought I’d share the answer here because it’s a common question that I receive.

Some hypervisors (including some hyper-converged compute platform vendors who shall remain nameless) do not give you much flexibility in the way storage is presented to a VM. You pick the VM and click to add drives. That’s it. No knobs to turn, no mess, no fuss. It’s meant to be easy, and for almost all situations, that’s just fine.

But, that might not be the completely optimal way to configure a VM that is hungry for I/O (such as a large SQL Server), if you have the option to configure it a bit more closely.

On Thursday, March 23rd, at 2pm Eastern, I will be hosting a webinar with Argenis Fernandez from Pure Storage where we will talk about infrastructure challenges as organizations start to adopt SQL Server’s in-memory features, such as In-Memory OLTP and Columnstore Indexes.

Details: The in-memory features that accompany modern versions of SQL Server, such as In-Memory OLTP and columnstore indexes, are some of the most ground-breaking and exciting enhancements to SQL Server in recent memory. However, have you explored these features and found that the performance boosts are not quite as great as advertised? The dependencies on a blazing fast infrastructure underneath SQL Server have never been higher. While these features are lightning fast when used appropriately, the speed of the infrastructure underneath, mostly CPU, memory, and storage, can hold back the performance of these features. Join David Klee, Heraflux Technologies, and Argenis Fernandez, Pure Storage, to learn how to leverage these features to boost your database performance, detect and diagnose any infrastructure performance issues that might exist, and learn about possible long-term improvements to your infrastructure that can safeguard your performance for years!

I’m pleased to announce the general availability of a new free ebook collaboration with James Green from ActualTech Media called “Modern Storage Strategies for SQL Server“. Storage is so vitally important to SQL Server performance, but the intricacies of one side’s administrators are rarely known by the other. My goal for this ebook was to set out to educate SQL Server professionals on how the storage underneath their data actually operates, how to work with the storage administrators on topics specific to SQL Server, and show them how they can make the most of it to improve the performance and availability of their databases.

In This Gorilla Guide You’ll Learn:

The basics of SQL Server and database workload characteristics

Key considerations for storage architecture with regard to SQL Server

Useful tips for protecting SQL Server from disasters

Best practices for leveraging flash storage for SQL Server

How to modernize SQL Server by taking advantage of the latest updates to the platform

A few weeks ago Bala Narasimhan from PernixData and I recorded a short conversation where we discussed the tips and tricks that both DBAs and Infrastructure Administrators need to maximize the performance of their systems without significant re-architecting of their environments. Check it out!

My recent post on using the new DiskSpd utility to help you benchmark your storage is a great primer on how to use DiskSpd. But… what if you want to run multiple tests? For example, what if you want to run both read and write tests with a varying degree of threads or operations per threads to see the ramp up curve? What if you wanted these automated? What about putting the results from all of these tests into something that you can quickly review?

Now you can – with a PowerShell script that I’m releasing for free. I call it DiskSpd Batch.

Not only does the script help automate your test cycles, it leverages the great DiskSpd feature of saving the results to an XML file. After the testing cycles complete, it then extracts the relevant information from each test cycle and places it into a CSV output file. You can use this file to perform your own analysis on the results.

This PowerShell script is available for free over at my business web site at Heraflux.com.

Usage

First, download DiskSpd from TechNet, and extract it to your hard drive on the server that you wish to test. Read the documentation that comes with it.

Next, find the subdirectory that matches your system architecture (32 or 64-bit). This path becomes your location to the DiskSpd executable.

From an elevated PowerShell prompt, execute the script with the following parameters that you specify.

Syntax

Parameter

Description

-Time

Duration for each test cycle, measured in seconds

-DataFile

Path and filename for the workload file

-DataFileSize

Workload file size, in the format “500M” for 500MB, or “10G” for 10GB

-OutPath

Results output file location (output file is automatically named)

-SplitIO

“True” tests permutations of read and write tests in the same test cycle, in increments of 10%. “False” only tests 100% read or write test cycles.

-AllowIdle

So as not to overwhelm a storage device’s ability to flush inbound I/O to disk, pause for 20 seconds between test cycles

Example

A normal test cycle might resemble the following screenshot.

The script performs numerous tests in the testing cycle, and then extracts the relevant data from the resultant XML file and creates a CSV file with the information that matters.

The output file can be opened with your favorite spreadsheet program. The columns that you will find the most interesting are:

WriteRatio

IsRandom

MB/s

IOps

Read MB/s

Read IOps

Write MB/s

Write IOps

Read and write latencies, broken out by percentile

Download and experiment with it! Remember, storage testing can be dangerous to an IT infrastructure. Not only can you overwhelm one server that’s doing the testing, you can also negatively impact (or even bring offline) the entire storage device and all of the other dependent systems located on it. Do not execute any storage tests in your environment outside of your own workstation until you have the express permission to execute the tests during a pre-specified window of opportunity. Heraflux is not liable for any damage or disruption to your business from you executing these tests in your environment.

As I mentioned in my storage benchmarking post, storage performance is one of the critical infrastructure components underneath a mission-critical SQL Server.

My defacto storage benchmarking utility has been recently updated. Last October, Microsoft released a great free utility called diskspd, and it is freely available at http://aka.ms/DiskSpd. I consider it a very solid modern replacement to the much loved SQLIO. It is a synthetic I/O subsystem workload generator that runs via a command line. It produces similar tests to SQLIO, such as read or write, random or sequential, number of threads and thread intensity, and setting the block sizes, but also gives us some significant improvements.

The benefits of diskspd include:

Sub-millisecond granularity on all tests, extremely useful for local SSDs and flash/hybrid storage arrays

Ability to perform read AND write tests in the same test, similar to IOmeter

Latency output metrics per read and write portions of the test, with standard deviation values

CPU consumption analysis by thread during the tests

Latency percentile analysis with percentiles 0-99 and then 99.9 up to 99.99999999 and then 100%, which is very useful for finding inflection points at the extremes which can skew test averages

Can define the workload placement and size in the command line parameters, which is useful to keep the test cycles compact

Ability to set the entropy values used in the workload file generation

Output is in plain text with an option to output to XML, which is extremely useful for a result we can convert and use elsewhere

Random or sequential. If -r is specified. Random tests are performed. If this parameter is omitted, sequential tests are performed.

-t

Worker threads. I usually set this to the number of non-hyperthreaded cores on the server.

-w

Read and/or write percentage, based on the percentage of writes. If the test is a pure read test, set this to zero. If the test is a pure write test, set to 100. You can mix reads and writes. For example, if you want to perform a 20% write / 80% read test, set the parameter as -w20.

-Z

Workload test write source buffers sized to a specific number of bytes. Specify your unit of size (K/M/G). The larger the value, the more write entropy (randomness) your workload data file contains. Experiment with this value based on your system and database workload profiles. For example, 1GB source buffer sizes could use the flag -Z1G.

At the end of the line, specify the workload placement location and file name.

Other parameters exist for more advanced workload simulations, so read the great documentation that accompanies the executable.

What if you want to simulate SQL Server? If we are going to do OLTP-type workloads, use the following sample command as a place to get started.

This test executes an 80%/20% read/write test with an 8KB block size test on a 50GB workload file located on the E: drive with four worker threads, each with four outstanding I/Os, an intensity of four outstanding I/O’s per thread, and with a write entropy value seed of 1GB. It saves the output text into a results.txt output file for reference.

You can also save this into a batch or PowerShell script to make this test easily repeatable.

Execute this with Administrator privileges or else you might see an error code about needing permissions to write the file or else it might take longer.

The header simply shows the parameters that you used to run the individual test, including the command line itself.

The next section shows the CPU consumption, broken out by user and kernel time by worker thread, for the test cycle.

The next section gets more interesting. We now have the IOPs and throughput metrics, broken out by worker thread and by read and write, for the test. IOPs by read and write matters the most here, with throughput a close second. The operations by thread should be very similar. Higher IOPs and throughput are good values.

The last section is the most interesting. It presents a percentile analysis of the storage performance from the minimum value up to the maximum value. Look for points of inflection in the data. In this test, you can see that we had a significant inflection point between the 99th and 99.9th percentile. Statistically speaking, less than one percent of our I/Os in this test had a latency greater than 4.123 milliseconds, and less than one tenth of one percent had this latency greater than 11.928 milliseconds.

In conclusion, storage testing with diskspd has never been easier! This utility is now my defacto storage benchmarking tool. Give it a try!

By continuing to use the site, you agree to the use of cookies. more information

The cookie settings on this website are set to "allow cookies" to give you the best browsing experience possible. If you continue to use this website without changing your cookie settings or you click "Accept" below then you are consenting to this.