Importance of Path Change Settings in VMware

With 1 gig networks it is important to tune per path settings for maximum throughput. The default of 1000 IOs per path can cause micro bursts of saturation and limit throughput. I've done a fair amount of testing and found that the best setting is to actually change paths based on the number of bytes sent per path. The reasoning is that it can be detrimental to change paths too often for small block IO. Setting the path change to bytes optimizes for both.

Here is a real world example from a demo I recently conducted. This was done with 4 1G interfaces on both the Nimble array and the ESX host connected through a Cisco 3750X stack.

SQLIO prior to path optimization :

Server

Tool

Test Description

IO/s

MB/s

Avg. Latency

SQL-06

SQLIO

Random 8k Writes, 8 threads with 8 qdepth for 120 sec

12375

97

4ms

SQL-06

SQLIO

Random 8k Reads, 8 threads with 8 qdepth for 120 sec

14456

113

3ms

SQL-06

SQLIO

Sequential 64k Writes, 8 threads with 8 qdepth for 120 sec

2130

133

29ms

SQL-06

SQLIO

Sequential 64k Reads, 8 threads with 8 qdepth for 120 sec

2147

134

29ms

SQLIO after path optimization :

Server

Tool

Test Description

IO/s

MB/s

Avg. Latency

SQL-06

SQLIO

Random 8k Writes, 8 threads with 8 qdepth for 120 sec

26882

210

1ms

SQL-06

SQLIO

Random 8k Reads, 8 threads with 8 qdepth for 120 sec

28964

226

1ms

SQL-06

SQLIO

Sequential 64k Writes, 8 threads with 8 qdepth for 120 sec

7524

470

8ms

SQL-06

SQLIO

Sequential 64k Reads, 8 threads with 8 qdepth for 120 sec

7474

467

8ms

Notice the large improvement in not only throughput but also the reduction in latency. The latency in the first test was due to the saturation of the 1G links.

The optimization is done with the following command from the ESX 5.x console:

have you guys tried iops=0, this essentially ignore the # of IOPS per path before switching, and relies on queue depth. Essentially poor man's LQD on ESX! We are trying to do some testing in tech-marketing lab to get some results.

I may be wrong, but when making the change to --iops=0 & --bytes=0, it looks like you have to set '--type' to 'iops'. I tried it using '--type=bytes' as written in the script above, but the iops limit didn't change.

Result when run with --type=bytes:

Device: eui.xxx

IOOperation Limit: 1000

Limit Type: Bytes

Use Active Unoptimized Paths: false

Byte Limit: 0

After, when run with --type=iops:

Device: eui.xxx

IOOperation Limit: 0

Limit Type: Bytes

Use Active Unoptimized Paths: false

Byte Limit: 0

From the help text:

-t|--type=<str>

Set the type of the Round Robin path switching that should be enabled for this device.

Valid values for type are:

bytes: Set the trigger for path switching based on the number of bytes sent down a path.

default: Set the trigger for path switching back to default values.

iops: Set the trigger for path switching based on the number of I/O operations on a path.

I just ran the command twice, one to set bytes, and then one to set IOPS. Since there is a Limit Type, I'm not sure if it matters if you changed bytes to 0 if the Limit Type is set to Iops.

My notes from another post:

In the SSH console on ESXi 5.1, this command will loop through each datastore setting Bytes to 0, IOPS to 0 and then display the current settings. For some reason, when listing disks, they show up twice, once with their regular ID and a second time with the ID ending in :1 and the settings can't be applied.

I would be interested in seeing the results of the tests. When I tried using low IOPS per path numbers I saw small block random performance degrade. I did not try setting IOPS per path to 0. I didn't even know that would be a valid input!

I did both Bytes and Iops to 0 with IOPS being set as the active Limit Type.

64 byte 100% reads with 5 workers an a 2 GB test file in IOmeter shows an increase from 1649 IOPS and 102 MB up to 2644 IOPS and 171 MB on a single VMDK over 1x4 GB links. Writes did not seem to be improved in my case.

(I posted this is another thread but just wanted to cross-post here as well, since this thread comes up first in Google search.) FYI: I tried configuring both iops=0 and bytes=0 for all Nimble volumes using the above CLI commands, which worked until the next reboot. After the reboot, iops stayed at 0, but bytes returned to the default of 10485760. This appears to be by design; the PSP policy can be either based on IOPS or Bytes but not both (pick one or the other). There doesn't seem to be any reason to configure bytes=0; the consensus seems to be that setting iops=0 for Nimble volumes is a best practice. Also, I was able to find a very simple and elegant command line to set all Nimble datastores (including new ones added after-the-fact) to default to the Round Robin PSP with the PSP policy iops=0. After issuing the following command, reboot the ESXi host and you will see that all Nimble volumes (including new ones) will default to Round Robin with iops=0:

Note that if you previously configured a user-defined SATP rule for Nimble volumes to simply use the Round Robin PSP (per the Nimble VMware best practices guide), you will first need to remove that simpler rule, before you can add the above rule, or else you will get an error message that a duplicate user-defined rule exists. The command to remove the simpler rule is:

Yes you are correct - these path changes get reset by default when a host reboots (good old VMware!).

The great thing is now that Nimble OS 2.0 is GA it's possible to use Nimble Connection Manager which replaces the need for these scripts and also maintains settings after reboots... (requires VMware Enterprise or Enterprise Plus licensing though).

It is suggested that instead of using the IOPS setting to use the BYTES setting, so that each path change would not happen until the packet size was closer to matching the ethernet size.

We ran various test (same setup as Adam's) and found that on HOST using normal frame packet size our optimal settings were IOPS=0 BYTES=512, this gave the best overall read and write numbers. IOPS=0 BYTES=1400 also gave good numbers (slightly better write times than 512).

Also ran the same SQLIO test using jumbo frames and we could not get any performance increase using any combo of setting (IOPS=0/1000 BYTES=0/512/1400/8800). The default (IOPS=1000 BYTES=10485760) gave the best overall performance. The jumbo frame issue might be related to network congestion or our need to upgrade to a CS440 controller (hopefully ordering soon).

Is there a preferred BYTES setting or are we on the right track with either of those options?