Get the maximum IOPS

A quick post that is triggered by a tweet from @GernotNusshall I saw passing today. He wanted to know how to find the maximum IOPS values over the last 5 minutes for a number of VMs. The IOPS values are readily available from the vSphere statistics but the problem is that the values are returned as summation values over the measuring interval and that you have a read and a write value.

An ideal job for PowerShell to get the values Gernot was after.

Update April 26th 2011: following changes are made

The output will also show the datastorename

A suggestion from Glenn Sizemore, the script uses the New-Object cmdlet to produce the output

Update June 29th 2011: A version that retrieves the average read and write IOPS was added.

Update August 6th 2011: Yet another version was added, now one that supports NFS datastores.

The script

PowerShell

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

$metrics="disk.numberwrite.summation","disk.numberread.summation"

$start=(Get-Date).AddMinutes(-5)

$report=@()

$vms=Get-VM|where{$_.PowerState-eq"PoweredOn"}

$stats=Get-Stat-Realtime-Stat$metrics-Entity$vms-Start$start

$interval=$stats[0].IntervalSecs

$lunTab=@{}

foreach($dsin(Get-Datastore-VM$vms|where{$_.Type-eq"VMFS"})){

$ds.ExtensionData.Info.Vmfs.Extent|%{

$lunTab[$_.DiskName]=$ds.Name

}

}

$report=$stats|Group-Object-Property{$_.Entity.Name},Instance|%{

New-ObjectPSObject-Property@{

VM=$_.Values[0]

Disk=$_.Values[1]

IOPSMax=($_.Group|`

Group-Object-PropertyTimestamp|`

%{$_.Group[0].Value+$_.Group[1].Value}|`

Measure-Object-Maximum).Maximum/$interval

Datastore=$lunTab[$_.Values[1]]

}

}

$report

Annotations

Line 1: The 2 metrics that measure IOPS. The metrics return the number of read and write operations over the measurement interval.

Line 5: The script stores all powered on guests in an array. The Get-VM statement can be adapted to return just the guests for which you want the IOPS numbers.

Line 6: The script uses only 1 Get-Stat cmdlet for all VMs and all requested metrics. This will optimise the use of resources on the vCenter Server to retrieve the metrics.

Line 7: To avoid hard-coding the duration of the interval the script retrieves the value from the first returned measurement.

Line 9-14: The script creates a hash table that contains all the canonical names of the LUNs used for the datastores on which the virtual machine are stored.

Line 16: The Group-Object cmdlet does all the hard work. It will group the statistical measurements just the way we would like them

Line 17: The New-Object cmdlet is used to create the output object.

Line 20-23: Calculates the maximum IOPS number by adding the read and write operations together. Since the IOPS value is ‘per second’, the script divides the summation value by the length, in seconds, of the interval.

Line 24: The datastorename is retrieved from the hash table with the canonical name of the LUN as the key.

A sample run

The script produces output similar to this

The sample output shows the only disadvantage, in my opinion, of using the New-Object cmdlet, you have no control over the order of the properties in the object.

The Read/Write Average version

A reader asked if it was possible to retrieve the average read and write IOPS as well. The following script should do the trick

PowerShell

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

$metrics="disk.numberwrite.summation","disk.numberread.summation"

$start=(Get-Date).AddMinutes(-5)

$report=@()

$vms=Get-VM|where{$_.PowerState-eq"PoweredOn"}

$stats=Get-Stat-Realtime-Stat$metrics-Entity$vms-Start$start

$interval=$stats[0].IntervalSecs

$lunTab=@{}

foreach($dsin(Get-Datastore-VM$vms|where{$_.Type-eq"VMFS"})){

$ds.ExtensionData.Info.Vmfs.Extent|%{

$lunTab[$_.DiskName]=$ds.Name

}

}

$report=$stats|Group-Object-Property{$_.Entity.Name},Instance|%{

New-ObjectPSObject-Property@{

VM=$_.Values[0]

Disk=$_.Values[1]

IOPSWriteAvg=($_.Group|`

where{$_.MetricId-eq"disk.numberwrite.summation"}|`

Measure-Object-PropertyValue-Average).Average/$interval

IOPSReadAvg=($_.Group|`

where{$_.MetricId-eq"disk.numberread.summation"}|`

Measure-Object-PropertyValue-Average).Average/$interval

Datastore=$lunTab[$_.Values[1]]

}

}

$report

The NFS datastore version

As one of my readers noticed, the original script didn’t handle NFS based datastores.
The following version remediates that shortcoming.

Hi Gianfranco,
That truncation is the result of PowerShell trying to get the objects in an optimal way on screen.
The problem is that this screen space (line length) is limited, and not all information might fit on one line. Hence the truncation.
There are a number of options to get more on a line:

-) Use a Format-List -AutoSize cmdlet
-) Use a Format-Table to get the properties on separate lines
-) Save the results to a file, with for example an Export-Csv

Matt S.

January 21, 2016 at 18:06

Luc, first off, thank you so much for this, and all, scripts you provide! Virtual high-five to you!

Nearly 5 years after the last comment, I have a new question:
One thing I was thinking of within the NFS version of this script is that the numbers given to us by vCenter are already the average of those reads/writes over that interval, right? So, if you want to actually get the real Max number of averaged reads/writes for that vmdk device, removing the part from Line 23 where it divides by $interval would be required, yes?

That’s correct, the values we get in the “average” counters are already the average over the interval we are looking at.
Depending on the Statistics Level for the interval, you can also ask for the maximum and minimum values, but again these will be the average over maxima/minima of the interval.
The division I do in line 23 is to get an IOPS value, which means “per second”. Hence the division by the duration of the interval.

Matt S.

January 22, 2016 at 18:56

How I understand that value is you’re already getting IOPS but it’s already averaged it out for you over that interval since it’s not a summation of the total amount of write operations executed in that interval.
Ok, let’s say that in a 20 second interval you get a value of 100 writes/sec. That means that over that 20 seconds, you had writes above and below 100 and the average of that is 100. You can assume that your maximum write IOPS over that 20 seconds is going to be well above 100 in order to give you an average of 100. Dividing that number by the interval length will only reduce your average further.

What would be nice is if the virtualdisk counters on NFS were the same as they are for disk counters on vmfs. I’m not finding anything in the virtualdisk counters list that actually shows a single max IO/sec value. It’s all averages over the interval.

Hi Fabio,
Both scripts fetch the metrics from the Realtime interval. In that interval all metrics (except for the ones created through aggregation of course) are available, the vCenter statistic levels do not come into play.
The difference with my script is that I use the actual number of read and writes per interval, not the average (except for the NFS version).
But both scripts would have to produce approximately the same results.
My script should be a bit faster since I limit it to 1 call of Get-Stat, and it provides the name of the datastore on which the vDisk resides.

Sam

Nivendran Nair

January 26, 2015 at 13:57

Hi LucD,

Please can you assist, I need a script the monitors 6 different datastores on VMWare 5.5 for a period of 30 days, reporting on 8am-5pm. Report must show Max and Average IOPS. e.g
Day 1 Monitored Time AVG & Max READ IOPS AVG & MAX Write IOPS
Day 2 Monitored Time AVG & Max READ IOPS AVG & MAX Write IOPS

Hi Daniel, yes I still see the incoming comments.
The easiest way to do that would be to use the Instance parameter on the Get-Stat line. That way you will only get back metrics for that specific instance.

sandesh

KennyF

August 7, 2012 at 17:30

Hi LucD,

Thank you so much for the script. It’s exactly what I need to get the necessary information for our DR Site requirements. I’m currently battling to get the same information, but for a specific date/time range. I added the following, but I receive a “Cannot index into a null array” error after I run the script and the csv file is empty. I also removed the “-Realtime” as that would ignore the start/finish options.

BerreB

April 23, 2012 at 16:06

Hi Luc,

Thx a lot for these scripts. I do have a question to ask you: I’m not sure how to interpret the values that are returned by the NFS script. For instance for one of my VM’s I get 11,5 IOPS Max for one of its vdisks whilst in esxtop CMD/s for this vdisk is constantly between 250 and 450?

I’m using the NFS version of the script exactly like you posted on this page.

@Jbuddy, you should now wait till the aggregation jobs have propagated the change. Should take 1 day for the values of the day before to be available.
But remember that increasing the statistics levels, will also increase the size of your VC database. Monitor the VC database closely in the coming days. And make sure the aggregation jobs complete successfully.

Jbuddy

Jbuddy

Thanks again for assisting me with this. I heard at your talk at vmworld you were quick to answer questions on your site :). So I tried finish before and was getting a null result. The actual error message is as follows. The metric counter “disk.numberread.summation” doesn’t exist for entity” My Start and end times are within 3 hours so

@Jbuddy, I suspect this is because your statistics level is not set to at least level 3.
If you look at the disk metrics, you’ll see that numberRead and numberWrite require level 3.
Check your current settings from the vSphere Client.

Jbuddy

October 28, 2011 at 19:50

Hi LucD,

Hopefully you are still monitoring this thread. I am using both of your disk stats script regarding IO average and maximums. We have an issue that happens at 1:00am on and off for about an hour. Could we adapt the script to just monitor that time period instead of an entire day? Is this possible with get-stat? I am about average with powershell and I am still wrapping my head around your script.

@Jbuddy, yes, I still look at comments for this thread 🙂
You can change the script to use statistical data from a specific time range with the Start and Finish parameters. You can’t use the Realtime parameter in this case, when the specified time interval is further back in time than approx 1 hour.
The script could look something like this
$metrics = "disk.numberWrite.summation","disk.numberRead.summation"
$start = Get-Date -Hour 1 -Minute 0 -Second 0
$finish = $start.AddHours(1)
$report = @()

Sorry I was unclear. Using your max IOPS script, what if I wanted to add together two NoteProperty values such as MaxIOPS and IOPSAve to create a third noteproperty value called IOPSNumber for each VM (lines 18-24). Of course adding those two specific values makes no sense, but that’s the general concept I can’t quite master in PS. Is that clearer?

@Derek, I’m not sure that I understand 100% what you want, but this is my interpretation.
A small test array with a property called Value.
I then group the elements based on the Value being odd or even.
This is added as a new property, called newValue, to each element.

Gernot Nusshall

April 26, 2011 at 11:15

As i already said on Twitter, thank you! 🙂
The only thing i don´t get is the convertion from canonical names to datastore names. i “studied” the other post you mentioned but as i already said, i don´t get it. can you point me out in which section of the lun report script you are doing the conversion?