Oracle Blog

Clive King's Weblog

Colour Dtrace : my I/O is red

I wrote this D script at the request of a Sales Engineer who wanted colour output to determine if I/O latency was poor, instead of an aggregation. Red for high latency and green for low latency as measured by the IO provider. Give it a go. It is not internationalized yet, but I can provide a sed script to convert colour to color if needed:-)

#!/usr/sbin/dtrace -s
#pragma D option quiet
/\*
Clive.King@sun.com 14/8/2007
The aim of this script is to give a 1st pass at determining the latency of
disk I/O. If its red, it means your storage is slow or very busy indeed.
If its green its OK, if its purple, then it may needs more analysis.
As with any high level tools which try to simplfy an issue, the
world is far more complex than red, purple and green. Is 50 worse than
49 or any other arbitrary threshold? 50ms is in the outer reaches of
what is generally acceptable, you may wish to tune it down to 20ms
for example if disk performance is important.
As with any intrusive monitoring, running this script on a system with a
heavy workload, will had some overhead.
Wikipedia suggests that around 7% of the poplulation are red-green
colour bind, any ideas on how to make this useful (at least not dangerous)
to that section of the population are welcome.
\*/
inline int red = 31;
inline int green = 32;
inline int purple = 35; /\* we have no amber in the standard colour set \*/
inline int reset_colour = 30; /\* black \*/
inline int MS = 1000000;
inline int md_major = 85;
inline int vxvm_major = 272; /\* XXX TODO Check when I get access to a vxvm box XXX \*/
inline int bold = 0; /\* set to 1 if you want bold output \*/
/\*
50ms is really slow. We would expect on fast storage for I/O's to
complete in less than 10ms, so 50ms is setting the bar high
\*/
inline int lat_limit = 50;
/\* size of aggregation elements to keep \*/
inline int summary_size = 5;
BEGIN {
good = 0;
bad = 0;
printf("\\033[0;%d;30m I/O latency by colour for up to the worst %d disk devices\\n\\n", bold, summary_size);
}
io:::start
/args[2]->fi_fs == "ufs" && args[1]->dev_major != md_major && args[1]->dev_major != vxvm_major/
{
start[args[0]->b_edev, args[0]->b_blkno] = timestamp;
}
io:::done
/start[args[0]->b_edev, args[0]->b_blkno]/
{
this->elapsed = timestamp - start[args[0]->b_edev, args[0]->b_blkno];
good += ((this->elapsed <= (lat_limit \* MS)) ? 1 : 0);
bad += ((this->elapsed > (lat_limit \* MS)) ? 1 : 0);
@[args[1]->dev_pathname] = lquantize(this->elapsed / MS, 0, 200, 50);
start[args[0]->b_edev, args[0]->b_blkno] = 0;
}
tick-10s
{
/\* A bit arbitary, but ball park figures \*/
/\* bad > 10% of total I/O's \*/
/\* middling 1% -> 10% of total I/O's \*/
/\* good < 1% of total I/O's \*/
c = (bad \* 10) > good ? red : ( ( bad \* 100) > good) ? purple : green;
printf("\\033[0;%d;%dm%Y\\n", bold, c, walltimestamp);
trunc(@, summary_size);
printa(@);
trunc(@);
good = bad = 0;
}
END
{
printf("\\033[0;0;%dm%s\\n", reset_colour, "END");
}