Open Worksheets

Using Analytics to examine CPU utilization and NFSv3 operation latency

Worksheets

This is the main interface for Analytics. See Concepts for an
overview of Analytics.

A worksheet is a view where multiple statistics may be graphed.
The screenshot at the top of this page shows two statistics:

CPU: percent utilization broken down by CPU identifier - as a graph

Protocol: NFSv3 operations per second broken down by latency - as a quantize plot

Click the screenshot for a larger view. The following sections introduce
Analytics features based on that screenshot.

Graph

The CPU utilization statistic in the screenshot is rendered as a graph.
Graphs provide the following features:

The left panel lists components of the graph, if available. Since this graph was "... broken down by CPU identifier", the left panel lists CPU identifiers. Only components which had activity in the visible window (or selected time) will be listed on the left.

Left panel components can be clicked to highlight their data in the main plot window.

Left panel components can be shift clicked to highlight multiple components at a time (such as in this example, with all four CPU identifiers highlighted).

Left panel components can be right clicked to show available drilldowns.

Only ten left panel components will be shown to begin with, followed by "...". You can click the "..." to reveal more. Keep clicking to expand the list completely.

The graph window on the right can be clicked to highlight a point in time. In the example screenshot, 15:52:26 was selected. Click the pause button followed by the zoom icon to zoom into the selected time. Click the time text to remove the vertical time bar.

If a point in time is highlighted, the left panel of components will list details for that point in time only. Note that the text above the left box reads "At 15:52:26:", to indicate what the component details are for. If a time wasn't selected, the text would read "Range average:".

Y-axis auto scales to keep the highest point in the graph (except for utilization statistics, where are fixed at 100%).

The line graph button will change this graph to plot just lines without the flood-fill. This may be useful for a couple of reasons: some of the finer detail in line plots can be lost in the flood fill, and so selecting line graphs can improve resolution. This feature can also be used to vertical zoom into component graphs: first, select one or more components on the left, then switch to the line graph.

Quantize Plot

The NFS latency statistic in the screenshot is rendered as a quantize
plot. The name refers to the how the data is collected
and displayed. For each statistic update, data is quantized into buckets,
which are drawn as blocks on the plot. The more events
in that bucket for that second, the darker the block will be
drawn.

The example screenshot shows NFSv3 operations were spread out to 9 ms
and beyond - with latency on the y-axis - until an event
kicked in about half way and the latency dropped to less than
1 ms. Other statistics can be plotted to explain the drop
in latency (the filesystem cache hit rate showed steady misses go to
zero at this point - a workload had been randomly reading from
disk (0 to 9+ ms latency), and switched to reading files that
were cached in DRAM.)

Quantize plots are used for I/O latency, I/O offset and I/O size,
and provide the following features:

Detailed understanding of data profile (not just the average, maximum or minimum) these visualize all events and promote pattern identification.

Vertical outlier elimination. Without this, the y-axis would always be compressed to include the highest event. Click the crop outliers icon to toggle between different percentages of outlier elimination. Mouse over this icon to see the current value.

Vertical zoom: click a low point from the list in the left box, then shift-click a high point. Now click the crop outliers icon to zoom to this range.

Show Hierarchy

Graphs by filename have a special feature - "Show hierarchy" text will
be visible on the left. When clicked, a pie-chart and tree
view for the traced filenames will be made available.

The following screenshot shows the hierarchy view:

As with graphs, the left panel will show components based on the
statistic break down, which in this example was by filename. Filenames
can get a little too long for that left panel - try
expanding it by clicking and dragging the divider between it and the
graph; or use the hierarchy view.

The hierarchy view provides the following features:

The filesystem may be browsed, by clicking "+" and "-" next to file and directory names.

File and directory names can be clicked, and their component will shown in the main graph.

Shift click pathnames to display multiple components at once, as shown in this screenshot.

The pie chart on the left shows the ratio of each component to the total.

Slices of the pie may be clicked to perform highlighting.

If the graph isn't paused, the data will continue to scroll. The hierarchy view can be refreshed to reflect the data visible in the graph by clicking "Refresh hierarchy".

There is a close button on the right to close the hierarchy
view.

Common

The following features are common to graphs and quantize plots:

The height may be expanded. Look for a white line beneath in the middle of the graph, click and drag downwards.

The width will expand to match the size of your browser.

Click and drag the move icon to switch vertical location of the statistics.

Background Patterns

Normally graphs are displayed with various colors against a white background. If
data is unavailable for any reason the graph will be filled
with a pattern to indicate the specific reason for data unavailability:

The gray pattern indicates that the given statistic was not being recorded for the time period indicated. This is either because the user had not yet specified the statistic or because data gathering had been explicitly suspended.

The red pattern indicates that data gathering was unavailable during that period. This is most commonly seen because the system was down during the time period indicated.

The orange pattern indicates an unexpected failure while gathering the given statistic. This can be caused by a number of aberrant conditions. If it is seen persistently or in critical situations, contact your authorized support resource and/or submit a support bundle.

Saving a Worksheet

Worksheets can be saved for later viewing. As a side effect,
all visible statistics will be archived - meaning that they will continue
to save new data after the saved worksheet has been closed.

To save a worksheet, click the "Untitled worksheet" text to name it
first, then click "Save" from the local navigation bar. Saved worksheets
can be opened and managed from the Saved Worksheets section.

Toolbar Reference

A toolbar of buttons is shown above graphed statistics. The following
is a reference for their function:

Icon

Click

Shift-Click

move backwards in time (moves left)

move backwards
in time (moves left)

move forwards in time (moves right)

move forwards in time
(moves right)

forward to now

forward to now

pause

pause

zoom out

zoom out

zoom in

zoom in

show one minute

show
two minutes, three, four, ...

show one hour

show two hours, three, four, ...

show
one day

show two days, three, four, ...

show one week

show two weeks, three,
four, ...

show one month

show two months, three, four, ...

show minimum

show next minimum,
next next minimum, ...

show maximum

show next maximum, next next maximum, ...

show line
graph

show line graph

show mountain graph

show mountain graph

crop outliers

crop outliers

sync worksheet to this
statistic

sync worksheet to this statistic

unsync worksheet statistics

unsync worksheet statistics

drilldown

rainbow highlight

save statistical data

save
statistical data

export statistical data

export statistical data

Mouse over each button to see a tooltip to describe the click
behavior.

CLI

Saved Worksheets:CLI - for how to dump worksheets in CSV, which may be suitable for automated scripting.

Tips

If you'd like to save a worksheet that displays an interesting event, make sure the statistics are paused first (sync all statistics, then hit pause). Otherwise the graphs will continue to scroll, and when you open the worksheet later the event may no longer be on the screen.

If you are analyzing issues after the fact, you will be restricted to the datasets that were already being archived. Visual correlations can be made between them when the time axis is synchronized. If the same pattern is visible in different statistics - there is a good chance that it is related activity.

Be patient when zooming out to the month view and longer. Analytics is clever about managing long period data, however there can still be delays when zooming out to long periods.

Tasks

BUI

Monitoring NFSv3 or SMB by operation type

Click the add icon.

Click the "NFSv3 operations" or "SMB operations" line.

Click "Broken down by type of operation".

Monitoring NFSv3 or SMB by latency

Click the add icon.

Click the "NFSv3 operations" or "SMB operations" line.

Click "Broken down by latency".

Monitoring NFSv3 or SMB by filename

Click the add icon.

Click the "NFSv3 operations" or "SMB operations" line.

Click "Broken down by filename".

When enough data is visible, click the "Show hierarchy" text on
the left to display a pie-chart and tree-view for the path names
that were traced in the graph.

Click "Refresh hierarchy" when the pie-chart and tree-view become out of
date with the scrolling data in the graph.