Introduction

If you administer any Linux/UNIX servers using the VMSTAT Log file and want some tool or application which will help you to analyze the log file by displaying customizable graphs along with statistics for every component in every descriptor, then you are in the right place. Go ahead, read the article, download the application, and use it.

This application will demonstrate how to effectively use the VMSTAT Analyzer. The application is flexible enough for user configuration. The user needs to provide or locate the VMSTAT file which s/he wants to analyze for UNIX / Linux system resource utilization.

Below are listed some functionality which can be performed very easily using user friendly UI in the application:

Once the user loads the VMSTAT Log file in the application, it checks for integrity and validity of the log file and prompts the user in case there are any warnings or errors. So, you sit back and relax, as VMSTAT_Analyzer.exe checks for the integrity and validity of the VMSTAT Log file.

The user can view Server Resource Utilization for procs, memory, Swap, IO, System, CPU (as Field Descriptor), and also there are individual components in the customizable graph and other statistics data which is included in the VMSTAT Log file.

This data is parsed from the VMSTAT file which the user has loaded in VMSTAT_Analyzer.exe and displays data dynamically.

It also helps the user to know the meaning of each component for each Field Descriptor.

It helps the user to know the Minimum, Average, Maximum, Percentile, and Standard Deviation value (as statistics) for each component for every Field Descriptor available in the log file.

User has the option of Granularity / Zoom selection on the graph which extents to the graph broken down into small parts by dragging the mouse cursor inside the graph with a left click.

The user can un-zoom or undo all zoom made by the user on the graph to full scale display by right clicking on the menu and selecting the option.

X axis is calculated depending on the number of data points and converted into the HH:MM:SS format, which helps the user to understand how long the VMSTAT ran in the server.

The application is flexible enough for user configuration.

The user can change the Title, X Axis, and Y Axis for the graph as per choice, which overrides the default value.

The user can change the interval for the data points which are available in the log file. This interval value entirely depends on the value that the user has mentioned in the VMSTAT command and is very important to configure because the X axis depends entirely on this interval value and the number of data points available in the VMSTAT log file. E.g., if the user used vmstat -n 1 > vmstat.log to collect VMSTAT Log file from the server, then set the interval time as 1. If the user uses vmstat -n 5 > vmstat.log, then set the interval as 5. This value is in seconds.

The user can change the percentile value for the statistics. This value is in %.

The user has the option of changing the line color for the graph which helps in greater visibility.

The user can save the graph in various image formats in a desired location or in the Clipboard. This can be done very easily by right clicking on the graph and selecting the menu.

The user can export statistics values in CSV format in a desired location. This can be done very easily by right clicking on the Statistics section and selecting the menu.

The most important note that everybody should consider when creating the VMSTAT file in their UNIX / Linux environment: Please use vmstat -n 1 > vmstat.log. You can redirect to any file name you want and any duration (interval for data point) for VMSTAT. But note the option -n. I am not going to describe anything about VMSTAT (which is out of the scope for this article), but remember that -n gets the desired header in the VMSTAT Log file by which VMSTAT_Analyzer.exe can understand and display customizable information.

Background

For a couple of days, I was searching in the Internet for something like the application of this article which will help me to show the graph for each component and also provide me their statistic values which should be configurable. But I couldn't find any.

That's when I decided to create one and distribute in www.codeproject.com, so that others can use it and get help out of the application.

Screenshot

Displaying customizable graph and statistics after you have loaded a valid VMSTAT log file which was created in the server using the command vmstat -n 1 > vmstat.log:

Displaying the user choice of selecting which Field Descriptor the user wants to view data.

Displaying the user choice of configuring the graph and various other factors:

The user can change the line color which effects the graph plotting for greater visibility:

Using the code

I have followed a very simple code and nothing is complicated down in the code. Anybody can get a grip of the code if you download the source file. You are free to make any modifications to the code. But it will be great if you can share it with others.

Let me brief about this portion of the code snippet. It's fairly simple. As you can see, the VMSTAT Log file has a specific header to categorize various server utilization matrix, which we get when we use the command vmstat -n 1 > vmstat.log in a UNIX / Linux server to create the log file. So the very first thing that VMSTAT_Analyzer.exe will do is to check for the integrity and validity of the VMSTAT log file which the user has loaded in the application. Below are some processes that this code snippet will perform.

Count for number of data points available in the VMSTAT Log file to determine the value for the X Axis in the graph. It does [-2] to reduce the data point for the header string.

It defines two variables for storing the header string which is constant when the user uses the command vmstat -n 1 > vmstat.log in the server.

Then it reads each line from the VMSTAT Log file and breaks / splits that line delimited by space, so that each sub string can be read from each line.

If the line read from the log file is the first line determined by the variable intNumberOfDataPoint = 0, compare each sub string with the variable strFirstLineVMSTATArr. This will help to validate the first line in the VMSTAT Log.

Do the same for second line and compare with the variable strSecondLineVMSTATArr, which will validate the second line in the VMSTAT Log.

From the third line till the end of the log file, it should contain the values for the individual component. So read each line and see that those values are integer values and each line should contain exactly 16 values.

If any of these above criteria is not matching, then throw an error and let the user know which line in the VMSTAT log file has an error and what is the error, so that user is aware of the error.

This is one of the major code snippets used to draw the graph by plotting the X and Y axes values along with other information, like description and graph title.

This method determines which Field Descriptor to consider and displays the graph for that Field Descriptor which the user has selected as per their needs. This is determined by the global variable intGlobalGraphFieldDescriptorValue.

After the graph is displayed by plotting the X and Y axes and other information like statistics value, checkboxes will be enabled as per the selected Field Descriptor. E.g., if the user selected PROCS as the Field Descriptor, then two checkboxes will be enabled for run time process "r" and uninterruptible sleep process "b". That means the graph will display two lines separated by different colors.

The user can filter by selecting one checkbox and unselecting another checkbox of any component, thus enabling them to analyze one component at a time.

GetPercentileValue(double[] intArrayYValue) - This helps to calculate the percentile value for an individual component.

GetPopulationStdDeviationValue(double[] intArrayYValue) - This helps to calculate the Standard Deviation value for an individual component.

Export_To_CSV(string strFileName) - This helps to export the statistics value in the user desired location in CSV format.

If you want to make some changes in the code, then add a reference to ZedGraph.dll (which is provided in the source code, or download from zedgraph.org, and is the basic Charting control) by selecting Project -> Add Reference.

Points of interest

The user gets a beautiful chart for every component in the VMSTAT file along with their statistics which s/he can export as a graph in various image formats and the statistics in CSV format to a location of the user's choice.

Thanks

I like to thank zedgraph.org for their excellent charting class library. Kudos to zedgraph who made my life easy.

History and future enhancements

This is the first version of VMSTAT_Analyzer. I am planning to come up with another version for the -a option of VMSTAT which provides active memory distribution.

Please let me know if I can add any other functionality to the existing version which can help VMSTAT_Analyzer to be a better application.

Share

About the Author

I am a Performance Engineer, but I like programming. i don't do it as a specialty, but because i love it. anyone who supports source code sharing on the same plane as me.Anyone who wants to learn more about me can feel free to contact me. Meanwhile, i'd appreciate your feedback. Get in touch sumitsushil@gmail.com

* If this article is helpful, please give reputation points.* Don't forget to tip the waiter with your appreciation.

What I did was take the first line what you have provided with predefined header that vmstat analyzer understands. Though it throws warning, but we can continue with it to display the result. Hope that helps.

Amazing project. I loved using it visualizing the vmstat, however, if the vm is created from a cloud, vmstat adds a seperate column on vmstat -n 1 command "steal time". The application fails to recognize and discard it. can you please fix that ? That would be awesome.

This tool is really great. There are some minor flaws as Elmue has suggested. But there is a great effort put into this. I wanted to come up with similar tool, but now it looks like, I can modify your code according to my convinience. Thanks for this cool stuff.

2.)The parser is quite primitive.I load a logfile and it tells me that it is invalid, but the logfile is correct.Your code assumes that a vmstat logfile must have 16 columns.But my file has 17 columns.I suppose you used an older Linux Version.

3.)Your code searches for the string----cpu----but my file has a-----cpu------(three characters longer)An intelligent parser would not fail because of some characters more.

4.)Why do you exit the program when the parser finds an error ??May be I want to load another file instead of exiting the application!This is currently impossible. I cannot abort the parsing without exiting the application!_____________________

4.Adapt the lineif (objStringArr.Length == 17)to the column count of your logfile depending on your Linux Version.(May be in the next Linux Version there will be 18 columns or 19 !)

5.Remove the

frmHelp objFrmHelp = new frmHelp();
objFrmHelp.ShowDialog();

in Form_Load()

It is extremely annoying that the Help window opens every time the program starts and the user has to click it away each time.

These are only some urgent fixes to get a usable application.Much more fixes are required to get a stable application that will also work with logfiles of the next Linux version.

Instead of assuming fix columns you should write a dynamic parser that allows ANY count of columns and ANY column names. A generic parser would not even fail if the columns have names likeprocs xx yy abc test si so bi bo hulk A dynamic parser would split the line at the spaces and would not care how the column are named.

Its great that you work on Linux machine. But if you have any Windows machine available, then you can Analyze your VMSTAT Log file in Windows machine by using VMSTAT_ANALYZER.exe (which is avaiable in this article).In any case it is very easy to copy log files from Linux to Windows.