ServerUsage – Measuring users’ activity on Linux hosts

ServerUsage is a Free Open Source Software system to collect and process usage statistic information from multiple computers running a GNU-Linux Operating System.

Since CatN is a “cluster” hosting company, one of the challenges is to track and analyse the customers’ activity spread on multiple physical hosts for both monitoring and billing purposes. From each physical host we need to collect at regular intervals the total disk I/O, the network traffic and the CPU ticks used by each system user, process and IP address. This “raw” log data must be then aggregated and processed on a central point to extract relevant information.

After spending some time searching for a ready-made solution I decided to start the ServerUsage project to best-fit our needs. This project is now available as a Free Open Source Software, so anyone can freely use and contribute to it.

NOTE: ServerUsage works as a complete system, but the small programs that it is composed from can be also used independently for different purposes.

The Architecture

ServerUsage is composed by two main sections: ServerUsage-Client and ServerUsage-Server.

ServerUsage flowchart schema

ServerUsage-Client

This section contains the software to be installed on the computers for monitoring. It is essentially composed by a SystemTap kernel module to collect the usage information, and a program to transmit the data to a remote server through a TCP connection.
SystemTap is a free software infrastructure to simplify the gathering of information about the running Linux system, it is somewhat equivalent to DTrace on Solaris-based systems.
Once installed and configured, this system can be easily started and stopped using the provided SysV init script.

ServerUsage-Client-MDB

This section contains the software to be installed on the MariaDB database servers to monitor. MariaDB is an enhanced, drop-in replacement for MySQL. It includes the Google and Percona patches to get usage statistics. With this module you can measure the CPU time spent by each DB user and the amount of bytes sent and received.
Once installed and configured, this system can be easily started and stopped using the provided SysV init script.

ServerUsage-Server

The ServerUsage-Server program listens on a TCP port for incoming log data from multiple ServerUsage-Client clients and stores the logs in a SQLite table. An script (serverusage_dbagg.sh) is executed periodically by a cron job to aggregate data on another table and delete obsolete data.
Once installed and configured, this system can be easily started and stopped using the provided SysV init script.

The serverusage_api.php script can be used remotely to extract formatted data from the database or to display graphs.
This scripts accept several input parameters:

The reference time for all servers should be the standard UTC.
The latest available time on the ServerUsage-Server aggregated table is always in the past by the value specified by DB_AGGREGATION_DELAY constant (by default 5 minutes).

When the service is started, the serverusage_client.ko SystemTap kernel module is executed via the staprun command and the output is piped to the serverusage_tcpsender.bin to be sent to the Log server via a TCP connection. If the connection is broken or the Log server is not responding, the log files are temporarily stored on /var/log/serverusage_cache.log file and resent as soon the TCP connection is restored.

To start the service at boot you can use the following command:

# chkconfig serverusage_client on

Install ServerUsage-Client-MDB

The ServerUsage-Client_MDB RPM must be installed on each MariaDB server you wish to monitor.

As root install the ServerUsage-Client-MDB RPM file (please replace the version number with the correct one):

# rpm -i serverusage_client_mdb-6.3.0-1.el6.$(uname -m).rpm

Configure the ServerUsage-Client-MDB

# nano /etc/serverusage_client_mdb.conf

Set the IP address of the Log server where ServerUsage-Server is installed and be sure that the specified TCP port is open on both client and server.

The ServerUsage-Client includes a SysV init script to start/stop/restart the service:

When the service is started, the logs are collected and piped to the serverusage_tcpsender.bin to be sent to the Log server via a TCP connection. If the connection is broken or the Log server is not responding, the log files are temporarily stored on /var/log/serverusage_cache.log file and resent as soon the TCP connection is restored.

To start the service at boot you can use the following command:

# chkconfig serverusage_client_mdb on

Install ServerUsage-Server

The ServerUsage-Server RPM must be installed on the Log Server (the computer receiving the logs from the clients) only.

As root install the ServerUsage-Server RPM file (please replace the version number with the correct one):

# rpm -i serverusage_server-6.3.0-1.el6.$(uname -m).rpm

Once the RPM is installed you can configure the ServerUsage-Server by editing the following file:

# nano /etc/serverusage_server.conf

The ServerUsage-Server includes a SysV init script to start/stop/restart the service:

The init script starts the serverusage_tcpreceiver.bin program that listens for incoming TCP connections from the clients, and installs a cron job to aggregate the data every 5 minutes.
The raw data received from serverusage_tcpreceiver.bin is stored on a SQLite 3 database (var/lib/serverusage/serverusage.db) table named log_raw. The table containing the aggregated data is called log_agg_hst. The aggregated data is immediately removed from the log_raw table. The data on log_agg_hst older than DB_GARBAGE_TIME seconds is automatically removed.

To start the service at boot you can use the following command:

# chkconfig serverusage_server on

To extract formatted information from the SQLite database you can use the serverusage_api.php. This file is installed by default in /var/www/serverusage directory, so you have to configure Apache/PHP accordingly or move the script to another position.

The serverusage_api.php allows you to extract filtered information in various formats: JSON (JavaScript Object Notation), CSV (tab-separated text values), Base64 encoded serialized array or SVG (Scalable Vector Graphics). You can find an example HTML file that displays an auto-update graph using the php API in the same directory where the file serverusage_api.php is located.

Notes on Performances and Limitations

The compilation options of the SystemTap module are set by default to handle a maximum of one million lines per minute. This value is big enough to handle almost all usage cases but you can change it by setting the MAXACTION and MAXMAPENTRIES parameters of the stap command.

The transmision speed between the client and server is limited by the network bandwidth and quality.

The processing capacity of the Log server is limited by the hardware characteristics of the server.

On a virtual machine running CentOS 6.2 with 2 virtual processors and 4GB of RAM, I successfully sent and processed more than 110,000 lines per second.

Since this project is still at an experimental stage, I invite you to try it and leave your comments and suggestions here to help develop the project further.

I have a few different clients hosted on a single machine, and I need a singular way to split the traffic for web, email and ssh/sftp usage. Web and email I can use apache and postfix logs for, but sftp logs I’m not sure how.

What made you code the TCP communications yourself? Why not use a proven software stack like ZeroMQ or similar? With added benefits like telling the server to process all new messages in the queue I would of thought it would make a better solution along with the concurrency and asynchronous benefits?