In the past several years cloud-based technologies have become very popular. We can observe lots of services starting from cloud infrastructure to whole cloud-based platforms for testing and development. Most cloud-based services are very convenient and reliable. We often see media reports about big companies transferring their infrastructure to cloud-based services. Unfortunately, cloud-based services have some certain disadvantages like:

Inability of self-management and lack of full server and service control.

Communication problem between customer and service support team.

Subscription-based service.

High cost of cloud-based technologies.

The most important point is the last one. The cost of cloud-based services has increased in geometric progression. New models of monetization are invented. And the companies that invested in cloud development start to get back their money.

The problem is particularly pointed when you need to process big data, for example, to collect and analyze log files for the whole appl stack in company, such as:

Server logs.

Mobile apps logs and Mobile Crashes.

JavaScript errors.

In this article, we will tell you how to install and setup the standalone solution for log collection and analysis and to save resources.

LogPacker can collect and analyze any types of log files on any platforms. There is a straight integration with most of them, it means that the service automatically recognizes any types of log files, collects, aggregates and analyzes them.

Let’s consider the main steps of the process for log collection and analysis:

Infrastructure setup.

Cluster setup for log file collection and analysis.

Agents setup on servers.

LogPacker API setup.

Notification setup.

We will go over every point in detail, starting with architecture setup for log storing and analysis.

Infrastructure Setup

LogPacker Cluster - is a set of servers (where LogPacker is installed), combined in one network. In can consist of several linked LogPacker-servers or it can be a standalone application, installed on Linux-server. Several nodes in cluster allow to load-balance and save logs concurrently to several types of storages.

We provide a free license with a limit of five servers; it means that you can build a cluster of five nodes, which can handle any load for free.

First, let’s see how to set up a cluster of two servers. After registration, you will be able to use the console daemon-application. It is necessary to install it on two servers. You can do it with the help of rpm/deb pack or to install it manually from the tar-archive.

Let’s consider the install process for every way in detail:

RPM setting:

sudo rpm -ihv logpacker-1.0_27_10_2015_03_45_30-1.x86_64.rpm

Daemon will be installed to /opt/logpacker folder.

DEB setting:

sudo dpkg -i logpacker_1.0-27-10-2015_03-45-30_amd64.deb

Setting from .tar archive allows to install LogPacker – daemon to any directory and to have several daemons on one machine (if necessary).

Before service running, it is necessary to setup log storage. LogPacker server supports the following types of storages: file, lasticsearch, MySQL, PostgreSQL, MongoDB, HBase, influxdb, Kafka, tarantool, Memcached. We will set up the servers for concurrent write to two services: Memcached и PostgreSQL. Memcached can be used as “hot cache" for storing fresh logs.

A cluster of two servers is ready to go. You can use it not only for log storing and aggregation; at the same time the service can collect logs from the servers and save them in PostgreSQL and Memcached. For that purpose, you need to add flag-agent to Supervisord.

command=/opt/logpacker/logpacker_daemon --agent --server

Agent Setup

Now let’s turn to agent setting on servers. File logpacker_daemon contains three options of running: only Agent, only Server and parallel start. This is an example of agent start in Supervisord:

It is possible to set as many agents as your license allows. It is enough to have one Agent on every server that is connected with LogPacker cluster. Every Agent has its own configuration file ( configs/agent.ini), where you need to write cluster address (in our case server1:9999).

networkapi=server1:9999

Agent fault tolerance is achieved by Cluster Healthcheck and data save in internal temporary memory in case of a cluster failure.

Agent can collect logs of all possible services out of the box. It is possible to add/remove services in configs/agent.ini :

Every service has its own settings, indicating log-files location of configuration files inconfigs/services.ini. Let’s set up the Agent for collecting logs of your application that are stored in a random directory.

Let’s add a new service with paths to logs in configs/services.ini (with Wildcard support)

As a data source, it is necessary to indicate service that LogPacker Cluster uses for log saving. API can read data only from one storage, but there is also a possibility to indicate “backup" resources in case our main service fails.

Let’s read data from Memcached, having PostgreSQL in store configs/api.ini:

Notification Setup

LogPacker Cluster sends all received and processed fatal errors to the email of your account once an hour, by default. It is only possible if there is an installed local sendmail on your server.

The service also supports the following types of notifications:

Sendmail (by default)

Slack

SMTP

Twilio SMS

Message intervals and levels could be set in configs/notify.ini file:

providers=sendmail
interval=3600
levels=Fatal,Error
tags=*

Eventually, in a short time, we have set up the system for log collection and analysis that meets all the requirements. Also, with a little effort, we can add to this system services for analyzing JS errors and log files from Mobile apps. System setting and support is quite easy.

The system can be easily scaled by adding new nodes to the cluster. Resources consumption can be minimized both on agent servers and on LogPacker Cluster itself.

Processing big data requires lots of resources. Data loss can become a huge problem for a company. LogPacker saves resources on every system level and provides fault tolerant and reliable solution for log collection and analysis.