Tag: Moloch

Usage possibilities of Moloch

Author : Tomáš Mokoš

Moloch offers many distinct usage possibilities, the set of which is not limited to the ones mentioned down below and can be expanded by individual users, provided they can find other applications of this service:

Access Intelligence – Helps with analysis of authorized/non-authorized access to system resources, applications, servers, system operation and different functions. You can also perform depth analysis (with the use of tagging) of a particular system, application or service running in the network

Components

Moloch consists of three components:

Elasticsearch – search engine powering the Moloch system. It is distributed under the terms of Apache license. Requests are handled using HTTP and results are returned in JSON file format. Elasticsearch supports database sharding, making it fast and scalable.

Capture – C language based application for real-time network traffic monitoring. Captured data is written to disk in PCAP format. Alternatively, it can be used to import PCAP files for analysis and archiving manually through command line. The application analyzes protocols of OSI layers three through seven and creates SPI data which it sends to the Elasticsearch cluster for indexing.

Viewer – The viewer uses a number of node.js tools. Node.js is an event-based, server-side Javascrip platform with its own HTTP and JSON communication. Viewer runs on each device with running Capture module and it provides a web UI for searching, displaying and exporting of PCAP files. GUI/API calls are carried out using URIs, enabling integration with security information and event management (SIEM) systems, consoles or command line for PCAP file obtaining.

Architecture

All the components can be located and run on a single node, however this is not recommended for processing of larger data flows. Whether the data flow is too large can be determined by requests taking too long to respond, in that case, transition to multi-node architecture is advised. The individual components have distinct requirements, Capture requires large amounts of disk space to store received PCAP files, by contrast, Elasticsearch requires large amount of RAM for idexing and searching. The viewer has the smallest requirements of the three, allowing it to be located anywhere.

Considering the possibility of packet loss at high traffic flows, it is recommended for the packet capture interface to NOT be the same as the interface connected to the internet, in this case, the interface assigned with static IP address. On the server in our lab there are two interfaces, one for packet capture and one for “outside” communication. To prevent packet loss, it is recommended to increase the Moloch-side interface’s buffer to maximum and turn off most of the NIC’s services by using the following commands:

You can find out the maximum buffer size using the ethool -g command, to check NIC’s services use the ethtool -k command. Disable most of NIC’s services, since you want to capture network traffic instead of what the OS can see they are not going to be used anyway.

Hardware Requirements

The architecture of Moloch enables it to be distributed on multiple devices. For small networks, demonstrations or home deployment it is possible to host all the tools necessary on a single device; however, for capturing large volumes of data at high transfer rates it is recommended not to run Capture and Elasticsearch on the same machine. Moloch allows for software demo version testing directly on the website. In case of storage space shortage, Moloch replaces the oldest data with the new. Moloch can also perform replications, effectively doubling storage space usage. We advise to thoroughly think through the use of this feature.

Elasticsearch and amount of nodes

Considering the fact that the formulas that we used to calculate for how many days can Moloch archive network traffic and what hardware should we use were only approximate, we have decided to measure some statistics to help us clear up these values.

From the Elasticsearch node quantity calculation formula: ¼ * [average network traffic in Gbit/s] * [number of days to be archived], we get that at 2 Mbit/s, one node should suffice.

In our topology, the server running Moloch was connected to a 100Mbps switch, therefore, even though the generated network traffic reached 140Mbps, the flow was subsequently limited on switch.

Single source to single destination test

At first, while generating packets with a generated IP address from cloud to a lab PC, we’ve had a problem with cloud’s security policies. These policies prevented sending of packets with source IP address different from the one assigned to the hosting cloud instance, therefore we have only generated traffic from a single source IP address to a single destination IP address.