Moloch – Components and architecture

Components

Moloch consists of three components:

Elasticsearch – search engine powering the Moloch system. It is distributed under the terms of Apache license. Requests are handled using HTTP and results are returned in JSON file format. Elasticsearch supports database sharding, making it fast and scalable.

Capture – C language based application for real-time network traffic monitoring. Captured data is written to disk in PCAP format. Alternatively, it can be used to import PCAP files for analysis and archiving manually through command line. The application analyzes protocols of OSI layers three through seven and creates SPI data which it sends to the Elasticsearch cluster for indexing.

Viewer – The viewer uses a number of node.js tools. Node.js is an event-based, server-side Javascrip platform with its own HTTP and JSON communication. Viewer runs on each device with running Capture module and it provides a web UI for searching, displaying and exporting of PCAP files. GUI/API calls are carried out using URIs, enabling integration with security information and event management (SIEM) systems, consoles or command line for PCAP file obtaining.

Architecture

All the components can be located and run on a single node, however this is not recommended for processing of larger data flows. Whether the data flow is too large can be determined by requests taking too long to respond, in that case, transition to multi-node architecture is advised. The individual components have distinct requirements, Capture requires large amounts of disk space to store received PCAP files, by contrast, Elasticsearch requires large amount of RAM for idexing and searching. The viewer has the smallest requirements of the three, allowing it to be located anywhere.

Moloch can be easily scaled to multiple nodes for Capture and Elasticsearch components. One or several instances of Capture can run on a single or multiple nodes, while sending data to the Elasticsearch database. Similarly, single one or multiple instances of Elasticsearch can run on either one or several nodes to increase the amount of RAM capacity for indexing. This architecture type is therefore recommended for data flow capture and real-time indexing.