Bulk Loading CSV Threat
Intelligence Information

Metron is designed to work with STIX/Taxii threat feeds, but can also be bulk loaded
with threat data from a CSV file. Command-separated values (CSV) is a simple file format used to
store tabular data, such as a spreadsheet or database. Files in the CSV format can be imported
to and exported from programs that store data in tables.

The shell script $METRON_HOME/bin/flatfile_loader.sh reads data from the
local disk and loads the threat intelligence data into an HBase table. This loader uses the
special configuration parameter inputFormatHandler to specify how to consider
the data. The two implementations are BY_LINE and
org.apache.metron.dataloads.extractor.inputformat.WholeFileFormat. The
default is BY_LINE, which makes sense for a list of CSVs in which each line
indicates a unit of information to be imported. However, if you are importing a set of STIX
documents, then you want each document to be considered as input to the Extractor.

OPTIONAL: Create a Mock CSV Threat Intel Feed SourceHortonworks Cybersecurity Platform (HCP) is designed to work with STIX/Taxii threat feeds, but can also be bulk loaded with threat data from a CSV file. In this example, we create a mock CSV enrichment source. If your production environment, you will want to use a genuine enrichment source.

Run the Threat Intelligence LoaderNow that you have defined the threat intel source, threat intel extractor, and threat intel mapping configuration, you must run the loader to move the data from the enrichment source to the HCP enrichment store (HBase) and store the enrichment configuration in ZooKeeper.