Letter Salad or Salad for short, is an efficient and flexible implementation of the well-known anomaly detection method Anagram by Wang et al. (RAID 2006) and provides various extensions to it.

Salad is based on n-gram models, that is, data is represented as all its substrings of length n. During training these n-grams are stored in a Bloom filter. This enables the detector to represent a large number of n-grams in little memory and still being able to efficiently access the data. Salad extends Anagram by allowing various n-gram types, a 2-class version of the detector for classification and various model analysis modes.

Changes to previous version:

Lots and lots of cool new features and bugfixes ;)

Refinements to the user interface:
This includes a progress indicator, colors, etc.

Determine the expected error (salad-inspect)

Enable the user to echo the used parametrization:
salad [train|predict|inspect] --echo-params

Allow to set the input batch size as program argument:
salad [train|predict|inspect] --batch-size

Support for processing network dumps and capturing packets and streams directly from network interfaces. Furthermore we integrated unit tests, established a logging infrastructure for more consistent output and fixed various bugs.