Details

MOA (Massive On-line Analysis) is a framework for data stream mining. It includes tools for evaluation and a collection of machine learning algorithms. Related to the WEKA project, it is also written in Java, while scaling to more demanding problems. The goal of MOA is a benchmark framework for running experiments in the data stream mining context by proving

a set of existing algorithms and measures form the literature for comparison and

an easily extendable framework for new streams, algorithms and evaluation methods.

Using MOA

The workflow in MOA follows the simple schema depicted below: first a data stream (feed, generator) is chosen and configured, second an algorithm (e.g. a classifier) is chosen and its paramters are set, third the evaluation method or measure is chosen and finally the results are obtained after running the task.

To run an experiment using MOA, the user can choose between a graphical user interface (GUI) or a command line execution. Users should probably start by watching the demo video (see downloads) or download the software and try on an example. Developers can easily extend all three parts of the above architecture to include and test new methods.

Stream Clustering

Outlier Detection

Recommender Systems

Extending MOA

Here we just want to give a short example of how to easily extend MOA with a new learning algorithm. New methods are added to the framework via reflections on start up.

To add a new stream classifier algorithm, implement the Classifier.java interface with the following three main methods

void resetLearningImpl(): a method for initializing a classifier learnervoid trainOnInstanceImpl(Instance): a method to train a new instancedouble[] getVotesForInstance(Instance): a method to obtain the prediction result

To add a new stream clustering algorithm, implement the Clusterer.java interface with the following three main methods

void resetLearningImpl(): a method for initializing a clusterer learnervoid trainOnInstanceImpl(Instance): a method to train a new instanceClustering getClusteringResult(): a method to obtain the current clustering result for evaluation or visualization

Bi-directional interaction of MOA with WEKA

It is easily possible to use WEKA classifiers from MOA, and MOA classifiers and streams from WEKA.