Open Data

Apache Mahout

Apache Mahout is a project of the Apache Software Foundation to Produce free implementations of distributed gold Otherwise scalable machine learning algorithms Focused Primarily in the areas of collaborative filtering , clustering and classification. Many of the implementations use the Apache Hadoop platform. [2] [3] Mahout also provides Java libraries for common math operations and Java primitive collections. Mahout is a work in progress; the number of implemented algorithms has grown quickly, [4] but various algorithms are still missing.

While they are used for clustering , clustering and batching, they are implemented on Apache Hadoop using the map / reduce paradigm, it does not restrict to Hadoop-based implementations. Contributions that run on a single node or a non-Hadoop cluster are also welcomed. For example, the ‘Taste’ collaborative-filtering recommender component of Mahout was originally a separate project and can run stand-alone without Hadoop.

Starting with the release 0.10.0, the project shifts its focus to building the backend-independent programming environment, named code “Samsara”. [5] [6] [7] The environment consists of an algebraic backend-independent optimizer and an algebraic Scala DSL unifying in-memory and distributed algebraic operators. At the time of this writing algebraic platforms supported are Apache Spark and H2O , and Apache Flink . Support for MapReduce algorithms is being phased out. [8]