Talent Gap

A big issue with Hadoop, especially early on, has been a shortage of Java developers who know how to code Map-Reduce. A number of tools have been created which help with this by allowing those with different skill sets interact with Hadoop.

Some popular tools include:

Spark is a fast and general engine for large-scale data processing. You will need to know something about one of these languages: Scala, Python, or Java. This one has been getting a lot of attention in the past year or so.

Hive is friendlier for database people with its SQL – like interface that generates map/reduce code with out Java programming.

Datameer is a proprietary commercial product sits on top of Hadoop and provides an Excel-like interface for its core features, which exposes Hadoop to people with a variety of backgrounds, including business analyst types.