Before you start developing applications on MapR’s Converged Data Platform, consider how you will get the data onto the platform, the format it will be stored in, the type of processing or modeling that is required, and how the data will be accessed.

A MapR Ecosystem Pack (MEP) provides a set of ecosystem components that work together on one or more MapR cluster versions. Only one version of each ecosystem component is available in each MEP. For example, only one version of Hive and one version of Spark is supported in a MEP.

After you have a basic understanding of Apache Spark and have it installed and running on your MapR cluster, you can use it to load datasets, apply schemas, and query data from the Spark interactive shell.

MapR provides JDBC and ODBC drivers so you can write SQL queries that access the Apache Spark data processing engine. This section provides instructions on how to download the drivers, and install and configure them.