The MapR Data Science Refinery includes a preconfigured Apache Zeppelin notebook, packaged as a Docker container. Apache
Zeppelin is an open source web-based data science notebook. You can use it with MapR components to conduct data discovery,
ETL, machine learning, and data visualization.

To run the Apache Zeppelin container, you must access the Zeppelin Docker image from MapR’s public repository, run
the Docker image, and access the deployed container from your web browser. From your browser, you can create Zeppelin
notebooks.

Before you start developing applications on MapR’s Converged Data Platform, consider how you will get the data onto the
platform, the format it will be stored in, the type of processing or modeling that is required, and how the data will
be accessed.

The MapR Data Science Refinery includes a preconfigured Apache Zeppelin notebook, packaged as a Docker container. Apache
Zeppelin is an open source web-based data science notebook. You can use it with MapR components to conduct data discovery,
ETL, machine learning, and data visualization.

To run the Apache Zeppelin container, you must access the Zeppelin Docker image from MapR’s public repository, run
the Docker image, and access the deployed container from your web browser. From your browser, you can create Zeppelin
notebooks.

This topic contains sample snippets from YAML files that enable you to run MapR Data Science Refinery as a Kubernetes service. The samples show you how to use various Kubernetes features. This includes specifying your MapR
ticket as a Kubernetes secret, using a ConfigMap to define environment variables, mapping ports to route external traffic, passing the container password as a Kubernetes
secret, exposing MapR Data Science Refinery as a service, and running MapR Data Science Refinery as a deployment for high availability.

Out-of-box, the interpreters in Apache Zeppelin on MapR are preconfigured to run against different backend engines. You
may need to perform manual steps to configure the Livy, Spark, and JDBC interpreters. No additional steps are needed to
configure and run the Pig and Shell interpreters. You can configure the idle timeout threshold for interpreters.

Apache Zeppelin supports the Helium framework. Using visualization packages, you can view your data through area charts,
bar charts, scatter charts, and other displays. To use a visualization package, you must enable it through the Helium
repository browser in the Zeppelin UI. Like Zeppelin interpreters, Helium is automatically installed in your Zeppelin
container.

This section contains examples of how to use Apache Zeppelin interpreters to access the different backend engines. This
includes running Apache Pig scripts, Apache Drill queries, Apache Hive queries, and Apache Spark jobs, as well as accessing
MapR Database and MapR Event Store For Apache Kafka.

MapR supports public APIs for MapR Filesystem, MapR Database, and MapR Event Store For Apache Kafka. These APIs are available for application development purposes.

Accessing and Creating Notebooks in Zeppelin

After the Apache Zeppelin Docker image is running, access the Zeppelin notebook
in your browser by specifying the following URL:

https://localhost:9995

This URL loads the Zeppelin notebook’s home page. You must specify a secure
URL.

If the Docker image is running on a remote node, such as a MapR edge node,
replace localhost with the host name or IP address of the
remote node. If you specified a different port number in your docker
run command, replace 9995 with your port number.

Log in to Zeppelin using the user name and password you specified in your
docker run command: