Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training,
learning paths, books, tutorials, and more.

Spark start and data load

Now it's time to fire up a Spark cluster which will give us all the functionality of Spark while simultaneously allowing us to use H2O algorithms and visualize our data. As always, we must download Spark 2.1 distribution from http://spark.apache.org/downloads.html and declare the execution environment beforehand. For example, if you download spark-2.1.1-bin-hadoop2.6.tgz from the Spark download page, you can prepare the environment in the following way: