Spark- RDD Creation

In this blog we will see how to create a standalone java application that will run on spark cluster, we will also learn how to use spark-submit tool

We will be loading a text file from given path and counting the number of records. You can create a RDD by parallelizing a collection or loading any external file. You can also create an RDD by transforming an existing RDD.

application-jar: Path to a bundled jar including your application and all dependencies. The URL must be globally visible inside of your cluster, for instance, an hdfs:// path or a file:// path that is present on all nodes.

application-arguments: Arguments passed to the main method of your main class, if any