1. Overview

Now a days, with advancement of technologies, millions of devices are generating the data at massive speed.
Organizations across the globe are digging deeper to find valuable information from data, so we can say that data is “New Oil”.Apache Spark is a fast and general engine for large-scale data processing.
So in this blog, we are trying to perform most commonly executed program by prominent distributed computing frameworks,
i.e Spark WordCount example.
For a bigdata developer, Spark WordCount example is the first step in spark development journey.

2. Development environment

3. Sample Input

In order to experience the power of Spark, the input data size should be massive. But in our case, we are using small input file for learning.
For this tutorial, we are using below text files(UTF-8 format) as input,

5. Build & Run Spark Wordcount Example

We need to pass 2 arguments to run the program(s).
First argument will be input file path and second argument will be output path.
Output path(folder) must not exist at the location, Spark will create it for us.

6. Output(Portion)

Once the job is completed successfully, you will get the output which looks like following output,

JavaDeveloperZone is the group of innovative software developers. We are expert in Java JEE and BigData application development. Our contributions will help Java developers and make development journey easy. Feel free to ask any question and suggestion. Always have space for improvement !
We are also providing software application development as service. Contact us