Tuesday, November 13, 2012

Storing Apache Hadoop WordCount Example Output to Database

Apache Hadoop WordCount example is the HelloWorld of Hadoop. Using this to Database Sinking of Hadoop output makes it easy to understand. Database I used is MySQL and the DDL for table used is as following;

CREATE TABLE word_count(word VARCHAR(254), count INT);

After creating the following Apache Hadoop Job along with Mapper and Reducer to Sink the output to Database. For this I use DBOutputFormat as the OutputFormat and DBConfiguration to specify DB configuration parameters.

17 comments:

I just followed your blog and was able to put the data into the database as was supposed to do by the job but i now want to read the data and currently I am facing a problem with it. It would be of great help if you could post a job to retrieve the same data that you put in the DB.

I am a seasoned Software Engineer with Proven Experience in Java based Software Development. I Provide Consultation or Freelance Development with High Quality Standards. Feel free to contact me shazin (dot) sadakath (at) gmail.com