Disclaimer:

These are my personal views and are meant for Informational purpose only. Please verify the Information via Professional help or via Official references before acting upon the information provided in this Blog.

2. Create a Hive Table and load the data uploaded in step 1 to the Hive Table

3. Analyze data in Hive via Excel Add-in

Before we begin, I assume you have access to Hadoop on azure, Have your sample data (don’t have one? learn from a blog post), familiar with Hadoop ecosystem and know your way around the Hadoop on Azure Dashboard.

Now, Here are the steps involved:

STEP 1: Upload Twitter Text Data into Hadoop on Azure cluster

1. Have your data to be uploaded ready! I am just going to Copy Paste the File from my host machine to the RDP’ed machine. In this case, the machine that I am going is the Hadoop on Azure cluster.

For the purpose of this blog post, I have a text file having 1500 tweets:

2. Open web browser > Go to your cluster in Hadoop on Azure

3. RDP into your Hadoop on Azure cluster

4. Copy-Paste the File. It’s a small data file so this approach works for now.

Step 2: Create a Hive Table and load the data uploaded in step 1 to the Hive Table

Note that for the purpose of this blog-post, I’ve chose string as data type for all fields. This is something that depends on the data that you have. If I were building a solution, I would spend some more time choosing the right data type.

Step 3. Analyze data in Hive via Excel Add-in

1. Switch to Hadoop on Azure Dashboard

2. Go to the Hive Console and run the show tables to verify that there is a tweetsampletable.

3. Now if you haven’t, Download and Install the Hive ODBC Driver from the Downloads section of your Hadoop on Azure Dashboard.