June 23, 2016

Apache Spark is becoming ubiquitous by day and has been dubbed the next big thing in the Big Data world. Spark has been replacing MapReduce with its speed and scalability. In this Spark series we will try to solve various problems using Spark and Java.

Word count program is the big data equivalent of the classic Hello world program. The aim of this program is to scan a text file and display the number of times a word has occurred in that particular file. And for this word count application we will be using Apache spark 1.6 with Java 8.

If you had asked me this question a couple of months ago or even just a month ago, my answer would have been Hillary Clinton without any but’s and if’s attached. But now, Its quite unclear.

I have been following this election from the beginning and I predicted it would be Hillary Clinton Vs. Trump for the final, long before others did. And, I assumed Hillary would win it in the finals very comfortably. But, What a ride it has been! It has become such a close call these days and rightfully so.

Map and FlatMap functions transform one collection in to another just like the map and flatmap functions in several other functional languages. In the context of Apache Spark, they transform one RDD in to another RDD.

Vim is one of the most powerful text editors available. And, hence it is not really possible for everyone to know everything or get the same ideas on improving their work experience. And, so this article includes a few tips and handy shortcuts that will help your productivity just as we have been doing in the Vim series, but individually not extensive enough to get their own dedicated article.

A lot of Programmers use Vim in some way or another but a vast majority of them use only a handful of features. Knowing to use Splits, tabs, macros and marks can really increase your productivity. Through this and the upcoming articles on Vim I will try to cover the important things that make Vim so awesome.

Splitting your Screen
Vim Splits is a very powerful way of keeping your workflow organized. You can use splits (windows or view-ports in Vim vernacular) to get a different view in to the same file or open a different file to see a quick diff .

Xargs is one of the really useful Linux commands that every one should know about. It is a great command that simplifies your task at hand and overall, a great command to have in your arsenal. And, its quite simple to use.

So, What is xargs?

Xargs is a command that helps you run a command on a series of things in order. The easiest way to understand this is by looking at an example. Before, that here is the typical command structure when you use xargs.

We have already seen how to use the dot operator in Regular Expressions in the last tutorial (Link). To see all the articles of this Regular Expressions series, click here .

In this lesson, we will see what to do when some characters you want to match are optional, i.e., if they are present, match them and if they are not, don't bother.
So, Imagine a situation where you want to match foo and foobar. Hence, bar is optional. Then, what do we do to match and identify such words? Let's see ...

Vim never fails to surprise you with the amazing features it has in its arsenal. Very recently I have found that Vim comes bundled with an encryption mechanism referred to as VimCrypt.
It is always a good practice to encrypt your files especially when it contains personal or sensitive information. I often write my Daily journal notes in vim and i always encrypt them with some external programs. But, Vim itself is capable of doing that.
Let's see how it works.

Let's continue with Regular Expressions. All articles in this series can be found here. I will be using Regexr.com for most of these tutorials. It is a great site, where you can write and validate your regular expressions against your desired input text.

Read Line is a editor every body uses but no one knows about. If you have used Linux for anything before, you have used ReadLine. Doesn't ring any bells? Well, that is because, not a lot of people call it that. It is more frequently termed as the "Bash Prompt" or something along those lines.

dsp@freblogg:$ |

So, this prompt you see every time you open a terminal, has got a name. Its called ReadLine. A really lax name, isn't it? No wonder it didn't catch on.

So anyway, here are a few keyboard shortcuts to help you type efficiently in ReadLine and become a Con-fu master!

Regular Expressions or RegEx is a sequence of characters that define a search pattern. Regex is every where these days and you can use it to extract information from Text files, Log files, Dictionaries, Spread sheets and even webpages. Every major programming language has support for Regular Expressions. Most importantly grep, awk and sed use regex to find/replace matches.

Regular Expressions can help you save a lot of time. Instead of writing complex String pattern searches which span over multiple lines, regex gets the job done really easily and really fast.

NERDTree is a real time saver and a pretty cool extension to your Vim setup, to make it more user friendly. Almost every other Text editor out there comes with an ability to show the file directory listing in which the current file is present. And, if you are wondering how you can do the same in Vim, then look no further because NERDTree is what you want.

So, what does it do?

It just shows all the files, folders in the current working directory. Also, you can add,delete files right from the list. That's pretty cool.

Switch Case is certainly one of the most widely used programming constructs. It is just as widely used in Java as in any other language.

Up until Java 7, switch in Java did not support String type in the case statement. So, if you want to perform multiple comparisons on Strings, the only way you were able to do it was by using multiple If - Else Statements which is certainly not pretty!

But, then in Java 7 they have introduced Switch With Strings and it was welcomed by everyone who had to write those terrible if-else ladders. So, Let’s see how we can use Strings in a Case statement.