Spatial bubble chart with R

Recently I wanted to know how our members are distributed in the country. Getting the data in a table or even in a bar graph does not give any idea on how far apart the chapters are or its geographical location.

Chapter-wise Member distribution

A spatial bubble chart would show location as well as size of chapter. Unfortunately Excel or open office do not have such feature. I could do it easily with R.

For the uninitiated in data science, R is a free and open source (Mukt) software by by collaboration of thousands of developers around the world for data science. You may read more about it here. R has a library ‘ggmap’ that can render maps. When I searched for the package, it was not installed. Getting a package in R is easy.

I give a command to start R as super user and interactive mode:

$ sudo -i R

Once in R interactive mode I give:

> install.package('ggmap')

To install ‘ggmap’ package. R prompted me to select the repository, once I selected one it downloaded and installed the library. In couple of minutes I got a confirmation:

** building package indices** testing if installed package can be loaded* DONE (ggmap)

So I exited R with a > q() command.

Next, I make a file with name of chapters, number of members and location of the chapters. There is not much data so I keep it in a text file with fields separated by a comma. R can read such format easily. Here is structure of the data file:

I do some massaging of the data to find percentage of members in each chapter. For this I open an interactive R session in R commander. I select the package ‘ggmap’ under Tools menu:

Tools > Load package > ggmap

Then I give:

# Read data from filemdata <- read.csv("ch_data.csv")# Total values under members colnetMemb <-sum(mdata$members)# Create a temporary array with number of membersmemb<-mdata[,"members"]# Find percentagepercent<-memb/netMemb*100# Add new column 'percent' in my data setmdata$percent<-percent

As you see, processing array with R is as simple as performing arithmetic. Next I get map of India. ggmap can get data from various sources, like google-map, openstreet-map etc. I used google-map with street view as my base map. One can get other views like satellite, hybrid or terrain views as well. I needed to do a bit of experimentation to get the correct zoom level.

Once I got the sizes correct I keep the data in a variable and add my bubble plot layer to it. Here I use square root of percentage to make area proportional to size and use a scale factor of 3 to get good visuals:

Spatial chart makes the problem clear. It shows absolutely no presence in NE states, which we know. I also shows absence in central India and rather low presence in western parts of the country. This is something that was not so evident from our previous bar chart.