Tag Archives: election data hackathon

So I created an interactive for Wionews.com (embedded below) on the assembly elections taking place in five states. This write-up goes into how I did the interactive and the motivations behind it.

The Interactive is embedded below. Click on Start to begin.

The interactive looks at three things:

where each party won in the last assembly election in 2012 in each of the five states, visualised with a map.

where each party won in the last Lok Sabha (LS) election in 2014, if the LS seats were broken up into assembly seats. This was also done with a map.

the share of seats won by each major party in previous assembly elections, done with a line chart.

I got all my data from the Election commission website and the Datameet repositories, specifically the repositories with the assembly constituency shapefiles and historical assembly election results.

Now these files have a lot of information in them, but since I was making this interactive specifically for mobile screens and there wouldn’t be much space to play with, I made a decision to focus just on which party won where.

As mundane as that may seem, there’s still some interesting things you get to see. For example, from the break-up of the 2014 Lok Sabha results, you find out where the Aam Aadmi Party has gained influence in Punjab since the last assembly elections in 2012, when they weren’t around.

The interactive page on the AAP in Punjab, 2014

ANALYSING THE DATA

While I got the 2012 election results directly from the election commission’s files, the breakdown of the 2014 Lok Sabha results by assembly seat needed a little more work with some data analysis in python (see code below) and manual cross-checking with other election commission files.

PUTTING IT ALL ONTO A MAP

The next thing to do was put the data of which party won where onto an assembly seat map for each state.

To get the assembly seat maps, I downloaded the assembly constituency shapefile from the datameet repository and used the software QGIS to create five separate shapefiles for each of the states. (Shapefiles are what geographers and cartographers use to make maps.)

A screenshot of the QGIS software separating the India shapefile into separate ones for the states.

The next task is to make sure the assembly constituency names in the shapefiles match the constituency names in the election results. For example, in the shapefile, one constituency in Uttar Pradesh is spelt as Bishwavnathganj while in the election results, it’s spelt as Vishwanathganj. These spellings need to be made consistent for the map to work properly.

I did this with the OpenRefine software which has a lot of inbuilt tools to detect and correct these kinds of inconsistencies.

The purist way would have been to do all this with code, but I’ve been using OpenRefine, a graphical tool, for a while now and it’s just easier for me this way. Please don’t judge me! (Using graphical tools such as OpenRefine and QGIS make it harder for others to reproduce your exact results and is less transparent, which is why purists look down on a workflow that is not entirely in code.)

After the data was cleaned, I merged or ‘joined’ the 2012 and 2014 election results with the shapefile in QGIS, I then converted the shapefile into the geojson format, which is easier to visualise with javascript libraries such as D3.js.

I then chose the biggest three or four political parties in the 2012 assembly and 2014 LS election results for each state, and created icons for them using the tool Inkscape. This can be done by tracing the party symbols available in various election commission documents.

Some of the party icons designed for the interactive

HOW IT’S ALL VISUALISED

The way the interactive would work is if you click on the icon for a party, it downloads the geojson file which, to crudely put it, has the boundaries of the assembly seats and the names of the party that’s won each seat.

The interactive map showing the NPF in Manipur in 2014

You then get a map with the seats belonging to that party coloured in yellow. And each time you click on a different party icon, a new map is generated. (If I’ve understood the process wrong, do let me know in the comments!)

I won’t go into the nitty gritty of how the line chart works, but essentially every time you click on one of these icons, it changes the opacity of the line representing the party into 1 making it visible while the opacity of every other line is reduced to 0 making them invisible.

Now I haven’t gone into the complexity of much of what’s been done. For example, if you see those party symbols and the tiny little shadows under them (they’re called drop shadows), it took me at least two days to make that happen.

It took two days to get these drop shadows!

MOTIVATIONS BEHIND THE INTERACTIVE

As for the design, I wanted something that people would just click/swipe through, that they wouldn’t have to scroll through, and also limit the data on display, giving only as much as someone can absorb at a glance.

My larger goal was to try and start doing data journalism that’s friendlier and more approachable than the stuff I’ve been doing in the past such as this blogpost on the Jharkhand elections.

I actually read a lot on user interface design, after which I made sure that the icons people tap on their screen are large enough for their thumbs, that icons were placed in the lower half of the screen so that their thumbs wouldn’t have to travel as much to tap on them, and adopted flat design with just a few drop shadows and not too many what-are-called skeumorphic effects.

Another goal was to allow readers to get to the information they’re most interested in without having to wade through paras of text by just tapping on various options.

The sets of options available to the user while in the interactive

I hacked a lot of D3.js examples on bl.ocks.org and stackoverflow.com to arrive at the final interactive, I’m still some way away from writing d3 code from scratch, but I hope to get there soon.

Because I’m not a designer, web developer, data scientist or a statistician, I may have violated lots of best practices in those fields. So if you happen to come across some noobie mistake, do let me know in the comments, I’m here to learn, thanks! 🙂

Shijith Kunhitty is a data journalist at WION and former deputy editor of IndiaSpend. He is an alumnus of Washington University, St. Louis and Hindu College, Delhi.

Two caveats: (1) we may not get unique and standard identifiers across datasets, and (2) calculations may get difficult in case of by-elections [Lok Sabha Secretariat will have details of all by-elections, which can be accessed through RTI request].

Hack for Change on Women’s Rights

Shobha, Breakthrough.tv, led the discussion on the planned Hack for Change event being organised by Breakthrough and Hacks/Hackers, as part of the 16 days of activism against violence against women.

The hackathon is organised around urban safety data from Whypoll , multimedia evidences of early marriage practices in Bihar and Jharkhand gathered by Gramvaani , etc. It will also include a Wikipedia Edit-athon facilitated by Noopur Raval.

There were multi-directional discussions around other datasets of relevance for the hack event, which I have not kept track of very well. Overall, there were discussions around datasets available from , those published by National Crime Records Bureau, FIR and call database of Delhi police (and how to access that), and data on violence against women gathered by Tata Institute of Social Sciences from police stations across seven states.

Presentation on iPython

Konark Modi presented a detailed introduction to using iPython to undertake data cleaning in a very organised manner, as well collaboration features/workflow of iPython.

There emerged a demand for a tutorial on OpenRefine (previously Google Refine), which will be organised in a later meeting.

Mapping Indian Election Data

We will start documenting publicly available datasets relevant for studying past General Assembly (Lok Sabha) elections in India and the activities of the elected members at present. One can contribute to this mapping exercise in two ways, as mentioned below.

GitHub: We have created a repository for this data mapping exercise under the DataMeet organisation at GitHub. The organisation page can be accessed here, and the (india-election-data) repository can be accessed here. In the repository, I have created a draft format for documenting the identified datasets. This draft format can be accessed here. Please feel free to suggest changes to the draft format by opening an issue.

To document a dataset, use the format given in the repository, fill up the details, and rename the file according to the dataset’s name, such as “election-results-delhi-1995.md”. Then if you notice any requirement of data cleaning/reorganisation or lack of clarity regarding the dataset, open an issue (where the name of the dataset is mentioned) to note that task.

Google Drive spreadsheet: Alternatively, you can access this spreadsheet on Google Drive and add the relevant information about the dataset documented by you.

Please comment here or post to the DataMeet mailing list for any clarifications and suggestions.

After a hiatus, the Delhi DataMeet-Up is back. We are meeting today, Friday, November 22, at Akvo Foundation office, at 5:00 pm.

Here is the tentative agenda of the meeting:

Updates: Sharing news across the network.

Discussion: Discussing how we can support Hack for Change around women’s rights being organised by Breakthrough.tv and Hacks/Hackers New Delhi. Shobha and Anika would begin the discussion by talking about the planned event.

Discussion: Beginning a discussion towards a election data hackathon. It will be led by Satyakam and Surendran.