Start-up builds India’s largest ever repository of electoral data

Hyderabad. The 2014 parliamentary elections in India registered two irreversible trends - very large young voter base and advances in technology. It became smart led by digital/social media technologies. Obama’s 2008 presidential campaign ushered use of Social Media and 2012 brought Big Data Analytics to forefront. World’s largest democracy went one step further by integrating social media and big data analytics for the first time.

Hyderabad based Modak Analytics, a data analytics start-up, built India’s first Big Data based Electoral Data Repository of 81.4 crore voters for the just concluded elections. It has pioneered innovations by applying big data analytics to Indian politics.

“We were confronted with volume, variety and velocity of data (common for most big data problems). We knew difficulties and complexities” informed Milind Chitgupakar, Chief Analytics Officer. Aarti Joshi, Co-Founder and Executive Vice President from Modak Analytics added, "It’s 12 month long 3240 hours of toil and 65 years of cumulative experience of 10 scientists in Data Analytics that made it possible."

The massive exercise involved 814 million voters, the largest ever on the planet. Comparatively, USA has 193.6 million voters, Indonesia 171, Brazil 135.8 and UK 45.5 million.

The real challenge was extraction of voter info from 25 million PDF pages and transliteration of the same into English to fuse with other sources. Technology was a big hurdle. The infrastructure, built especially for the project, included 64 node Hadoop, PostgreSQL and servers that process master file containing over 8 Terabytes of Data. Besides, Testing and Validation was another big task. ‘First of a kind’ Heuristic (machine learning) algorithms were developed for people classification based on name, geography etc., which help in identification of religion, caste and even ethnicity.

Add comment

Comments should be within the boundaries of decency. Express your thoughts without spreading hatred.