Two caveats: (1) we may not get unique and standard identifiers across datasets, and (2) calculations may get difficult in case of by-elections [Lok Sabha Secretariat will have details of all by-elections, which can be accessed through RTI request].

Hack for Change on Women’s Rights

Shobha, Breakthrough.tv, led the discussion on the planned Hack for Change event being organised by Breakthrough and Hacks/Hackers, as part of the 16 days of activism against violence against women.

The hackathon is organised around urban safety data from Whypoll , multimedia evidences of early marriage practices in Bihar and Jharkhand gathered by Gramvaani , etc. It will also include a Wikipedia Edit-athon facilitated by Noopur Raval.

There were multi-directional discussions around other datasets of relevance for the hack event, which I have not kept track of very well. Overall, there were discussions around datasets available from , those published by National Crime Records Bureau, FIR and call database of Delhi police (and how to access that), and data on violence against women gathered by Tata Institute of Social Sciences from police stations across seven states.

Presentation on iPython

Konark Modi presented a detailed introduction to using iPython to undertake data cleaning in a very organised manner, as well collaboration features/workflow of iPython.

There emerged a demand for a tutorial on OpenRefine (previously Google Refine), which will be organised in a later meeting.

Mapping Indian Election Data

We will start documenting publicly available datasets relevant for studying past General Assembly (Lok Sabha) elections in India and the activities of the elected members at present. One can contribute to this mapping exercise in two ways, as mentioned below.

GitHub: We have created a repository for this data mapping exercise under the DataMeet organisation at GitHub. The organisation page can be accessed here, and the (india-election-data) repository can be accessed here. In the repository, I have created a draft format for documenting the identified datasets. This draft format can be accessed here. Please feel free to suggest changes to the draft format by opening an issue.

To document a dataset, use the format given in the repository, fill up the details, and rename the file according to the dataset’s name, such as “election-results-delhi-1995.md”. Then if you notice any requirement of data cleaning/reorganisation or lack of clarity regarding the dataset, open an issue (where the name of the dataset is mentioned) to note that task.

Google Drive spreadsheet: Alternatively, you can access this spreadsheet on Google Drive and add the relevant information about the dataset documented by you.

Please comment here or post to the DataMeet mailing list for any clarifications and suggestions.