The European Investigative Journalism and Dataharvest Conference organised by Journalismfund.eu, is the most relevant networking event for investigative and data journalists in Europe. Dataharvest EIJC17 will take place on Friday 19, Saturday 20 and Sunday morning 21 May 2017, with a pre-conference Hack Day on Thursday 18 May.

Sign up or log in to save this to your schedule and see who's attending!

Making the datasets from an investigation public allows readers to verify the reported facts and scrutinize the conclusions. But it may also release enough facts to identify individuals in the data. Unfortunately, we are pretty unique - in the combination of our describing attributes (date of birth, sex, location, profession, etc) as well as in our observable behaviour (e.g. combinations of articles bought in an online shop or films watched on netflix). This uniqueness can make it easy to re-identify people in a dataset, even if names and other obvious identifiers have been removed. Data anonymization is about making us less unique, by removing/replacing just enough information so that people can no longer be identified – while still retaining enough information to make the dataset useful to others. It can be hard to strike the balance between privacy and utility (in fact, it is more or less an unsolved problem). In the first session Katharina Rasch gives an introduction to data anonymization, demonstrates different methods, and looks at examples of data anonymization gone wrong. The second session aims to bring people together with an interest in or experience in developing anonymization methods. Both sessions are aimed at journalists and data scientists alike, no previous experience in data analysis is required.