Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our User Agreement and Privacy Policy.

Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our Privacy Policy and User Agreement for details.

Cyber Summit 2016: Privacy Issues in Big Data Sharing and Reuse

Although there is no well-established definition of big data, its main characteristic is its sheer volume. Large volumes of data are generated by people (e.g., via social media) and by technology, including sensors (e.g., cameras, microphones), trackers (e.g., RFID tags, web surfing behavior) and other devices (e.g., mobile phones, wearables for self-surveillance/quantified self), whether or not they are connected to the Internet of Things. However, the large volumes of data needed to capitalize on the benefits of big data can to some extent also be established by the reuse of existing data, a source that is sometimes overlooked.

Data can be reused for purposes similar to that for which it was initially collected, but also beyond these purposes. Similarly, data can be reused in its original context, but also beyond this context. However, such repurposing and recontextualizing of data may lead to privacy issues. For instance, data reuse may lead to issues regarding informed consent and informational self-determination. When the data is used for profiling and other types of predictive analytics, also issues regarding stigmatization and discrimination may arise. This presentation by Bart Custers, Head of Research, eLaw – Center for Law and Digital Technologies at Leiden University, The Netherlands, focuses on the privacy issues of big data sharing and reuse and how these issues could be addressed.

8.
8
Data may be discriminating:
When police surveillance focuses on black
neighborhoods, people in database will be black
(selective sampling)
Patterns may be discriminating:
Database may show top managers are male
(self fulfilling prophecy)
People causing car accidents are >16 years old
(non-novel pattern)
Discrimination may be concealed/indirect
Selection on zip code instead of ethnic
background (redlining)
Selection on legitimate attributes correlated to
discriminating attributes (masking)
Discrimination
Stigmatisation
Polarisation

9.
Privacy policies/Terms & Conditions
People do not read policies
Reading everything would take 244 hours annually
Users are willing to spend 1-5 minutes on this
Facebook: 9,500 words (>1 hour), LinkedIn: 7,500 words (~1 hour)
People do not understand policies
Policies are often highly legalistic, technical, or both
Devil is in the details
People do not grasp consequences
Preferred option is not available
Take-it-or-leave it decisions: check the box
9
informational self-determination (Westin, 1967)
People control who gets their data and for which purposes

10.
10
Past Current Future?
Big data is used for a lot of decision-making
Based on what data?
Based on which analyses?
Do you know in how
many databases you are?

11.
LimitingAccess to Sensitive Data
Basic idea is that if sensitive data are absent in the database/cloud, the
resulting decisions/selections cannot be discriminating
However, restricting access is very difficult:
According to information theory, the dissemination of data follows
the laws of entropy:
▪ Information can easily be copied and multiplied
▪ Information can easily be distributed
▪ This process is irreversible
11

12.
Analyze the problem:
Privacy Impact Assessments
Customize the solution:
Privacy by Design
Privacy enhancing tools
Privacy preserving big data analytics
Discrimination aware data mining
12
Since there is not one problem, there is no single solution
Combinations of smart solutions are required

13.
New perspectives
Focus less on:
Limiting access to data
Restrictions use of data
Focus more on:
Transparency
Responsibility
13
Restricting data access and use limits big data
opportunities and is difficult to enforce