The Big Data Research and Development Initiative is a federal funding program to “greatly improve the tools and techniques needed to access, organize and glean discoveries from huge volumes of digital data.”

National Institutes of Health and the National Science Foundation are offering up to $25 million for devising ways to visualize and extract biological and medical information from large and diverse data sets.

NIH announced it would provide researchers free access to all 200 terabytes of the 1,000 Genomes Project—an attempt to catalog human genetic variation—via Amazon Web Services.

Editorial from PLoS Genetics about their willingness to publish research using 23andMe data.

“The editors of PLoS Genetics decided to proceed after satisfying ourselves on two major points, namely that the participants were not coerced to participate in the study in any way, and they were clearly aware that their samples would be used for genetic research.”

The research was deemed “not human subject research” by an independent human subjects review board because it did not meet either criteria of (1) investigators obtaining data through interaction with participants or (2) subjects being identifiable by investigators.

Researchers used a commercial IRB, after article already submitted, but using a commercial firm is already standard practice in pharmaceutical and biotech industries.

“The study was not performed under the auspices of their Universities, and we did not feel that review by an academic IRB was necessarily appropriate.”

“For situations in which a study does not meet the aforementioned criteria but obtaining a consent form would still be desirable, there are no guidelines or policy with regard to how such a consent form should be developed.” In other words, even when the researchers neither interact with the participants nor can identify the participants, but for ethical reasons it would be good to have consent, there are no guidelines for how to proceed.

The editors noted the concern about lack of open access to the underlying data but felt the insights of the paper were of higher value to the public good, especially since collaborators could follow-up with a similar process.

Brief letter by affiliate of HP Labs, raising concern about big data for published studies not being publicly available.

Expresses concern that proprietary data (Facebook, Google, etc) used for published studies isn't made publicly available, making verification and replication impossible.

Huberman: this trend could result in a "small group of scientists with access to private data repositories enjoying an unfair amount of attention."

In response to a conference he chaired in which three scientists from Google and the University of Cambridge declined to release data they had compiled for a paper on the popularity of YouTube videos in different countries.

Describes two scenarios for the future of computational social science, neither in the public interest: one in the domain of internet companies and government agencies; the other in the domain of academic researchers with private datasets.

Calls for the development of computational social science in an “open academic environment” and examines obstacles to that development.

Individual profiles can be extracted from anonymized data.

U.S. National Institutes of Health and the Wellcome Trust abruptly removed a number of genetic databases from online access.

"It may be necessary for IRBs to oversee the creation of a secure, centralized data infrastructure."

86% of internet users have taken steps online to remove or mask their digital footprints—ranging from clearing cookies to encrypting their email.

55% of internet users have taken steps to avoid observation by specific people, organizations, or the government.

The representative sample of 792 respondents also finds that notable numbers of internet users say they have experienced problems because others stole their personal information or otherwise took advantage of their visibility online. Specifically:

21% of internet users have had an email or social networking account compromised or taken over by someone else without permission.

12% have been stalked or harassed online.

11% have had important personal information stolen such as their Social Security Number, credit card, or bank account information.

6% have been the victim of an online scam and lost money.

6% have had their reputation damaged because of something that happened online.

4% have been led into physical danger because of something that happened online.

Multitude of criticisms of dynamic consent model, but no apparent knock-down argument against it.

Difference between models is “whether consent to ‘unknown’ future activities, can be labelled ‘informed consent’ and be viewed as an expression of an autonomous will.”

In dynamic consent, participants are “consented” for every study, so for both meaningful and trivial changes in relation to earlier consents. In broad consent, only re-consented for meaningful changes. Authors argue that dynamic consent does not respect autonomy better compared to broad consent.

Biobanks are already obliged to keep members continuously informed, so dynamic consent is just a different “information policy.”

Participants may be overwhelmed by the complexity of continuous consent and therefore less likely to participate.

Concern that dynamic consent could encroach on participant governance of major research projects, when these decisions might better be handled by experts.

May lead to relaxed IRB reviewing because of the perception that participants have the ability to withdraw anyway.

The focus on returning research results to participants, which dynamic consent could facilitate, pushes researcher in the direction of healthcare.

SECTION 4: Popular Media

Edward Snowden’s revelations about National Security Agency opened society-wide conversation about tensions between national security and privacy.

New attention to the kind of data we generate through use of technologies.

There’s no equivalent to the Fair Credit Reporting Act (FCRA), a law that "requires that entities that collect information for those making employment, credit, insurance and housing decision" to be accurate, for marketing and data brokers.

"Personal data could be… used by firms making decisions that … affect users' lives profoundly. … too risky to do business with or aren’t right for certain clubs, dating services, schools or other programs."

"Reclaim Your Name"—Four point standard for industry to adopt voluntarily: basically the typically right to access, correct, opt-out, and also to know how brokers find and use data.