13 October 2011

Data Mining In Sweden

Almost no place in the world outside Scandinavian countries like Sweden, have such comprehensive and massively cross-indexed dossiers on their citizens. For a social science or public health researcher, this is heaven.

Neuroskeptic notes a recent example of a study demonstrating a link between both bipolar disorder and schizophrenia and creative occupations (visual artists (photographers, designers, etc.) non-visual artists (musicians, actors, authors) and academics (university teachers)) in both the individual with that condition and their relatives as distant as first cousins.

Bipolar people and their relatives are more likely to be in creative professions, confirming the stereotype. Relatives of people treated on an inpatient basis for schizophrenia, but not the individuals themselves, were likewise more likely to be in creative professions. The odds ratio for being in a creative profession due one of the conditions is about 1.5. Creative occupations were slightly less common in people who experienced unipolar depression and their relatives. Unipolar depression reduced one's odds of being in a creative profession by 5-10%. IQ was a bit lower in inpatient mental health patients and a bit higher in creative professionals. The mental health condition-creative occupation link was even stronger after adjusting for IQ.

A comprehensive health records data base derived from the national health care system, census records showing occupation, age, and family relationships, and military IQ test records for men (who have mandatory military service), and the huge national database, gave the study a sample size of 200,000 people with unipolar depression alone, and 100,000 people treated for bipolar or schizophrenia over a 30 year period that includes every single person who ever received treatment for any of those conditions in the entire country for the entire study period. The control group was every single adult in Sweden in that time period. It is the ultimate in comprehensive data sets that are very rich in information for every single data point. This study looks at heredity without genotyping, but there is also a considerable amount of genetype information in the health care database that I've seen used in other studies.

Of course, the downside is that the government knows a huge amount about you and has it in a form that is relatively amenable to being used in a coordinated fashion. If any government is to be trusted with that information, it is probably the Scandinavians, who have some of the least corrupt and most competent civil services in the world. A relatively homogeneous population compared to places like the U.S. or India, also helps build trust that allowing this information to be collectivized will produce data that is used for the good of everyone. But, it isn't hard to imagine how this information might be abused in a government that operated the way that American governmental bureaucracies do.

To the extent that Americans are like Scandinavians with respect to the matter studied, this is great. They get all the Big Brother privacy costs, and we still get the benefits of stunningly comprehensive public health and social science studies. To the extent that Americans aren't like Scandinavians with respect to the matters studies, this is less good. We avoid the Big Brother privacy costs, but in exchange, have lower quality data and have to rely on relationships established with data that may not hold true in our own populations, thus leading to lower quality decision making.

Ultimately, I think the technology is going to sooner or later drive us towards the Swedish model. We will ultimately give up privacy in exchange for the knowledge that comes from wider availability of the data that is collected for useful purposes. Privacy, like an oil based economy, seems likely to be a temporary luxury and historical outlier that is sandwiched between the vast stretch of history when neither internal combustion engines powered by oil nor privacy was available (because people lived in small communities where everyone knew everyone else's business), and the future where we will run out of oil and technology will triumph over efforts to maintain privacy.

Sooner or later, rather than hiding our personal shortcomings and flaws through privacy, we will have to learn to acknowledge them and tolerate them in ourselves and others more than we do today.