0:23Skip to 0 minutes and 23 secondsCHANUKI SERESINHE: OK, so with regards to the Mappiness study, how meaningful is this study if the people being surveyed might only represent a subset of the population? For example, a Mappiness study is centred around a typical iPhone user, and such users might be young and fairly successful.

0:38Skip to 0 minutes and 38 secondsSUZY MOAT: I think that's a really good question, and highlights that the learners are really thinking about the limitations of these data sets. Now, I think the context we have to interpret this in, is the wide array of approaches we have to trying to measure human behaviour and human experience. So, I work in big data, in data science these days, but I used to do more traditional experiments in the psychology style, with a small number of participants. And obviously, there's many people out there, running huge surveys with traditional methodologies. And I think all of these different data sources, they have something to offer for the study of human behaviour. But none of them are perfect.

1:31Skip to 1 minute and 31 secondsI've never seen a perfect data set measuring any sort of human behaviour, I think. So what we've got to try and ask is, OK, well, what does this add to our study of this particular area? And if you think about happiness, you know, we've had some real challenges in trying to measure happiness in the past. In experiments, we've been limited generally to quite small number of people, only a small number of measurements and a limited set of scenarios. With surveys, it's often been difficult to get measurements for a large number of people over a long period of time. So you might have a large number of people, but often, you might just have one measurement a year, for example.

2:18Skip to 2 minutes and 18 secondsSo, you know, while these different approaches can allow you to perhaps control something more carefully, so in an experiment you know exactly what conditions participants have been in. In a really carefully set-up survey, you can try and make sure that you've got a cross-section of society. There's obviously some drawbacks to what we've been able to measure. So I think the excitement about a study like Mappiness is that it does give us this possibility of capturing measurements from a really large number of people as they go about their daily lives. And it also opens up this possibility to measure all sorts of different things about what they're doing at the time, who they're with, where they are.

3:03Skip to 3 minutes and 3 secondsAnd we know from our research, like with George MacKerron, that this opens up the range of questions that you can ask. But nevertheless, when you are interpreting the data, you always need to think, what are the disadvantages of this approach, and consider how those drawbacks or disadvantages might affect any conclusions you want to draw from the data set.

3:29Skip to 3 minutes and 29 secondsCHANUKI SERESINHE: So another related question is, is self-reported data reliable? I mean, are people more likely to report what they want other people to hear? And how can you deal with this in your analysis?

3:40Skip to 3 minutes and 40 secondsTOBIAS PREIS: That's another very good question. And basically, the answer is, I mean, the quick answer is we can't be sure, right? I mean, again, they understand that no data set which is out there is actually perfect. But if we think about this a little bit more, then obviously the question is, what is the incentive for a person in the first place to interact with a certain service, right?

4:05Skip to 4 minutes and 5 secondsI mean, obviously, if it would be just for the purpose of fueling a database, filling a database, with lots of measurements for scientists to analyse how happy people are, I mean, obviously then there could be a certain danger that a lot of these responses are not necessarily true responses, so to say, to put it in very casual words. But if you think about it a little bit longer, then obviously there are all sorts of other services out there, and then one obvious choice is our emails, right? I mean, lots of people are using email providers, Gmail and others. And there, the question is, it's very simple, right?

4:46Skip to 4 minutes and 46 secondsI mean, if you ask whether email writers are basically filling Google databases with not-true responses, then obviously that's a very obvious answer that this is not the case. Right, I mean, they are using it because they want to send emails to colleagues, friends, to people around the globe. And from that point of view, it's always a question about incentive. Why are people doing something? Not necessarily for the database which is recording something. And this brings us back to Mappiness, then, right? So it's not for the scientists behind the database. It's actually the incentive in the first place to get an idea of what the personal happiness cycle throughout the day, throughout the week, is.

5:34Skip to 5 minutes and 34 secondsSo the app displays this, so basically you can explore your own characteristic patterns, so to say. And that's something really worth keeping in mind, because for many, many applications, mobile apps, it's a question of finding this right balance between incentivising people to do something and getting data out of it. If it's just about collecting data, nobody would actually do it in the first place, because it's obviously also a huge level of intrusion into personal privacy. So these are answers to your question. And the short answer, again, is we don't know, or we can't be sure that the data set is perfect.

6:28Skip to 6 minutes and 28 secondsCHANUKI SERESINHE: So, thanks. So finally, people are really concerned about the ethics regarding some of the research, particularly Facebook. Do you think Facebook should be allowed to experiment with us like they did? I mean, what is your take? Like, where do you draw the line between what is good, an ethical use of big data, or research that is not harmful?

6:46Skip to 6 minutes and 46 secondsSUZY MOAT: That's another really important question. I think it highlights that there are so many questions around ethics with big data, that we as a society still need to find the answers to. So, you know, I think all three of us, we've done quite a lot of sorts of public events around the question of big data and data-science-related questions in this area. And sometimes you speak to people there and say, well, what would you be happy for your data to be used for? And if you suggest, OK, well, for benefit of society as a whole, you know, how can this address societal problems? People will normally say, yeah, no, that sounds good.

7:28Skip to 7 minutes and 28 secondsI'd be happy for my data to be used for that. If you say companies making money from your data, people are normally not very enthusiastic about that at all. And research is normally somewhere in the middle. But if you look at the situations in which people actually give up their data, we see that people are actually giving lots and lots of data to companies. But as soon as we start talking about government, for example, people become a lot more cautious.

8:00Skip to 8 minutes and 0 secondsAnd I think that probably part of the reason for that apparent mismatch between what people have been saying they'd be happy with and what really happens in terms of people giving up their data is that there's a change in focus between people thinking about what they'd like their data to be used for and a sense of who's looking at my data. So if you're interacting with a company like Google or Facebook, you're probably not thinking, I think a lot of people are perhaps not thinking about them taking the data. They're thinking about the service they're getting back, whereas, you know, questions about government, kind of bigger societal questions, it's not something we see the benefits of so quickly.

8:38Skip to 8 minutes and 38 secondsAnd so your focus moves to, oh, you know, somebody's got my data. So I think if we want to answer questions like this, we really have to be aware of the fact that actually, companies are experimenting on us all the time. You know, if you think of Amazon, you think of Netflix, you think of indeed Facebook, they're frequently changing what we see and hoping to get something from us in response. They're trying to see, OK, can they encourage us to spend more money? Or can they encourage us to spend more time on their sites? So maybe the question should be, you know what sort of price are we...

9:17Skip to 9 minutes and 17 secondsor what sort of compensation do we expect for use of our data? Is the benefit that we're getting from services, such as Facebook or Netflix or whoever, is that OK? Is that something where we think, OK, I'm happy to give up my data for that? And are there... if so, are there equivalent questions about progressing our understanding of society or benefiting society as a whole, as many governmental operations are really... the underlying aim is to try and make society work better. So, and if we say, well, actually, no, I don't want people to experiment on us at all, you know, I don't think companies should be doing that, then how do we actually enforce that?

10:03Skip to 10 minutes and 3 secondsIs it something where we say, OK, we should prohibit the companies from doing this? Or do we just say, OK, well, if you don't want to be experimented on, then you shouldn't use Amazon, or you shouldn't use Facebook, or you shouldn't use Netflix, or whichever service you want to name? So I think these questions about business, research, government, they're questions we need to answer in parallel. We need to think about how the answers to each of these different kinds of data usage, how they fit together. And I think it's perhaps worth remembering that in some cases, we do become much more aware of our data being used.

10:40Skip to 10 minutes and 40 secondsSo, you know, for example, while there are questions that need to be addressed about different research studies, something which is nearly always true is that the results of research studies will be made publicly available. Often, these days, there's not a paywall in place. The general public can see the results. And that's not normally the case with companies. Normally that happens when we don't know about it. So I suspect we don't know about it. But I'm not happy if I do. We probably want a more nuanced answer than that. So I'm not sure I know the answer.

11:19Skip to 11 minutes and 19 secondsI think the point is that it's really something society needs to reflect on, and we need to work together to try and work out where these lines are, and what we're happy with and what we're not.

11:39Skip to 11 minutes and 39 secondsTOBIAS PREIS: There's another exciting week ahead of us. We will dig into, or all of us, and learners will dig into disasters and mobile data. We will look into the digital traces around Harrigan's, and we will work out whether we can find a way to use mobile phone data to measure how large a crowd is. I mean, very exciting applications which we will discuss in that week.

Week 7 round-up

In Week 7, we began to explore how big data might help us measure and improve our happiness.

Here’s a brief summary to help you prepare for Week 8.

You learned how George MacKerron created a smartphone app, Mappiness, to find out where and when people are happy all around the UK. You heard about a Facebook study which investigated whether emotions are contagious, by manipulating what Facebook users saw on their news feed. Thore Graepel also demonstrated that what we “like” on Facebook might give away all sorts of information about our personality, from how intelligent we are, to how satisfied we are with our lives.

These studies again raised important issues about privacy and the ethics of big data. It was great to read all your comments on where you thought the line should be drawn between what is acceptable for government and businesses to do and what you thought may be a step too far.

Finally, you started analysing Google Trends data in R and RStudio. You calculated the Future Orientation Index for the UK in 2012 yourselves. Well done!

This week, we move on to understanding how big data can help us measure where people are. Have a great week!