We will be specifically focussing on how survey data can be enhanced using social media data (for example by creating new measures, validating survey estimates or improving non-response adjustments) and how social media data can be validated using survey data. However, we are aware that such a dataset has a greater potential than this, so we are also thinking about the ethics and practicalities that may be involved in making this dataset available more widely.

This will no doubt be tricky – as far as we are aware it is unprecedented to attempt to link data in this way and make it available to wider set of researchers, and it is therefore difficult to predict what issues may arise. We therefore aim to be as open as possible about these issues; this will involve documentation of the choices we make so others may learn from mistakes we may make, but we would also like to consult with the wider research community at key points in the process:

Consent to data linkage

Social media data collection/linking to survey data

Data archiving

We are currently at the first stage – asking consent to link participants’ survey data and Twitter data. To an extent, by asking consent we are going beyond what many social media researchers may do, but by linking to survey data and aiming to archive, this changes the dynamic somewhat.

There are constraints to what we can do: the survey will be administered in web, telephone or face-to-face modes, so the process must work in all contexts. There is also limited questionnaire space, so we cannot add any more questions, and we also need to consider burden on the participant – a large amount of information may overwhelm and leave them less informed.

Below, I have outlined the template for the three questions we would like to ask. We are proposing to use ‘help links’ during the questionnaire to allow the participant to find out more information if they want it online, or an interviewer to answer questions in an interviewer-administered mode:

Q1 [Ask All]Do you have a personal Twitter account?1.Yes2.NoQ2 [IF Q1 = Yes]We are interested in being able to link people’s answers to this survey to the ways in which they use Twitter. We would also like to know who uses Twitter.We will not use your tweets to identify you in any way and your Twitter information will be treated as confidential and given the same protections as your interview data. Your Twitter name, and any information that would allow you to be identified would not be published.HELP SCREEN: What data will you collect from my Twitter account?HELP SCREEN: What will the data be used for?HELP SCREEN: Who will be able to access the linked data?HELP SCREEN: What will you do to protect my data?Are you willing to tell me the name of your personal Twitter account and for your Twitter information to be linked with your answers to this survey?1.Yes2.No

We would really appreciate any feedback you may have on what information we might include in these help links, or how we might change the question wording/ administration. If you do also have any thoughts that may not be possible in this context, they would also still be useful to hear so we can document them for others that may want to do this in the future.

If you have any suggestions, or would like to discuss this further, please do contact me at curtis.jessop@natcen.ac.uk. As we need to submit our final version of the question text to ISER by mid-September, please do try to get any comments to me by the end of August.

6 comments:

I think your only hurdle here will be how you want to share your Twitter data with other researchers. If you share the data in some aggregate format (descriptives, linguistic analysis, etc.) then I don't see any problems here. But if you share their raw tweets (which could be used to google the account name and link a person to their survey data) then I think you'd have some ethical problems. Also, will they have a mechanism to opt out later and knowledge of your analysis of their tweets each time you are doing it?Just my thoughts. And sounds like a really interesting study!

Ah, it's unfortunate that private business does not publish their research and methodologies as often as academic and social companies do. Combining Twitter data with survey/panel data has been tested by numerous survey companies over the last 6 years.

The end conclusion is often that people don't talk enough about the desired topic in social media to warrant all the time and money required to collect the social media data. For instance, if the survey is about politics, the Twitter data may reveal one or two political posts neither of which contain enough data to be useful.

But it would be great to have some published data on the methodology so that other researchers can see hit rates and success rates.

About NSMNSS

Should social science researchers embrace social media and, if we do, what are the implications for our methods and practice? This network, led by NatCen Social Research and SAGE along with our affiliate supporters (see below) is for people using or seeking to use social media in social science research who want to explore the implications of that question.