By the Numbers: 10 online sample integrity tips

Article Abstract

From inspecting IP and e-mail addresses to post-survey telephone interviews, here are strategies for analyzing the quality of online survey samples.

Editor's note: Jerry Savage is the proprietor of Seattle-based ResearchSight.

The widespread availability of online survey technology has opened important doors for product developers and marketing managers who seek to develop customer-centric strategies that will help them succeed in a challenging marketplace. However, the Internet is still a largely uncharted social space that harbors the well-intentioned as well as others who participate in surveys merely to make money or to get inside information on new product and marketing concepts. Indeed, the Internet could be likened to the Wild West of primary market research: a place of great opportunity and innovation, yet one which also poses many unique challenges.

It seems fair to conclude that there are two critical elements of any quality survey: 1) a sample that is representative of the population being studied and 2) survey questions or “items” that garner valid and reliable responses from participants. The quality of the sample is arguably the more important of the two and it is precisely this issue which poses some of the biggest challenges in online survey research. While online surveys using established consumer panels as a sample source are an enticing option for companies that want to survey various segments of the marketplace, cheaters (i.e., individuals who lie in surveys to acquire an incentive or get information) can pose a serious threat to data integrity and undermine the generalizability of results.

As such, questions of sample integrity are at the core of methodological questions about how to use the Internet to acquire accurate and actionable business intelligence. While not all companies are faced with exactly the same questions, there are strategies that can be applied by most analysts and managers when analyzing the integrity of online survey samples:

1. Be diligent. The first step to ensuring sample integrity in any online survey is committing to a diligent review of all responses. While automated approaches to catching cheaters with straight-line and other pattern syntax can be useful, it is important that each survey response be manually inspected to help ensure the study includes only respondents who provide honest answers and are representative of the population under study. While diligent analysis at this level can be a tedious matter, the time spent inspecting data is well worth it

2. Predefine sample parameters. While achieving a truly random sample of any large population is often difficult or impossible due to typically low response rates in surveys as well as time and budget constraints, it is possible to gather a sample which matches some known population parameters. When possible, use extant data such as the census or sales data to create sample plans. This helps ensure that the final sample is representative of at least some key parameters of the population being studied, thereby reducing sampling error.

3. Carefully develop screening criteria. Screening questions are measures at the beginning of surveys that are targeted to specific market segments. Such questions are used to determine whether a potential respondent is appropriate for the study and, in some cases, what questions they should be asked. For key screening criteria, it is sometimes useful to include “dummy levels” in survey questions so that it’s not easy for would-be cheaters to guess which response must be selected to proceed through the survey. For instance, if you wanted only people who use a certain model of car in your survey, you could include five models and terminate everyone who did not select the model of interest. This helps ensure that the final sample is representative of the market segment being studied.

4. Inspect IP addresses. Determine whether respondents have common IP addresses or are from a common subnet. Depending on the nature of the panel or list being used to field the survey, respondents who come from a common IP address or subnet should probably be flagged as suspicious in the database. Suspicious IP addresses can be looked up on the Internet to determine the country of origin and that information can be especially important when conducting international surveys.

5. Conduct time-based integrity checks. Quality online survey tools allow one to determine the number of seconds it took each respondent to complete a survey. Cheaters tend to complete surveys much faster than authentic respondents. After all, they are often only taking the survey to make money. If a given respondent took the survey in less than the median number of seconds for the sample as a whole, that person should be flagged as suspicious. Note that some respondents also take an unusually long time to complete a survey simply because they are preoccupied with other tasks. These people may be qualified respondents who are simply busy but they should be excluded as outliers from analyses of time-to-completion.

6. Inspect e-mail addresses. Cheaters sometimes have unusual e-mail addresses from domains like yahoo.com or hotmail.com. (Consider the e-mail address ddfytc@yahoo.com.) Suspicious e-mail addresses should generally be removed from the panel and/or dataset if the e-mail address is associated with a common suspect subnet, odd verbatims or response times that fall below the median time-to-survey-completion measure.

7. Gather qualitative data. Cheaters often give very short responses to open-ended questions and/or responses that would not be given by a qualified respondent. Bear in mind that past studies have shown respondents to online surveys tend to provide longer, more detailed responses than those responding to telephone surveys. Short responses that don’t seem relevant to the study are an important indicator of cheating behavior.

8. Telephone control. While online surveys are a far more efficient and flexible way to do survey research, telephone surveys offer an important advantage in that interviewers are able to speak with respondents and ensure they are indeed part of the target population. When conducting a study of the general population or a large market segment using the Internet, develop a telephone control and use this as a benchmark in analysis. This can be challenging at times but is an important element of an effective sample integrity strategy. For instance, in a study of 500 consumers, 50 interviews could be done by phone to help ensure that all questions are being properly interpreted by respondents and that segments of the population who are under-represented in online panels are properly represented in the final sample. That control – especially when combined with other controls developed with reliable extant data – can be used in when developing sample plans and weighting strategies. This multi-method approach to data collection helps ensure that the final sample is generalizable.

9. Post-survey telephone interviews of online respondents. Conduct brief, semi-structured telephone interviews with a random sample of online respondents after they have taken the survey. This process can yield useful data and also allows one to spot problems that may have occurred during the course of online data collection.

10. Careful interpretation. While market analysts and research firms tend to tout the “latest and greatest” in statistical modeling, customers don’t always think and behave in ways that can be understood with statistical models. A careful analysis of survey data that includes input from researchers, managers and stakeholders – together with a commitment to putting the customer at the center of development efforts – is a key element of developing an accurate and actionable interpretation of survey results.

Ever more important

Without a doubt, online survey research will continue to grow in the future and will provide many more companies with the opportunity to get to know their customers and create better relationships with them. As the use of online survey technology becomes more widespread, it will become ever more important that researchers and managers take steps to maintain the integrity of survey samples. The work can be tedious but in the long run – especially in highly competitive consumer markets – it’s well worth the effort.

Related Glossary Terms

An attempt to reach a person who could not be reached by telephone on the first try or a follow-up or after-use interview.

CAPI (Computer Aided Personal Interviewing)

Interviewer-administered surveying using a computer-based questionnaire.

CASI (Computer Aided Self-Administered Interviewing)

Self-administered surveying using a computer-based questionnaire.

Cluster Sampling

Consists of selecting clusters of units in a population and then performing a census on each cluster. The selection of clusters could be based on some desired feature of the population or could be a random sample of clusters in the population.

Related Articles

Newsweek surveyed buyers of the 38 compact truck models for 1989 that were available in late 1988 using an eight-page questionnaire to provide a database about buyers, their vehicles, and the purchase process they automotive industry can use as a tool to better serve future buyers. This article is a review of the key elements in the process of buying a product that becomes a major reflection of the owner’s personality—in this case, a truck.

The author recounts experiments in which he and other researchers allowed respondents to change their replies to scale-based questions during the interview process. While more research-on-research needs to be done, the technique seems to increase respondent comfort and satisfaction and thus may have value down the road.

The notion of buying local doesn't only apply to coffee and produce anymore, as consumers increasingly crave connection to their communities. A high-quality product from a local retailer could likely beat out global competitors - as long as the price is right.