Tag Archives: AAPOR

My biggest takeaway from this year’s conference is that AAPOR is a very healthy organization. AAPOR attendees were genuinely happy to be at the conference, enthusiastic about AAPOR and excited about the conference material. Many participants consider AAPOR their intellectual and professional home base and really relished the opportunity to be around kindred spirits (often socially awkward professionals who are genuinely excited about our niche). All of the presentations I saw firsthand or heard about were solid and dense, and the presenters were excited about their work and their findings. Membership, conference attendance, journal and conference submissions and volunteer participation are all quite strong.

At this point in time, the field of survey research is encountering a set of challenges. Nonresponse is a growing challenge, and other forms of data and analysis are increasingly en vogue. I was really excited to see that AAPOR members are greeting these challenges and others head on. For this particular write-up, I will focus on these two challenges. I hope that others will address some of the other main conference themes and add their notes and resources to those I’ve gathered below.

As survey nonresponse becomes more of a challenge, survey researchers are moving from traditional measures of response quality (e.g. response rates) to newer measures (e.g. nonresponse bias). Researchers are increasingly anchoring their discussions about survey quality within the Total Survey Error framework, which offers a contextual basis for understanding the problem more deeply. Instead of focusing on an across the board rise in response rates, researchers are strategizing their resources with the goal of reducing response bias. This includes understanding response propensity (who is likely not to respond to the survey? Who is most likely to drop out of a panel study? What are some of the barriers to survey participation?), looking for substantive measures that correlate with response propensity (e.g. Are small, rural private schools less likely to respond to a school survey? Are substance users less likely to respond to a survey about substance abuse?), and continuous monitoring of paradata during the collection period (e.g. developing differential strategies by disposition code, focusing the most successful interviewers on the most reluctant cases, or concentrating collection strategies where they are expected to be most effective). This area of strategizing emerged in AAPOR circles a few years ago with discussions of nonresponse propensity modeling, a process which is surely much more accessible than it sounds, but it has really evolved into a practical and useful tool that can help any size research shop increase survey quality and lower costs.

Another big takeaway for me was the volume of discussions and presentations that spoke to the fast-emerging world of data science and big data. Many people spoke of the importance of our voice in the realm of data science, particularly with our professional focus on understanding and mitigating errors in the research process. A few practitioners applied error frameworks to analyses of organic data, and some talks were based on analyses of organic data. This year AAPOR also sponsored a research hack to investigate the potential for Instagram as a research tool for Feed the Hungry. These discussions, presentations and activities made it clear that AAPOR will continue to have a strong voice in the changing research environment, and the task force reports and initiatives from both the membership and education committees reinforced AAPOR’s ability to be right on top of the many changes afoot. I’m eager to see AAPOR’s changing role take shape.

“If you had asked social scientists even 20 years ago what powers they dreamed of acquiring, they might have cited the capacity to track the behaviors, purchases, movements, interactions, and thoughts of whole cities of people, in real time.” – N.A. Christakis. 24 June 2011. New York Times, via Craig Hill (RTI)

AAPOR a very strong, well-loved organization and it is building a very strong future from a very solid foundation.

MORE DETAILED NOTES:

This conference is huge, so I could not possibly cover all of it on my own, so I will try to share my notes as well as the notes and resources I can collect from other attendees. If you have any materials to share, please send them to me! The more information I am able to collect here, the better a resource it will be for people interested in the AAPOR or the conference-

Patrick Ruffini assembled the tweets from the conference into this storify

Annie, the blogger behind LoveStats, had quite a few posts from the conference. I sat on a panel with Annie on the role of blogs in public opinion research (organized by Joe Murphy for the 68th annual AAPOR conference), and Annie blew me away by live-blogging the event from the stage! Clearly, she is the fastest blogger in the West and the East! Her posts from Anaheim included:

My full notes are available here (please excuse any formatting irregularities). Unfortunately, they are not as extensive as I would have liked, because wifi and power were in short supply. I also wish I had settled into a better seat and covered some of the talks in greater detail, including Don Dillman’s talk, which was a real highlights of the conference!

I believe Rob Santos’ professional address will be available for viewing or listening soon, if it is not already available. He is a very eloquent speaker, and he made some really great points, so this will be well worth your time.

Data cleaning has a bad rep. In fact, it has long been considered the grunt work of the data analysis enterprise. I recently came across a piece of writing in the Harvard Business Review that lamented the amount of time data scientists spend cleaning their data. The author feared that data scientists’ skills were being wasted on the cleaning process when they could be using their time for the analyses we so desperately need them to do.

I’ll admit that I haven’t always loved the process of cleaning data. But my view of the process has evolved significantly over the last few years.

As a survey researcher, my cleaning process used to begin with a tall stack of paper forms. Answers that did not make logical sense during the checking process sparked a trip to the file folders to find the form in question. The forms often held physical evidence of a indecision on the part of the respondent, such as eraser marks or an explanation in the margin, which could not have been reflected properly by the data entry person. We lost this part of the process when we moved to web surveys. It sometimes felt like a web survey left the respondent no way to communicate with the researcher about their unique situations. Data cleaning lost its personalized feel and detective story luster and became routine and tedious.

Despite some of the affordances of the movement to web surveys, much of the cleaning process stayed routed in the old techniques. Each form has its own id number, and the programmers would use those id numbers for corrections

if id=1234567, set var1=5, set var7=62

At this point a “good programmer” would also document the changes for future collaborators

*this person was not actually a forest ranger, and they were born in 1962
if id=1234567, set var1=5, set var7=62

Making these changes grew tedious very quickly, and the process seemed to drag on for ages. The researcher would check the data for a potential errors, scour the records that could hold those errors for any kind of evidence of the respondent’s intentions, and then handle each form one at a time.

My techniques for cleaning data have changed dramatically since those days. My goal is to use id numbers as rarely as possible, but instead to ask myself questions like “how can I tell that these people are not forest rangers?” The answer to these questions evokes a subtley different technique:

* these people are not actually forest rangers
if var7=35 and var1=2 and var10 contains ‘fire fighter’, set var1=5)

This technique requires honing and testing (adjusting the precision and recall), but I’ve found it to be far more efficient, faster, more comprehensive and, most of all- more fun (oh hallelujah!). It makes me wonder whether we have perpetually undercut the quality of the data cleaning we do simply because we hold the process in such low esteem.

So far I have not discussed data cleaning for other types of data. I’m currently working on a corpus of Twitter data, and I don’t see much of a difference in the cleaning process. The data types and programming statements I use are different, but the process is very close. It’s an interesting and challenging process that involves detective work, a better and growing understanding of the intricacies of the dataset, a growing set of programming skills, and a growing understanding of the natural language use in your dataset. The process mirrors the analysis to such a degree that I’m not really sure why it would be such a bad thing for analysts to be involved in data cleaning.

I’d be interested to hear what my readers have to say about this. Is our notion of the value and challenge of data cleaning antiquated? Is data cleaning a burden that an analyst should bear? And why is there so little talk about data cleaning, when we could all stand to learn so much from each other in the way of data structuring code and more?

It’s always fun for a professional survey researcher to stumble upon a great pop cultural reference to a survey. Yesterday I heard a great description of a census taken at Kakuma refugee camp in Kenya. The description was in the book I’m currently reading: What Is the What by Dave Eggers (great book, I highly recommend it!). The book itself is fiction, loosely based on a true story, so this account likely stems from a combination of observation and imagination. The account reminds me of some of the field reports and ethnographic findings in other intercultural survey efforts, both national (US census) and inter or multinational.

To set the stage, Achak is the main character and narrator of the story. He is one of the “lost boys” of Sudan, and he found his way to Kakuma after a long and storied escape from his war-ravaged hometown. At Kakuma he was taken in by another Denka man, named Gop, who is acting as a kind of father to Achak.

What is the What by Dave Eggers

“The announcement of the census was made while Gop was waiting for the coming of his wife and daughters, and this complicated his peace of mind. To serve us, to feed us, the UNHCR and Kakuma’s many aid groups needed to know how many refugees were at the camp. Thus, in 1994 they announced they would count us. It would only take a few days, they said. To the organizers I am sure it seemed a very simple, necessary, and uncontroversial directive. But for the Sudanese elders, it was anything but.

—What do you think they have planned? Gop Chol wondered aloud.

I didn’t know what he meant by this, but soon I understood what had him, and the majority of Sudanese elders, greatly concerned. Some learned elders were reminded of the colonial era, when Africans were made to bear badges of identification on their necks.

—Could this counting be a pretext of a new colonial period? Gop mused.—It’s very possible.
Probable even!

I said nothing.

At the same time, there were practical, less symbolic, reasons to oppose the census, including the fact that many elders imagined that it would decrease, not increase, our rations. If they discovered there were fewer of us than had been assumed, the food donations from the rest of the world would drop. The more pressing and widespread fear among young and old at Kakuma was that the census would be a way for the UN to kill us all. These fears were only exacerbated when the fences were erected.

The UN workers had begun to assemble barriers, six feet tall and arranged like hallways. The fences would ensure that we would walk single file on our way to be counted, and thus counted only once. Even those among us, the younger Sudanese primarily, who were not so worried until then, became gravely concerned when the fences went up. It was a malevolent-looking thing, that maze of fencing, orange and opaque. Soon even the best educated among us bought into the suspicion that this was a plan to eliminate the Dinka. Most of the Sudanese my age had learned of the Holocaust, and were convinced that this was a plan much like that used to eliminate the Jews in Germany and Poland. I was dubious of the growing paranoia, but Gop was a believer. As rational a man as he was, he had a long memory for injustices visited upon the people of Sudan.

—What isn’t possible, boy? he demanded.—See where we are? You tell me what isn’t possible at this time in Africa!

But I had no reason to distrust the UN. They had been feeding us at Kakuma for years. There was not enough food, but they were the ones providing for everyone, and thus it seemed nonsensical that they would kill us after all this time.

—Yes, he reasoned,—but see, perhaps now the food has run out. The food is gone, there’s no more money, and Khartoum has paid the UN to kill us. So the UN gets two things: they get to save food, and they are paid to get rid of us.

—But how will they get away with it?

—That’s easy, Achak. They say that we caught a disease only the Dinka can get. There are always illnesses unique to certain people, and this is what will happen. They’ll say there was a Dinka plague, and that all the Sudanese are dead. This is how they’ll justify killing every last one of us.
—That’s impossible, I said.

—Is it? he asked.—Was Rwanda impossible?

I still thought that Gop’s theory was unreliable, but I also knew that I should not forget that there were a great number of people who would be happy if the Dinka were dead. So for a few days, I did not make up my mind about the head count. Meanwhile, public sentiment was solidifying against our participation, especially when it was revealed that the fingers of all those counted, after being counted, would be dipped in ink.

—Why the ink? Gop asked. I didn’t know.

—The ink is a fail-safe measure to ensure the Sudanese will be exterminated.

I said nothing, and he elaborated. Surely if the UN did not kill us Dinka while in the lines, he theorized, they would kill us with this ink on the fingers. How could the ink be removed? It would, he thought, enter our bodies when we ate.

—This seems very much like what they did to the Jews, Gop said.

People spoke a lot about the Jews in those days, which was odd, considering that a short time before, most of the boys I knew thought the Jews were an extinct race. Before we learned about the Holocaust in school, in church we had been taught rather crudely that the Jews had aided in the killing of Jesus Christ. In those teachings, it was never intimated that the Jews were a people still inhabiting the earth. We thought of them as mythological creatures who did not exist outside the stories of the Bible. The night before the census, the entire series of fences, almost a mile long, was torn down. No one took responsibility, but many were quietly satisfied.

In the end, after countless meetings with the Kenyan leadership at the camp, the Sudanese elders were convinced that the head count was legitimate and was needed to provide better services to the refugees. The fences were rebuilt, and the census was conducted a few weeks later. But in a way, those who feared the census were correct, in that nothing very good came from it. After the count, there was less food, fewer services, even the departure of a few smaller programs. When they were done counting, the population of Kakuma had decreased by eight thousand people in one day.

How had the UNHCR miscounted our numbers before the census? The answer is called recycling.

Recycling was popular at Kakuma and is favored at most refugee camps, and any refugee anywhere in the world is familiar with the concept, even if they have a different name for it. The essence of the idea is that one can leave the camp and re-enter as a different person, thus keeping his first ration card and getting another when he enters again under a new name. This means that the recycler can eat twice as much as he did before, or, if he chooses to trade the extra rations, he can buy or otherwise obtain anything else he needs and is not being given by the UN—sugar, meat, vegetables. The trading resulting from extra ration cards provided the basis for a vast secondary economy at Kakuma, and kept thousands of refugees from anemia and related illnesses. At any given time, the administrators of Kakuma thought they were feeding eight thousand more people than they actually were. No one felt guilty about this small numerical deception.

The ration-card economy made commerce possible, and the ability of different groups to manipulate and thrive within the system led soon enough to a sort of social hierarchy at Kakuma.”

If relevant data is available, $20 incentive can be tailored to likely nonrespondents

Evaluating race & Hispanic origin questions- this was a big theme over the course of this conference. The social constructiveness of racial/ethnic identity doesn’t map well to survey questions. This Census study found changes in survey answers based on context, location, social position, education, ambiguousness of phenotype, self-perception, question format, census tract, and proxy reports. Also a high number of missing answers.

Session 4: Adaptive Design in Government Surveys

A potpourri of quotes from this session that caught my eye:

Re: Frauke Kreuter “the mother of all paradata”
Peter Miller: “Response rates is not the goal”
Robert Groves: “The way we do things is unsustainable”

Response rates are declining, costs are rising

Create a dashboard that works for your study. Include the relevant cars you need in order to have a decision maing tool that is tailored/dynamic and data based

STOP data collection at a sensible point, before your response bias starts to grow exponentially and before you waste money on expensive interventions that can actually work to make your data less representative

Interviewer paradata

Chose facts over inference

Presence or absence of key features (e.g. ease of access, condition of property)

(for a phone survey, these would probably include presence or absence of answer or answering mechanism, etc.)

For a household survey, household factors more helpful than neighborhood factors

Three kinds of adaptive design

Fixed design (ok, this is NOT adaptive)- treat all respondents the same

When choosing a sample for cognitive interviews, focus on the people who tend to have the problems you’re investigating. Otherwise, the likelihood of choosing someone with the right problems is quite low

AIR experiment: cognitive interviews by phone

Need to use more skilled interviewers by phone, because more probing is necessary

Awkward silences more awkward without clues to what respondent is doing

Hard to evaluate graphics and layout by phone

When sharing a screen, interviewer should control mouse (they learned this the hard way)

ON the Plus side: more convenient for interviewee and interviewer, interviewers have access to more interviewees, data quality similar, or good enough

Try Skype or something?

Translation issues (much of the cognitive testing centered around translation issues- I’m not going into detail with them here, because these don’t transfer well from one survey to the next)

Education/internationall/translation: They tried to assign equivalent education groups and reflect their equivalences in the question, but when respondents didn’t agree to the equivalences suggested to them they didn’t follow the questions as written

Poster session

One poster was laid out like candy land. Very cool, but people stopped by more to make jokes than substantive comments

One poster had signals from interviews that the respondent would not cooperate, or 101 signs that your interview will not go smoothly. I could see posting that in an interviewer break room…

Session 8: Identifying and Repairing Measurement and Coverage Errors

Health care reform survey: people believe what they believe in spite of the terms and definitions you supply

Paraphrased Groves (1989:449) “Although survey language can be standardized, there is no guarantee that interpretation will be the same”

Politeness can be a big barrier in interviewer/respondent communication

Reduce interviewer rewording

Be sure to bring interviewers on board with project goals (this was heavily emphasized on AAPORnet while we were at this conference- the importance of interviewer training, valuing the work of the interviewers, making sure the interviewers feel valued, collecting interviewer feedback and restrategizing during the fielding period and debriefing the interviewers after the fielding period is done)

We are coming to a new point of understanding with some of the more recent developments in survey research. For the first time in recent memory, the specter of limited budgets loomed large. Researchers weren’t just asking “How can I do my work better?” but “How can I target my improvements so that my work can be better, faster, and less expensive?”

Session 1: Understanding and Dealing with Nonresponse

Researchers have been exploring the potential of nonresponse propensity modeling for a while. In the past, nonresponse propensities were used as a way to cut down on bias and draw samples that should yield to a more representative response group.

In this session, nonresponse propensity modeling was seen as a way of helping to determine a cutoff point in survey data collection.

Any data on mode propensity for individual respondents (in longitudinal surveys) or groups of respondents can be used to target people in their likely best mode from the beginning, instead of treating all respondents to the same mailing strategy. This can drastically reduce field time and costs.

Prepaid incentives have become accepted best practice in the world of incentives

Our usual methods of contact are continually less successful. It’s good to think outside the box. (Or inside the box: one group used certified UPS mail to deliver prepaid incentives)

Dramatic increases in incentives dramatically increased response rates and lowered field times significantly

Larger lag times in longitudinal surveys led to a larger dropoff in response rate

Remember Leverage Salience Theory- people with a vested interest in a survey are more likely to respond (something to keep in mind when writing invitations, reminders, and other respondent materials, etc.)

Nonresponse propensity is important to keep in mind in the imputation phase as well as the mailing or fielding phase of a survey

Mobile surveys that weren’t mobile optimized had the same completion rates as mobile surveys that were optimized. (There was some speculation that this will change over time, as web optimization becomes more standard)

iPhones do some mobile optimization of their own (didn’t yield higher complete rates, though, just a prettier screenshot)

The authors of the Gallup paper (McGeeney & Marlar) developed a best practices matrix- I requested a copy

Smartphone users are more likely to take a break while completing a survey (according to paradata based on OS)

This session boasted a particularly fun presentation by Paul Schroeder (abt SRBI) about distracted driving (a mobile survey! Hah!) in which he “saw the null hypothesis across a golden field, and they ran toward each other and embraced.” He used substantive responses, demographics, etc. to calculate the ideal number of call attempts for different survey subgroups. (This takes me back to a nonrespondent from a recent survey we fielded with a particularly large number of contact attempts, who replied to an e-mail invitation to ask if we had any self-respect left at that point)

Have you ever planned a trip online? In January, when I traveled to Amsterdam, I did all of the legwork online and ended up in a surprising place.

Amsterdam City Center is extremely easy to navigate. From the train station (a quick ride from the airport and a quick ride around The Netherlands), the canals extend outward like spokes. Each canal is flanked by streets. Then the city has a number of concentric rings emanating from the train station. Not only is the underlying map easy to navigate, there is a traveler station at the center and maps available periodically. English speaking tourists will see that not only do many people speak English, but Dutch has enough overlap with English to be comprehensible after even a short exposure.

But the city center experience was not as smooth for me. I studied map after map in the city center without finding my hotel. I asked for directions, and no one had heard of the hotel or the street it was on. The traveler center seemed flummoxed as well. Eventually I found someone who could help and found myself on a long commuter tram ride well outside the city center and tourist areas. The hotel had received great reviews and recommendations from many travelers. But clearly, the travelers who boasted about it were not quite the typical travelers, who likely would have ended up in one of the many hotels I saw from the tram window.

Have you ever discovered a restaurant online? I recently went to a nice, local restaurant that I’d been reading about for years. I ordered the truffle fries (fries with truffle salt and some kind of fondue sauce), because people had really raved about them, only to discover once they arrived that they were fundamentally french fries (totally not my bag- I hate fried food).

These review sites are not representative of anything. And yet we/I repeatedly use them as if they were reliable sources of information. One could easily argue that they may not be representative, but they are good enough for their intended use (fitness for purpose <– big, controversial notion from a recent AAPOR task force report on Nonprobability Sampling). I would argue that they are clearly not excellent for their intended use. But does that invalidate them altogether? They often they provide the only window that we have into the whatever it is that we intend them for.

Truffle fried aside, the restaurant was great. And location aside, the hotel was definitely an interesting experience.

The largest, most profile, highest value and most widely practiced side of online research was created out of a high demand to analyze the large amount of consumer data that is constantly being created and largely public available. This tremendous demand led to research methods that were created in relative haste. Math and programming skills thrived in a realm where social science barely made a whisper. The notion of atheoretical research grew. The level of programming and mathematical competence required to do this work continues to grow higher every day, as the fields of data science and machine learning become continually more nuanced.

The largest, low profile, lowest value and increasingly more practiced side of online research is the academic research. Turning academia toward online research has been like turning a massive ocean liner. For a while online research was not well respected. At this point it is increasingly well respected, thriving in a variety of fields and in a much needed interdisciplinary way, and driven by a search for a better understanding of online behavior and better theories to drive analyses.

I see great value in the intersection between these areas. I imagine that the best programmers have a big appetite for any theory they can use to drive their work in a useful and productive ways. But I don’t see this value coming to bear on the market. Hiring is almost universally focused on programmers and data scientists, and the microanalytic work that is done seems largely invisible to the larger entities out there.

It is common to consider quantitative and qualitative research methods as two separate languages with few bilinguals. At the AAPOR conference in Boston last week, Paul Lavarakas mentioned a book he is working on with Margaret Roller which expands the Total Survey Error model to both quantitative and qualitative research methodology. I spoke with Margaret Roller about the book, and she emphasized the importance of qualitative researchers being able to talk more fluently and openly about methodology and quality controls. I believe that this is, albeit a huge challenge in wording and framing, a very important step for qualitative research, in part because quality frameworks lend credibility to qualitative research in the eyes of a wider research community. I wish this book a great deal of success, and I hope that it is able to find an audience and a frame outside the realm of survey research (Although survey research has a great deal of foundational research, it is not well known outside of the field, and this book will merit a wider audience).

But outside of this book, I’m not quite sure where or how the work of bringing these two distinct areas of research can or will be done.

Also at the AAPOR conference last week, I participated in a panel on The Role of Blogs in Public Opinion Research (intro here and summary here). Blogs serve a special purpose in the field of research. Academic research is foundational and important, but the publish rate on papers is low, and the burden of proof is high. Articles that are published are crafted as an argument. But what of the bumps along the road? The meditations on methodology that arise? Blogs provide a way for researchers to work through challenges and to publish their failures. They provide an experimental space where fields and ideas can come together that previously hadn’t mixed. They provide a space for finding, testing, and crossing boundaries.

Beyond this, they are a vehicle for dissemination. They are accessible and informally advertised. The time frame to publish is short, the burden lower (although I’d like to believe that you have to earn your audience with your words). They are a public face to research.

I hope that we will continue to test these boundaries, to cross over barriers like quantitative and qualitative that are unhelpful and obtrusive. I hope that we will be able to see that we all need each other as researchers, and the quality research that we all want to work for will only be achieved through the mutual recognition that we need.