Wednesday, October 19, 2011

I'm not dead! In fact, I have been very alive over the past half a year: moving, finishing my Ph.D and starting a new job. With all that settled, I am determined to start where I left off.

I often get questions about software to do Structural Equation Modeling. There are quite a few packages out there, some more user-friendly or sophisticated than others. Here is an overview of existing packages and my opinion on the pro's and cons of each.'Old' software:
I was trained to work with LISREL as a student. LISREL still exists, but updates have become sparse over the last decade. It was developed by Karl Joreskog, one of the great pioneers of SEM and Dag Sorbom. Although it hasn't been updated for over 5 years, it is still one of the most widely used software packages. The great thing about LISREL is that it allows flexibility. It can handle categorical data and multilevel-data. The LISREL language has been adopted by other SEM-packages. LISREL requires users to know some matrix algebra, to understand what specific parameters to estimate.

Mx and EQS were until the end of the 1990s the main competitors to LISREL. Mx has been revamped as an R-package (openMx), while EQS is still in use today. I have never worked with EQS myself, but because of the involvement of Peter Bentler in the EQS-project, it has always been cutting-edge when it comes to the handling of categorical data. There seem to be few people using the software, and although being updated now and then, the possibilities in EQS when it comes to mixture and multilevel-models seem limited.

AMOS is still one of my favorites, although since taken over by the evil empire of SPSS, development has grinded to a standstill. The thing I love about AMOS is its drag-and-drop interface, and is user-friendliness. I use AMOS to teach students the first things about SEM, and it works great. AMOS has no possibility to handle multi-level data or mixture models however, and although it can deal with categorical data, there are other packages that do a better job.

'New' software:
Since 2000 some new software packages have taken over the role of LISREL, Mx and EQS as leading software packages. They are either more user-friendly, are better integrated with standard statistical software, or more sophisticated.

MPLUS is currently the package to beat. It is an extremely sophisticated programme, with frequent large updates. When it comes to modeling possibilities, there is no other package that comes close to MPLUS. It can handle all types of mixture models, and now includes a Bayesian module that opens a whole new world for statistical modelers. Bengt Muthen is the driving force behind Mplus, contributing greatly to the expansion of SEM models beyond psychometrics and sociometrics. I work with Mplus on a daily basis, and together with some colleagues at Utrecht University, we regularly organize meetings, or teach courses in Mplus. See www.fss.uu.nl/mplus if you want to know more.

Stata (Glamm) and Sas (procalis) have integrated SEM packages into their main statistical packages. I have never worked with these packages before, but from the literature it seems they are capable of doing the basic SEM analysis (factor models, path models) in a good way. Procalis has a good module for doing Latent Class and Latent transition analysis.

Latent Gold is another package especially designed for Latent Class analysis and mixture models, although it can handle simpler types of SEM as well. The power of this software package lies in the flexibility of specifying link functions between observed and latent variables. It is currently the only package that can deal with ordinal latent variables and boasts some intelligent algorithms that make the estimation of mixture models way faster than for example Mplus.

Finally, three packages in R can do SEM. The great benefit of these packages is that they are open source, and allow the user to program functions himself, or to integrate R-scripts with packages prorgrammed by others. OpenMx (mentioned earlier) and Lavaan seem to the two packages that are currently most advanced. I've heard mixed opinions about the sem package in R, so would not opt for that package

Tuesday, May 10, 2011

Sorry for the long silence: have been caught up in work and other things that were always more pressing than writing blog posts. Perhaps it is also because I found it hard to write about statistical modeling. Statistical models are usually complex, and therefore it is difficult to write about them in an accessible way.

Statistical models are everywhere; their goal is to summarize our world in such a way as to capture the essence, and leave out the irrelevant complexities. Therefore, I see graphs, figures and visualisation tools themselves as models not very differently from statistical models.
In the social sciences, I think statistical models are complex in two ways, that make them different from models in physics.
First, our social world is generally more complex and nuanced than laws of physics, and therefore; more complex models are necessary.
Second, measurement in the social sciences are more difficult than in the exact sciences and contain more measurement error. Statistical Models in the social sciences should in my view therefore always incorporate some form of measurement errors. One technique that has become dominant in the social sciences, is the technique of Structural Equation Modeling (SEM). Ken Bollen, who is a famous SEM-scientist, explains what makes SEM such a good and attractive technique in the following video.

Thursday, April 7, 2011

One of the professors at the department where I work (Joop Hox) told me at our first meeting ever that good survey methodologists know their way around in the world of statistics. I think this saying should also go in the reverse order by the way, but I did take his advice seriously, and I am getting more and more interested in statistics, and specifically statistical modeling.
A good statistical model in my view should be able to answer a specific (complicated) research questions about our social world, in a relatively straightforward way. This implies answering the question of causality (see previous post) is very important in all statistical models, and second, that the statistical model should summarize our social reality in a simple way.

Proving causality can be hard and depends mainly on a good research design. Summarizing our social reality in a not too simplified way is hard, but graphics can do wonders. The video below is very old (well, 5 years), and most of you have perhaps seen it, but it is a good illustration of what I think good researchers should try to achieve (including the Swedish overenthusiastic accent).

Because of the advent of Internet and abundance of IT-application, the amount of data that we have available for marketing and research is booming. The great challenge for statisticians (and here come the survey methodologists into play)in the next years is how to handle all this data, make them insightful, and use them to answer questions we weren't able to asnwer before. A great blog post on this topic was posted last year on www.radar.oreilly.com. Highly advised.

Saturday, March 26, 2011

A short post in between, so I can share two thoughts:
1. it is now possible to post reactions to my blog posts. Because I'm new to blogging, the settings were not very inviting previously. Now, you can react very easily. Please do if you feel like it. I believe in progress by debate.
2. I am not a specialist in the concept of causality myself, but I love Judea Pearl's blog on the topic of causal relationships, counterfactuals and the role of covariates in the social sciences. He recently posted some great links, new ideas and video's. Go and read it if you have a spare hour.

Wednesday, March 23, 2011

Instead of separating out mode effects from nonresponse and noncoverage effects through statistical modeling, it is perhaps better to design our mixed-mode surveys in such a way so that mode effects do not occur. The key principle in preventing the mode effects from occurring, is to make sure that questionnaires are cognitively equivalent to respondents. This means that no matter in which survey mode the respondents participate, they would give the same answer. In my opinion, there are two ways to achieve this.

1. choose a mix of modes that lead to a cognitively equivalent survey process. The survey process is very different in a questionnaire administered in a telephone vs. an Internet mode. Some mode combinations are can however be combined without great differences between they survey process across the modes:

- combine face-to-face with telephone modes: the mode of communication is in both modes aural with an interviewer asking and recording answers. The only difference is that the interviewer is physically present in the face-to-face survey, and not in the telephone survey.
- combine mail and Internet modes. Differences between these modes are minimal. Whereas in the United States it is difficult to sample addresses (but not impossible), in Europe, this combination can easily be implemented. Don Dillman talks about some experiments with this method on the 2009 AAPOR conference (thanks to www.pollster.com).

2. The second way is to use nonequivalent survey modes (for example the telephone and internet), but design the individual survey questions in such a way that they are still equivalent across modes. This implies that all questions should be simple, short and clear, and that there should be as few answer categories as possible (i.e. yes/no and similar). This means that it would be difficult to ask for attitudes or opinions in such a mixed mode design.

Tuesday, March 15, 2011

Mixed mode surveys have shown to attract different types of respondents. This may imply that they are succesful. Internet surveys attract the young and telephone surveys the old, so any combination of the two can lead to better population estimates for the variable you're interested in. In other words, mixed-mode surveys can potentially ameliorate the problem that neither telephone, nor Internet surveys are able to cover the entire population.

The bad news is that mode-effects (see posts below) coincide with selection effect in mixed-mode surveys. For that reason, it is hard to determine how succesful mixed-mode surveys are, and more importantly, really hard to combine results when there are large differences in the dependent variable across the survey modes.

I think that matching is one of the few methods to adequately deal with this issue: the idea is straightforward. In any survey among the general population, there will be 1. people who are able and willing to only answer in a specific survey mode (i.e. the Internet or telephone), 2. respondents who would respond in both and 3. respondents who would not participate at all. This means that the composition of the telephone and Internet-samples in a mixed-mode survey will contain people unique to that mode, and people who can also be found in the other mode (see below - the match part).
With matching, respondents who are similar on a set of covariates are matched from both survey modes, so that pairs of very similar respondents are formed from every survey mode. After matching, any differences that persists between the matched respondents from both samples cannot be due to selection effects on the covariates. Therefore, any differences that remain between the matched respondents after matching should exist only because of a mode effect: whether a question is asked by the interviewer or self-administered, whether it is audial or visual, and whether answers are spoken or written down.
Matching can be easily done using the package MatchIt in R (amongst others). More information about matching in mixed-mode surveys can be found in a manuscript I wrote with some colleagues.

Monday, February 28, 2011

Mode effects - the fact that respondents respond differently to a survey question, solely because of the mode of interviewing - are hard to study. This is because mode-effects interact with nonresponse effects. An Internet survey will atract different respondents than a telephone survey. Because of this, any differences that result from this survey, could be either due to differences in the type of respondents, or because of a mode effect.

There are three basic methods to study mode effects. The most common way to study mode-effects is:

1. to experimentally assign respondents to a survey mode. Then, the results from the survey are compared: the response rate, the demographic composition of the samples, and finally differences in the dependent variables. Sometimes, demographic differences between the samples are corrected using a multivariate model, like weighting. For an overview: see the results of this google scholar search.

This type of design is popular, but in my view it has a great drawback: we know Internet samples and telephone surveys do only cover a part of the population. Landline telephone coverage is rapidly declining, while Internet use remains limited to about 80 per cent of the general population in Western countries. There are two alternative approaches, that deal with this issue.

2. One can make respondents switch modes during the interview. For example from the telephone to the Internet, or from face-to-face to paper-and-pencil. Although this approach sounds very simple, relatively few studies have been conducted in this manner. See Heerwegh (2009) for a nice example.
More experimental studies are defnitely welcome and necessary if we want to understand how problematic mode effects are.

3. The third way of studying mode effects relies on more sophisticated statistical modeling to separate different sources of survey error. The most relevant errors in mixed-surveys are coverage, nonresponse and response errors (i.e. the mode effect). Separating these could be done using a) validation data b) repeated measurements using the same or different modes or c) matching.
I am not aware of any mixed-mode studies that have used validation data to study mode effects, and as the mode effect occurs for attitudinal questions mainly, it is hard to find such data. The other two approaches both offer more practical ways of assessing mode effects. I will discuss both the modeling approach using longitudinal data and matching more extensively in next posts.

Monday, February 21, 2011

One of the most interesting issues in survey research is the mode effect. A mode effect can occur in mixed-mode surveys, where different questionnaire administration methods are combined. The reasons for mixing survey modes are multifold, but usually survey researchers mix modes to limit nonresponse, reach particular hard-to-reach types of respondents, or limit measurement error. It is more common today to mix modes than not mix them, for some good reasons:

1. nonresponse to survey requests is ever increasing. In the 1970s it was feasible to achieve a 70% response rate without too much effort in the U.S. and the Netherlands. Nowadays, this is very difficult. In order to limit costs and increase the likelihood of a response, survey organisations use a mix of consecutive modes. For example, it starts by mailing a cheap questionnaire by mail, perhaps with an URL included in the mail. Nonrespondents are then followed up by more expensive modes: they are phoned, and/or later visited at home to make sure response rates go up.

2. there are few survey modes that are able to reach everyone. In the 1990s, almost everyone had a landline phone, now only 65% does so. Internet penetration is at about 85%, but does not seem to get higher. In order to reach everyone, we have to mix modes. On top of that, certain types of respondents may have mode preferences. Young people are commonly believed to like web-surveys (I'm not too sure of that), while older people like phone or face-to-face surveys.

3. for some questions, we know it is better to ask them in particular modes. Sensitive behaviors and attitudes, like drug use, committing fraud, or attitudes towards relationships, are better measured when the survey is anonymous (i.e. when no interviewer is present). For questions that are difficult and require explanation the opposite is true: interviewers are necessary for exmple to get a detailed view of someone's income.

Mixing survey modes seems to be a good idea from all these angles. One problematic feature however is that people react differently when they answer a question on the web or on the phone. This is because it makes a difference whether a questions is read out to you (phone), or whether you can read the question yourself. Also, it matters whether an interviewer is present or not, and whether you have to tell your answer or whether you can write it down. These differences between survey modes lead to all kinds of differences in the data: the mode effect. Although differences between survey modes are well documented, the problem is that mode effects and other effects are confounded: the different modes attract different people. People on the phone might be less likely to give a negative answer due to the interviewer being present, but it could also be that phone surveys attract older people, who are also less likely to answer negatively. The fact that measurement errors and non-measurement errors interact in mixed-mode surveys makes it very difficult to estimate how problematic mode effects are in practice, and whether we should be worried about them. In my next post I will outline some ways how mode-effects could in my view be studied and better understood

Thursday, February 10, 2011

Before people believe I'm old-fashioned, I do think that Internet-surveys, even panel surveys are the future of survey research. John Krosnick makes some good points in a video shot by the people from www.pollster.com

1. Is it clear who ordered and financed the poll?
2. Is there a report documenting the poll's procedures?
3. Is the target population clearly described?
4. is the questionnaire available and has it been tested?
5. what were the sampling procedures?
* the sample should be drawn for the target population. If it only contains for example people with Internet access, be careful
6. What is the number of respondents?
7. Is the response percentage sufficient?
* it is difficult to say what percentage is sufficient. Higher response percentages do not automatically lead to better data quality. 10 or 20 % is however too low.
8. Have the data been weighted?
9. Are the margins of error being reported?

Many opinion pollers do badly when it comes to predicting elections. This is mainly because they let their respondents self-select them into their polls. So what, who cares? The polls make for some good entertainment and easily fill the talk-shows on television. If everyone knows they cannot be trusted, why care?

We should care. In the Dutch electoral system - with poportional respresentation - every vote counts. If only a small percentage of voters lets their vote depend on the polls of the election result, this can result in shifts of several seats in parliament. It is unclear how many voters decide how to vote based on the opinion polls, but it is a fact that there are many voters who consider voting for two or more parties, and many who do vote strategically. The Dutch Parliamentary Election Study (DPES) in 2006 found that 18% of voters indicated that they let their vote be influenced by the election polls. This amounts to a total of 27 parliamentary seats: almost the number of seats of the largest party in the current parliament.

As long as voters choose strategically in different ways this may not matter. If someone votes strategically to make sure a new government has the the greens in it, but someone else votes strategically for labour to make sure his or her favourite candidate becomes prime mininster, the net effect of strategical voting might be zero or very small. There is evidence however, that this is not the case. People like to vote for winners. This is called the bandwagon effect. Whenever labour does well in the (biased) opinion polls, more voters will consider voting for them. This may in the end lead to the fact political parties (and pollers) have a lot of interest to do well in polls. In fact, it may be tempting to publish fraudulent polls. This seems to be increasingly common in the United States, where they call them "push polls" . Publish fraudulent polls on purpose to make public opinion shift in your favor.

So, what to do about it? First, I think it would be fair not to publish any opinion polls some time before election day, as is done in France for example (albeit only for two days). Second, journalists and newsreaders should be very critical towards opinion polls, and only publish them when some basic quality criteria have been assessed and met. The Dutch Organisation on Survey Research has taken the initiative to develop a checklist for journalists. I will put it online soon.

One of the difficulties in exit polling, is that some people might not want to say whom they vote for, especially if this person is politically controversial. This might be one of the reasons why Geert Wilders, and the PVV in general always underperform in Dutch exit polls. The second difficulty is selecting a number of polling stations. Good exit polls do this either randomly, or (even better) choose stratified sampling. Stratified sampling is particularly important when voting behavior has a strong regional component. For example, a random selection of polling stations in the Netherlands, might exclude by chance any localities in the 'bible belt' , where people often vote for the SGP leading to a under-represntation of voters for the SGP. Stratifying on past voting behavior in polling stations can increases statistical power, making sure we need fewer polling stations to achieve the same margin of error.

In the past, exit polls were conducted like this. Slowly, market research firms have first switched to telephone surveys, and later Internet surveys to do their exit poll. Both TNS NIPO and peil.nl relied on their panel to predict the election results. This once again shows how people who voluntarily join access panels can not be used to produce good statistics for the general population.
Wisely, the Dutch news stations (ANP, NOS, RTL) chose to do a proper, old-school exit poll in 2010. See this post for details (in Dutch).

So what, one might ask? Why worry about the crappy polls? We can just ignore them, and then focus on the polls that do a good job? Alas, people are heavily influenced by polls in the media in the period leading up to elections. More on this, and strategic voting, next time

Monday, January 17, 2011

Opinion pollers do a lousy job of predicting elections. For a good read, see for example the prediction of the New Hampshere primary in 2008, when all polls predicted Obama to win, but it was Clinton who won (albeit by a slim margin).

In the Dutch context, there are three main polling firms, that each do equally well (or badly). Out of a hundred and fifty parliamentary seats, peil.nl mispredicted 20, while TNS-NIPO and Synovate shared the honor of only missing the target by 16 seats in the 2010 parliamentary election. These polls were conducted the day before the election, and some of the pollers said that people might have changed their vote at the last minute. That may very well be, but even the exit poll on the night of the election was wrong. Peil.nl was 17 seats off and TNS NIPO 15. Only Synovate did a lot better, and only missed the true result by 3 seats. I will discuss why this is in a next post, but it is just a matter of speed and low costs versus quality.

Monday, January 10, 2011

With a new year come new year's resolutions. I have been working as a survey methodologist for about the last 5 years. I teach and I do research. Teaching gives instant rewards, or at least instant feedback. I like that. Doing research is however a different matter. It is a slow and sometimes agonizing process of muddling through (for me).

Studies remain in review forever, sometimes don't make it at all into a publication, while some of my ideas or views just never make onto paper at all. I hope this blog fills that gap.

I will write in English, but might occasionally do so in Dutch if I feel like it. As far as content goes, I'm not sure where all of this will lead. I might post very academic-like things very frequently, but could also publish every once in a while.

As a survey methodologist my view is that data matter. Policy makers, and academics use data too often without really knowing how the data were gathered, and whether they are trustworthy. Over the past five years it is my experience that data quality is often low, leading to badly informed or even wrong decisions. Data quality is far more important that fancy statistical models or cool graphs. Hopefully you will enjoy my adventures in the jungle of improving survey data quality.