A note on bias in project management research

Project management research relies heavily on empirical studies – that is, studies that are based on observation of reality. This is necessary because projects are coordinated activities involving real-world entities: people, teams and organisations. A project management researcher can theorise all he or she likes, but the ultimate test of any theory is, “do the hypotheses agree with the data?” In this, project management is no different from physics: to be accepted as valid, any theory must agree with reality. In physics (or any of the natural sciences), however, experiments can be carried out in controlled conditions that ensure objectivity and the elimination of any extraneous effects or biases. This isn’t the case in project management (or for that matter any of the social sciences). Since people are the primary subjects of study in the latter, subjectivity and bias are inevitable. This post delves into the latter point with an emphasis on project management research.

From my reading of several project management research papers, most empirical studies in project management proceed roughly as follows:

Formulate a hypotheses based on observation and / or existing research.

Design a survey based on the hypotheses.

Gather survey data.

Accept or reject the hypotheses based on statistical analysis of the data.

Discuss and generalise.

Survey data plays a crucial role in empirical project management studies. This pleads the question: Do researchers account for bias in survey responses? Before proceeding, I’d like to clarify the question with with an example. Assume I’m a project manager who receives a research survey asking questions about my experience and the kinds of projects I have managed. What’s to stop me from inflating my experience and exaggerating the projects I have run? Answer: Nothing! Now, assuming that a small (or, possibly, not so small) percentage of project managers targeted by research surveys stretch the truth for whatever reason, the researcher is going to end up with data that is at least partly garbage. Hence the italicised question that I posed at the start of this paragraph.

The tendency of people to describe themselves in a positive light referred to as social desirability bias. It is impossible to guard against, even if the researcher assures respondents of confidentiality and anonymity in analysis and reporting. Clearly this is more of a problem when used for testing within an organisation: respondents may fear reprisals for being truthful. In this connection William Whyte made the following comment in his book The Organization Man, “When an individual is commanded by an organisation to reveal his innermost feelings, he has a duty to himself to give answers that serve his self-interest rather than that of The Organization.” Notwithstanding this, problems remains even with external surveys. The bias is lessened by anonymity, but doesn’t completely disappear. It seems logical that people will be more relaxed with external surveys (in which they have no direct stake), more so if they are anonymous. However, one cannot be completely certain that responses are bias-free.

Of course, researchers are aware of this problem, and have devised techniques to deal with it. The following methods are commonly used to reduce social desirability bias

The use of forced choice responses – where respondents are required to choose between different scenarios rather than assigning a numerical (or qualitative) rating to a specific statement. In this case, survey design is very important as the choices presented need to be well-balanced and appropriately worded. However, even with due attention to design, there are well-known problems with forced choice response surveys (see this paper abstract, for example).

It appears that social desirability bias is hard to eliminate, though with due care it can be reduced. As far as I can tell (from my limited reading of project management research), most researchers count on guaranteed anonymity of survey responses as being enough to control this bias. Is this good enough? May be it is, may be not: academics and others are invited to comment.

8 Responses

Thankfully research is not that easy.
The effects described exist in every data analysis, but when are they actually affecting research?
To counter-argue the blog a little, designing questionnaires is neither an easy nor trivial task – but a basic one. Every science need to understand it’s measurement instruments.

Social Desirability?

Yes, it does exist, but if it exists it must exist for all respondents alike. Say you want to identify yet another critical success factor for project management, let’s say WBS-use. What do you do? You measure a lot of factors including your WBS-use and you measure a self-reported project success. The easy answer to the desirability problem is that luckily the bias has the same likelihood to affect all respondents (it’s a systematic error not a random error in the design). You gather your data and then you split it into two groups, the one that use WBS vs. non-users. Your test for group differences and report a significant result. Yeah! With the little tweak of reporting a structural result and not a numerical you sneaked your way around the problem. Well done! That is why a serious researcher would never report that use of WBS in project management increases your average success rate by 15%. That’s something only consultants do.

Questionnaire design

Furthermore, I claim/hope that a lot of thought goes into questionnaire design. Questionnaires should rely on previously used sets of questions, one factor or influence is never addressed by only one question, to account for other biases (e.g., consistency) questions are asked as a positive and negative statement. In order to account for question order effects, questions are randomly mixed, and so on and forth. Secondly, questionnaires are pre-tested and pre-tests analysed – statistical methods (e.g. Cronbach’s Alpha) exist to measure the quality of a set of questions.
Other things you usually wonder about are: How many levels do I show on a rating scale 1-5, 1-7, 1-10? Is the middle of my scale really neutral? Questions like “How satisfied are you?” get different answers, if you ask it on a scale where every point has a verbal description (‘very satisfied’, ‘satisfied a little’ etc.), a numerical anchor (1, 2, 3, etc.), or both of them or none and only the ends of the scale have a verbal description.

Quantitative research in general

Garbage in – garbage out.
Good quantitative research relies on Theories and has a sound logical explanation before testing something. Bad research gets some data throws it to the wall (aka correlation analysis) and reports whatever sticks.
Good research Re-uses previously published questionnaire scales and questions, tests carefully for errors. Good research tries to find non-self-reported measurements to validate the findings, for instance project success with actual cost overrun figures. Good research validates results by re-testing the effects found in analysis on a different group of participants or on the same participants a little time later.
Good research answers structural questions and not effects, uses advanced methods (e.g. structural equation modelling) to account for moderating influences, complex cause-effect-relationships, as well as errors and residuals (real life is complex and can never be fully explained by a limited set of questions and factors). Bad research uses single correlations and regression analysis, reports the strength of effects, and has small sample sizes.
Good quantitative research knows its limitations and discusses them, there are a million other things to consider I haven’t even touched yet – like interviewer bias (e.g. the gender of the interviewer, how he/she asks the questions), or how the questionnaire is administered (e.g. web-based, pen and paper, interview on the phone, interview in person).

To put it in a nutshell – researchers think a lot about these questions and do not take them light heartedly. Of course there is bad research and good research, and there is a whole research arm called ‘Philosophy of Science’ where scholars are debating only these questions. It’s a damn interesting topic anyway.

Next time in this comment section: What is this Case Study based research all about?

Thanks so much for your comments. This is just the sort of detailed, thoughtful response I was hoping for.

Indeed, I’m sure that good researchers (who one assumes are the majority?!) do worry about bias, and try their best to address it. On the other hand, though, I’ve read way too many papers that have a cavalier approach to bias and statistics. Hence my motivation for this post.

Thanks again for your detailed counter-arguments. I look forward to your future comments.

I am interested in doing research into bias of this nature, especially when collecting data from project managers only (and not their superiors). Would you be able to point me to any literature you have come across that deals with this specifically?

Thanks for your message. Although there’s a lot of work been done on social desirability bias other contexts, I couldn’t find anything specific to project management (in my searches via google scholar). To dig deeper, one would need access to a good research library – which, unfortunately, I don’t have. I’d be very interested to know if you can find something: please do let me know if you do.

The lack of any work in this area is what prompted my post. I’m pretty sure social desirability bias plays an important role in skewing data collected by PM researchers – (Why? Well, because everyone likes to think they do they right thing..). It would be interesting to find out if this is really so, although I think it is hard to come up with a sound research methodology to investigate this.

Sorry I can’t be of any help. Good luck with your work, and do let me how it goes.

[…] In earlier posts, I’ve written about the role of cognitive biases in project task estimation and project management research. The effect of these biases on performance metrics can be summarised as follows: since many […]