Lies, Damned Lies, and…

I’m currently waiting to find out whether my research methodology empire will be extended this term, to cover a quantitative research methods course for second-year undergraduates – a teaching stint that would itself be regarded as preparation for assisting with a rethink of our second-year methodology course offerings, which are currently split between one term of “quant” and one term of “qual”. No one likes the split and yet, for various reasons (some programs only want their students to take one or the other course, and some programs are still running their own independent courses, etc.) thinking through whether and how to integrate these courses will be quite complex. While my responsibility for this course is still somewhat hypothetical, the beginning of the term is rapidly approaching, and I’ve begun half-preparing – mainly by soliciting ideas from folks who have taught into the course in the past, or who are interested in teaching into it this coming term.

One of the stories that seems to crop up in relation to past iterations of the course is the difficulty obtaining an interesting dataset on which students can practice the more statistical concepts covered in the course. Past iterations of the course appear generally to have given students some overarching policy problem – drug use in youth culture is a theme that has been mentioned often – and then set them loose on a dataset to test various hypotheses against the data, and to reflect on the policy implications of their results. Apparently, however, we have struggled to obtain relevant Australian data sufficiently robust for whatever exercises the students have been asked to perform. Instead – at least one year – we used a UK dataset, but were still asking students to reflect on Australian policy concerns.

When I heard this, I grimaced a bit, and said, “No – I’d really rather, if the point is to reflect on local problems, we use relevant local datasets. Otherwise it will confuse the students – and convey the wrong message, I think, about the need to look into these problems empirically – we don’t want to give the impression that just any old data will do…”

“Oh, no -” my interlocutor clarified, “the students didn’t know they were using UK data. We went in and edited the dataset – we changed all the names of British counties to the names of Victorian communities. It took forever! So, as far as the students were concerned, they were working with Australian data. They never knew.”

Now let me get this straight: We give students a term-long assessment task, oriented to get them to test their assumptions about an Australian policy issue (I’m not clear whether this was on the drug use topic, or on something else) – but we cook the data!!! Oh sure, the data are true for somewhere – and the same sorts of skills and reasoning would apply, regardless of the dataset – I do understand the reasoning behind the assessment task. But still… I have these images of students coming out of this course, getting into debates with friends and family years from now, and going, “Well, you know, I actually researched this issue at uni, and apparently the trend is…” What will the students do, when they run into conflicting empirical data at some later point? How will they make sense of it all?

Why not just tell students you’re using UK data? Or making the data up? Surely we don’t think our students are so fragile that this would cause them to disinvest completely from the task?