Questions/revision from students

This is a very readable book on that topic: Shipley, B. (2000). Cause and correlation in biology : a user's guide to path analysis, structural equations, and causal inference. Cambridge, UK ; New York, NY. USA: Cambridge University Press.

What is "superficial validity"?

That was a descriptive slide heading about validity in terms of "seeming right" or agreeing with some observer. I changed it to "surface-level" now.

Is content validity to do with the precise content of the questions, whereas criterion validity is sort of making sure you're looking at the relevant outcome?

Content validity refers to items seeming (to some observer) to have content that matches the construct being measured. Criterion validity refers to predicting an outcome that (some observers agree) measures the construct. So if higher IQ scores predicted better exam grades, that might count as criterion validity. Another researcher might claim that this is a new discovery: that IQ is related to school. In the end both are true. Which is why I called these surface-level validity in my lecture.

Is it the case that face validity is separate to content and criterion validity, or is it subsumed by them? As is the case with predictive and concurrent validity?

Lots of these terms are, in my opinion a bit like, the old descriptions med schools taught of schizophrenia "hebephrenic" "catatonic". They were descriptive at best, and a substitute for better understanding. That said. Face validity requires only that you or some expert says "that looks right to me". Content validity is clearly very closely linked to this idea. Criterion validity is clearly quite different.

You have a slide title mentioning "external and internal manifestations of bias". What this was relating to?

I changed this to "How might test-bias be visible in data?" which is hopefully more clear.

Could you perhaps clarify the difference between culturally biased and culturally loaded

In measurement theory, the idea of bias implies a problem in measurement: the measured result does't accurately reflect the underlying trait we want to measure. So a culturally biased test is one which doesn't measure the same thing in two cultures. A culturally loaded test is one which measures something that is affected by culture. So a test could be culturally loaded but not biased, or not cultural lay loaded, but be biased, or be both culturally loaded and biased.

#You give examples of culturally loaded questions (Edinburgh parks and E=mc^2), stating that the Physics question is lower on bias (presumably because E=mc^2 is more universally known, thus it can be assumed that more people across the globe will recognise this equation as opposed to a question detailing knowledge of Edinburgh parks). Thus, is the Edinburgh park question higher on bias?
* That was what I intended. I think I mentioned that Binet in his testing sought to build items which were either not dependent on experience, or were dependent only on experiences which all subjects could be expected to have had. In that way, he sought to minimise bias. If you define your population within which the test is to be used (say, people born and living in Edinburgh), then a measure of park knowledge would not be biased.

Is bias dependent, then, on the samples of participants you ask?

Yes, as stated above, a test bias will only be manifested when groups are tested which differ in some way the test-constructor did not anticipate. For instance, if you discovered that naming Edinburgh parks was a biased item for Taxi drivers in Aberdeen, you could modify the test to use "local parks".