Readers told us that it was instructive and engaging to take quizzes on using assessments, and we like to listen to you! So here is the first of a new series of quizzes on assessment topics. This week’s quiz is on setting a cut score (pass score). The questions, written for us by Neil Bachelor of Pure Questions, are about what to do when designing a diagnostic test for safety procedures.

We regard resources like this quiz as a way of contributing to the ongoing process of learning about assessment. In that spirit, please enjoy the quiz below and feel free to comment if you have any suggestions to improve the questions.

Now on to the quiz! Be sure to look for your feedback after you have completed it!

14 Responses to “How much do you know about assessment? Quiz 1: Cut Scores”

Just a differtnt opinion. For the question
“It would be better to set multiple cut scores than a single cut score.”

Incorrect, the right answer is TRUE. This would be beneficial as any employees who had very little awareness of the safety procedures or, worse, had potentially dangerous misconceptions, should receive remedial training as a matter of urgency. In order to distinguish these high priority individuals from those who have partial understanding and are closer to the cut score, a second lower threshold could usefully be set.

I disagree. If they failed, they are judged unsafe to perform the work. They need to stay clear until trained. When someone can’t do the job we hired them for, that should generate it’s own sense of urgency.

Thank you John for an interesting exemplar. I was interested in how you arrive at the final score when using negative marking. In the above example, the net score of 1 out of possible scores in the interval [-5,5] is mapped to [0,100] scale without mapping the interval [-5,5] -> [0,100]. I know it is a matter of interpreting shift correctly.

I would be curious to hear of your views (particularly, how receptive the practitioners from industry are to negative marking).

I like the idea of assessing knowledge of assessment topics and look forward to the other quizzes in this series.

I do have a slight bone to pick with the scoring though – the decision to use negative scoring for incorrect answers is fine (makes me take the quiz a little more seriously), but I think the ‘Score’ at the end is in correct. For example, I answered 4 out of 5 questions correctly (I would like to argue over the scoring of the one that was marked wrong, but let’s leave that for another day), so my final score was 4 for the correct answers, -1 for the one I got wrong, which made a total of 3.

However, the wording on the results were as such:

Total Score: 3 out of 5, 60%

I can understand this if I interpret this as:

Total Score: 3 (points) out of (a maximum score of) 5, 60% (of the total possible number of points)

But, it could also be read as 3 out of 5 questions answered correctly (which would correspond to traditional scoring formats). Is it possible to alter the wording for scores in QMP, or would we need to instruct students/lecturers about the meaning behind the statement?

Thank you for having taken the quiz! Some people commented on the first quiz that they were surprised to lose marks for getting questions wrong. This quiz uses True/False questions and it is easy to guess at answers, so we’ve set it to subtract a point for each question you get wrong, to illustrate that this is possible. Negative scoring like this encourages you to answer “Don’t Know” rather than guess; this is particularly helpful in diagnostic tests where you want participants to be as honest as possible about what they do or don’t think they know.

Thank you for having taken the quiz! Some people commented on the first quiz that they were surprised to lose marks for getting questions wrong. This quiz uses True/False questions and it is easy to guess at answers, so we’ve set it to subtract a point for each question you get wrong, to illustrate that this is possible. Negative scoring like this encourages you to answer “Don’t Know” rather than guess; this is particularly helpful in diagnostic tests where you want participants to be as honest as possible about what they do or don’t think they know.

Anthony’s point is a fair one and I agree that anyone deemed ‘unsafe’ and a significant risk to themselves or others shouldn’t be working the line. If this became apparent from an assessment, remedial training should be provided before they are allowed to return. However, I suppose that the grey area is how and where you define a level of understanding that constitutes ‘failure’ or ‘unsafe’. If someone scores 80% in the assessment there is certainly room for improvement; but have they failed? Are they unsafe? Should everyone given a safety test get 100%? This arguably depends upon the nature of the procedures being assessed, the environment in which they are being applied and the adverse impact if they are not. Each scenario should therefore be judged on individual merit and, on reflection, the scenario probably doesn’t provide enough information to make this distinction. I’m sorry about that. As Safety Procedures rightly give the least room for misunderstanding perhaps referring more broadly to ‘packing’ or ‘operational’ procedures would have been a more suitably accommodating phrase. In any case, thanks to Anthony for making the counterpoint.

To provide some extra background for Venkat’s query regarding scoring: the % level has a system configured minimum of zero (or else it would go down to -100% if all 5 statements were answered incorrectly).

To address the broader rationale of the scoring we’ve used, I’d like to refer back to one of the foundations of Classical Test Theory (and indeed all assessment) in this equation:

The Observed Score = The True Score +/- Measurement Error

As many of you will be aware, Measurement Error takes many forms but with the Multiple-Choice format, and especially True/False statements, one of the biggest is from random or even educated guessing. In essence, the negative scoring is designed to minimise this as for every answer that someone guesses correctly, and gets a positive score, they are as likely to get one wrong and score a negative to cancel it out. Across the duration of a test, this guessing ‘error’ element will therefore tend to and vary around zero which means that when someone scores 70% on a quiz or exam, this score is arguably more meaningful (i.e. it doesn’t indicate 35% understanding and 35% of guessing, it indicates 70% understanding).

To give a quick example, imagine you randomly guess through ten true/false responses without the negative scoring. Your expected score, on the basis of probability, is 5 out of 10. If you repeat this but with the negative scoring activated, your expected score would be zero.

Relating this to Kenji’s and Eric’s score on the Quiz:

A) ‘3 out of 5’ would indeed have been from 4 correct (+4/80%) – 1 incorrect (-1/20%) = 3/60%.

Alternatively, for a lower score:

B) ‘1 out of 5’ would have been from 3 correct (+3/60%) – 2 incorrect (-2/40%) = 1/20%

As mentioned in an earlier comment, this negative marking has the additional benefit of encouraging people to use the ‘I don’t know’ option rather than make a random guess. This is particularly helpful in diagnostic tests when ‘non-gamed’ item response data is of extra benefit to the tutor (for identifying misconceptions).

In term of client and academic reaction, I’ve been using this scoring system for many years and it’s generally very well received. It does sometimes raise a few questions but, once explained, clients and test-takers seem to appreciate that it has a fairness and balance at its core. On the downside though, I must concede Kenji’s point that ‘score’ is a debatable term for the number that it ultimately generates. Alas the thesaurus doesn’t yield anything more inspiring than ‘mark’ mind so better suggestions most welcome!