Thursday, August 11, 2005

Revisiting Kirkpatrick's Level One

Whenever I am involved in an evaluation project, I advocate getting rid of the smile sheet completely, and replacing that tortured questionnaire with one closed question, plus an open follow-up to encourage respondents to reveal what really matters to them: “Would you recommend this course to a friend or colleague? Why or why not?”

The response tells you unambiguously about the level of satisfaction of the learner, and any clarification offered tells you about the issues that really matter to that learner. That’s more than is called for at Level 1, especially if you have done a good job of testing your training intervention before rolling it out live.

It’s not always possible to reduce things to one question, but I see it as a starting point in the negotiation. I tend to be somewhat dismissive of Level 1 evaluations. That is not because they serve no purpose (they are vital), but because they attract way too much attention at the expense of business impact studies, and because they are often poorly designed and inaccurately interpreted.

Every training intervention needs some kind of feedback loop, to make sure that – within the context of the learning objectives – it is relevant, appropriately designed, and competently executed. At Level 1 the intention is not to measure if, or to what extent, learning took place (that’s Level 2); nor is it intended to examine the learner’s ability to transfer the skills or knowledge from the classroom to the workplace (Level 3); nor does it attempt to judge the ultimate impact of the learning on the business (Level 4). Level 1 of Kirkpatrick’s now somewhat dated “four levels” is intended simply to gauge learner satisfaction.

Typically, we measure Level 1 with a smile sheet, a dozen Lickert-scaled questions about various aspects of the experience. At the end of the list we’ll put a catch-all question, inviting any other comments. I won’t repeat the reasons why the end-of-course environment in which such questions are answered is not conducive to clear, reasoned responses. But the very design of such questionnaires is ‘leading’ and produces data of questionable validity, even in a calm and unhurried environment.

Far too many of the smile sheets that I see put words or ideas into the mouths of learners. We prompt for feedback on the instructor's style, on the facilities and food, on the clarity of slides. The net effect is to suggest to respondents (and to those interpreting the responses) that these things are all equally important, and that nothing outside of the things asked about has much relevance. By not prompting respondents you are likely to get to those things that, for them, are the real burning issues. Open questions are not as simple to tabulate, but they give you an awful lot to chew on.

Now the one-question approach does not necessarily give you all the data that you need to continuously fine-tune your training experience – but neither does the typical smile sheet. Trainers need to understand that sound analytical evaluations often require multi-stage studies. Your end-of-course feedback may indicate a problem area, but will not tell you specifically what the problem is. A follow-on survey, by questionnaire, by informal conversation, or by my preferred means of a brief focus group, will tell you a great deal more than you could possibly find out under end-of-course conditions.

The typical smile sheet is a lazy and ineffective approach to evaluating learner satisfaction. It may give you a warm and comfortable feeling about your course or your performance as a trainer, or it may raise a few alarm flags. But the data that it produces is not always actionable, is rarely valid, and often misses the important issues.

In market research, or any statistical field for that matter, there are two important errors that good research tries to mitigate. Known as Type One and Type Two Errors, they measure the likelihood of seeing something that is not there and the likelihood of missing something important that is there. I have never heard anyone address these error types in their interpretation of Level 1 results.

We see in our smile-sheet results what we want to see, and react to those things that we regard as relevant. If we are so smug in our knowledge that we know what is going on anyway, why do we bother with token smile sheets at all?

4 comments:

Godfrey, there's another aspect of smile sheets that is rarely addressed: Timing. The feedback you get at the conclusion of an event is bogus. The learner has no clue how much learning is going to stick. Better results would result from asking for the assessment a month later the learning. The wisdom of hindsight....

"Now that you've had a chance to apply what you learned, would you recommend this to a colleague?"

Even thought the focus should be on the desired end results - Level 4, smile sheets do provide one aspect not addressed by either gentlemen, that being Do I like my environment or the emotional aspect hence "Smiley." I heard Don Kirkpatrick specifically talk about the importance of having the emotions involved or what some call "What's In It For Me" (WIIFM.

Also, Jay might be somewhat presumptious to suggest that all training provides an opportunity for application 30 days later. Would it not be better to have shorter learning engagements that span a month to provide opportunities for application?

Mention of the Type 1 and Type 2 errors always reminds me of what an early stats professor joked: that there is also a Type 3 error, which in the phrasing of this posting would be looking at the wrong thing.

What's wrong with "smilesheets" is that they support to many Type 3 errors.

I like the type 3! I had a whacky-but-wonderful professor who had a string of "type" errors that he'd roll out metaphorically. Two that come to mind were Type Red error (putting all your chips on red) and Type 13 error (getting out of bed that morning).