The Psychology of Personnel Selection (1)

“Ideally those in the business of selection want to use reliable and valid measures to accurately assess a person’s abilities, motives, values and traits. There are many techniques available and at least a century of research trying to determine the psychometric properties of these methods. Over the past twenty years there have been excellent meta-analyses of the predictive validity of various techniques.
In this chapter we considered some assessment and selection techniques that have alas ‘stood the test of time’ despite being consistently shown to be both unreliable and invalid. They are perhaps more a testament to the credulity, naivety and desperation of people who should know better.
However, it is important to explain why these methods are still used. One explanation is the Barnum effect [see also this] whereby people accept as valid about themselves and others high base-rate, positive information. The use to a client of personal validation of a test by a consultant or test publisher should thus be questioned.
Part of the problem for selectors is their relative ignorance of the issues which are the subject of this book. Even specialists in human resources remain uninformed about research showing the poor validity and reliability of different methods.”

…

Like the body language book, this book belongs in the category of ‘books that might actually contain information that will be useful for me to know at some point’ – the stuff covered is certainly related to thoughts I posted here on the blog not long ago.

I’ve read the first half of the book by now, which covers the topic of methods of personnel selection. The second half deal with (psychological) constructs used for personnel selection. I’m not particularly impressed at this point, but it’s not a bad book and there are some useful observations in here. The analysis provided in the coverage is not very sophisticated; they’ll talk about the usual correlations among methods and job performance and other outcome variables and they’ll have a word or two about the ‘variance explained’, perhaps derived from meta-reviews – but detailed analysis is often missing, or at least less detailed than I’d have liked it to have been. I should point out that it’s not just at the analytical level that the coverage is not too impressive/deep; it’s also at the more conceptual level. Biographical data – i.e. information about a person’s background and life history (/work history) and stuff like that (they call it ‘biodata’) – may for example not play a large role overall and may have a rather limited explanatory power in terms of explaining future performance, but some of the specific variables which might be introduced into the hiring analysis when covering such aspects of an applicant’s skill set and background may still be very high-impact; one example would be past prison sentences. They do talk a little about different scoring mechanisms applied by employers and this implicitly at least conceptually relates to the types of decision rules firms might use to integrate data like this into the hiring decision process – they don’t really talk about decision rules or implementation at all, but you sort of know these things are lurking there in the background – however they only talk about that stuff in the abstract, and I’m not sure I understand why they don’t go into more detail here in terms of identifying variables which might be of particular interest. Another problematic aspect to me was the almost total absence of cost-benefit stuff in the chapter about interviews. More specifically, I remember reading a study perhaps a year or two ago which found that using two interviewers rather than one seemed not to be cost-efficient (in that sample at least, or in the baseline model setting – something like that..) as the greater accuracy obtained was not sufficient to cover the increased cost associated with having an extra employee interview people, rather than doing other stuff instead. As the job-relevant information you can obtain from an interview is not particularly impressive to start with, compared to what information you may be able to get from other mechanisms, this is not too surprising, but such aspects were nevertheless still not covered. I’m too lazy to find the study, but the main point is that tradeoffs like these do exist and likely inform, or at least should inform, some firms’ interview practices – to me it seems as if it would make a lot of sense to include such information in a book like this as well.

On another note even though they do talk about how to improve upon methods which might have some promise but are often used in a suboptimal manner, how to optimally combine methods is not really addressed in the book at the analytical level, certainly not in any detail. Perhaps it’s not really fair to criticize the authors for that, as I’d probably have had some critical remarks as well if they had decided to talk about stuff like that as well, but it feels a little bit strange to me that such aspects are not covered. The authors talk about specific methods, and they talk about how good these are – interviews are this good, references are that good. They compare the methods with other methods which might be able to provide the same information, which is something you’d expect them to do. It’s made clear in the book, as you should be able to tell from the quotes below, that some methods overlap in terms of the information you’ll get out of them. However might it not also be the case that a combination of methods sometimes will provide more value to the firm than perhaps the sum of the coefficients might indicate? One might conceptualize this in terms of a hurdle model where firms will only hire someone once they’re convinced that the individual is ‘good enough’, and before some individual has cleared the hurdle and proven himself worthy all efforts aimed at finding the right candidate are basically wasted – and it turns out that the firm needs both GPA, biodata and interview information before anyone can be said to have cleared the hurdle, as these data sources are deemed sufficiently different from each other to enable the employer to assess the candidate along all the relevant dimensions (or something like that…). Maybe this is not a good way to think about this, but either way many employers do use multiple methods during the same selection process, and not taking interactions among methods into account at the analytical level in any detail and in particular implicitly focusing only on one particular type of interaction in your coverage seems problematic in light of that; you want to compare the methods, but in order to make proper comparisons you need to somehow include in your considerations/analysis the relevant potential combinations of methods as well, and the authors don’t engage in this type of analysis (there are a few remarks about ‘incremental validity’ a couple of places, but that’s it). I don’t know – maybe they’ll talk about this stuff in more detail in the second part of the book.

I’ve added a few observations from the book below, as well as some more comments. The main thing I’ve taken away from the first part of the book is that even the most informative methods available to future employers don’t really tell you nearly as much as you might think they do (and it seems clear from the coverage that most of them probably tell the employers/interviewers much less than these people think they do) – there’s a lot of variation in performance etc. which is in some sense unaccounted for.

…

“Whatever we might believe about physiognomy and personology, it is clear that many people make inferences about others based on appearance. […] Considerable experimental evidence suggests that people can and do infer personality traits from faces […] Taken as a whole, this research shows that the process of inferring traits from faces is highly reliable. That is, different judges tend to infer similar traits from given faces. [see incidentally Funder for more details on this kind of stuff; the details are messier than the authors let on here – for example it matters a lot which traits we’re talking about here, as some are much easier to observe than are others] […] However, the picture that emerges regarding the validity of physiognomic judgements is more ambiguous. […] There remains very little evidence that body shape is a robust marker of temperament or ability and should therefore be used for personnel selection. That said, it is to be expected that people’s (for example, interviewers’) perceptions of others’ (e.g., interviewees’ or job applicants’) psychological traits will be influenced by physical traits, but this will inevitably represent a distorted and erroneous source of information and should therefore be avoided.”

“The result of an interview is usually a decision. Ideally this process involves collecting, evaluating and integrating specific salient information into a logical algorithm that has shown to be predictive.
However, there is an academic literature on impression formation that has examined experimentally how precisely people select particular pieces of information. Studies looking at the process in selection interviews have shown all too often how interviewers may make their minds up before the interview even occurs (based on the application form or CV of the candidate), or that they make up their minds too quickly based on first impression (superficial data) or their own personal implicit theories of personality. Equally, they overweigh or overemphasise negative information or bias information not in line with the algorithm they use. […] Research in this area has gone on for fifty years at least. Over the years small, relatively unsophisticated studies have been replaced by ever more useful and important meta-analyses. There are now a sufficient number of meta-analyses that some have done helpful summaries of them. Thus Cook (2004) reviewed Hunter and Hunter (1984) (30 studies); Wiesner and Cronshaw (1988) (160 studies); Huffcutt and Arthur (1994) (114 studies) and McDaniel, Whetzel, Schmidt and Maurer (1994) (245 studies). These meta-analyses covered many different studies done in different countries over different jobs and different time periods, but the results were surprisingly consistent. Results were clear: the validity coefficient for unstructured interviews as predictors of job performance is around r = .15 (range .11 – .18), while that for structured interviews is around r = .28 (range .24 – .34). Cook (2004) calculates the overall validity of all interviews over three recent meta-analyses – taking job performance as the common denominator of all criteria examined – to be around r = .23.”

“given that interviews are used to infer information about candidates’ abilities or personality traits […], they provide very little unique information about a candidate and show little incremental validity over established psychometric tests (of ability and personality) in the prediction of future job performance […] All sorts of extraneous factors like the perfume a person wears at interview have been shown to influence ratings.”

There’s a literature on this stuff, and they provide a few samples of the sort of findings that may pop up when people look at these things. I’ve talked about some of these before, but I’ll add them here anyway. Attractive people get higher evaluations – this is not surprising. Female interviewers gave higher ratings than male interviewers. Early impressions were more important than factual information for interviewer ratings. There’s a contrast effect at work where your rating may be influenced by the rating of the guy who came before you. Non-verbal communication clearly matters – ‘applicants who looked straight ahead, as opposed to downwards, were rated as being more alert, assertive and dependable; they were also more likely to be hired. Applicants who demonstrated a greater amount of eye contact, head moving and smiling received higher evaluations.’ Interviewers give higher ratings to applicants they perceive to be similar to themselves, and/or applicants they find ‘likeable’. Interviewers may rate negative information more heavily than positive information. They have been found to talk more when they’ve formed a favourable decision. Pre-interview impressions have been found to have strong effects on the outcome of an interview; if the interviewer is favourably inclined before the interview starts, the interviewer is more likely to rate you highly and to think you handled the interview well. In terms of the time interviewers spend making a decision, it’s noteworthy that the decision to hire may be made very fast: “Interviewers reached a final decision early in the interview process; some studies have indicated the decision is made after an average of 4 minutes. Decisions to hire were made sooner than decisions not to hire.”

A bit more job interview stuff:

“Interpersonal skills manifest in interviewing can be characterised by:

Fluency: smooth, controlled, unflustered progress.
Rapidity: speedy responses to answers and issues.
Automaticity: performing tasks without having to think.
Simultaneity: the ability to mesh and coordinate multiple, verbal and non-verbal tasks at the same time.
Knowledge: Knowing the what, how, when and why of the whole interview process.

Skills also involve understanding the real goal of the interview, being perceptive, understanding what is and what is not being said, and empathy. Recent research in the past decade has argued that the key issue assessed by the employment interview is the person–organisational fit […] [however] most interviewers try to assess candidate’s personality traits, followed closely by social or interpersonal skills, and not that closely by intelligence and knowledge. On a few occasions, interviewers focus on assessing interviewees’ preferences or interests and physical attributes, and the variable of least interest appears to be fit […] It is noteworthy that all these variables can be assessed via reliable and valid psychometric tests […], which begs the question of what if any unique information (that is reliable and valid) can be extracted from employment interviews.”

What about references? One funny observation I hadn’t thought about in this context is that if you’re an employer who want to get rid of a guy, a ‘good way’ to help him on his way is to write a nice reference letter. In a way you have a much stronger incentive to provide the low-productivity worker with a very nice letter of reference than you do your star employee; you’d much rather the latter didn’t go anywhere and kept working in your company. More generally, references tend to say nice things about the people who ask for them (and not only nice things, but the same nice things – ‘referees tend to write similar references for all candidates’), meaning that the variance is low – (in particular the unstructured) references don’t actually tell you very much because they tend to look very similar. Here’s part of what they write in the book:

“References are almost as widely used in personnel selection as the interview […] Yet there has been a surprising dearth of research on the reliability and validity of the reference letter; and, as shown in this chapter, an assessment of the existing evidence suggests that the reference is a poor indicator of candidates’ potential. Thus Judge and Higgins (1998) concluded that ‘despite widespread use, reference reports also appear to rank among the least valid selection measures’ […] The low reliability of references has been explained in terms of evaluative biases (Feldman, 1981) attributable to personality characteristics of the referee […] Most notably, the referee’s mood when writing a reference will influence whether it is more or less positive […] Some of the sources of such mood states are arguably dispositional […] and personality characteristics can have other (non-affective) effects on evaluations, too. For example, agreeable referees […] can be expected to provide more positive evaluations”

“Although the wider literature has provided compelling evidence for the fact that cognitive ability tests, particularly general mental ability scores, are the best single predictor of work performance […], Dean and Russell’s (2005) results provide a robust source of evidence in support of the validity of coherently constructed and scored biodata scales”

One big problem is that although that may be the case, that validity is mostly academical as that’s mostly not how employers handle the data – i.e. using ‘coherently constructed and scored scales’: “Biodata are typically obtained through application forms […] It is […] noteworthy that application forms are generally not treated or scored as biodata. Rather, they represent the collection method for obtaining biographical information and employers or recruiters often assess this information in non-structured, informal, intuitive ways”. So, yeah.

“The most important conclusion with regard to biodata is no doubt that they represent a valid approach for predicting occupational success (in its various forms). Indeed, meta-analytic estimates provided validities for biodata in the region of .25 […] In any case, this means that biodata are as valid predictors as the best personality scales, though the fact that biodata scales overlap with both personality and cognitive ability measures limits the appeal of biodata.”

“GPA-based selection has been the target of recurrent criticisms over the years and there are still many employers and recruiters who are reluctant to select on the basis of GPA. […] [however] many selection strategies use GPA to ‘sift’ or select out candidates during the early stages of the selection process […] In the past ten years meta-analysis has provided compelling evidence for the validity of GPA in occupational settings. Most notably, Roth and colleagues reported corrected validities above .30 for job performance […] and .20 for initial level of earnings […] the highest validity was found for job performance one year after graduating, with validities decreasing thereafter […] When salary is taken as the criterion, the highest validity was found for current salary, followed by starting salary, and last for salary growth […] the overall corrected validity above .30 for performance and around .20 for salary is at least comparable to and often higher than that of personality traits […] the causes of individual differences in GPA are at least in part similar to the causes of individual differences in job outcomes. […] GPA can be conceptually linked to occupational performance in that it carries variance from both ability and non-ability factors that are determinants of individual differences in real-world success”

About me/this blog

This blog is mainly a site where I keep track of and share some of the stuff I read and learn. Only a small subset of the posts on this blog deal with economics – I have diverse interests, and as the category cloud in the sidebar below illustrates this blog contains posts about all kinds of stuff: Mathematics, physics, statistics, geology, geography, health care and medicine, psychology, evolutionary biology, genetics, history, anthropology, archaeology, chess, …

You’re always welcome to ask questions in the comment section. New readers should be aware that the first comment someone leaves on this blog is always withheld automatically to limit spam and needs to be approved by me before it appears on the site; so your first question or comment may not appear immediately.

Pages

Goodreads Quotes

"Happiness and its anticipation are […] proximate mechanisms that lead us to perform and repeat acts that in the environments of history, at least, would have led to greater reproductive success." (Richard D. Alexander)