Abstract

This paper investigates the importance of ethnic homophily in the hiring discrimination process. Our evidence comes from a correspondence test performed in France in which we use three different kinds of ethnic identification: French sounding names, North African sounding names, and “foreign” sounding names with no clear ethnic association. Within the groups of men and women, we show that all non-French applicants are equally discriminated against when compared to French applicants. Moreover, we find direct evidence of ethnic homophily: recruiters with European names are more likely to call back French named applicants. These results show the importance of favoritism for in-group members. To test for the effect of information about applicant’s skills, we also add a signal related to language ability in all resumes sent to half the job offers. The design allows to uniquely identify the effect of the language signal by gender. Although the signal inclusion significantly reduces the discrimination against non-French females, it is much weaker for male minorities.

Keywords

JEL Classification

Notes

Acknowledgements

We gratefully acknowledge David Neumark for sharing his data and estimation programs with us. We also thank Francis Bloch, Nick Bloom, Emmanuel Duguet, Christelle Dumas, Raquel Fernandez, Stéphane Gauthier, James Heckman, Shelly Lundberg, Muriel Niederle, Phillip Oreopoulos, Paolo Pin, Chris Taber, Marie-Anne Valfort as well as participants to various conferences and seminars for their thoughtful comments on the paper. We are grateful to the CEPREMAP for financial support. Nicolas Jacquemet acknowledges the Institut Universitaire de France for its support. Constantine Yannelis thanks the Alexander S. Onassis foundation for generous support.

Compliance with ethical standards

Conflict of interest

The authors declare that they have no competing interests.

Appendix

Hiring Discrimination with heteroskedastic Unobserved Heterogeneity

The data generating process of a correspondence study stems from employers treatment of the content of the applications. We denote \(P_i^*(J,X,Z)\) the productivity of an application i in a given position. This productivity depends on the job’s characteristics (measured mainly through firm specific variables) J. At the individual level, productivity results from two components: a set of individual characteristics that are observable to both the econometrician and the employer, Xi; and a component which is unobservable to them both, Zi. We assume that ethnicity, denoted Ri = 0 for non-minorities (e.g., whites) and Ri = 1 for minorities (e.g., North Africans), do not enter productivity directly.19 This allows focus on discrimination, which in this framework might occur for two reasons. First, let the employer callback decision be described by the latent variable \(T_i^*\), which determines the treatment applied to a given application (it will drive a dichotomous choice variable below, but it can be thought of as any continuous outcome, such as e.g., wage offers). Taste-based discrimination implies that such treatment depends not only on the productivity of the applicant but also on unproductive observable characteristics such as ethnicity:20

which implies statistical discrimination as long as \({\Bbb E}(Z|R_i) \ne {\Bbb E}(Z_i)\)—the unobserved productivity of an applicant is thought of as being dependent on ethnicity. When each employer receives two applications, one from a non-minority applicant and another from a minority one, discrimination shows up in the average differential treatment reserved to applications,

By construction, this difference in callback cannot be driven by employer’s characteristic. We thus omit the term δJ below, implicitly including it in the deterministic part of the model, βX.

The content of correspondence test data

By way of construction, a correspondence test controls the observables X in such a way that they do not systematically differ across sub-groups: the difference in observed characteristics must balance over job applications, hence \({\Bbb E}(X|R = 0) = {\Bbb E}(X|R = 1)\).21 The difference in treatment over all job applications thus arises due to two parameters: taste based discrimination, γ, and statistical discrimination which induces a gap in the (perceived or actual) means of the unobserved productive characteristics \({\Bbb E}(Z|R = 0) - {\Bbb E}(Z|R = 1)\). Note that the two parameters can only be identified together unless the study provides enough control to guarantee that the distribution means are exactly equal, so that only taste based discrimination occurs.22 However, all components of this aggregate effect arise as the result of an unequal treatment based on unproductive observable characteristics. As such, they correspond to the economic definition of discriminatory behavior: in the remaining, we thus focus on the identification of the gross handicap experienced by minority applicants, μ—standing for the sum of these two parameters: \(\mu = {\Bbb E}(Z|R = 0) - {\Bbb E}(Z|R = 1) + \gamma\).

As noted by Heckman and Siegelman (1993); Heckman (1998) the observational context of a correspondence study may lead to biased estimates if the variance of the error term is ethnicity-specific.23 This is the case because the observed outcome is non-linearly related to the underlying discriminatory decisions. To see this explicitly, denote c the perceived quality threshold an applicant has to reach to be invited for an interview: the outcome variable of the experiment is \(T = {\mathbf{1}}[T^ * >c]\). Further assume that the unobserved productivity Z is i.i.d normally distributed in each ethnic sub-group, but with a (perceived or actual) variance that varies across groups. To ease of exposition, we denote \(\sigma _0^2 = Var(Z|R = 0)\) and \(\sigma _1^2 = Var(Z|R = 1)\). The statistical model generating the observed outcome thus gives the following specifications for the probability of obtaining a callback:

where Ф denotes the standard normal distribution. The difference between these two expressions is the empirical source of identification for discrimination. However, even in the extreme case with neither statistical nor taste-based discrimination—i.e., γ = 0 and \({\Bbb E}(Z|R) = {\Bbb E}(Z)\), so that μ = 0—the difference in probabilities still depends on the comparison between σ1 and σ0. For instance, if the X have been chosen at the lower tail of the skills distribution, so that βX < c, and σ0 < σ1 then it has to be that \(\Phi [(\beta X - c){\mathrm{/}}\sigma _1] < \Phi [(\beta X - c){\mathrm{/}}\sigma _0]\), ∀X: the experiment produces (spurious) evidence of discrimination against people from R = 1 origin. In the more general case with both discrimination and differences in the variance of unobservables, this framework highlights the identification problem faced by correspondence test data.

Unbiased measures of discrimination

The identification problem arises because the average treatment effect \(\Delta = \overline T _{\left| {R = 0} \right.} - \overline T _{\left| {R = 1} \right.}\) provides an estimate of \(\frac{{{\Bbb E}(Z|R = 0) - c}}{{\sigma _0}} - \frac{{{\Bbb E}(Z|R = 1) + \gamma - c}}{{\sigma _1}}\) (up to the functional form of the distribution), while what one seeks to measure is \(\mu = {\Bbb E}(Z|R = 0) - {\Bbb E}(Z|R = 1) + \gamma\): the systematic disadvantage experienced by minority applicants due to their belonging to an observable sub-population. As a result, the confounding effect of the dispersion in the observables shows up in the comparison of callback probabilities, thus affecting the mean comparisons between outcomes from the study.

As noted by Neumark (2012), one can however use restrictions on β to restore identification. If the study provides enough variation on relevant applications characteristics, grouped in X in the model above, one can estimate β/σ0 and β/σ1 from the (heteroskedastic) Probit model derived in the previous section. Under the assumption that the effect of X on the callback is homogeneous across ethnicity (i.e., the true value of β is the same), the ratio of the two point estimations identifies σ0/σ1, which in turn allows us to estimate μ after the usual normalization setting one of the variance terms equal to 1. Since only one such identifying regressor is needed to achieve identification, any additional productivity control provides the usual specification tests of an over-identified model.