The following section of the EEOC Uniform Guidelines on Employee Selection Procedures provides a definition of "test fairness:"




One of the most common ways to determine whether a test is "fair" involves what has come to be known as the "Cleary model" or "regression model." T. Anne Cleary (1968) first described this method over 30 years ago. Consider Figure 1 below in which personnel selection test scores X are plotted against subsequent measures of job performance Y.


Note, in the late 1980's my daughter wanted to see what I did at work. Consequently, I let her attend one of the masters level statistics classes I was teaching for the Rutgers School of Management and Labor Relations Human Resources degree program. That night I covered the Cleary model and, upon asking her how she liked the course, she complimented me on my "hot dogs and sticks." Hence, a biased test will be one in which the two hot dogs are not on the same stick! If this metaphor helps you understand the issue, please see the footnotes below Figures 2 and 3 for its extention to "two hot dogs on one stick" and an Oscar Meyer product called a "cheese dog." 
Figure 1: An Unfair Test (or two hot dogs on separate sticks). Note the following points of interest:

A Statistical Interlude Skip this part if you want a conceptual understanding of test fairness and don't plan on actually crunching numbers to determine its presence or absence. Figure 1 describes a classsic expample of an unfair or "biased" test. The preferred method for statistically detecting this circumstance is hierarchical moderated regression analsysis. This procedure involves an F test of the following null hypothesis: With adequate sample size (which can pose a problem when Nblacks is very small), circusmtances like those found in Figure 1 will cause the F statistic; to reject the null hypothesis . Statistically, this is equivalent to testing the null hypothesis that there is no significant difference between the slopes and intercepts of the lines running through the white and black elispes. 

Clearly any application of a personnel selection system exhibiting XY relationships found in Figure 1 will cause adverse impact  no matter where a cut score is set, more whites will be selected than blacks and the proportion of blacks hired relative to their availability in the applicant pool will be meaningfully smaller than the proportion of whites hired relative to their availability (i.e., the 4/5ths rule will likely be violated). Figure 2 below describes an instance where a test does not exhibit bias (i.e., it is a fair test) yet still causes adverse impact. 

Continuing from the note below Figure 1, an unbiased test will be one in which the two hot dogs are on the same stick. Figure 3 below describes the concept of differential validity using an Oscar Meyer product called a "cheese dog." 
Figure 2: A Fair Test (or two hot dogs on a single stick). Note the following points of interest:

Finally, Figure 3 shows a test that meets the EEO Guidelines fairness criterion yet exhibits differences in the strength of the XY relationship for black and white applicants. This was commonly referred to as "differential validity" during the 1970's, when conceptual definitions of test fairness were being thrashed out in the literature. See original research by Bartlett, Bobko, and Moser (1978) and Bobko and Bartlett (1978) for a sampling of studies addressing these issues. See Arvey and Faley (1992) for a complete discussion of these issues.


Continuing from the notes below Figures 1and 2, this is an unbiased test because the two hot dogs are on the same stick. Figure 3 is perhaps best conceived in terms of an Oscar Meyer product called a "cheese dog," i.e., a hot dog product (the outer elipse) which has been injected with an inner "elipse" of cheese. 
Figure 2: Differential Validity in a Fair Test (cheese dog). Note the following points of interest:

It is important to note that inferences of test fairness are not related to whether the test exhibits equal (or comparable) criterion validity of black and white applicants (i.e., rxy = rxy for blacks and whites). Perhaps the most perplexing aspect of "differential validity" is the frequently reported finding of:



© Craig J. Russell, 2000