The Role of Range Restriction and Criterion Contamination in Assessing Differential Validity by Race/Ethnicity

Berry, Christopher M.; Sackett, Paul R.; Sund, Amy

doi:10.1007/s10869-012-9284-3

The Role of Range Restriction and Criterion Contamination in Assessing Differential Validity by Race/Ethnicity

Published: 06 December 2012

Volume 28, pages 345–359, (2013)
Cite this article

Journal of Business and Psychology Aims and scope Submit manuscript

Christopher M. Berry¹,
Paul R. Sackett² &
Amy Sund³

664 Accesses
10 Citations
Explore all metrics

Abstract

Purpose

Berry et al.’s (J Appl Psychol 96:881–906, 2011) meta-analysis of cognitive ability test validity data across employment, college admissions, and military domains demonstrated that validity is lower for Black and Hispanic subgroups than for Asian and White subgroups. However, Berry et al. relied on observed test-criterion correlations and it is therefore not clear whether validity differences generalize beyond observed validities. The present study investigates the roles that range restriction and criterion contamination play in differential validity.

Design/Methodology/Approach

A large dataset (N > 140,000) containing SAT scores and college grades of Asian, Black, Hispanic, and White test takers was used. Within-race corrections for multivariate range restriction were applied. Differential validity analyses were carried out using freshman GPA versus individual course grades as criteria to control for the contaminating influence of individual differences between students in course choice.

Findings

Observed validities underestimated the magnitude of validity differences between subgroups relative to when range restriction and criterion contamination were controlled. Analyses also demonstrate that validity differences would translate to larger regression slope differences (i.e., differential prediction).

Implications

Subgroup differences in range restriction and/or individual differences in course choice cannot account for lower validity of the SAT for Black and Hispanic subgroups. Controlling for these factors increased subgroup validity differences. Future research must look to other explanations for subgroup validity differences.

Originality

The present study is the first differential validity study to simultaneously control for range restriction and individual differences in course choice, and answers a call to investigate potential causes of differential validity.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Detecting differential item functioning using generalized logistic regression in the context of large-scale assessments

Article Open access 26 June 2014

The Effect of Method Characteristics on Retest Score Gains and Criterion-Related Validity

Article 03 July 2015

Reflections on (Bi) Factor Analysis

Notes

The matched course datasets included only those SAT-ICG correlations drawn from courses with both minority and White students. This was the most appropriate way to control for differential course-taking patterns because most courses White students took did not include minority students. However, this greatly reduced the number of courses that would have otherwise been included for the White sample. To investigate the influence of this decision rule, we ran parallel analyses including all SAT-ICG correlations, regardless of whether they included both minority and White students. Validity estimates remained very similar and study conclusions were unchanged. Detailed results are available upon request from the first author.
Parallel ICG analyses were run using subgroup-specific estimates of number of courses taken in the freshman year (which ranged from 9.50 to 10.01 across subgroups) and intercorrelation between ICGs (which ranged between .38 and .49). No results changed by more than 3 correlation points, and the general pattern of results (i.e., Asian and White validities essentially equivalent with Black and Hispanic validities being lower) did not change.

References

Aguinis, H., Culpepper, S. A., & Pierce, C. A. (2010). Revival of test bias research in preemployment testing. Journal of Applied Psychology, 95, 648–680.
Article PubMed Google Scholar
Aguinis, H., & Smith, M. A. (2007). Understanding the impact of test validity and bias on selection errors and adverse impact in human resource selection. Personnel Psychology, 60, 165–199.
Article Google Scholar
Beaujean, A. A., Firmin, M. W., Knoop, A. J., Michonski, J. D., Berry, T. P., & Lowrie, R. E. (2006). Validation of the Frey and Detterman (2004) IQ prediction equations using the Reynolds intellectual assessment scales. Personality and Individual Differences, 41, 353–357.
Article Google Scholar
Berry, C. M. (2007). Toward an understanding of evidence of differential validity of cognitive ability tests for racial/ethnic subgroups. Unpublished doctoral dissertation, University of Minnesota.
Berry, C. M., Clark, M. A., & McClure, T. K. (2011). Racial/ethnic differences in the criterion-related validity of cognitive ability tests: A qualitative and quantitative review. Journal of Applied Psychology, 96, 881–906.
Article PubMed Google Scholar
Berry, C. M., & Sackett, P. R. (2009). Individual differences in course choice result in underestimation of the validity of college admissions systems. Psychological Science, 20, 822–830.
Article PubMed Google Scholar
Breland, H. M. (1979). Population validity and college entrance measures (College Board Research and Development Report RDR 78–79 No. 2). Princeton, N.J.: Educational Testing Service.
Bridgeman, B., McCamley-Jenkins, L., & Ervin, N. (2000). Predictions of freshman grade-point average from the revised and recentered SAT I: Reasoning test (College Board Research Report No. 2000-1; Educational Testing Service Research Report No. 00–1). New York: College Entrance Examination Board.
Campbell, J. P. (1990). An overview of the army selection and classification project (project A). Personnel Psychology, 43, 231–239.
Article Google Scholar
Elliott, R., & Strenta, A. C. (1988). Effects of improving the reliability of the GPA on prediction generally and on comparative predictions for gender and race particularly. Journal of Educational Measurement, 25, 333–347.
Article Google Scholar
Frey, M. C., & Detterman, D. K. (2004). Scholastic assessment or g? The relationship between the scholastic assessment test and general cognitive ability. Psychological Science, 15, 373–378.
Article PubMed Google Scholar
Ghiselli, E. E., Campbell, J. P., & Zedeck, S. (1981). Measurement theory for the behavioral sciences. San Francisco, CA: W. H. Freeman & Co.
Google Scholar
Gulliksen, H. (1950). Test theory of mental tests. New York: Wiley.
Book Google Scholar
Hunter, J. E. (1980). Validity generalization for 12,000 jobs: An application of synthetic validity and validity generalization to the general aptitude test battery (GATB). Washington, D.C.: U.S. Department of Labor, Employment Service.
Google Scholar
Hunter, J. E., & Schmidt, F. L. (2004). Methods of meta-analysis: Correcting error and bias in research findings (2nd ed.). New York: Sage Publications.
Google Scholar
Hunter, J. E., Schmidt, F. L., & Hunter, R. (1979). Differential validity of employment tests by race: A comprehensive review and analysis. Psychological Bulletin, 86, 721–735.
Article Google Scholar
Hunter, J. E., Schmidt, F. L., & Le, H. (2006). Implications of direct and indirect range restriction for meta-analysis methods and findings. Journal of Applied Psychology, 91, 594–612.
Article PubMed Google Scholar
Johnson, J. W., Carter, G. W., Davison, H. K., & Oliver, D. H. (2001). A synthetic validity approach to testing differential prediction hypotheses. Journal of Applied Psychology, 86, 774–780.
Article PubMed Google Scholar
Keef, S. P., & Roberts, L. A. (2004). The meta-analysis of partial effect sizes. British Journal of Mathematical and Statistical Psychology, 57, 97–129.
Article PubMed Google Scholar
Kuncel, N. R., Crede, M., & Thomas, L. L. (2005). The validity of self-reported grade point averages, class ranks, and test scores: A meta-analysis and review of the literature. Review of Educational Research, 75, 63–82.
Article Google Scholar
Kuncel, N. R., Hezlett, S. A., & Ones, D. S. (2004). Academic performance, career potential, creativity, and job performance: Can one construct predict them all? Journal of Personality and Social Psychology, 86, 148–161.
Article PubMed Google Scholar
Lautenschlager, G. J., & Mendoza, J. L. (1986). A step-down hierarchical multiple regression analysis for examining hypotheses about test bias in prediction. Applied Psychological Measurement, 10, 133–139.
Article Google Scholar
Linn, R. L. (1978). Single-group validity, differential validity, and differential predictions. Journal of Applied Psychology, 63, 507–514.
Article Google Scholar
Linn, R. L. (1983). Pearson selection formulas: Implications for studies of predictive bias and estimates of educational effects in selected samples. Journal of Educational Measurement, 20, 1–15.
Article Google Scholar
Mattern, K. D., Patterson, B. F., Shaw, E. J., Kobrin, J. L., & Barbuti, S. M. (2008). Differential validity and prediction of the SAT (College Board Research Report No. 2008-4). New York: College Board.
McDaniel, M. A., Kepes, S., & Banks, G. C. (2011). The Uniform Guidelines are a detriment to the field of personnel selection. Industrial and Organizational Psychology: Perspectives on Science and Practice, 4, 494–514.
Google Scholar
Mendoza, J. L., Bard, D. E., Mumford, M. D., & Ang, S. C. (2004). Criterion-related validity in multiple-hurdle designs: Estimation and bias. Organizational Research Methods, 7, 418–441.
Article Google Scholar
Morgan, R. (1990). Analyses of predictive validity within student categorizations. In W. W. Willingham, C. Lewis, R. Morgan, & L. Ramist (Eds.), Predicting college grades: An analysis of institutional trends over two decades (pp. 225–238). Princeton: Educational Testing Service.
Google Scholar
Naylor, J. C., & Shine, L. C. (1965). A table for determining the increase in mean criterion score obtained by using a selection device. Journal of Industrial Psychology, 3, 33–42.
Google Scholar
Ramist, L., Lewis, C., & McCamley, L. (1990). Implications of using freshman GPA as the criterion for the predictive validity of the SAT. In W. W. Willingham, C. Lewis, R. Morgan, & L. Ramist (Eds.), Predicting college grades: An analysis of institutional trends over two decades (pp. 253–288). Princeton, NJ: Educational Testing Service.
Google Scholar
Rigdon, J. L., Shen, W., Kuncel, N. R., Sackett, P. R., Beatty, A. S., & Kiger, T. B. (in press). The role of socioeconomic status in SAT-Freshman grade relationships across gender and racial subgroups. Journal of Educational Measurement.
Rotundo, M., & Sackett, P. R. (1999). Effect of rater race on conclusions regarding differential prediction in cognitive ability tests. Journal of Applied Psychology, 84, 815–822.
Article Google Scholar
Ryan, A. M., McFarland, L., Baron, H., & Page, R. (1999). An international look at selection practices: Nation and culture as explanations for variability in practice. Personnel Psychology, 52, 359–391.
Article Google Scholar
Sackett, P. R., & Yang, H. (2000). Correction for range restriction: An expanded typology. Journal of Applied Psychology, 85, 112–118.
Article PubMed Google Scholar
Schmidt, F. L., & Hunter, J. E. (1998). The validity and utility of selection methods in personnel psychology: Practical and theoretical implications of 85 years of research findings. Psychological Bulletin, 124, 262–274.
Article Google Scholar
Schmidt, F. L., Pearlman, K., & Hunter, J. E. (1980). The validity and fairness of employment and educational tests for Hispanic Americans: A review and analysis. Personnel Psychology, 33, 705–724.
Article Google Scholar
Society for Industrial and Organizational Psychology, Inc. (2003). Principles for the validation and use of personnel selection procedures (4th ed.). Bowling Green, OH: SIOP.
Google Scholar
Trattner, M. H., & O’Leary, B. S. (1980). Sample sizes for specified statistical power in testing for differential validity. Journal of Applied Psychology, 65, 127–134.
Article Google Scholar
Young, J. W., & Kobrin, J. L. (2001). Differential validity, differential prediction, and college admission testing: A comprehensive review and analysis (College Board Research Report No. 2001-6). New York: College Board.

Download references

Acknowledgments

This project was supported in part by the Meredith P. Crawford Fellowship from the Human Resources Research Organization. We thank the College Board for providing the data for this project. We also thank Richard Landers and Haoyu Yu for invaluable computer programing support.

Author information

Authors and Affiliations

Department of Psychology, Texas A&M University, 4235 TAMU, College Station, TX, 77843, USA
Christopher M. Berry
Department of Psychology, University of Minnesota, Minneapolis, MN, USA
Paul R. Sackett
Department of Psychology, Wayne State University, Detroit, MI, USA
Amy Sund

Authors

Christopher M. Berry
View author publications
You can also search for this author in PubMed Google Scholar
Paul R. Sackett
View author publications
You can also search for this author in PubMed Google Scholar
Amy Sund
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Christopher M. Berry.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Berry, C.M., Sackett, P.R. & Sund, A. The Role of Range Restriction and Criterion Contamination in Assessing Differential Validity by Race/Ethnicity. J Bus Psychol 28, 345–359 (2013). https://doi.org/10.1007/s10869-012-9284-3

Download citation

Published: 06 December 2012
Issue Date: September 2013
DOI: https://doi.org/10.1007/s10869-012-9284-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The Role of Range Restriction and Criterion Contamination in Assessing Differential Validity by Race/Ethnicity