copyright © 2006 biddle consulting group, inc. 193 blue ravine, ste 270 folsom, ca 95630...

Copyright © 2006www.biddle.com

Biddle Consulting Group, Inc.193 Blue Ravine, Ste 270

Folsom, CA 956301-800-999-0438 www.biddle.com

Are Employers on Safe Grounds Using Validity Generalization (VG) in

Making a Title VII Defense?

2006 SWARM Regional ConferenceLittle Rock, Arkansas


Contact Information

Dan A. Biddle, Ph.D.CEO, Biddle Consulting Group, Inc.193 Blue Ravine, Ste 270Folsom, CA 956301-800-999-0438www.biddle.com

Email: [email protected]


Overview of Biddle Consulting Group, Inc. (BCG)

• Since 1974• Over 200+ cases in the EEO/AA area (both

plaintiff and defense cases)• Pioneers in the EEO/AA field• Administrative Skills Testing (OPAC)• 911 Dispatcher Testing (CritiCall)• AAP Software and Services• EEO Litigation Assistance (expert consulting and

witness services)


Agenda

• Criterion-Related Validity• Validity Generalization (VG)• Title VII Requirements for Tests that

Exhibit Adverse Impact• VG, Title VII, and the Courts• Recommendations• Q&A


The Building Blocks for VG: Criterion-Related Validation

Studies


Criterion-Related Validity

• Demonstrated by empirical data showing that the selection procedure is predictive of, or significantly correlated with, important elements of work behavior

• Relies on “correlations” between tests and job criteria


TestJob

Performance

The strength of this relationship is reported as a

“Validity Coefficient”

Criterion Validity


Criterion-Related Study

010203040506070

0 20 40 60 80 100

Test Score

Per

form

ance

Mea

sure

Score on some “Criteria” (e.g., job performance, days missed

work, etc.)

Score on a “Test”


Criterion-Related Study

010203040506070

0 20 40 60 80 100

Test Score

Pe

rfo

rma

nc

e M

ea

su

re

Test Score = 22Performance = 31

Test Score = 85Performance = 55

Correlation Demo


Interpreting Correlation Coefficients+1.00+1.00

+0.50+0.50

0.000.00

-0.50-0.50

-1.00-1.00

The closer to +1.00 or -1.00 the The closer to +1.00 or -1.00 the stronger the relationship between the stronger the relationship between the variablesvariables

The stronger the relationship between The stronger the relationship between two variables, the better the ability to two variables, the better the ability to predict one if given the otherpredict one if given the other


Validity Coefficient Interpretation

>.35 very beneficial

.21 - .35 likely to be useful

.11 - .20 depends on circumstances

< .11 unlikely to be useful

Source: Testing and Assessment: An Employer's Guide to Good Practices (U.S. DOL, 1999).

Guidelines for Interpreting Validity Coefficients


CRV and Statistical Power

• Power = the ability of a statistical study to find “statistical significance” if it exists

• Power is determined by:– Sample size (N)– Effect size (r)– “1 tail” or “2 tail tests” and– Statistical significance level (p)


Statistical Power for Criterion-Related Validity Studies

20%

30%

40%

50%

60%

70%

80%

90%

100%

30 50 70 90 110 130 150 170 190 210 230 250

Sample Size

Sta

tisti

cal

Po

wer

r = 0.20

r = .25

r = .30


Validity Generalization (VG): A Brief Overview


VG = Meta Analysis Applied to Test Validation Research

• VG applies meta-analysis techniques to combine the results of several validation studies to form general theories about relationships between variables across different situations

• Schmidt & Hunter (1977) opened the gate to VG techniques in the personnel testing field


VG Uses and Applications• VG is typically used to answer questions about

how:– Specific Tests and/or– Constructs (traits or abilities)

• Predict across:– Criteria– Occupations– Settings


• Meta-analysis Example: Results for Cognitive Ability for Police Officer Occupation (Aamodt, 2004)

Criterion K N r ρ

Academy 61 14,437 0.41 0.62

Supervisor Ratings 61 16,231 0.16 0.27

Commendations 7 2,015 -0.01 -0.02

Activity 6 656 0.19 0.33

Absenteeism 5 1,402 -0.03 -0.05

Injuries 3 1,891 -0.06 -0.08

Discipline Problems 13 4,850 -0.06 -0.11

Discipline Problems: Fired

or Suspended

7 3,019 -0.12 -0.21

Discipline Problems:

Complaints/Reprimands

6 1,831 -0.03 0.06

K = number of studies, N = sample size, r = mean correlation, ρ

= mean correlation corrected for range restriction.


• 90% power to detect r=.25 using sample of 134

• 12 studies (over half) showed no validity in local settings

• 8 studies had low correlations (< .11)

• VG output corrected for unreliability and:– Direct RR: .24– Indirect RR: .48

Study #

Validity Coefficient

Sample Size

Power (1-tail)

p-value Valid?

1 0.030 120 87% 0.37 No2 0.135 130 89% 0.06 No3 0.180 140 91% 0.02 Yes4 0.290 150 93% 0.00 Yes5 0.340 120 87% 0.00 Yes6 0.180 130 89% 0.02 Yes7 0.150 140 91% 0.04 Yes8 0.110 150 93% 0.09 No9 0.090 120 87% 0.16 No

10 0.126 130 89% 0.08 No11 0.210 140 91% 0.01 Yes12 0.390 150 93% 0.00 Yes13 0.198 120 87% 0.02 Yes14 0.164 130 89% 0.03 Yes15 0.109 140 91% 0.10 No16 0.094 150 93% 0.13 No17 0.020 120 87% 0.41 No18 0.114 130 89% 0.10 No19 0.164 140 91% 0.03 Yes20 0.070 150 93% 0.20 No21 0.010 120 87% 0.46 No22 0.010 130 89% 0.46 No


Factors That Can Influence Validity From “moving” Between SituationsFactors Before/At Testing Situation Factors Occurring After Testing

• Sample Size• Base Rate (% of applicants who “show

up qualified”)• Competitive Environment• Other Selection Procedures Used

Before/After the Test• Test Content• Test Administration Conditions

(proctoring, time limits, etc.)• Test Administration Modality (e.g.,

written vs. online)• Test Use (ranked, banded, cutoffs

used)• Test Reliability (e.g., internal

consistency)• Test Bias (e.g., culturally-loaded

content)

• Job Content Comparability • Job Performance Criteria• Reliability of Job Performance Criteria • Level of Supervision/Autonomy• Level/Quality of Training Provided• Org./Unit Demands & Constraints• Job Satisfaction• Management Styles and Role Clarity• Reward Structures and Processes• Organizational Citizenship, Morale, and

Commitment of the Workforce• Organizational Culture, Norms, Beliefs, Values,

Expectations Surrounding Loyalty and Conformity

• Organizational Socialization Strategies for New Employees

• Formal and Information Communication (Style, Levels, and Networks)

• Centralization and Formalization of Decision-Making

• Organization Size• Physical Environment


Title VII Requirements for Tests that Exhibit Adverse Impact


TESTTEST

AdverseAdverseImpact?Impact?

YESYES NONO

Is the PPTIs the PPTValid?Valid?

YESYES NONO

Alternative Alternative EmploymentEmployment

Practice?Practice?

NONODefendant PrevailsDefendant Prevails

YESYESPlaintiff PrevailsPlaintiff Prevails

ENDEND

Plaintiff Plaintiff PrevailsPrevails

How Can Testing Practices be How Can Testing Practices be Challenged?Challenged?

Title VII Disparate Impact Title VII Disparate Impact Discrimination FlowchartDiscrimination Flowchart


Test Validation & Adverse Impact Civil Rights Act of 1991

Amends Section 703 of the 1964 Civil Rights Act (Title VII)(k)(1)(A). An unlawful employment practice based ondisparate impact is established under this title only if:

• A(i) a complaining party demonstrates that a respondent uses a

particular employment practice that causes a disparate impact on the basis of race, color, religion, sex, or national origin, and the respondent fails to demonstrate that the challenged practice is job-related for the position in question and consistent with business necessity; OR,

• A(ii) the complaining party makes the demonstration described in subparagraph (C) with respect to an alternate employment practice, and the respondent refuses to adopt such alternative employment practice.


Job DutiesPerformed

By Incumbents In Original Validation

Study

Uniform Guidelines Transportability (7B)

Job DutiesPerformed By IncumbentsIn New Local

Situation

Validity Can be “Transported”


EEOC v. Atlas Paper (1989, 6th Circuit)

• “. . . the expert failed to visit and inspect the Atlas office and never studied the nature and content of the Atlas clerical and office jobs involved. The VG theory utilized by Atlas with respect to this expert testimony under these circumstances is not appropriate. Linkage or similarity of jobs in dispute in this case must be shown by such on site investigation to justify application of such a theory.”

• The premise of the VG theory . . . is that intelligence tests are always valid. The first major problem with a VG approach is that it is radically at odds with Albemarle Paper v. Moody, Griggs v. Duke Power, relevant case law within this circuit, and the EEOC Guidelines, all of which require a showing that a test is actually predictive of performance at a specific job. The VG approach simply dispenses with that similarity or manifest relationship requirement . . . (emphasis added) (EEOC v. Atlas Paper, 868 F.2d. at 1499).


VG, Title VII, and the Courts

• When the courts evaluate criterion-related validity evidence, four basic elements are typically inspected: – Statistical significance– Practical significance– Type and relevance of the job criteria– Evidence to support the specific use of the test

• VG has a difficult time answering these questions…


Recommendations for Applying VG in Personnel Testing Research

• Recommendation #1: Address the evaluation criteria provided by the Uniform Guidelines, Joint Standards, and SIOP Principles regarding the evaluation of the internal quality of the VG study. This will help insure that the VG study itself can be relied upon for drawing inferences.

• Key Factors:– Publication Bias– Corrections Made and Underlying

Assumptions/Justifications– Similarities of Tests and Criteria



• Recommendation #2. Address the criteria provided by the Uniform Guidelines, Joint Standards, and SIOP Principles regarding the similarity between the VG study and the local situation.– Helps to insure that the VG study can be relied upon and the

research is relevant to the local situation (similarities between tests, jobs, job criteria, etc.).

– The most critical factor evaluated by courts when considering VG evidence is the similarity between jobs (see also 7B of the Uniform Guidelines).

– VG evidence is the strongest where there is clear evidence that the job duties between the target position and those in the positions in the VG study are highly similar as shown by a job analysis in both situations.



• Recommendation #3: Only use VG evidence to supplement other sources of validity evidence (e.g., content validity or local criterion-related validation studies) rather than being the sole source. – Supplementing a local criterion-related validity study

with evidence from a VG study may be useful if an employer has evidence that statistical artifacts (not situational moderators) suppressed the actual validity of the test in the local situation (provided that the job comparability criteria of 7B UGESP has been met).



• Recommendation #4: Evaluate the test fairness evidence from the VG study using the methods outlined by the Uniform Guidelines, Joint Standards, and SIOP Principles.

• Recommendation #5: Evaluate and consider using “alternate employment practices” that are “substantially equally valid” (as required by the 1991 Civil Rights Act Section 2000e-2[k][1][A][ii] and Section 3B of the Uniform Guidelines).


Thank you!

copyright © 2006 biddle consulting group, inc. 193 blue ravine, ste 270 folsom, ca 95630...

Documents

test slide

arkansas slide

correlation demo slide

test score

job performance

job criteria

test job performanc

blue ravine