1 artifact corrected correlations between theoretical text complexity and empirical text complexity...

1

Artifact Corrected Correlations between theoretical text complexity

and empirical text complexity

PCRC – San DiegoFebruary 7-10, 2013

A. Jackson Stenner

2

A Story worth remembering! Oasis 475 Common Core Sarah Kershaw (FSU) Art Graesser (University of Memphis)

and Danielle McNamara (Arizona) Bob Calfee (UC Riverside

3

Theoretical versus Empirical Text Complexity for

719 Articles*

Reliability = 0.997

SEM = 12.8L

r = 0.968

r” = 0.969

R2” = 0.938

RMSE” = 89.6L

* Inclusion criteria: 50 encounters and 1,000 items

Mean Theoretical = 884.4L (356.2)

Mean Empirical = 884.4L (355.0)

4

A Brief Digression:

Correlation corrected for measurement error

Works for effect sizes (e.g. Cohens d) also!

5

Two ends of the stick

Test for a non-nil null .P values are suggestive of a signal. These data are inconsistent with the hypothesis that there is no relationship.

Have I found anything?

Test the hypothesis that the populationcorrelation between theory and observation is r = 1.0 after correction for artifacts.

Is there anything left to find?

6

Hunter and Schmidt (2004) identified six more artifacts relevant to our problem

7

Artifacts

1. Error of measurement (a1) in the dependent variable: Study validity will be systematically lower than true validity to the extent that empirical text complexity is measured with random error. Measurement error is ubiquitous (a1 = √0.997 = .998599).

2. Error of measurement in the independent variable: Theory validity will systematically understate the true validity of the attribute measured because the theory is imperfectly represented (a2 = 1.0*).

3. Range variation (a3) in the independent variable: Study validity will be systematically lower than true validity to the extent that the range of theoretical text complexities in the study is smaller than the population range.

8

Artifacts cont’d.4. Range variation (a4) in the dependent variable: Study

validity will be systematically lower than true validity to the extent that the range of empirical text complexities is smaller than the population range. Correction for double range restriction (a3 + a4 = .9853).

5. Deviation from perfect construct validity (a6) in the dependent variable: Task type used to measure empirical text complexity may have some specificity not shared by alternative task types, thus, resulting in slightly different text orderings depending upon which task type is used (a5 = 1.0*).

6. Deviation from perfect construct validity (a5) in the independent variable: Construct mis-specification due to less than perfect operationalization of the constructs in the specification equation (a6 =√.9833= .9916).

9

Artifacts cont’d.

7. Linear Bias (a7) is quite small for large sample studies.Bias = (2N-2)/(2N-1) = .9993Transcription and reporting error (a8 = 1.0*).

10

A = a1 a2 (a3 + a4) a5 a6 a7 a8

= (.9985) (1) (.9853) (.9916) (1)

(.9993) (1)

= .9748

Corrected = Observed = .96762 = .9926Correlation Correlation .9748

A95% confidence interval for the observed correlation (se

= .00475)

.96287≤P≤.97237

95% confidence interval for the corrected correlation .988≤P≤.998

11

ConclusionThe fact that correlations between theoretical estimates and empirical

estimates are influenced by a dozen or more artifactual sources of variation poses threats to inference whenever raw correlations are interpreted. These problems are particularly troublesome when the population correlation is in fact r = 1.0. Because of attenuation due to artifacts the raw (uncorrected) correlation, say r = .90, invites speculation about what moderator variables may have been omitted in the study. And so begins a time consuming, potentially expensive, search for moderators that don’t exist. When researchers fail to produce moderator variables that account for the unexplained variance, which they must, then it is concluded that better theories or better operationalizations of key constructs are needed. When these repair strategies also fail the research is dead-ended and assigned to the dustbin of science. We conjecture that social and human science literatures are littered with correlational studies that conform to the above depressing scenario: “Failure to correct for these artifacts results in massive underestimation of the mean correlation. Failure to control for variation in these artifacts results in a large overstatement of the variance of population correlations and, thus, in a potential false assertion of moderator variables where there are none” (Hunter and Schmidt, 2004, 132.

12

A. Jackson Stenner Chairman & CEO, MetaMetrics

University of North Carolina, Chapel [email protected]

Contact Info:

1 artifact corrected correlations between theoretical text complexity and empirical text complexity...

Documents