very good reliability analysis

85
Reliability Analysis A technique to determine the scalability and reliability of a scale with multiple items.

Upload: jgfdjkvzjkcgv

Post on 16-Nov-2014

291 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Very Good Reliability Analysis

Reliability Analysis

A technique to determine the scalability and reliability of a scale with multiple items.

Cronbach’s alpha

Spearman-Brown split-half reliability

Page 2: Very Good Reliability Analysis

Guttman split-half reliability

Factor analysis & scale validity

Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University

2

Page 3: Very Good Reliability Analysis

Key Concepts*****

Reliability Analysis

The concept of a scaleDifference between a scale and an indexUCR Index Crime per 100,000Selline-Wolfgang Crime Seriousness ScaleSalient Factor Scale of the US Parole CommissionRand Seven-Factor ScaleKey questions to asked about a scaleThe concept of reliabilityOperational definition of reliability

Test-retest reliabilityAlternative forms reliability

Odd-even reliabilitySplit-half reliability

Inter-rater reliabilityThe concept of validityOperational definitions of validity

Face validityContent validityConcurrent validityPredictive validityInter-rater validity

Scale score: the sum of the score across itemsAverage score: (scale score) (1/n)Classical theory of reliability

Observed scoreTrue scoreErrorReliability as the ratio of the

True score variance to the observed score varianceThe relationship between the reliability of a scale and the number of itemsInterpretation of Cronbach’s The effect on the reliability of a scale of deleting one or more itemsInterpretation of Spearman-Brown split-half reliability

AssumptionsInterpretation of Guttman split-half reliability

AssumptionsStrictly parallel v. parallel models of reliability and their assumptionsThe use of factor analysis in reliability analysisThe use of regression analysis and analysis of variance in reliability

analysis

Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University

3

Page 4: Very Good Reliability Analysis

Lecture Outline

The concept of a scale & criminal justice examples of scales

Issues in assessing a scale: reliability & validity

An example: the training needs of court administrators

Classical theory of reliability

Cronbach’s Alpha ()

Reliability analysis of training needs data

The concept of a split-half reliability

Spearman-Brown split-half reliability

Guttman split-half reliability

Testing assumptions about scale items:

Strictly parallel vs parallel models

Factor analysis and scale validity

Validating a scale using external criteria: factor analysis, regression and ANOVA

Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University

4

Page 5: Very Good Reliability Analysis

Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University

5

Page 6: Very Good Reliability Analysis

Reliability Analysis

Interdependency Technique

Designed to determine the consistency with which multiple items in a scale measure the same underlying trait

Assumptions

Since reliability analysis uses correlational techniques, the assumptions of correlation apply

Variables are metric

Variances of the various variables are comparable

Covariances among the various combinations of variables are comparable

Absence of outliers

Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University

6

Page 7: Very Good Reliability Analysis

The Concept of a Scale

A measuring instrument from which …

A single number can be derived

Across multiple items

Which indicates the quantity of a trait a subject possesses.

Some criminal justice examples of scales

The UCR Index Crime Rate

Sellin-Wolfgang Crime Seriousness Index

US Parole Commission Salient Factor Score

Rand Seven-Factor Index

Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University

7

Page 8: Very Good Reliability Analysis

Uniform Crime Report Index of Part I Offenses Per 100,000 Population

The UCR provides an index of crime based upon the sum of reported crimes in seven categories, including:

Violent Crimes Property Crimes *

Homicide BurglaryForcible Rape Larceny-TheftRobbery Auto TheftAggravated Assault

This index is tracked year-to-year and it is assumed that:

If the index rise, so does the total incidence of crime, both reported & unreported

E.g. an incidence of 4000 index crimes is twice 2000, indicating twice the incidence of reported and unreported crime. A questionable assumption.

Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University

8

Page 9: Very Good Reliability Analysis

* Arson is considered a property crime but is not included in the Crime Index Total. It is included in the Modified Crime Index Total, however.

Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University

9

Page 10: Very Good Reliability Analysis

Sellin-Wolfgang Crime Seriousness Index

(Sellin, T. & Wolfgang, M.E. The Measurement of Delinquency, Wiley, 1966)

Sellin & Wolfgang developed a technique to account for not only …

The number of crimes reported to police

But also their relative seriousness

Based upon surveys of various populations they found differential seriousness weights for various crimes. For example:

Crime Serious Weight

Assault (death) 26Forcible Rape 11Robbery (weapon) 5Larceny $5000 4Auto Theft (no damage) 2Larceny $5 1Assault (minor) 1

They proposed that crimes be weighted for seriousness first, then added together to provide an index which reflects both the …

Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University

10

Page 11: Very Good Reliability Analysis

Amount of crime

And the relative seriousness of crimesThe Salient Factor Score of the

US Parole Commission(Hoffman, P. Screening for risk: a revised salient factor score. J. of Criminal

Justice, 11, 1984, 539-547)

The US Parole Commission has used the Salient Factor Score to predict the likelihood of recidivism on parole.

The score ranges from 0 (poor risk) to 10 (very good risk) based on the weighting of the following factors:

1 Number of prior convictions

2 Prior commitments longer than 30 days

3 Age at the time of the current offense

4 How long the offender was at liberty since the last commitment

5 Whether the offender was on probation, parole or escape status at the time of the current offense

Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University

11

Page 12: Very Good Reliability Analysis

6 Record of heroine dependency

The Salient Factor Score combined with the seriousness of the current offense is also used by the US Sentencing Commission to provide sentencing guidelines.

Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University

12

Page 13: Very Good Reliability Analysis

Rand Seven-Factor Index: Selective Incarceration of Career Criminals

(Greenwood, P.W. & Abrahamse, A. Selective Incapacitation, Rand Corp., Santa Monica, Calif.: 1982)

The Rand Corporation developed a seven-factor scale to identify defendants likely to be high-rate serious offenders if not incarcerated.

The research was based upon self-report surveys of incarcerated robbers and burglars.

The seven factors of the scale included:

Prior conviction for the same charge

Incarcerated more than 50% of the previous 2 years

Convicted before the age of 16

Served time in a juvenile facility

Drug use in the previous 2 years

Drug use as a juvenile

Unemployed more than 50% of the last 2 years

Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University

13

Page 14: Very Good Reliability Analysis

The Index ranges from 0 (low risk) to 7 (high risk)

Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University

14

Page 15: Very Good Reliability Analysis

Key Questions About A Scale

Are the items in the scale reliable, valid?

Are the scale items additive?

Can a scale score be derived from the items?

From the sum of the items, or

The average of the items?

Is the scale score reliable, valid?

Do the items in the scale measure one or more than one trait?

To what extent are the items in the scale intercorrelated?

Can parallel forms of the scale be developed?

How well can individual items predict the scale score?

Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University

15

Page 16: Very Good Reliability Analysis

What external criteria should be used to validate the scale?

Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University

16

Page 17: Very Good Reliability Analysis

The Concept of Reliability

Reliability

How accurate is the instrument?

How accurately does the instrument measures “what ever” it measures?

How well does the instrument correlate with itself?

Operational definitions of reliability

Test-retest reliability

Alternative forms reliability

Split-half reliability

Odd-even reliability

Inter-rater reliability

Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University

17

Page 18: Very Good Reliability Analysis

Operational Definitions of Reliability

Test-Retest Reliability

Measure the same subjects twice (t1 & t2)with the same instrument & under the same conditions.

Reliability = the correlation between t1 & t2

Problems: pretest sensitivity, history, andmaturation

Alternative Forms Reliability

Odd-Even Reliability: correlate the odd numbered items in a scale or test with the even numbered items.

Split-Half Reliability: correlate the 1st half of the items on the scale or test with the 2nd half of the items.

Inter-Rater Reliability

Used to determine the consistency with

Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University

18

Page 19: Very Good Reliability Analysis

which 2 or more raters can independently rate the same subjects the same way.

Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University

19

Page 20: Very Good Reliability Analysis

The Concept of Validity

Validity

What is being measured?

To what extent does the instrument measure what it is designed to measure?

Is more than one trait being measured?

How well does the instrument correlate with validated external criteria?

Operational Definitions of Validity

Face validity

Content validity

Concurrent validity

Predictive validity

Inter-rater validity

Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University

20

Page 21: Very Good Reliability Analysis

Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University

21

Page 22: Very Good Reliability Analysis

Operational Definitions of Validity

Face Validity

On its face, does the measuring instrument “look” like it measures what it is designed to measure (non-empirical standard)

Content Validity

As on an examination, the extent to which the items on a scale or test adequately sample the full range of content to be measured

Concurrent Validity

Does the instrument measure the intended concept as it exists “now”, at the present time, vis-à-vis some future time

Predictive Validity

Does the instrument measure the intended concept as it will be at some future point intime, as in a forecast of recidivism

Inter-Rater Validity

Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University

22

Page 23: Very Good Reliability Analysis

The correlation between the independent assessment made by a valid expert and the assessment made with the measuring instrument

Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University

23

Page 24: Very Good Reliability Analysis

An Example: Training Needs of Court Administrators

A survey that included 13 Likert scale training needs items was distributed to 202 court administrators to determine their relative need for continuing professional education.

The items were designed to determine the perceived need for training in the following areas.

Administrative Issue Administrative Issue

Case flow management (case_flo)

Judge/administrator relations (jud_rel)

Communication skills (com_skl) Integrated justice systems (int_jus)

Court's role in corrections (com_cor)

Management information systems (info_sys)

Court reporting technology (rep_tec)

The court as a human organization (hum_org)

Security management (sec_man) Program evaluation (eval)

Judicial ethics (ethics) Strategic planning (plan)

Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University

24

Page 25: Very Good Reliability Analysis

Human resource management (hum_res)

(The terms in parentheses are the database code names of the variables)

Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University

25

Page 26: Very Good Reliability Analysis

An Example: Training Needs of Court Administrators (cont.)

Each of the 13 items in the survey was rated on the following Likert scale.

1=no training needed 4=growing need2=minor need 5=very critical need3=needed

Calculating a scale score for a subject

Minimum scale score = (13) (1) = 13

Maximum scale score = (13) (5) = 65

Converting the scale score to a Likert value

Scale conversion factor: 1/13 = 0.07692

For a minimum scale score of 13

(13) (0. 0769) = 1 level of need

For a maximum scale score of 65

(65) (0.0769) = 5 level of need

Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University

26

Page 27: Very Good Reliability Analysis

Research Questions About the Training Needs of Court Administrators

How much variability is there in the need for training across the various items?

In what area(s) is there the greatest need?

In what area(s) is there the least need?

What is the average need for training across all the 13 items?

To what extent are the training need items correlated?

Can a scale score, the sum of the items or their average, be used as an overall measure of the need for training?

Is the scale score a reliable measure of whatever the scale measures?

Can reliable alternative forms of the scale be constructed?

Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University

27

Page 28: Very Good Reliability Analysis

What is the effect on the reliability of the scale of deleting one or more items?

What does the scale measure? What are the external correlates of the scale?

Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University

28

Page 29: Very Good Reliability Analysis

Classical Theory of Reliability

The trait being measured (need for training)

True Score (tau )

Random Error (e)

The observed score on an item (Xij)

Xij = + e

Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University

29

Page 30: Very Good Reliability Analysis

Definition of Reliability

Index of Reliability

The proportion of the true score variability captured across all items

Relative to the total observed score variability across all the items

r = ( 2 true score) / (2 observed score)

Assumptions

If the error associated with the observed scores is random,

Then when the scores are summed across items,

The errors should cancel, and

The scale score should approximate the true score being measured.

Therefore, the more items in the scale,

Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University

30

Page 31: Very Good Reliability Analysis

The better the estimate of the true score due to the greater opportunity for the errors to cancel each other

Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University

31

Page 32: Very Good Reliability Analysis

Lee J. Cronbach’s Alpha(Cronbach, L.J. Coefficient alpha and the internal structure of tests.

Psychometrica, 16, 1951, 297-334)

Alpha () measures the extent to which the scale score measures the true score

Indicates the reliability of the scale

Ranges between 0.0 and 1.0

0.0 = no reliability

1.0 = perfect reliability

(k) (cov / var) =

1 + (k - 1) (cov / var)

k = the number of items in the scale

cov = the average covariance between pairs of items

var = the average variance of the items

If the scale items have been standardized

Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University

32

Page 33: Very Good Reliability Analysis

= [ (k) (r) ]/ [1+(k-1) (r)]

Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University

33

Page 34: Very Good Reliability Analysis

Structure of Cronbach’s Alpha

The greater the correlation among the Items …

The higher the value of (ranges from 0 to 1)

The greater the covariance among the Items …

The higher the value of

The greater the number of items …

The higher the value of

Items with high covariance are measuring the same thing, namely …

Tau, the true score

Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University

34

Page 35: Very Good Reliability Analysis

Descriptive Data on the Survey of Court Administrators

Item Mean Std Dev Cases

1. CASE_FL 3.3692 1.3222 202.0 2. COM_COR 2.4667 1.2628 202.0 3. COM_SKL 3.3692 1.1958 202.0 4. ETHICS 3.0923 1.3127 202.0 5. EVAL 2.7077 1.3352 202.0 6. HUM_ORG 2.8814 1.1553 202.0 7. HUM_RES 2.4330 1.2845 202.0 8. INF_SYS 3.0258 1.2476 202.0 9. INT_JUS 2.8299 1.3071 202.0 10. JUD_REL 3.0464 1.3868 202.0 11. PLAN 2.8308 1.2327 202.0 12. REP_TEC 1.9077 1.1337 202.0 13. SEC_MAN 3.1077 1.3727 202.0

Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University

35

Page 36: Very Good Reliability Analysis

Intercorrelation of Scale Items

The higher the intercorrelation among the scale items, the greater the reliability of the scale and the higher the value of Cronbach's alpha

Correlation Matrix

CASE_FL COM_COR COM_SKL ETHICS EVAL

CASE_FL 1.0000COM_COR .2396 1.0000COM_SKL .4733 .3901 1.0000ETHICS .4510 .3049 .5336 1.0000EVAL .3355 .2260 .4115 .5429 1.0000HUM_ORG .3596 .3646 .5197 .6004 .4842HUM_RES .3435 .3870 .4654 .5006 .5370INF_SYS .4770 .2893 .4408 .4938 .4524INT_JUS .3716 .2612 .3117 .4607 .2855JUD_REL .4849 .2890 .5753 .6811 .4882PLAN .4462 .2921 .4765 .5628 .5149REP_TEC .2345 .2794 .3143 .4602 .3508SEC_MAN .4064 .4426 .5190 .5386 .4021

HUM_ORG HUM_RES INF_SYS INT_JUS JUD_REL

HUM_ORG 1.0000HUM_RES .5262 1.0000INF_SYS .5647 .4155 1.0000INT_JUS .4813 .4098 .5243 1.0000JUD_REL .5871 .5253 .5025 .4982 1.0000PLAN .6462 .5333 .6369 .4824 .5779REP_TEC .4216 .4590 .3744 .3898 .3791SEC_MAN .4372 .4242 .3820 .4087 .4811

PLAN REP_TEC SEC_MAN

PLAN 1.0000REP_TEC .4626 1.0000SEC_MAN .3603 .3419 1.0000

Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University

36

Page 37: Very Good Reliability Analysis

Item Variances and Covariances

average scale score

Statistics for Mean Variance Std Dev Variables Scale 37.0678 132.4273 11.5077 13

Item Means Mean Minimum Maximum Range Max/Min Variance 2.8514 1.9077 3.3692 1.4615 1.7661 .1637

Item Variances Mean Minimum Maximum Range Max/Min Variance 1.6262 1.2853 1.9233 .6380 1.4964 .0390

Inter-itemCovariances Mean Minimum Maximum Range Max/Min Variance .7134 .3515 1.2399 .8884 3.5276 .0313

Inter-itemCorrelations Mean Minimum Maximum Range Max/Min Variance .4398 .2260 .6811 .4551 3.0134 .0102

var

cov

The scale score is the sum of the Likert ratings across the 13 items in the scale. The mean score for the 202 administrators is 37.0678.

Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University

37

Page 38: Very Good Reliability Analysis

Cronbach's alpha is calculated from the average variance (var) and average covariance (cov) among the scale items.

Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University

38

Page 39: Very Good Reliability Analysis

Calculation of Cronbach’s Alpha ()

(k) (cov / var) =

1 + (k - 1) (cov / var)

(13) (0.7134) / (1.6262) =

1 + (13 - 1) (0.7134) / (1.6262)

= 0.9104 A high degree of reliability

Average scale score

(2.8514, average Likert rating) (13) =

37.0682 37.07

(2.8514) / (0.07692) 37.07

The scale score can range from a low of 13 to a high of 65.

Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University

39

Page 40: Very Good Reliability Analysis

The Effect of Deleting an Item on the Reliability of the Scale

Q If the item caseflow management (case_fl) is deleted from the scale, would the reliability () decline appreciably?

Average Likert score for all 13 items = 2.8514

Average scale score for all items = 37.0682(2.8514) (13) = 37.0682

For case_fl, the average score = 3.3692

If case_fl is deleted from the scale

The mean scale score declines to

(37.0682) - (3.3692) = 33.699

declines from 0.9104 to 0.9071 (cf. table in the next exhibit)

Conclusion

Deletion of case_fl does not effect the reliability of the scale very much since its deletion does not change appreciably

Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University

40

Page 41: Very Good Reliability Analysis

Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University

41

Page 42: Very Good Reliability Analysis

Summary Table of the Impact of Deleting Items on the Reliability of the

Scale

Cronbach's for the full scale = 0.9104

Item-total Statistics

Scale Scale Corrected Mean Variance Item- Squared

Alpha if Item if Item Total Multiple if

Item Deleted Deleted Correlation Correlation

Deleted

CASE_FL 33.6985 115.0976 .5492 .3710 .9071

COM_COR 34.6011 118.7323 .4397 .2811 .9113

COM_SKL 33.6985 114.3094 .6526 .4906 .9028

ETHICS 33.9755 110.2300 .7428 .6047 .8987

EVAL 34.3601 113.6023 .5988 .4418 .9050

HUM_ORG 34.1863 113.3235 .7224 .5686 .9002

HUM_RES 34.6348 112.7329 .6615 .4871 .9023

INF_SYS 34.0420 113.2100 .6652 .5248 .9022

INT_JUS 34.2379 114.4985 .5799 .4213 .9058

JUD_REL 34.0214 109.2204 .7342 .5978 .8990

PLAN 34.2370 112.0915 .7209 .5995 .8999

REP_TEC 35.1601 118.1503 .5271 .3309 .9076

SEC_MAN 33.9601 112.6420 .6144 .4535 .9044

Reliability Coefficients 13 items

Alpha = .9104 Standardized item alpha = .9108

Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University

42

Page 43: Very Good Reliability Analysis

Interpretation Based upon the decrease in ,

The most reliable items are ethics, judicial/ administrator relations, & planning.

The least reliable items are community corrections, caseflow management, & court reporting technology.

Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University

43

Page 44: Very Good Reliability Analysis

Interpretation of the Item Deletion Summary Table

Scale mean if item deleted

The mean scale score if the associated item is deleted. The mean scale score for all 13 items is 37.0682. (cf. p. 24)

Scale variance if item deleted

The scale variance if the associated item is deleted. The variance for all 13 items is 132.4273. (cf. p. 24)

Scale item total correlation

The correlation between a deleted item and the scale score associated with of the remaining 12 items

If low, the item contributes little to the scale's reliability

If high, the item contributed a lot to the scale's reliability

Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University

44

Page 45: Very Good Reliability Analysis

Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University

45

Page 46: Very Good Reliability Analysis

Interpretation of the Item Deletion Summary Table (cont.)

Squared multiple correlation (R 2 )

Regression of the deleted item on the 12 remaining items in the scale

Xd = a + b1X1 + b2X2 + … + b12X12

Xd = deleted item

Indicates the proportion of variance in the deleted item explained by the 12 remaining items in the scale

If R2 is high, the deleted item contributes substantially to the reliability of the scale

If R2 is low, the deleted item contributes little to the reliability of the scale

Alpha if item deleted

The effect on the reliability of the scale () if the item is deleted.

Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University

46

Page 47: Very Good Reliability Analysis

Compare with the value of Cronbach's for the scale including all 13 items (0.9104, cf p. 27)

Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University

47

Page 48: Very Good Reliability Analysis

Split-Half Reliability

Sometimes alternative forms of the same scale are desirable, as in pre-post designs.

But will the two forms be equally reliable?

Spearman-Brown Split-Half Reliability

rSB = (2) (rxy) / (1 + rxy)

rxy = correlation between the two halves of the scale

Guttman Split-Half Reliability

rG = 2 (S2t - S2

t1 - S2t2) / S2

t

S2t = total variance of entire scale

S2t1 = variance of the 1st half of the scale

Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University

48

Page 49: Very Good Reliability Analysis

S2t2 = variance of the 2nd half of the scale

Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University

49

Page 50: Very Good Reliability Analysis

Split-Half Reliability of the Training Needs Scale

Statistics for Mean Variance Std Dev Variables Part 1 20.3196 40.0103 6.3254 7 Part 2 16.7482 32.2112 5.6755 6 Scale 37.0678 132.4273 11.5077 13

Item Means Mean Minimum Maximum Range Max/Min Variance Part 1 2.9028 2.4330 3.3692 .9362 1.3848 .1534 Part 2 2.7914 1.9077 3.1077 1.2000 1.6290 .2008 Scale 2.8514 1.9077 3.3692 1.4615 1.7661 .1637

Item Variances Mean Minimum Maximum Range Max/Min Variance Part 1 1.6091 1.3347 1.7828 .4481 1.3357 .0286 Part 2 1.6462 1.2853 1.9233 .6380 1.4964 .0583 Scale 1.6262 1.2853 1.9233 .6380 1.4964 .0390

Inter-itemCovariances Mean Minimum Maximum Range Max/Min Variance Part 1 .6845 .3811 .9515 .5705 2.4969 .0261 Part 2 .7445 .5295 .9879 .4584 1.8657 .0259 Scale .7134 .3515 1.2399 .8884 3.5276 .0313

Inter-itemCorrelations Mean Minimum Maximum Range Max/Min Variance Part 1 .4284 .2260 .6004 .3744 2.6565 .0105 Part 2 .4535 .3419 .6369 .2950 1.8629 .0072 Scale .4398 .2260 .6811 .4551 3.0134 .0102

Reliability Coefficients

N of Cases = 202.0 N of Items = 13

Correlation between forms = .8385 Equal length Spearman-Brown = .9122

Guttman Split-half = .9093 Unequal-length Spearman-Brown = .9126

Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University

50

Page 51: Very Good Reliability Analysis

7 Items in part 1 6 Items in part 2

Alpha for part 1 = .8382 Alpha for part 2 = .8320

Interpretation

The 13 items in the scale are divided into two parts or forms: Part 1 and Part 2

Part 1 has 7 items, Part 2 has 6 items

Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University

51

Page 52: Very Good Reliability Analysis

Split-Half Reliability of the Training Needs Scale (cont.)

The Spearman-Brown reliability for equal-length forms: rSB = 0.9122, unequal length forms: rSB = 0.9126

The Guttman reliability: rG = 0.9093

Cronbach's

Part 1 = 0.8382

Part 2 = 0.8320

Compare the 's of the two parts to the for the scale with 13 items: = 0.9104

Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University

52

Page 53: Very Good Reliability Analysis

Calculation of the Spearman-BrownSplit-Half Reliability

For forms of equal length

rSB = (2) (rxy) / (1 + rxy)

rSB = (2) (0.8385) / (1 + 0.8385) = 0.9122

rxy = correlation between the two forms of the scale

For forms of unequal length

rSB = (2.00097) (0.8385) / (1 + 0.8385)

= 0.9126

Assumptions

Parts 1 and 2 are equally reliable

Equal variances in Parts 1 and 2

Interpretation

rSB = 0.9126 indicates the reliability of a 13

Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University

53

Page 54: Very Good Reliability Analysis

item scale made up of two parts that correlate 0.8385

Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University

54

Page 55: Very Good Reliability Analysis

Calculation of the Louis Guttman’sSplit-Half Reliability

rG = 2 (S2t - S2

t1 - S2t2) / S2

t

S2t = variance of the 13 item scale (cf. p.31)

S2t1 = variance of Part 1, 7 items (cf. p. 31)

S2t2 = variance of Part 2, 6 items (cf. p. 31)

rG = (2) (132.43 - 40.01 - 32.21) / 132.43

rG =0.9093

Assumptions

Assumes neither equal reliability in Parts 1 and 2

Nor equal variances in Parts 1 and 2

Interpretation

rG = 0.9093 indicates the reliability of a 13

Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University

55

Page 56: Very Good Reliability Analysis

item scale made up of two parts that correlate 0.8385

Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University

56

Page 57: Very Good Reliability Analysis

Testing Assumptions

Do the scale items have

Equal mean estimates of the true score?

Equal variance estimates of the true score?

Two models which can be tested

Strictly Parallel Model

Parallel Model

Strictly Parallel Model

Tests whether all the items have the same means and variances for the true score

Parallel Model

Tests whether all the items have the same variances for the true score

Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University

57

Page 58: Very Good Reliability Analysis

But not necessarily the same means for the true score

Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University

58

Page 59: Very Good Reliability Analysis

Test of the Strictly Parallel Model of Assumptions

Test for Goodness of Fit of Model Strictly Parallel

Chi-square = 633.3580 Degrees of Freedom = 101 Log of determinant of unconstrained matrix = -.009346 Log of determinant of constrained matrix = 3.206150 Probability = .0000

Parameter Estimates

Estimated common mean = 2.8514 Estimated common variance = 1.7773 Error variance = 1.0720 True variance = .7053 Estimated common inter-item correlation = .3943

Estimated reliability of scale = .8943 Unbiased estimate of reliability = .8959

Strictly Parallel Model

Null hypothesis The items have the same means and variances for the true score.

2 = 633.358, df = 101, p 0.0001

Decision Reject the null hypothesis, the items have significantly different means and variances.

Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University

59

Page 60: Very Good Reliability Analysis

Test of the Parallel Model of Assumptions

Test for Goodness of Fit of Model Parallel

Chi-square = 242.7770 Degrees of Freedom = 89 Log of determinant of unconstrained matrix = -.009346 Log of determinant of constrained matrix = 1.226618 Probability = .0000

Parameter Estimates

Estimated common variance = 1.6262 Error variance = .9128 True variance = .7134 Estimated common inter-item correlation = .4387

Estimated reliability of scale = .9104 Unbiased estimate of reliability = .9113

Parallel Model

Null hypothesis The items have the same variances for the true score, but not necessarily the same means.

2 = 242.77, df = 89, p 0.0001

Decision Reject the null hypothesis, the items have significantly different variances.

Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University

60

Page 61: Very Good Reliability Analysis

To What Extent Do the 13 Training Needs Items Measure the Same Thing?

Factor analysis of the 13 training needs items

Principal Component Analysis

With Varimax Rotation

Results

Only one factor extracted with an eigenvalue greater than 1.0

This factor accounts for 48.94% of the variance in the 13 training needs items

Conclusion

The one underlying trait being measured by the 13 item scale is the need for training.

Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University

61

Page 62: Very Good Reliability Analysis

Results of the Factor Analysis of the 13 Training Needs Items

Communalities

Variable Initial Extraction

CASE_FL 1.000 .380COM_SKL 1.000 .508COM_COR 1.000 .251REP_TEC 1.000 .354SEC_MAN 1.000 .451ETHICS 1.000 .638HUM_RES 1.000 .521JUD_REL 1.000 .630INT_JUS 1.000 .420INF_SYS 1.000 .532HUM_ORG 1.000 .613EVAL 1.000 .450PLAN 1.000 .614

Extraction Method: Principal Component Analysis.Factor Analysis

Total Variance Explained

Component Total % of Variance Cumulative % Total % of Variance Cumulative %

1 6.363 48.943 48.943 6.363 48.943 48.9432 .975 7.500 56.4433 .882 6.788 63.2314 .798 6.142 69.3735 .653 5.021 74.3946 .581 4.470 78.8647 .551 4.235 83.0998 .501 3.856 86.9559 .421 3.238 90.193

10 .363 2.793 92.98611 .325 2.501 95.48612 .308 2.368 97.85413 .279 2.146 100.000

Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University

62

Page 63: Very Good Reliability Analysis

Extraction Method: Principal Component Analysis.

Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University

63

Page 64: Very Good Reliability Analysis

Results of the Factor Analysis of the 13 Training Needs Items (cont.)

Component Matrix

Variable Component1

CASE_FL .617 COM_SKL .713 COM_COR .501 REP_TEC .595 SEC_MAN .672 ETHICS .799 HUM_RES .722 JUD_REL .793 INT_JUS .648 INF_SYS .729 HUM_ORG .783 EVAL .671 PLAN .783

Extraction Method: Principal Component Analysis.a 1 components extracted.

Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University

64

Page 65: Very Good Reliability Analysis

Validation of the Scale

To what extent is the need for training as measured by the scale score sum correlated with the following experiential variables: (* metric variable, ** nonmetric variable)

Years of experience (years)* Education (edu)* Participation in the state’s professional

development program (pdp)* Participation in annual professional conferences

(conf)* Years of membership in the state's professional

court administration association (year_mem)* Type of court administered (type_c)**

The Analysis

Dependent Variable = the scale score (scl_sum)

For metric independent variables

Multiple regression analysis

For the nonmetric independent variable

One-way ANOVA

Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University

65

Page 66: Very Good Reliability Analysis

Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University

66

Page 67: Very Good Reliability Analysis

The Scale Score as a Function of Metric Predictor Variables

Regression Analysis

Scale score regressed on all metric experiential variables

The metric independent variables were not significantly related to the scale score for training needs

R2 = 0.047, p 0.267

Model Summary

Model R R Square Adjusted R Square

Std. Error of the Estimate

1 .217 .047 .011 11.6954

a Predictors: (Constant), YEAR_MEM, EDU, PDP, CONF, YEARSb Dependent Variable: SCL_SUM

ANOVA

Model Sum of Squares

df Mean Square

F Sig.

1 Regression 889.993 5 177.999 1.301 .267Residual 18055.359 132 136.783Total 18945.352 137

Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University

67

Page 68: Very Good Reliability Analysis

a Predictors: (Constant), YEAR_MEM, EDU, PDP, CONF, YEARSb Dependent Variable: SCL_SUM

The Scale Score as a Function of Metric Predictor Variables (cont.)

Coefficients

Unstandardized Coefficients

Standardized

Coefficients

t Sig. 95% Confidence Interval for

B Model B Std.

ErrorBeta Lower

BoundUpper Bound

1 (Constant) 29.659 5.003 5.928 .000 19.762 39.556 YEARS -.139 .285 -.060 -.487 .627 -.703 .426 EDU 2.078 1.176 .152 1.766 .080 -.249 4.405

PDP 2.783 2.869 .085 .970 .334 -2.893 8.459CONF .898 .615 .146 1.462 .146 -.317 2.114YEAR_MEM

-.192 .330 -.076 -.581 .562 -.844 .461

a Dependent Variable: SCL_SUM

Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University

68

Page 69: Very Good Reliability Analysis

The Scale Score as a Function of Type of Court

One-way ANOVA

IV = type of court, DV = scale score

A marginally significant difference was found in the mean scale score for training needs

F = 3.344, p 0.069

Administrators in county courts at lawindicated a greater need for training than district court administrators

Univariate Analysis of Variance

Descriptives

Dependent Variable: SCL_SUM N Mean Std.

DeviationStd. Error 95%

Confidence Interval for

Mean

Minimum

Maximum

LowerBound

Upper Bound

1.00 132 36.0032 11.4240 .9943 34.0361 37.9702 13.00 61.00 2.00 67 39.1652 11.7273 1.4327 36.3047 42.0257 18.00 59.00 Total 199 37.0678 11.5946 .8219 35.4469 38.6886 13.00 61.00

Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University

69

Page 70: Very Good Reliability Analysis

Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University

70

Page 71: Very Good Reliability Analysis

The Scale Score as a Function of Type of Court (cont.)

ANOVA

Dependent Variable: SCL_SUM

Sum of Squares

df Mean Square

F Sig.

Between Groups

444.347 1 444.347 3.344 .069

Within Groups

26173.536 197 132.861

Total 26617.883 198

Reliability Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University

71