rasch analysis of the rosenberg self-esteem scale with

14
Rasch Analysis of the Rosenberg Self-Esteem Scale With African Americans Ruth Chu-Lien Chao, Courtney Vidacovich, and Kathy E. Green University of Denver Effectively diagnosing African Americans’ self-esteem has posed an unresolved challenge. To address this assessment issue, we conducted exploratory factor analysis and Rasch analysis to assess the psychometric characteristics of the Rosenberg Self-Esteem Scale (RSES, Rosenberg, 1965) for African American college students. The dimensional structure of the RSES was first identified with the first subsample (i.e., calibration subsample) and then held up under cross-validation with a second subsample (i.e., validation subsample). Exploratory factor analysis and Rasch analysis both supported unidimen- sionality of the measure, with that finding replicated for a random split of the sample. Response scale use was generally appropriate, items were endorsed at a high level reflecting high levels of self-esteem, and person separation and reliability of person separation were adequate, and reflected results similar to those found in prior research. However, as some categories were infrequently used, we also collapsed scale points and found a slight improvement in scale and item indices. No differential item functioning was found by sex or having received professional assistance versus not; there were no mean score differences by age group, marital status, or year in college. Two items were seen as problematic. Implications for theory and research on multicultural mental health are discussed. Keywords: item-level analysis, Rasch analysis, clinical distress, African Americans Although the Rosenberg Self-Esteem Scale (RSES; Rosenberg, 1965) is a frequently used measure of self-esteem, there is still inconsistency in assessing African Americans’ self-esteem (Aluja, Rolland, García, & Rossier, 2007; Alwin & Jackson, 1981; Gray- Little, Hancock, & Williams, 1997; Richardson, Ratner, & Zumbo, 2009). Among African Americans, self-esteem is a crucial index reflecting their mental health. The Surgeon General, while recog- nizing that mental health is culturally influenced, defines it as “a state of successful performance of mental function, resulting in productive activities, fulfilling relationships with other people, and the ability to adapt to change and to cope with adversity” (U.S. Department of Health & Human Services, 2001). The marginal- ization and economic deprivation of African Americans contribute to the unique distribution of mental health problems among this population (Jackson et al., 2004). Therefore, Krause (1983) pro- posed to evaluate African Americans’ self-esteem based on their racial context. Specifically, African Americans experience multiple oppres- sions related to both racism and discrimination, which has con- tributed to the racial context of self-esteem. Jackson et al. (2004) found that, due to multiple oppressions in their racial context, self-esteem for African Americans may indicate self-value against distorted, false, denigrating, antiself, anti-African messages. Self- concept is developed despite varied negative images related to skin color and physical appearance, as well as common stereotypes related to domestic servitude, welfare dependence, and many oth- ers. After developing his self-esteem scale in 1965, Rosenberg (1972) highlighted that scholars need to understand African Amer- icans’ self-esteem within their “distinctive subcultures [which re- fer to] . . . style of life . . . conceptions of right or wrong. . .patterns of values and systems of aspirations . . .” (p. 99). Constantine and Blackmon (2002) further extended African American students’ self-esteem to consider racial socialization messages received by African American and culture-specific self-esteem. For African Americans, racial socialization messages reflecting pride and knowledge about African American culture were positively asso- ciated with their self-esteem. Accordingly, to understand the self-esteem of African Ameri- cans based on their cultural context, there are two competitive conclusions regarding the factors or dimensions of self-esteem. Rosenberg (1965) designed the RSES as a global measure of a single dimension of self-evaluation and various studies have sup- ported that. But, consistent with other previous research on the RSES, a two-dimensional factor structure was revealed. Various studies have supported the contention that the RSES taps two distinct dimensions of self-esteem. Rosenberg (1965, 1972) also indicated two different types of affect (i.e., positive vs. negative) in self-esteem. In a classic longitudinal study, Owens (1994) used an eight-item version of the RSES with slightly different wording of the items to examine the relationships of the positive and negative subscales with depressive symptoms. Owens concluded that the RSES reflects two-dimensions. Thus, this present study proposed to address this dispute by determining whether a global dimension or a two-dimensional structure of self-esteem was warranted for a This article was published Online First June 9, 2016. Ruth Chu-Lien Chao, Courtney Vidacovich, and Kathy E. Green, Coun- seling Psychology Department, Morgridge College of Education, Univer- sity of Denver. Correspondence concerning this article should be addressed to Ruth Chu-Lien Chao, Counseling Psychology Department, Morgridge College of Education, University of Denver, Denver, CO 80208. E-mail: Cchao3@ du.edu This document is copyrighted by the American Psychological Association or one of its allied publishers. This article is intended solely for the personal use of the individual user and is not to be disseminated broadly. Psychological Assessment © 2016 American Psychological Association 2017, Vol. 29, No. 3, 329 –342 1040-3590/17/$12.00 http://dx.doi.org/10.1037/pas0000347 329

Upload: others

Post on 30-May-2022

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Rasch Analysis of the Rosenberg Self-Esteem Scale With

Rasch Analysis of the Rosenberg Self-Esteem Scale WithAfrican Americans

Ruth Chu-Lien Chao, Courtney Vidacovich, and Kathy E. GreenUniversity of Denver

Effectively diagnosing African Americans’ self-esteem has posed an unresolved challenge. To addressthis assessment issue, we conducted exploratory factor analysis and Rasch analysis to assess thepsychometric characteristics of the Rosenberg Self-Esteem Scale (RSES, Rosenberg, 1965) for AfricanAmerican college students. The dimensional structure of the RSES was first identified with the firstsubsample (i.e., calibration subsample) and then held up under cross-validation with a second subsample(i.e., validation subsample). Exploratory factor analysis and Rasch analysis both supported unidimen-sionality of the measure, with that finding replicated for a random split of the sample. Response scale usewas generally appropriate, items were endorsed at a high level reflecting high levels of self-esteem, andperson separation and reliability of person separation were adequate, and reflected results similar to thosefound in prior research. However, as some categories were infrequently used, we also collapsed scalepoints and found a slight improvement in scale and item indices. No differential item functioning wasfound by sex or having received professional assistance versus not; there were no mean score differencesby age group, marital status, or year in college. Two items were seen as problematic. Implications fortheory and research on multicultural mental health are discussed.

Keywords: item-level analysis, Rasch analysis, clinical distress, African Americans

Although the Rosenberg Self-Esteem Scale (RSES; Rosenberg,1965) is a frequently used measure of self-esteem, there is stillinconsistency in assessing African Americans’ self-esteem (Aluja,Rolland, García, & Rossier, 2007; Alwin & Jackson, 1981; Gray-Little, Hancock, & Williams, 1997; Richardson, Ratner, & Zumbo,2009). Among African Americans, self-esteem is a crucial indexreflecting their mental health. The Surgeon General, while recog-nizing that mental health is culturally influenced, defines it as “astate of successful performance of mental function, resulting inproductive activities, fulfilling relationships with other people, andthe ability to adapt to change and to cope with adversity” (U.S.Department of Health & Human Services, 2001). The marginal-ization and economic deprivation of African Americans contributeto the unique distribution of mental health problems among thispopulation (Jackson et al., 2004). Therefore, Krause (1983) pro-posed to evaluate African Americans’ self-esteem based on theirracial context.

Specifically, African Americans experience multiple oppres-sions related to both racism and discrimination, which has con-tributed to the racial context of self-esteem. Jackson et al. (2004)found that, due to multiple oppressions in their racial context,self-esteem for African Americans may indicate self-value against

distorted, false, denigrating, antiself, anti-African messages. Self-concept is developed despite varied negative images related to skincolor and physical appearance, as well as common stereotypesrelated to domestic servitude, welfare dependence, and many oth-ers. After developing his self-esteem scale in 1965, Rosenberg(1972) highlighted that scholars need to understand African Amer-icans’ self-esteem within their “distinctive subcultures [which re-fer to] . . . style of life . . . conceptions of right or wrong. . .patternsof values and systems of aspirations . . .” (p. 99). Constantine andBlackmon (2002) further extended African American students’self-esteem to consider racial socialization messages received byAfrican American and culture-specific self-esteem. For AfricanAmericans, racial socialization messages reflecting pride andknowledge about African American culture were positively asso-ciated with their self-esteem.

Accordingly, to understand the self-esteem of African Ameri-cans based on their cultural context, there are two competitiveconclusions regarding the factors or dimensions of self-esteem.Rosenberg (1965) designed the RSES as a global measure of asingle dimension of self-evaluation and various studies have sup-ported that. But, consistent with other previous research on theRSES, a two-dimensional factor structure was revealed. Variousstudies have supported the contention that the RSES taps twodistinct dimensions of self-esteem. Rosenberg (1965, 1972) alsoindicated two different types of affect (i.e., positive vs. negative) inself-esteem. In a classic longitudinal study, Owens (1994) used aneight-item version of the RSES with slightly different wording ofthe items to examine the relationships of the positive and negativesubscales with depressive symptoms. Owens concluded that theRSES reflects two-dimensions. Thus, this present study proposedto address this dispute by determining whether a global dimensionor a two-dimensional structure of self-esteem was warranted for a

This article was published Online First June 9, 2016.Ruth Chu-Lien Chao, Courtney Vidacovich, and Kathy E. Green, Coun-

seling Psychology Department, Morgridge College of Education, Univer-sity of Denver.

Correspondence concerning this article should be addressed to RuthChu-Lien Chao, Counseling Psychology Department, Morgridge Collegeof Education, University of Denver, Denver, CO 80208. E-mail: [email protected]

Thi

sdo

cum

ent

isco

pyri

ghte

dby

the

Am

eric

anPs

ycho

logi

cal

Ass

ocia

tion

oron

eof

itsal

lied

publ

ishe

rs.

Thi

sar

ticle

isin

tend

edso

lely

for

the

pers

onal

use

ofth

ein

divi

dual

user

and

isno

tto

bedi

ssem

inat

edbr

oadl

y.

Psychological Assessment © 2016 American Psychological Association2017, Vol. 29, No. 3, 329–342 1040-3590/17/$12.00 http://dx.doi.org/10.1037/pas0000347

329

Page 2: Rasch Analysis of the Rosenberg Self-Esteem Scale With

sample of African American college students, and if a measuredeveloped primarily with Whites adequately captures AfricanAmerican students’ self-esteem.

Thus, for African American students, it is important to examinehow and whether the assessment of self-esteem (i.e., RSES) maybe fundamentally a valid measure for this population. Moreover, tounderstand African Americans’ self-esteem, scholars and practi-tioners should conceptualize self-esteem based on African Amer-ican culture. African American students occupy a unique socialstatus in the U.S., with evidence that this group is likely toexperience a number of contextual risk factors as a function oftheir racial group membership. Thus, on one hand, they experiencebeing marginalized which may be related to negative experience orlow self-esteem. On the other hand, while the unique risks andchallenges that African American students experience are impor-tant to highlight, many African American students show a greatdeal of positive adaptation that often is unrecognized and unac-knowledged (Constantine, 2007; Jackson et al., 2004). To date,there has been relatively little focus on the positive developmentamong African American students, including personal and culturalassets. According to resiliency theory (e.g., Glantz & Johnson,1999; Luthar, Cicchetti, & Becker, 2000), resiliency is concernedwith exposure to risk but focuses on strengths instead of deficitsand understanding healthy development in spite of risk exposure.Thus, when African American students have more resilience aboutthemselves or their culture-related beliefs, they tend to have higherself-esteem (Constantine & Blackmon, 2002). Our primary pur-pose in this study was to evaluate whether the RSES, developedprimarily with Whites, measures self-esteem given potential dif-ferences in the meaning of the construct for an African Americanstudent population.

Additionally, our study presented an example of the potentialusefulness of a widely used self-esteem scale among AfricanAmericans. A PsycINFO search for Rasch research on generalpopulations identified more than 1,000 studies, most of whichdealt with instruments designed to assess specific symptoms (e.g.,posttraumatic stress disorder). Only 24 of these studies involvedrace-related or health disparities issues. Four of these focused onthe racial differences between African Americans and Whites, andnone of these studies conducted analyses on the usefulness of theRSES with African Americans. Thus, to further address the racialcontext in assessment, we utilized Rasch analysis to evaluate howuseful the RSES was for African Americans.

Summary of Psychometric Research on the RSES

To understand the structure of self-esteem of African Americansbased on their cultural context, it is important to know the back-ground for the RSES, its development, scoring, and psychometriccharacteristics. The Rosenberg Self-Esteem Scale (RSES; Rosen-berg, 1965) comprises 10 items, with five negatively and fivepositively worded, that assess a person’s global evaluation orliking of him/herself. Self-esteem is linked to a wide range ofvariables, such as neuroticism, depression, extraversion, conscien-tiousness, and attachment style (Kuster, Orth, & Meier, 2012). TheRSES is perhaps the most widely used instrument to measureself-esteem. It has been used extensively internationally (e.g.,Baranik et al., 2008; Schmitt & Allik, 2005) and translated into

multiple languages (e.g., Aluja et al., 2007; Roth, Decker, Herz-berg, & Brähler, 2008).

The structure of the RSES has been the subject of extensivestudy. While results are mixed, more recent work with structuralequation analyses supports an overall global self-esteem dimen-sion with method factors needed to adequately model RSES data(Marsh, Scalas, & Nagengast, 2010). In contrast, though in agree-ment with some results of earlier exploratory and confirmatoryfactor analyses, studies of RSES structure employing item re-sponse theory have generally resulted in the conclusion that theinstrument is reasonably unidimensional (Baranik et al., 2008;Gray-Little, Hancock, & Williams, 1997; Mannarini, 2010;Quintão, Delgado, & Prieto, 2011; Roth et al., 2008; Song, Cai,Brown, & Grimm, 2011). Yet, the results of Classen, Velozo, andMann (2007) found that all items fit a unidimensional model butalso found the residual variance of some negatively worded itemsto load on a potential second dimension. Roth, Decker, Herzberg,and Brähler (2008) suggested item response theory “can provideuseful information about a questionnaire, in particular when theresults of preceding factor analyses are ambiguous” (p. 194). Instudies employing item response theory, Items 5 (Mannarini,2010) and 8 (Baranik et al., 2008) were found to be problematic,with Items 8, 9, and 10 (Gray-Little et al., 1997) less discriminat-ing than the remaining items. Item 5 states “I feel I do not havemuch to be proud of,” and Item 8 states “I wish I could have morerespect for myself.” Negatively worded items evidenced differen-tial item functioning between U.S. and Chinese undergraduates(Song et al., 2011). Quintão, Delgado, and Prieto (2011) found thatsome items were offset in targeting (items were generally too easyfor these samples) with samples of Portuguese college students.Furthermore, Classen et al. (2007) conducted analysis with anoninstitutionalized elderly sample and found an offset in targeting(items were too easy for these samples) on elderly people.

RSES and African Americans’ Self-Esteem

Despite the RSES being one of the most widely used instru-ments for measuring level of self-esteem, unfortunately, there arefew studies concentrating on the structure of self-esteem using theRSES for African Americans. Thus, scholars studying AfricanAmerican assessments (Broman, Torres, Canady, Neighbors, &Jackson, 2010; Jackson et al., 2004) advocated for focusing rep-resentative data on mental health such as self-esteem among Af-rican Americans. Therefore, to follow the suggestion of theseleading scholars, the authors concentrated data collection on Af-rican Americans.

In addition to focusing on African American data, scholars alsoproposed to examine self-esteem based on the racial context ofAfrican Americans. Given the unique nature of the self-esteem ofthis group, impacted by experiences of racism and discrimination,the accurate measurement of this construct is important (Hatcher,2007). To date, among the very few studies located examinedRSES structure for a U.S. African American samples, some studies(Seaton, Caldwell, Sellers, & Jackson, 2010) followed the originalconceptualization and validated self-esteem as a global concept ofself-worth and self-value. However, some scholars argued that,when considering Black self-esteem, global self-esteem is not“domain specific and unidimensional” for African Americans andshould incorporate positive or negative perceptions (Hatcher &

Thi

sdo

cum

ent

isco

pyri

ghte

dby

the

Am

eric

anPs

ycho

logi

cal

Ass

ocia

tion

oron

eof

itsal

lied

publ

ishe

rs.

Thi

sar

ticle

isin

tend

edso

lely

for

the

pers

onal

use

ofth

ein

divi

dual

user

and

isno

tto

bedi

ssem

inat

edbr

oadl

y.

330 CHAO, VIDACOVICH, AND GREEN

Page 3: Rasch Analysis of the Rosenberg Self-Esteem Scale With

Hall, 2009; Krause, 1983). Specifically, with the cultural influ-ences of racial esteem and the rejection of negative stereotypesforming an important and distinct aspect of this concept, the RSESshould be used and interpreted with caution in this populationgiven these findings and until the measure structure and psycho-metric properties are examined for African Americans. Within theracial context, African Americans may be able to maintain positiveself-feelings, while simultaneously incorporating negative self-views. Hatcher and Hall (2009) engaged a sample of 98 AfricanAmerican women and found the RSES to be best explained as twodimensions, negative and positive self-regard, with lower itemdiscrimination for Items 5 and 8. Hatcher (2007) cited sevenstudies in which the RSES was employed along with other mea-sures in an examination of RSES score validity and noted the lackof psychometric examination of the measure for African Ameri-cans and questioned the validity of the RSES for use with thispopulation. It was also found a two-dimension model (i.e., positiveself-esteem subscale and negative self-esteem subscale) among508 (and 434 were African Americans) nursing assistants.

Instrument Assessment With Rasch Analysis

Due to inconsistent results regarding the RSES structure whenusing factor analytic techniques, Rasch (1960, 1980) analysis wasemployed. Classical test theory has been criticized on severalcounts, with Rasch (and other IRT models) seen as the often moreinformative measurement model. Among the criticisms of classicaltest theory remediated by the Rasch and other IRT models aretreatment of ordinal raw scores as interval, treatment of ordinalscale points as interval, sample dependence of person and itemposition statistics, and assessment of dimensionality with the ef-fects of the first measurement dimension included (e.g., Hamble-ton & Jones, 1993; Petrillo, Cano, McLeod, & Coon, 2015; Prieto,Alonso, & Lamarca, 2003). Rasch models also provide indices ofmodel fit not available with classical test theory. For these reasons,Rasch rating scale analyses were applied in addition to an explor-atory factor analysis (more specifically, principal axis factoring).Most prior analyses of the RSES were conducted using classicaltest theory, with the exceptions cited above. Two studies of theRSES using an IRT model found support for a unidimensionalstructure (Roth et al., 2008). However, Roth et al. (2008) analyzedthe dimensionality of the German version of RSES and usedconfirmatory factor analysis, with one- and two-dimensional mod-els tested, and found that the RSES is a two-dimensional scale.They also found that RSES comprised the highly correlated com-ponents of positive and negative self-evaluation, which constitutea unitary construct of global self-esteem at the second-order level.They then conducted item response theory (IRT) analysis andsupported a one-dimensional view of the RSES. Classen et al.(2007) used the Rasch model to determine the item-level psycho-metrics of the RSES in 986 noninstitutionalized elderly. Theyfound that RSES showed appropriate item fit statistics and personseparation, but had limitations in the spread of items, unidimen-sionality, the logical progression of difficulty among the items,negative statements included in the rating scale, and the contextualrelevance of items underlying the self-esteem construct. In sum-mary, results of RSES dimensional analysis using IRT seem tofavor a unidimensional structure but are not consistent with prioranalyses of RSES structure. Thus, the current study was designed

to add to the literature on RSES structure specifically for a sampleof African Americans.

Here we summarize the approach we took to assess the psycho-metric quality of RSES scores. In agreement with Roth et al.(2008), we used multiple methods to determine the structure of theRSES, including exploratory factor analysis and Rasch analyses.Exploratory factor analysis stemming from classical test theory hasbeen extremely useful in identifying items that reflect latent di-mensions but is limited for the reasons provided above. Raschanalysis provides a more detailed examination of the scale use andprovides an alternative identification of structure based on conver-sion of raw scores to interval measures, with a wider array ofinformation regarding fit, a map of the construct, and a variedinterpretation (separation) of consistency. The theory underlyingthe Rasch model specifies that useful measurement mandates aunidimensional construct arranged in a monotonically increasingpattern (e.g., more than/less than) along an equal interval contin-uum. If the data (e.g., calibration or invariance subsample in ourstudy) fit the Rasch model, item and person estimates can beinterpreted in terms of abstract, equal-interval units created bynatural log transformations of raw data odds, within standard errorestimates (Bond & Fox, 2007). Instruments such as the RSES oftenuse 4-point rating scales as vehicles through which clients canexpress their views and experiences relevant to the item content.To effectively use data obtained from scale administration, wemust evaluate whether individuals respond to the rating scale in themanner intended by the scale developers.

Importantly, utilizing Rasch analysis to examine RSES items forAfrican Americans allows researchers to evaluate the extent towhich items are useful in reflecting a unidimensional scale. Raschfit indices assess whether items contribute to the construct asexpected. Fit statistics are transformations of chi-square statistics,with expected values of the mean square (MNSQ) and standard-ized fit indices of 1.0 and 0.0, respectively, if the data fit themodel. Fit is weighted by the difference between the item and theperson parameter (termed infit) or is unweighted (outfit). Outfit ismore sensitive to extreme responses. Items and persons can un-derfit or overfit the model. Underfit occurs when there is too muchnoise in an estimate; overfit occurs when there is less than themodel-expected noise in an estimate. Underfit, or MNSQ or stan-dardized fit exceeding an arbitrary but not capricious cut-offoccurs for items eliciting idiosyncratic responses or items that areless strongly related to the measure core. Overfit, or MNSQ orstandardized fit below a cut-off typically occurs for items thatshow very little noise, possibly by holding a strong relationship tothe measure core. Underfit may be considered more concerningthan overfit, thus different cut-offs may be useful for underfitcompared with overfit. Cut-offs for identifying fitting items andpersons are sufficiently flexible to allow for researcher judgment.As noted by Smith, Rush, Fallowfield, Velikova, and Sharpe(2008), there continues to be a debate around the issue of whethermean square or standardized fit indices are preferable. The meansquare (MNSQ) infit and outfit indices were used in this study,with a criterion of .6 � fit � 1.4. Additionally, a principalcomponents analysis of residuals (PCAR) was used to determinewhether a second dimension was indicated by the data. In contrastto an exploratory factor analysis which creates eigenvectors fromsolution of a matrix of item covariances, a principal componentsanalysis of residuals (PCAR) identifies common characteristics

Thi

sdo

cum

ent

isco

pyri

ghte

dby

the

Am

eric

anPs

ycho

logi

cal

Ass

ocia

tion

oron

eof

itsal

lied

publ

ishe

rs.

Thi

sar

ticle

isin

tend

edso

lely

for

the

pers

onal

use

ofth

ein

divi

dual

user

and

isno

tto

bedi

ssem

inat

edbr

oadl

y.

331RAUSCH ANALYSIS AND RES

Page 4: Rasch Analysis of the Rosenberg Self-Esteem Scale With

shared by items once the dominant linear dimension has beenremoved by differencing model predicted from observed values.PCAR in concept addresses the hypothesis that the residuals arerandom noise by finding the component that explains the largestpossible amount of variance in the residuals. The largest amount ofvariance in the residuals is the “first contrast.” Set criteria have notbeen established for when deviation from the core item set indi-cates a second dimension, though Linacre (2004) suggested aninstrument may be considered unidimensional if variance ex-plained by the first dimension is substantial (e.g., �40%), theeigenvalue for the first contrast (analogous to but computed dif-ferently from the eigenvalue for the second component in a prin-cipal components analysis) is less than or equal to 2.0, and thevariance explained by the first contrast is less than 5%. Thesecriteria are used as guidelines; Linacre (2014, p. 10) also states “asecondary dimension must have the strength of at least three itemsso if the first contrast has units (i.e., eigenvalue) less than 3.0 (fromreasonable length tests) then the test if probably unidimensional”(dimensionality: contrasts & variances). If Linacre’s criteria arenot met, further diagnostic tables are reviewed. For example, itemsare placed into three clusters; if the disattenuated correlationbetween clusters is high, the information provided by different setsof items is similar, indicative of little to be gained by pursuing asecond dimension in the data.

To further address the usage of RSES among African Ameri-cans, we evaluated the person separation estimates to see how wellitems assess different levels of the measure on a less-to-morecontinuum and identify the number of subgroups of persons thatthe instrument can discriminate. Separation equals the true stan-dard deviation divided by root mean square error. Separation ishigher if an item hierarchy is evident. Separation and reliability ofseparation are transformations of one another (Smith, 2001), andthe reliability of person separation is equal to (Separation2/[1 �Separation2]). Rasch reliability indices based on interval linearmeasures are suitable for subsequent parametric calculations ofmeans and standard deviations (Merbitz, Morris, & Grip, 1989).Separation values less than 2.0 imply the instrument may not besufficiently sensitive to distinguish between more than high andlow scorers (Linacre, 2016). Higher values of separation representgreater coverage of the construct along a continuum.

Response scale use is intended to reflect a less-to-more scale,with lower category responses being easier to agree with thanhigher category responses. Scale use is evaluated using a numberof indices. Linacre (2014) provided recommendations for fruitfuluse of a rating scale with the following. The number of responsesin any category should exceed 10, a smooth distribution of fre-quencies across categories is preferred, average measure shouldadvance clearly from one category to the next higher one, meansquare category fit should be close to or less than 1.0, and stepcalibrations should advance from one category to the next with achange in step difficulty of more than 1.4 and less than 5.0 logits.

Invariance means the item positions are the same (within somelevel of error) across distinct groups, such as by sex or mentalhealth status. Invariance is assessed by examination of differentialitem function (DIF). DIF occurs when the item is perceived dif-ferently by members of one group than another, and so the itemposition varies by group. DIF is assessed using significance tests ofdifferences in item position along with a suggestion that DIF of

less than about half a logit is not a large enough effect to declarean item as failing invariance (Wright & Panchapakesan, 1969).

As both calibration and validation samples were used in thisstudy, we investigated displacement in item position from the firstto the second sample. The displacement statistic “approximates thedisplacement of the estimate away from the statistically bettervalue which would result from the best fit of your data to themodel” (Linacre, 2016). With item positions anchored based onthe calibration sample, a free (unanchored) parameter estimate wasprovided in the validation sample for all of the items. Displace-ment is the direct comparison of the anchored difficulty value withthe value from the free estimation arising from the second sample.If displacement is small, the item positions are considered stableover samples.

Rasch analysis can be used to identify gaps in the constructcontinuum by identifying items and persons that are not in equiv-alent positions. A well-targeted test has a person logit mean ofclose to 0.0, indicating persons and items are well matched inposition (Chao & Green, 2013; Crandal et al., 2015). Where itemsand persons are not well targeted, they have larger standard errorestimates, although this can potentially be compensated for by anincrease in the sample size (Linacre, 1994). Also, as the offsetbetween the item mean of 0.0 and the person mean increases, theproportion of extreme scores increases and all reliabilities decrease(Linacre, 1997). Substantial offset indicates that the test is eithertoo easy or too difficult for the sample and so does not functionwell as a test intended for a comparable population. In contrast,one would expect a diagnostic test to show a substantial offsetbetween the person and item means when given to a generalpopulation. Gaps in construct coverage provide information abouthow well the instrument measures what it is intended to measurewithin reasonable ranges of the measure and also where items canbe directed to further improve it. Thus, such gaps can provideinformation regarding how well the RSES measures self-esteemfor African Americans. The appropriateness of measurement cananswer whether there is a propensity to perpetuate health dispar-ities in the use of the RSES as a psychological assessment.

Rasch models comprise a family of models, each model using adifferent response scale. The Rasch model used in this analysiswas the polytomous rating scale model (Wright & Masters, 1982).Winsteps (Linacre, 2016) software was used to generate the anal-yses. The rating scale model incorporates a parameter for the scalealong with parameters for item and person position, as can be seenbelow:

ln(Pnij ⁄ Pni(j � 1)) � Bn � Di — Fj

(Pnij � the probability that person n encountering Item i isobserved in category j; Bn � logit position of person n; Di � logitposition of item I; Fj � logit position of rating scale step j).

The RSES, to our knowledge, has only infrequently been sub-jected to Rasch or other item response theory analyses. Our generalpurpose was to use Rasch analysis to provide a detailed under-standing of the RSES when administered to African Americans.Such a comprehensive understanding would make it possible toaddress health disparity in psychological assessment and wouldallow equitable interpretation of the RSES for African Americans.We used the following steps to assess the functioning of the RSESfor our sample: fit of data to the model, item logit positions thatevidenced no DIF and made conceptual sense in terms of a

Thi

sdo

cum

ent

isco

pyri

ghte

dby

the

Am

eric

anPs

ycho

logi

cal

Ass

ocia

tion

oron

eof

itsal

lied

publ

ishe

rs.

Thi

sar

ticle

isin

tend

edso

lely

for

the

pers

onal

use

ofth

ein

divi

dual

user

and

isno

tto

bedi

ssem

inat

edbr

oadl

y.

332 CHAO, VIDACOVICH, AND GREEN

Page 5: Rasch Analysis of the Rosenberg Self-Esteem Scale With

less-to-more interpretation, targeting of items, and expected rela-tionships with background variables. To rigorously examine thevalidity of the RSES for African Americans, we conducted theRasch analysis with two subsamples: a calibration subsample andan invariance subsample.

Even within African American communities, individuals’ func-tioning can be affected by demographic background variables suchas sex and prior psychological counseling (Trippi & Cheatham,1991). According to Ward, Wiltshire, Detry, and Brown (2013), itis important to know that ages, sexes, and other demographicvariables may put African Americans in different statuses in regardof their self-esteem. Thus, we followed Ward et al. (2013) toexamine if age and sex contribute to the differences in RSES items.On one hand, African American men and women may experienceunique discrimination related to their sex (Clark, Anderson, Clark,& Williams, 1999) that perhaps make them vulnerable to differentpsychological struggles. For example, African American men tendto have increased propensity for anger issues while African Amer-ican women may be vulnerable to depression (Chao, Mallinckrodt,& Wei, 2012). On the other hand, because male identity andmasculinity are associated with African American men’s health-related behaviors, African American men may be less motivated touse counseling than women due to stigma and pride (Wade, 2008).Previous studies found that the RSES scores could be differentbecause of demographic variables such as age, grade point average(GPA), current medication use, past medication use, as well asphysical and mental disabilities (Ridley, 2005). For example, dueto historical hostility, older African Americans seek counselingless than younger African Americans (Constantine & Sue, 2006;Ridley, 2005). Among African American students, grade pointaverage is related to their mental health (Gutman, Friedel, & Hitt,2003). Additionally, African Americans’ mental health may berelated to marital status which could be a resource for them(Lincoln & Chae, 2010). Thus, it is important to clarify the impactof demographic variables on RSES scores. Although AfricanAmerican college students may suffer from health disparities, werecognized that examining African American college students maylimit the generalization of our findings to other racial/ethnic pop-ulations. Importantly, our questions below echoed recommenda-tions from the Standards for Educational and Psychological As-sessment (Geisinger et al., 2013). For instance, Question 5 belowrelates to validity evidence based on the standards.

We enlisted an African American student sample to identify thestructure and functioning of RSES items with these African Amer-ican students. Specifically, the following questions were ad-dressed:

1. What is the structure of the RSES for a sample of AfricanAmerican college students? Is the measure unidimen-sional? Results from exploratory factor analysis, Raschprincipal components analysis of residuals, and Raschrating scale analyses were used to assess dimensionality.

2. Is the use of the rating scale consistent and appropriate?

3. What measurement gaps and redundancies exist along theRSES continuum and do these potential gaps indicate theneed for adding or deleting items?

4. Is differential item functioning (DIF) found betweengroups defined by the two variables: sex and havingreceived therapy or professional assistance? An interestin the examination of DIF by sex was based on results ofWard et al. (2013) who found differences in functioningby sex; examination of DIF by having received therapywas motivated by the idea that participants with pastexperience with therapy may respond differently to RSESitems. Is item displacement small?

5. Are there differences or correlations for RSES raw scoresand logit person position by age, sex, and variablesrelated to mental health? The purpose of these analyseswas to see whether a refined measure, using person logitposition, reflected anticipated differences by mentalhealth service use. We anticipated differences in self-esteem for those participants reporting use of mentalhealth services.

Method

Participants

A total of 719 African American students were invited from fiveuniversities, and 575 agreed to participate in this study; that is,80% of the students responded to our study. Fifty-one participantsdid not complete the survey and were deleted from the 575persons. The final sample size was 524, and 73% (524/719 �73%) of the participants’ data were used in our study.

The participants were 524 African American students seekingservices at a university counseling center from five predominantlyWhite public universities in the Midwestern United States. Eachuniversity had an annual enrollment of 18,000–35,000 students.These African American college students were 316 (60.3%) fe-males and 118 (22.5%) males; 90 individuals (17.2%) did notspecify sex. Most participants were undergraduate students (21.2%freshmen, 22.9% sophomore, 19.5% junior, 17.6% senior; with1.3% graduate school and 17.6% not specified). Most participantswere single (82.1%), with 13.2% married and 1.5% divorced.About 87.2% of participants were �22-years-old, and the ageranged from 18- to 35-years-old. Relatively low percentages of thesample had received therapy or professional assistance (10.2%),were on medication related to mental illness (1.0%), had beenhospitalized for psychological/psychiatric problems (1.1%), or hadattempted suicide (4.2%), while a somewhat higher percentagereported that a family member suffered from a mental or emotionaldisorder (18.6%). The sample of African American college stu-dents was randomly assigned to either a calibration or an invari-ance subsample. Table 1 provides a description of the backgroundvariables for the two subsamples.

Measures

Demographic questionnaire. Participants were asked to pro-vide demographic information on age, gender, marital status, andmental health variables.

Rosenberg Self-Esteem Scale (RSES; Rosenberg, 1965).There are 10 items in the Rosenberg Self-Esteem Scale (RSES)used to measure a person’s perception of their self-esteem. The

Thi

sdo

cum

ent

isco

pyri

ghte

dby

the

Am

eric

anPs

ycho

logi

cal

Ass

ocia

tion

oron

eof

itsal

lied

publ

ishe

rs.

Thi

sar

ticle

isin

tend

edso

lely

for

the

pers

onal

use

ofth

ein

divi

dual

user

and

isno

tto

bedi

ssem

inat

edbr

oadl

y.

333RAUSCH ANALYSIS AND RES

Page 6: Rasch Analysis of the Rosenberg Self-Esteem Scale With

response scale is a 4-point rating scale (strongly disagree tostrongly agree). Five items are positively worded (“On the whole,I am satisfied with myself”) and five items are negatively worded(“At times I think I am no good at all”). The positively wordeditems were reverse coded for a total score that is uniformlyinterpreted, with higher scores corresponding to lower levels ofself-esteem. The intent of this coding was to use high scores toidentify individuals with low self-esteem who may be in need ofcontinued services. Four demographic questions were asked re-garding level in college, age, gender, and marital status. Addition-ally, there were five questions regarding mental health issues: timein therapy, suicide attempts, hospitalization for mental illness,taking medication for a mental illness, and family history.

Procedure

Participants were African American college students who at-tended one of the five universities. Before the survey began, aresearch coordinator contacted university counseling centers toexplain the purpose of the study. After understanding the purposeof this study, participants signed a consent form prior to theirparticipation. They then were asked to complete the demographicquestionnaire and RSES. Participants could enter a lottery for a giftcard of $20, one given per campus.

Results

Question 1

What is the structure of the RSES? Is the measure unidimen-sional? To answer question 1, we evaluated dimensionality usinga classical test theory principal components analysis, a Raschprincipal components analysis of residuals, and item fit and sep-aration.

Exploratory factor analyses. Exploratory factor analysis(EFA) using principal axis factoring of the calibration and valida-tion subsamples was conducted. Prior to EFA, assumptions weretested. Both samples were suitable for factor analysis, based onKaiser-Meyer-Olkin indices of .87 and .88, respectively, for thecalibration and validation subsamples. Also, the Bartlett test pvalue was p � .001 for both samples, indicating that an EFA wasuseful. Bartlett’s test of sphericity, converted to a chi-squarestatistic was statistically significant at p � .001, indicating that thecorrelation matrix did not come from a population where the

correlation matrix is an identity matrix and the sample size waslarge enough to allow dimension structure analysis. A Cattell’sScree plot suggested that a one-, two-, or three-dimension solutionmight be interpretable, though there was a distinct elbow after thefirst dimension. Results of a parallel analysis suggested that onlyone dimension be interpreted. We conducted the analysis by usingboth orthogonal and oblique rotations for one- and two-dimensionsolutions and obtained similar results. The EFA results were usedas an indicator of how many dimensions to expect (one in thiscase). We also used factor loadings to identify potentially prob-lematic items and found no problematic items in the EFA. TheAppendix provides brief item summary information from the EFAanalysis. We decided that one dimension was adequate to explainvariance in the item intercorrelation matrix for this sample. Noitem loaded at less than .51 on the first dimension for the calibra-tion subsample (factor loadings from .51 to .69; first eigenvalue �4.12, second eigenvalue � 1.11); for the validation subsample, allitems loaded at more than .43 (.43 to .68; first eigenvalue � 4.29,second eigenvalue � 1.26). Variance explained by the first dimen-sion was 41% for the calibration and 43% for the validationsubsamples. Little support was found for a second dimension in theEFA analyses as all items loaded adequately on a first dimension.

Rasch principal components analysis of residuals. We per-formed an appraisal to evaluate whether the data fit the model usingprincipal components analysis of residuals (PCAR). For the calibra-tion sample, the measure explained 41.2% of the variance with theunexplained variance in the first contrast having an eigenvalue of 1.7with 9.5% unexplained variance. For the validation subsample, themeasure explained 42.9% of the variance with the unexplained vari-ance in the first contrast having an eigenvalue of 2.0 and 11.5%unexplained variance. We also combined the calibration and valida-tion subsamples into a total sample. For the combined sample, themeasure explained 45.4% of the variance with the unexplained vari-ance in the first contrast having an eigenvalue of 1.8 and 10.1%unexplained variance. The higher unexplained variance could be theresult of the potential second dimension seen in previous researchstudies, but with only 10 items in the measure a higher proportion ofvariance in the first contrast is not unexpected. The measure generallymeets the expectations set by Linacre (2012) for unidimensionality.However, further diagnostic information regarding dimensionalitywas reviewed. The disattenuated correlation between items clustersapproached 1.0, indicating person estimates based on different itemsubsets were highly correlated. This diagnostic information supports adecision to treat the measure as unidimensional.

Item fit to a Rasch rating scale model. Item fit was examinedto ensure that each item fit the Rasch model. According to Wright andLinacre (1994), the infit MNSQ values for a rating scale such as thisshould be between .6 and 1.4. Item MNSQ infit for both samplesranged from .69 to 1.45, with one item exceeding the 1.4 rule ofthumb cut-off. In the validation subsample, no item showed a stan-dardized MNSQ infit exceeding 1.4, so no items were marked fordeletion in that analysis. An infit of 1.45 for Item 5 for the validationsample was of concern, but the outfit MNSQ was within the properrange and the infit value in the calibration sample was 1.22 so the itemwas retained. Also, Item 8 had an infit MNSQ of 1.31 for thecalibration sample 1.43 for the validation sample. Items 5 and 8 werefound to be problematic in previous studies (Baranik et al., 2008;Mannarini, 2010), but because they seem to contribute when consid-ering both infit and outfit MNSQ in both samples, we chose to keep

Table 1Subsample Demographics

Calibrationsample

Validationsample

Variable n � 255 n � 269

Age—mean 20.56 20.67Age—range 17–38 17–45% female 65.5% 68.8%% been in therapy for mental illness 10.6% 9.7%% on medication for mental illness .8% 1.1%% been hospitalized for mental illness 1.2% 1.1%% attempted suicide 3.5% 4.8%% family history of “emotional” or

“mental” disorder 19.6% 17.1%

Thi

sdo

cum

ent

isco

pyri

ghte

dby

the

Am

eric

anPs

ycho

logi

cal

Ass

ocia

tion

oron

eof

itsal

lied

publ

ishe

rs.

Thi

sar

ticle

isin

tend

edso

lely

for

the

pers

onal

use

ofth

ein

divi

dual

user

and

isno

tto

bedi

ssem

inat

edbr

oadl

y.

334 CHAO, VIDACOVICH, AND GREEN

Page 7: Rasch Analysis of the Rosenberg Self-Esteem Scale With

them. Based on results for both the calibration and validation samplesand analysis of the samples combined, all of the 10 items in thissurvey fit the model at an acceptable level with the worst fit for Items5 and 8. Thus, results of both the EFA and PCAR suggest reasonablefit to a unidimensional model.

We also examined point-measure correlations which rangedfrom .47 to .69. The item with the lowest point-measure correlationfor both samples, Item 4, still had adequate fit (MNSQ infit andoutfit �1.0) in the two subsamples. This item fell in the middle ofthe other items in terms of difficulty and appeared to perform in asimilar manner to the other items.

Displacement of items’ positions was assessed when the calibrationsubsample was used to anchor item position, and then the validationsubsample item positions were computed fixing calibration sub-sample anchors. There were no displacement values exceeding .5between the subsamples, indicating no substantial displacement ofitems.

Examination of the construct laid out in the item-person map(see Figure 1) shows a progression from respect for self as easiestto endorse to feeling like a failure as most difficult to endorse.

Question 2

Is the use of the rating response scale consistent and appropriate?In terms of scale use, the scale appears to be used in the way

that the RSES author (i.e., Rosenberg, 1965) intended. Re-sponse scale use was as intended, and reflected a less-to-moreinterpretation of the rating scale, but the proportion of thesample with responses in the upper categories was low. Cate-gory probability curves for both subsamples (see Figure 2; theplot for the validation sample was very similar and so is notprovided) indicate that there was a smooth distribution ofcategories with clearly advancing steps. There was a minorreversal in the observed average at Step 3 for the calibration

Figure 1. Map of item person map for calibration subsample. Each “#” in the person column is three personsand each “.” is one to two persons; entries in the figure are RSES item numbers. The the x’s in the second columnare located at the points on the latent variable where a person would have a 50% chance of being observed inthe bottom category and a 50% chance of being observed in a higher category (e.g., the “item difficulty” of thebottom category); the third column x’s mark the item logit measure, and the x’s in the fourth column are locatedat the points on the latent variable where a person would have a 50% chance of being observed in the topcategory and a 50% chance of being observed in a lower category.

Thi

sdo

cum

ent

isco

pyri

ghte

dby

the

Am

eric

anPs

ycho

logi

cal

Ass

ocia

tion

oron

eof

itsal

lied

publ

ishe

rs.

Thi

sar

ticle

isin

tend

edso

lely

for

the

pers

onal

use

ofth

ein

divi

dual

user

and

isno

tto

bedi

ssem

inat

edbr

oadl

y.

335RAUSCH ANALYSIS AND RES

Page 8: Rasch Analysis of the Rosenberg Self-Esteem Scale With

subsample, likely because this category was infrequently used.A reversal occurs when a lower category has a higher logitposition than does a higher category and this reversal contra-dicts the notion of a less-to-more use of the scale. (When thedata were analyzed with the two subsamples combined, a re-versal was still found.) The structure calibration showed nodisorder, with steps going from lowest value to highest value.To ensure that each step fits the model, the infit MNSQ valueshould be from .5 to 2.0. With infit MNSQ values between .80

and 1.89 (Tables 3 and 4), this is additional support showing theRSES response scale to function appropriately. Table 3 pro-vides observed average and step structure values by category ofcalibration subsample, validation subsample, and for the twosubsamples combined. Category 4 had the highest MNSQ fitvalues (1.89, 1.57, and 1.72, respectively), but this is notsurprising when examining the observed counts. The partici-pants in this study used the lower categories (i.e., 1 and 2)which endorse higher levels of self-esteem more frequently thanthey used the higher categories. In fact, Category 4 was onlyused in 2% of the responses. This shows high levels of self-esteem for this sample which is seen in the item-person map(see Figure 1) as well. While the lowest category response wasappropriate for most of the sample, there were cases whereself-esteem was higher than could be captured by even thelowest category response. Higher categories (indicating lowerself-esteem) were not used to any extent by this sample.

For rating scales with very low endorsement of rating scalecategories (here 2% endorsement of Category 4), collapsing thescale is sometimes recommended (e.g., Royal et al., 2010). Wethen collapsed the rating scale from four to three points, com-bining Categories 3 and 4, which remedied the minor reversal inobserved average found in the calibration subsample. We con-ducted this analysis for both subsamples and for the full sample.Results of this change in the rating scale are shown in Table 3and effects on the overall scale statistics are shown in Table 2.Scale and item statistics improved slightly with this change tothe rating scale, with a slightly higher person separation, reli-ability of person separation, and Cronbach’s alpha.

Question 3

What measurement gaps and redundancies exist along the RSEScontinuum and do these potential gaps indicate the need for addingor deleting items?

Figure 2. Step calibrations, also called Andrich thresholds, are probabil-ity curves for each category.

Table 2Dimensionality, Item Fit, and Separation

Calibration sample estimate Validation sample estimate Combined sample estimate

IndexOriginal

4-category scaleCollapsed

3-category scaleOriginal

4-category scaleCollapsed

3-category scaleOriginal

4-category scaleCollapsed

3-category scale

Dimensionality—eigenvalue for first contrast 1.7 1.7 2.0 2.0 1.9 1.8Mean MNSQ infit 1.07 1.04 1.06 1.08 1.06 1.03SD MNSQ infit .71 .47 .67 .51 .69 .48Mean MNSQ outfit 1.04 1.03 1.01 1.06 1.02 1.01SD MNSQ outfit .77 .65 .68 .59 .72 .57Real person separation 1.38 1.56 1.50 1.67 1.44 1.61Real person root mean square error .75 .78 .73 .79 .74 .77Real reliability of person separation .66 .71 .69 .74 .67 .72Cronbach’s alpha .82 .84 .84 .85 .83 .85Person logit mean �2.23 �1.51 �2.18 �1.50 �2.20 �1.47Real item separation 6.05 6.31 5.42 6.27 8.18 8.44Real item root mean square error .13 .14 .13 .14 .09 .10Real reliability of item separation .97 .98 .97 .98 .99 .99

Note. Mean MNSQ infit measures average deviation from the measurement model and provides sensitivity to on-target (i.e., midrange) observations.Mean MNSQ Outfit measures deviation from the measurement model and provides sensitivity to off-target, extreme responses. Real person/item separationis the ratio of the true standard deviation (SD adjusted for measurement error), to the error standard deviation (root mean square error). Real separation iscomputed on the basis that misfit is due to departures in the data from model specifications. Real person/item root mean square error � standard error ofthe measure inflated for misfit. Real reliability of person/item separation � Separation2/(1 � Separation2). Person logit mean is the average logit positionof all persons whose position could be calibrated.

Thi

sdo

cum

ent

isco

pyri

ghte

dby

the

Am

eric

anPs

ycho

logi

cal

Ass

ocia

tion

oron

eof

itsal

lied

publ

ishe

rs.

Thi

sar

ticle

isin

tend

edso

lely

for

the

pers

onal

use

ofth

ein

divi

dual

user

and

isno

tto

bedi

ssem

inat

edbr

oadl

y.

336 CHAO, VIDACOVICH, AND GREEN

Page 9: Rasch Analysis of the Rosenberg Self-Esteem Scale With

In terms of targeting and construct coverage, the Raschmodel allows us to put items and persons on the same scale tounderstand how the scale is functioning for this sample. Thisscale measures African American college students’ perceptionsof self-esteem. The persons near the top, on the left-hand sideof Figure 1 are those who perceived their self-esteem to be low,while persons near the bottom are those who had high self-esteem. Figure 1 is the map for the calibration subsample, witha very similar display was found for the validation subsamplePersons are not spread evenly throughout the map; there are

numerous persons at the low end of the trait continuum whosepositions are not assessed well by items and the response scaleused. The person mean for the calibration subsample was �1.74and �1.82 for the validation subsample, and �1.47 for the twosubsamples combined. If this scale were to be revised, someitems at the same logit position might be deleted and replacedwith very easy items to extend construct coverage and tomeasure participants with high self-esteem better. They couldalso be replaced with some items at intermediate positions tofill in gaps, though items are fairly well distributed along thecontinuum covered. This is an indication that this measure maynot be ideal for African Americans. The distribution of items,seen on the right-hand side of Figure 1, had values between �2and 1 showing that this is a good measure along this range.

Although the RSES has been shown to be reliable in previousstudies, our purpose was to verify that it is a reliable measurespecifically for African American college students. Reliabilityis measured by computing person and item spread across themeasure. Regarding the person separation, a separation of atleast 2.0 is desirable and higher levels of separation indicate agreater spread of items and persons. In the calibration sub-sample, person separation was 1.61, with reliability of personseparation of .66, and a Cronbach’s alpha of .82. Person sepa-

Table 3Calibration, Validation, and Full Sample Category Structure

CategoryObserved

count Observed %Observedaverage

Sampleexpect Infit MNSQ

OutfitMNSQ

Stepstructure

Categorymeasure

Calibration sample with 4-category scale1 1,536 60 �3.03 �2.99 .92 .94 NONE (�2.96)2 755 30 �1.57 �1.62 .91 .95 �1.78 �.833 204 8 �.19 �.45 .80 .71 .29 .954 49 2 �.40 .56 1.89 2.76 1.49 (2.77)

Calibration sample with categories 3 and 4 collapsed1 1,536 60 �2.44 �2.42 .96 .97 NONE (�2.27)2 755 30 �.75 �.81 .93 .96 �1.10 .003 253 10 .81 .89 1.11 1.24 1.10 (2.27)

Validation sample with 4-category scale1 1,551 58 �3.03 �3.00 .96 .97 NONE (�3.14)2 882 33 �1.57 �1.60 .91 .89 �1.98 �.833 196 7 �.24 �.43 .86 .86 .51 1.044 48 2 �.12 .53 1.57 2.15 1.47 (2.79)

Validation sample with categories 3 and 4 collapsed1 1,551 58 �2.51 �2.54 1.03 1.03 NONE (�2.51)2 882 33 �.80 �.80 1.00 .99 �1.36 .003 244 9 .91 1.02 1.14 1.25 1.36 (2.51)

Full sample with 4-category scale1 3,087 59 �3.02 �2.99 .94 .95 NONE (�3.05)2 1,637 31 �1.56 �1.60 .91 .91 �1.87 �.833 400 8 �.22 �.44 .83 .79 .40 .994 97 2 �.27 .54 1.72 2.46 1.47 (2.77)

Full sample with categories 3 and 4 collapsed1 3,087 59 �2.41 �2.39 .96 .97 NONE (�2.35)2 1,637 31 �.76 �.81 .93 .92 �1.19 .003 497 10 .79 .87 1.11 1.23 1.19 (2.35)

Note. Observed count is the number of all responses to a category. Observed percentage is the percent of all responses in that category. Observed averageis the average of the measures that are modeled to produce the responses observed in the category. Infit MNSQ is the average of the infit MNSQs associatedwith responses in that category. Step structure is the logit position at which transition is made from a lower to this category.

Table 4Results of Independent Samples t-Test for in Therapy andAttempted Suicide

Medication andattempted suicide n Mean Standard deviation

On medication?Yes 5 �.99 1.51No 518 �2.62 1.64

Attempted suicide?Yes 22 �1.02 1.11No 501 �2.68 4.20

Thi

sdo

cum

ent

isco

pyri

ghte

dby

the

Am

eric

anPs

ycho

logi

cal

Ass

ocia

tion

oron

eof

itsal

lied

publ

ishe

rs.

Thi

sar

ticle

isin

tend

edso

lely

for

the

pers

onal

use

ofth

ein

divi

dual

user

and

isno

tto

bedi

ssem

inat

edbr

oadl

y.

337RAUSCH ANALYSIS AND RES

Page 10: Rasch Analysis of the Rosenberg Self-Esteem Scale With

ration was 1.72 for the validation subsample, with reliability ofperson separation of .69, and a Cronbach’s alpha of .84. For thecombined samples, person separation was 1.44, with reliabilityof person separation of .67, and a Cronbach’s alpha of .83.These are indicators that this is a reliable measure for thissample of African Americans but with minimal separation.

Question 4

Is differential item functioning found between groups definedby sex and having received therapy or professional assistance?

Regarding invariance, differential item functioning (DIF) was ex-amined to ensure that the items were not functioning differentlybetween groups. DIF introduces measurement bias which occurswhen people from different groups (i.e., sex) have a different proba-bility of giving a certain response on the RSES. Specifically, if anitem measures the same content of self-esteem across African Amer-ican men and women students, then, except for random variations, thesame or similar pattern of item scores in self-esteem should be foundirrespective of the nature of the group. Items in the RSES that givedifferent score patterns for African American men and women stu-dents display DIF (Holland & Wainer, 1993), and items displayingDIF would normally be revised or discarded. Indeed, DIF in psycho-metric tests has long been recognized as a potential source of bias inperson measurement (Lord, 1980). We assessed DIF with a t test ofthe significance of differences in item logit position and also used theMantel-Haenszel test (Holland & Wainer, 1993). Evaluated at asignificance level of .01, there was no DIF between male and female,or having received therapy or professional assistance versus not witheither statistical test. Thus, no items in RSES displayed DIF and noitems would be revised or discarded. These were assessed for bothsubsamples with the same results. This means that we can considerthe measure as invariant across those variables (Lord, 1980).

Question 5

Are there differences or correlations for RSES raw scores and logitperson position by sex, age, and variables related to mental health?

Relationships with background variables. The Rasch per-son logit position was used to assess relationships and differenceswith background variables. No significant differences were foundfor sex. No relationship was found with age.

Differences in person logit position. Because the mentalhealth questions yielded unbalanced groups in their responses,only yielding 2% responding yes on some questions, the dif-ferences in groups were assessed with t tests using the entiresample. A person logit position score was calculated for eachperson and then five separate t tests were conducted. Assump-tions of normality and homogeneity of variance were checkedand upheld with the exception of a slight violation of normalityfor the hospitalization and family emotional/mental disorderitems. There was no statistically significant difference found for“Have you been in therapy before or received any professionalassistance?” or “Does any member of your family suffer froman emotional or mental disorder?” or “Have you ever beenhospitalized for psychological/psychiatric problems?” Therewas a statistically significant difference for participants for thequestion “Are you on medication related to mental illness?;”t(521) � �2.22, p � .027, d � 1.03, with a higher logit position

for self-esteem score for those answering yes (see Table 4), andtherefore lower self-esteem. There were few persons responding“yes” to this question, however. There was also a statisticallysignificant difference for the question, “Have you ever at-tempted suicide?;” t(506) � �4.82, p � .001, d � .54, with ahigher mean for self-esteem score for those answering yes (seeTable 4), and therefore lower self-esteem. This means that thosewho had been in hospitalized for a mental disorder or hadattempted suicide responded with lower self-esteem scores thatthose who had not. We controlled the Type I error rate when weconducted the five t tests. If a Benjamini-Hochberg (Benjamini& Hochberg, 1995) control for Type I error was used, only theresult pertaining to suicide attempts would be declared signif-icant.

Discussion

The purpose of this study was to evaluate the use of the RSESwith African American clients by using Rasch analysis as wellas EFA. Our analyses provide several unique contributions tounderstanding the functionality of the RSES not present inearlier research. First, the RSES appeared to have adequatemeasurement properties for two subsamples and we found thatthe RSES was reasonably unidimensional for African Ameri-cans. Additionally, our findings showed that African Ameri-cans’ self-esteem was quite high in these undergraduates. How-ever, we also found several issues may affect African Americanstudents’ self-esteem. First, those who are on medication for amental disorder had lower self-esteem than others who were noton medication. Second, although it is not surprising that thosewho have attempted suicide had lower self-esteem, this findingcould serve as an alert to educators, practitioners, and parentsthat lower levels of self-esteem may make African Americanstudents vulnerable to psychological risks such as a suicideattempt.

A question remains of whether the RSES targets this popu-lation well. Our findings indicate that the fit between this scaleand African American students was not strong unless somequestions are added to widen the “ruler” on the higher self-esteem side. This offset in targeting has been noted in otherstudies (e.g., Classen, Velozo, & Mann, 2007). If the RSES isused diagnostically to help identify African American studentswith low self-esteem, then there is no need to add items.However, the current scale does not adequately reflect AfricanAmerican students with high self-esteem and therefore needsrevision if it is to be used normatively, as was its original intent.An alternative to changing items, however, might be to modifythe response scale. An unbalanced response scale might be usedin future research with response options such as extremelystrongly agree, strongly agree, somewhat agree, agree, dis-agree which would possibly serve to extend the reach of thescale to higher levels of self-esteem.

Our findings show that some items were problematic for oneof the two samples, although not for both, and not only in ourRasch analysis but also in the literature (Items 5 and 8). Thesame items being problematic in our analysis and the literaturereveals that some concerns could exist across populations withthose items. As scholars (e.g., Jackson et al., 2004; Rosenberg,1965) suggested that African Americans’ self-esteem should be

Thi

sdo

cum

ent

isco

pyri

ghte

dby

the

Am

eric

anPs

ycho

logi

cal

Ass

ocia

tion

oron

eof

itsal

lied

publ

ishe

rs.

Thi

sar

ticle

isin

tend

edso

lely

for

the

pers

onal

use

ofth

ein

divi

dual

user

and

isno

tto

bedi

ssem

inat

edbr

oadl

y.

338 CHAO, VIDACOVICH, AND GREEN

Page 11: Rasch Analysis of the Rosenberg Self-Esteem Scale With

understood within their own cultural context, Item 5 “I feel I donot have much to be proud of” seemed inconsistent with AfricanAmericans’ experiences that African Americans interpret theirsuccess in their everyday world as a way to demonstrate con-fidence and empowerment. Additionally, Item 8 “I wish I couldhave more respect for myself” may miss African Americans’racial pride (Jackson et al., 2004).

As African American clients who were reported to have lowutilization of mental services (Constantine, 2007), it was hopedthat, based on our results regarding the fit of RSES and AfricanAmerican clients, psychologists can provide service sensitive tothem. Specifically, African American clients are appropriatelycautious about seeking mental health services, including psy-chological assessment and testing. Historically, those individ-uals who sought services were pathologized, overmedicated,and were exposed to insensitive therapists who did not appre-ciate the role of African American culture in client’s problems.Psychologists who use the RSES with African American clientsshould be aware of the limitations in using these two items todescribe African American clients’ self-esteem. Additionally,as African American students increased their enrollment incollege, from 10% to 15% between 2002 to 2012 (U.S. Depart-ment of Education, 2015), our results can be considered to helppeople understand African American students’ self-esteem andfurther consider whether and how RSES described their self-esteem based on their worldviews.

The RSES reliability found in this study is similar to thatfound in previous African American samples, with alphas rang-ing from .83 to .86. (e.g., Mobley et al., 2005; Utsey, Pon-terotto, Reynolds, & Cancelli, 2000). The self-esteem of theAfrican American students enlisted in this study was higherthan that reported by Franck, De Raedt, Barbez, and Rosseel(2008) for a sample of Dutch adults, higher than that reportedby Richardson, Ratner, and Zumbo (2009) for a sample ofBritish Columbia adults, and lower than that reported by Elion,Wang, Slaney, and French (2012) for two samples of AfricanAmerican university students from Mid-Atlantic and Southernuniversities. It is recommended that future research continuewith the goal of providing normative information on RSESscores by categories of demographic variables.

Importantly, Figure 1 shows that the distribution of itemsbetween �2 and 1 and this distribution indicates the RSES is agood measure along this range. Yet, if the RSES were to berevised, some questions (i.e., “I feel I do not have much to beproud of”) need to be considered for revision such as deletionor expanding the range. There is a noticeable problem whenlooking at the items and persons on the same map. The items arenot sufficiently different to provide coverage of the constructfor the entire population, particularly those with high self-esteem. The items are measuring a portion of the population,with lower self-esteem, but the portion with higher self-esteemnot as well measured. This is problematic since the lowerportion of the graph (high self-esteem) is where the majority ofthis sample lies. More items need to be added to better measureand separate persons with higher self-esteem since it appearsthat African American college students have high self-esteem ingeneral, a finding supported by Hatcher (2007). In short, itemsare needed that expand coverage of the construct. Because theRSES targeting is to students with lower self-esteem, it could be

a supplement for diagnosis. It is important to know that AfricanAmerican students’ high self-esteem may relate to their racialpride (Rowles & Duan, 2012). For African Americans, higherlevels of self-esteem may be related to their positive experi-ences about their cultural heritage which may be not clearlymeasured by Rosenberg (1965) who developed the items forglobal definition of self-esteem. When African American stu-dents have a proud, informed, and sober perspective of theirrace/ethnicity, they are more likely to experience increasedsuccess in academic work. That is, higher self-esteem for Af-rican Americans may imply that they may engage in activitiesthat promote feelings of racial knowledge, pride, and connec-tion. The high self-esteem of African American students indi-cates that positive racial socialization may be beneficial to themental health of African American youth and underscores thepossible limitations of applying Rosenberg’s (1965) RSES.Thus, validity of the RSES for this group needs to be examinedand if the RSES is used to assess students with high self-esteem,items should be added to capture high levels of self-esteem.

Psychologists and scholars often use the RSES to measureAfrican Americans just as they measure the general populationwithout considering how well-targeted the RSES is for theirAfrican American clients (see Figure 1). It should be noted thatthe normative sample of the RSES came from the generalpopulation who may have had very different life experiencesthan African Americans. According to our results, when ad-dressing African Americans’ self-esteem, it is important toinclude their negative life experiences and positive protectivefactors (Gibbs, 1997). For example, the U.S. Surgeon General(2001) indicated that African Americans have a significantsphere of collectively defined interests instead of suffering fromlow self-esteem. Thus, such psychological resources have en-abled many African Americans to overcome adversity andremind researchers and psychologists to understand AfricanAmerican self-esteem from their cultural framework.

Limitations and Suggestions for Future Research

Our study had three limitations which leads us to suggestdirections for future research. First, this study was based onself-reported data in which participants may interpret the itemsdifferently from the original intentions of Rosenberg (1965).Future studies can incorporate other scales such social desir-ability to control participants’ tendency of positively presentingtheir self-image. Future studies can also include AfricanAmerican-centered items in which African American self-esteem is understood in its own black community culturalsetting. Second, although our study revealed those who were onmedication or had attempted suicide had lower levels of self-esteem, this study was not designed to explore causation; thatis, to decide whether low self-esteem caused suicide attempts orvice versa (i.e., suicide attempt caused low self-esteem). Futurestudies are needed to examine the relationship between medi-cation, suicide attempts, and self-esteem.

Future research should continue examination of the dimen-sional structure of the RSES along with item functioning. It isrecommended that two new items be proposed to replace Items5 and 8, which have been found to function poorly in severalstudies. While these items fit adequately in this study, replace-

Thi

sdo

cum

ent

isco

pyri

ghte

dby

the

Am

eric

anPs

ycho

logi

cal

Ass

ocia

tion

oron

eof

itsal

lied

publ

ishe

rs.

Thi

sar

ticle

isin

tend

edso

lely

for

the

pers

onal

use

ofth

ein

divi

dual

user

and

isno

tto

bedi

ssem

inat

edbr

oadl

y.

339RAUSCH ANALYSIS AND RES

Page 12: Rasch Analysis of the Rosenberg Self-Esteem Scale With

ment of these two items by items reflecting higher levels ofself-esteem would expand construct coverage and enhance thescale’s use as a normative measure. Specifically, the addeditems should reflected self-esteem based on African Americanheritage or racial pride. Some suggested items could be “I feelproud of being an African American” or “I have a positiveattitude toward my culture.” Also, overall scale indices (e.g.,reliability of person separation) improved slightly when therating scale was collapsed from four to three points. Futurestudies should examine whether this result is confirmed withnew participant samples. Future studies could also use an un-balanced rating scale to extend the measure to assess higherself-esteem better.

Finally, as several scholars (e.g., Broman et al., 2010) advo-cate, most studies on self-esteem has been conducted withEurocentric theories of causation and effects. To make theself-esteem items more attuned to African Americans, futurestudies can consider self-worth and self-value items pertainingto African Americans within the historical context of AfricanAmerican slavery, of postslavery racism, and of today’s envi-ronmental conditions. To further advance our knowledge aboutself-esteem of other ethnic/racial groups, further studies canconduct Rasch analysis of RSES with members of differentethnic/racial groups.

In conclusion, we conducted EFA and Rasch analysis toevaluate the appropriate use of RSES with African Americanstudents. Because of the social and cultural experiences AfricanAmerican students have, it is necessary to evaluate the appro-priateness of RSES for this population. With EFA and Raschanalysis, we found that RSES could serve as a supplement to thediagnosis of African American students with low self-esteem.However, for African American students with higher self-esteem, we suggest adding or revising items to include culture-related strengths or racial pride to better describe the perceptionof self-esteem from an African American cultural position. Ashigher categories of the scale were infrequently used, we alsorecommend future research review psychometric propertieswhen an unbalanced rating scale is employed.

References

Aluja, A., Rolland, J.-P., García, L. F., & Rossier, J. (2007). Dimensionality ofthe Rosenberg Self-Esteem Scale and its relationships with the three-and thefive-factor personality models. Journal of Personality Assessment, 88, 246–249. http://dx.doi.org/10.1080/00223890701268116

Alwin, D. F., & Jackson, D. J. (1981). Application of simultaneous factoranalysis to issues of factorial invariance. In D. D. Jackson & E. F.Borgotta (Eds.), Factor analysis and measurement in sociological re-search: A multidimensional perspective (pp. 249–279). Beverly Hills,CA: Sage.

Baranik, L. E., Meade, A. W., Lakey, C. E., Lance, C. E., Hu, C., Hua, W.,& Michalos, A. (2008). Examining the differential item functioning ofthe Rosenberg Self-Esteem Scale across eight countries. Journal ofApplied Social Psychology, 38, 1867–1904.

Benjamini, Y., & Hochberg, Y. (1995). Controlling the false discoveryrate: A practical and powerful approach to multiple testing. Journal ofthe Royal Statistical Society Series B. Methodological, 57, 289–300.

Bond, T. G., & Fox, C. M. (2007). Applying the Rasch model: Fundamen-tal measurement in the human sciences (2nd ed.). Mahwah, NJ: Erl-baum.

Broman, C. L., Torres, M., Canady, R. B., Neighbors, H. W., & Jackson,J. S. (2010). Race and ethnic self-identification influences on physicaland mental health statuses among blacks. Race and Social Problems, 2,81–91. http://dx.doi.org/10.1007/s12552-010-9032-0

Chao, R. C.-L., & Green, K. E. (2013). Rasch analysis of the outcomequestionnaire with African Americans. Psychological Assessment, 25,568–582. http://dx.doi.org/10.1037/a0032083

Chao, R. C.-L., Mallinckrodt, B., & Wei, M. (2012). Co-occurring pre-senting problems in African American college clients reporting racialdiscrimination distress. Professional Psychology: Research and Prac-tice, 43, 199–207. http://dx.doi.org/10.1037/a0027861

Clark, R., Anderson, N. B., Clark, V. R., & Williams, D. R. (1999). Racismas a stressor for African Americans: A biopsychosocial model. AmericanPsychologist, 54, 805–816. http://dx.doi.org/10.1037/0003-066X.54.10.805

Classen, S., Velozo, C. A., & Mann, W. C. (2007). The RosenbergSelf-Esteem Scale as a measure of self-esteem for the noninstitutional-ized elderly. Clinical Gerontologist, 31, 77–93. http://dx.doi.org/10.1300/J018v31n01_06

Constantine, M. G. (2007). Racial microaggressions against African Amer-ican clients in cross-racial counseling relationships. Journal of Counsel-ing Psychology, 54, 1–16. http://dx.doi.org/10.1037/0022-0167.54.1.1

Constantine, M. G., & Blackmon, S. M. (2002). Black adolescents’ racialsocialization experiences: Their relations to home, school, and peerself-esteem. Journal of Black Studies, 32, 322–335. http://dx.doi.org/10.1177/002193470203200303

Constantine, M. G., & Sue, D. W. (Eds.). (2006). Addressing racism:Facilitating cultural competence in mental health and educational set-tings. Hoboken, NJ: Wiley.

Crandal, B. R., Foster, S. L., Chapman, J. E., Cunningham, P. B., Brennan,P. A., & Whitmore, E. A. (2015). Therapist perception of treatmentoutcome: Evaluating treatment outcomes among youth with antisocialbehavior problems. Psychological Assessment, 27, 710–725. http://dx.doi.org/10.1037/a0038555

Elion, A. A., Wang, K. T., Slaney, R. B., & French, B. H. (2012).Perfectionism in African American students: Relationship to racial iden-tity, GPA, self-esteem, and depression. Cultural Diversity & EthnicMinority Psychology, 18, 118–127. http://dx.doi.org/10.1037/a0026491

Franck, E., De Raedt, R., Barbez, C., & Rosseel, Y. (2008). Psychometricproperties of the Dutch Rosenberg Self-Esteem Scale. PsychologicaBelgica, 48, 25–35. http://dx.doi.org/10.5334/pb-48-1-25

Geisinger, K. F., Bracken, B. A., Carlson, J. F., Hansen, J. C., Kuncel,N. R., Reise, S. P., & Rodriguez, M. C. (2013). APA handbook of testingand assessment in psychology, Vol. 2: Testing and assessment in clinicaland counseling psychology. Washington, DC: American PsychologicalAssociation. http://dx.doi.org/10.1037/14048-000

Gibbs, J. T. (1997). African American suicide: A cultural paradox. Suicide& Life-Threatening Behavior, 27, 68–79.

Glantz, M. D., & Johnson, J. L. (Eds.). (1999). Resilience and develop-ment: Positive life adaptations. New York, NY: Kluwer/Plenum Press.

Gray-Little, B., Williams, V. S. L., & Hancock, T. D. (1997). An itemresponse theory analysis of the Rosenberg Self-Esteem Scale. Person-ality and Social Psychology Bulletin, 23, 443–451. http://dx.doi.org/10.1177/0146167297235001

Gutman, L. M., Friedel, J. N., & Hitt, R. (2003). Keeping adolescents safefrom harm: Management strategies of African-American families in ahigh-risk community. Journal of School Psychology, 41, 167–184.http://dx.doi.org/10.1016/S0022-4405(03)00043-8

Hambleton, R. K., & Jones, R. W. (1993). Comparison of classical testtheory and item response theory and their applications to test develop-ment. Educational Measurement: Issues and Practice, 12, 38–47. http://dx.doi.org/10.1111/j.1745-3992.1993.tb00543.x

Thi

sdo

cum

ent

isco

pyri

ghte

dby

the

Am

eric

anPs

ycho

logi

cal

Ass

ocia

tion

oron

eof

itsal

lied

publ

ishe

rs.

Thi

sar

ticle

isin

tend

edso

lely

for

the

pers

onal

use

ofth

ein

divi

dual

user

and

isno

tto

bedi

ssem

inat

edbr

oadl

y.

340 CHAO, VIDACOVICH, AND GREEN

Page 13: Rasch Analysis of the Rosenberg Self-Esteem Scale With

Hatcher, J. (2007). The state of measurement of self-esteem of AfricanAmerican women. Journal of Transcultural Nursing, 18, 224–232.http://dx.doi.org/10.1177/1043659607301299

Hatcher, J., & Hall, L. A. (2009). Psychometric properties of the RosenbergSelf-Esteem Scale in African American single mothers. Issues in MentalHealth Nursing, 30, 70–77. http://dx.doi.org/10.1080/01612840802595113

Holland, P. W., & Wainer, H. (1993). Differential item functioning. Hills-dale, NJ: Erlbaum.

Jackson, J. S., Torres, M., Caldwell, C. H., Neighbors, H. W., Nesse, R. M.,Taylor, R. J., . . . Williams, D. R. (2004). The National Survey ofAmerican Life: A study of racial, ethnic and cultural influences onmental disorders and mental health. International Journal of Methods inPsychiatric Research, 13, 196–207. http://dx.doi.org/10.1002/mpr.177

Krause, N. (1983). The racial context of Black self-esteem. Social Psy-chology Quarterly, 46, 98–107. http://dx.doi.org/10.2307/3033846

Kuster, F., Orth, U., & Meier, L. L. (2012). Rumination mediates theprospective effect of low self-esteem on depression: A five-wave lon-gitudinal study. Personality and Social Psychology Bulletin, 38, 747–759. http://dx.doi.org/10.1177/0146167212437250

Linacre, J. M. (1994). Sample size and item calibration stability. RaschMeasurement Transactions, 7, 328.

Linacre, J. M. (1997). KR-20/Cronbach Alpha or Rasch Person Reliability:Which tells the “truth?”. Rasch Measurement Transactions, 11, 580–581.

Linacre, J. M. (2004). Rasch model estimation: Further topics. Journal ofApplied Measurement, 5, 95–110.

Linacre, J. M. (2012). A user’s guide to winstepsministep 3.70.0: Rasch-model computer programs. Chicago, IL: Winsteps.

Linacre, J. M. (2014). Dimensionality: Contrasts and variance. Retrievedfrom www.winsteps.com/winman/principalcomponents.htm

Linacre, J. M. (2016). A user’s guide to winsteps 3.70.0: Rasch-modelcomputer programs. Chicago, IL: Winsteps.

Lincoln, K. D., & Chae, D. H. (2010). Stress, marital satisfaction, andpsychological distress among African Americans. Journal of FamilyIssues, 31, 1081–1105. http://dx.doi.org/10.1177/0192513X10365826

Lord, F. M. (1980). Applications of item response theory to practicaltesting problems. Hillsdale, NJ: Erlbaum.

Luthar, S. S., Cicchetti, D., & Becker, B. (2000). The construct of resil-ience: A critical evaluation and guidelines for future work. Child De-velopment, 71, 543–562. http://dx.doi.org/10.1111/1467-8624.00164

Mannarini, S. (2010). Assessing the Rosenberg Self-Esteem Scale dimen-sionality and items functioning in relation to self-efficacy and attach-ment styles. Testing Psicometria Metodologia, 17, 229–242.

Marsh, H. W., Scalas, L. F., & Nagengast, B. (2010). Longitudinal tests ofcompeting factor structures for the Rosenberg Self-Esteem Scale: Traits,ephemeral artifacts, and stable response styles. Psychological Assess-ment, 22, 366–381. http://dx.doi.org/10.1037/a0019225

Merbitz, C., Morris, J., & Grip, J. C. (1989). Ordinal scales and founda-tions of misinference. Archives of Physical Medicine and Rehabilitation,70, 308–312.

Mobley, M., Slaney, R. B., & Rice, K. G. (2005). Cultural validity of theAlmost Perfect Scale–Revised for African American college students.Journal of Counseling Psychology, 52, 629–639.

Owens, T. J. (1994). Two dimensions of self-esteem: Reciprocal effects ofpositive self-worth and self-deprecation on adolescent problems. Amer-ican Sociological Review, 59, 391– 407. http://dx.doi.org/10.2307/2095940

Petrillo, J., Cano, S. J., McLeod, L. D., & Coon, C. D. (2015). Usingclassical test theory, item response theory, and Rasch measurementtheory to evaluate patient-reported outcome measures: A comparison ofworked examples. Value in Health, 18, 25–34. http://dx.doi.org/10.1016/j.jval.2014.10.005

Prieto, L., Alonso, J., & Lamarca, R. (2003). Classical Test Theory versusRasch analysis for quality of life questionnaire reduction. Health and

Quality of Life Outcomes, 1, 27–40. http://dx.doi.org/10.1186/1477-7525-1-27

Quintão, S., Delgado, A. R., & Prieto, G. (2011). Avaliação da escala deauto-estima de Rosenberg mediante o modelo de Rasch [Evaluation ofRosenberg Self-Esteem Scale using Rasch model]. Psicologia: Revistada Associação Portuguesa Psicologia, 25, 87–101.

Rasch, G. (1960). Probabilistic models for some intelligence and attain-ment tests. Copenhagen, Denmark: Danmarks Paedagogiske Institut.

Rasch, G. (1980). Probabilistic models for some intelligence and attain-ment tests. Chicago, IL: University of Chicago.

Richardson, C. G., Ratner, P. A., & Zumbo, B. D. (2009). Further supportfor multidimensionality within the Rosenberg Self-Esteem Scale. Cur-rent Psychology, 28, 98–114. http://dx.doi.org/10.1007/s12144-009-9052-3

Ridley, C. R. (2005). Overcoming unintentional racism in counseling andtherapy: A practitioner’s guide to intentional intervention. ThousandOaks, CA: Sage.

Rosenberg, M. (1965). Society and the adolescent self-image. Princeton,NJ: Princeton University Press.

Rosenberg, M. (1972). Race, ethnicity, and self-esteem. In S. S. Guterman(Ed.), Black psyche: The modal personality patterns of black Americans.Oxford, UK: Glendessary.

Roth, M., Decker, O., Herzberg, P. Y., & Brähler, E. (2008). Dimension-ality and norms of the Rosenberg Self-esteem Scale in a German generalpopulation sample. European Journal of Psychological Assessment, 24,190–197. http://dx.doi.org/10.1027/1015-5759.24.3.190

Rowles, J., & Duan, C. (2012). Perceived racism and encouragementamong African American adults. Journal of Multicultural Counselingand Development, 40, 11–23. http://dx.doi.org/10.1111/j.2161-1912.2012.00002.x

Royal, K. D., Ensslen, A., Ellis, A., & Homan, A. (2010). Rating scaleoptimization in survey research: An application of the Rasch RatingScale Model. Journal of Applied Quantitative Methods. Advance onlinepublication.

Schmitt, D. P., & Allik, J. (2005). Simultaneous administration of theRosenberg Self-Esteem Scale in 53 nations: Exploring the universal andculture-specific features of global self-esteem. Journal of Personalityand Social Psychology, 89, 623–642. http://dx.doi.org/10.1037/0022-3514.89.4.623

Seaton, E. K., Caldwell, C. H., Sellers, R. M., & Jackson, J. S. (2010). Anintersectional approach for understanding perceived discrimination andpsychological well-being among African American and Caribbean Blackyouth. Developmental Psychology, 46, 1372–1379. http://dx.doi.org/10.1037/a0019869

Smith, A. B., Rush, R., Fallowfield, L. J., Velikova, G., & Sharpe, M.(2008). Rasch fit statistics and sample size considerations for polyto-mous data. BMC Medical Research Methodology, 8, 33–43. http://dx.doi.org/10.1186/1471-2288-8-33

Smith, E. V., Jr. (2001). Evidence for the reliability of measures andvalidity of measure interpretation: A Rasch measurement perspective.Journal of Applied Measurement, 2, 281–311.

Song, H., Cai, H., Brown, J. D., & Grimm, K. J. (2011). Differential itemfunctioning of the Rosenberg Self-Esteem Scale in the U.S. and China:Measurement bias matters. Asian Journal of Social Psychology, 14,176–188.

Trippi, J., & Cheatham, H. E. (1991). Counseling effects on AfricanAmerican college student graduation. Journal of College Student Devel-opment, 32, 342–349.

U.S. Department of Education, National Center for Education Statistics.(2015). Digest of Education Statistics. Washington, DC: National Centerfor Education Statistics.

U.S. Department of Health and Human Services. (2001). Mental health: Areport of the Surgeon General. Rockville, MD: U.S. Department ofHealth and Human Services.

Thi

sdo

cum

ent

isco

pyri

ghte

dby

the

Am

eric

anPs

ycho

logi

cal

Ass

ocia

tion

oron

eof

itsal

lied

publ

ishe

rs.

Thi

sar

ticle

isin

tend

edso

lely

for

the

pers

onal

use

ofth

ein

divi

dual

user

and

isno

tto

bedi

ssem

inat

edbr

oadl

y.

341RAUSCH ANALYSIS AND RES

Page 14: Rasch Analysis of the Rosenberg Self-Esteem Scale With

U.S. Surgeon General. (2001). Mental health: Culture, race, andethnicity–A Suppl. mental health: A report of the Surgeon General.Rockville, MD: U.S. Department of Health and Human Services.

Utsey, S. O., Ponterotto, J. G., Reynolds, A. L., & Cancelli, A. A. (2000).Racial discrimination, coping, life satisfaction, and self-esteem amongAfrican Americans. Journal of Counseling and Development, 78, 72–80.http://dx.doi.org/10.1002/j.1556-6676.2000.tb02562.x

Wade, J. C. (2008). Masculinity ideology, male reference group identitydependence, and African American men’s health-related attitudes andbehaviors. Psychology of Men & Masculinity, 9, 5–16. http://dx.doi.org/10.1037/1524-9220.9.1.5

Ward, E. C., Wiltshire, J. C., Detry, M. A., & Brown, R. L. (2013). AfricanAmerican men and women’s attitude toward mental illness, perceptionsof stigma, and preferred coping behaviors. Nursing Research, 62, 185–194. http://dx.doi.org/10.1097/NNR.0b013e31827bf533

Wright, B. D., & Linacre, J. M. (1994). Reasonable mean-square fit values.Rasch Measurement Transactions, 8, 370–371.

Wright, B. D., & Masters, G. N. (1982). Rating scale analysis. Chicago, IL:MESA Press.

Wright, B. D., & Panchapakesan, N. (1969). A procedure for samplefree item analysis. Educational and Psychological Measurement, 29,23– 48. http://dx.doi.org/10.1177/001316446902900102

Appendix

Item Factor Loadings and Mean Square Fit Indices

Calibration sample Validation sample

ItemFactorloading

Infit meansquare

Outfit meansquare

Factorloading

Infit meansquare

Outfit meansquare

Positive attitude toward myself (R) .69 .71 .70 .68 .72 .79I have good qualities (R) .67 .67 .92 .62 .90 .87Feel a failure .66 .85 .67 .62 .99 .79Feel useless .64 .93 .90 .67 .92 .92Am a person of worth(R) .59 .89 .96 .42 1.34 1.50Satisfied with myself(R) .56 .82 1.08 .63 .79 .97Do things as well as most(R) .54 .91 1.26 .57 .95 1.32Am no good .53 1.18 1.24 .68 .98 1.00Do not have much to be proud of .53 1.45 1.36 .61 1.22 1.05Wish had more respect for myself .51 1.31 1.28 .58 1.43 1.45

Received July 16, 2015Revision received April 23, 2016

Accepted May 5, 2016 �

Thi

sdo

cum

ent

isco

pyri

ghte

dby

the

Am

eric

anPs

ycho

logi

cal

Ass

ocia

tion

oron

eof

itsal

lied

publ

ishe

rs.

Thi

sar

ticle

isin

tend

edso

lely

for

the

pers

onal

use

ofth

ein

divi

dual

user

and

isno

tto

bedi

ssem

inat

edbr

oadl

y.

342 CHAO, VIDACOVICH, AND GREEN