essential competencies for program evaluators in a diverse cultural context

Evaluation and Program Planning 35 (2012) 439–444

Essential competencies for program evaluators in a diverse cultural context

Yi-Fang Lee a,1, James W. Altschuld b,*, Lung-Sheng Lee c,2

a National Taiwan Normal University, 162 HePing East Road Section 1, Taipei 10610, Taiwanb The Ohio State University, 3253 Newgate Court, Dublin, OH 43017, United Statesc National United University, 1 Lienda, Miaoli 36003, Taiwan

Contents lists available at SciVerse ScienceDirect

Evaluation and Program Planning

journa l homepage: www.e lsev ier .com/ locate /eva lprogplan

A R T I C L E I N F O

Article history:

Received 21 February 2011

Received in revised form 10 July 2011

Accepted 21 January 2012

Available online 8 February 2012

Keywords:

Cultural context

Evaluator competencies

Fuzzy Delphi study

Professional development

A B S T R A C T

Essential evaluator competencies as identified by Stevahn, King, Ghere, and Minnema (2005) were

studied in regard to how well they generalize to an Asian (Taiwan) context. A fuzzy Delphi survey with

two iterations was used to collect data from 12 experts. While most competencies fit Taiwan, there were

a number of unique ones. A complete set of results is provided along with the implications of the findings

and what they might mean for evaluation in Taiwan particularly in relationship to the professionaliza-

tion of evaluation.

� 2012 Elsevier Ltd. All rights reserved.

1. Introduction

It is accepted that a set of core competencies is a sine qua non

condition of a profession (Stevahn, King, Ghere, & Minnema, 2006;Worthen, 1999). What are the core competencies of evaluation andwhere are evaluators in thinking about them and the trainingneeded for quality assessments of programs and projects? Avariety of skills, knowledge and dispositions have been proposedfor evaluators (Anderson & Ball, 1978; Dewey, Montrosse, Schroter,Sullins, & Mattox, 2008; Sanders & Worthen, 1970; Treasury Boardof Canada Secretariat, 2001; Worthen, 1975) although consensusregarding what is required has not yet been reached.

King and her colleagues (King et al., 2001; Stevahn et al., 2005,2006) identified what they termed the essential competencies forprogram evaluators (ECPE). That ground breaking work would beespecially valuable if it generalized to a context quite differentfrom the one in which it was initially developed. To that end, theECPE was studied in Taiwan through the use of a fuzzy Delphisurvey. The major questions were:

(1) D

*

Mal1

2

014

doi:

oes the ECPE framework fit the context of an Asian country?
(2) A re there unique competencies in Taiwan? (3) If so, what factors contribute to them?
Corresponding author. Tel.: +1 614 389 4585; fax: +1 614 688 3258.

E-mail addresses: [email protected] (Y.-F. Lee),

[email protected] (J.W. Altschuld), [email protected] (L.-S. Lee).

Tel.: +886 2 7734 3388; fax: +886 2 2392 9449.

Tel.: +886 37 381871; fax: +886 37 320610.

9-7189/$ – see front matter � 2012 Elsevier Ltd. All rights reserved.

10.1016/j.evalprogplan.2012.01.005

2. Theoretical review

Competence signifies some level of expertise with the multifac-eted abilities that a person needs to be successful in a field (Stevahnet al., 2005). These are complex action systems of knowledge, skillsand strategies that can be applied to work within a profession inconjunction with proper emotions, attitudes, and effective self-regulation (Rychen, 2001). Competencies guide training programs,enhance reflective practice, and drive credentialing, licensure, orprogram accreditation procedures (Altschuld, 1999; Ghere, King,Stevahn, & Minnema, 2006; Stevahn et al., 2005). Recentlyevaluation has directed attention to core competencies to improveits status as a profession or ‘‘near-profession’’ (Worthen, 1999).

Smith (1999, 2003) concluded that although the theory andpractice of evaluation continued to evolve there were persistentschisms within it. One is an epistemological divide betweenconstructivists (Lincoln, 1994) and positivists (Shadish, Cook, &Campbell, 2002) although this may be diminishing with time.Another is whether an evaluator should be a program advocate,who assists programs (Fetterman, 2001) or an independentassessor, who values the importance of objectivity (Scriven, 1997).

Despite such differences, other writers suggest that the field hasattained legitimacy with a recognized body of knowledge, methodsof inquiry, and procedures (Altschuld, 2005; Davis, 1986; Guss-man, 2005). Two evaluation association-based projects (theCanadian Evaluation Society, CES, and the American EvaluationAssociation, AEA) focused on developing inventories of evaluatorcompetencies. The Canadian project was led by Zorzi, McGuire andPerrin (McGuire & Zorzi, 2005; Zorzi, McGuire, & Perrin, 2002;Zorzi, Perrin, McGuire, Long, & Lee, 2002) and for AEA it was

http://dx.doi.org/10.1016/j.evalprogplan.2012.01.005

mailto:[email protected]



http://www.sciencedirect.com/science/journal/01497189

http://dx.doi.org/10.1016/j.evalprogplan.2012.01.005

Y.-F. Lee et al. / Evaluation and Program Planning 35 (2012) 439–444440

conducted by King, Stevahn, Ghere and Minema (Ghere et al., 2006;King et al., 2001; Stevahn et al., 2005, 2006). While the goals of thetwo efforts were similar (producing lists of competenciesapplicable in a wide range of settings) the approaches weresomewhat different.

The CES model included not only competencies but types ofevaluations and phases of evaluation which led to some overlap orrepeating of competencies (Huse & McDavid, 2006). ECPE, on theother hand, linked or cross-walked several sources of competen-cies (CES Essential Skills Series in Evaluation, 1991, AEA GuidingPrinciples for Evaluators, 1995 and Joint Committee on Standardsfor Educational Evaluation, 1994). Due to that feature we choseECPE as a framework for our study.

ECPE, based upon a review of the literature, consisted of aninitial list of competencies which was then validated forappropriateness (King et al., 2001). Feedback was collected fromconference presentations, expert consultations, and comparisonsto existing sources in order to be as comprehensive as possible(Stevahn et al., 2005). The final product is a user-friendly taxonomyof 61 competencies in six categories: (a) professional practice(professional norms and values); (b) systematic inquiry (technicalaspects of evaluations); (c) situational analysis (analyzing andattending to the contextual and political issues related to theevaluation); (d) project management (the nuts-and-bolts ofmanaging an evaluation); (e) reflective practice (understandingone’s practice and level of evaluation expertise); (f) interpersonalcompetence (skills needed for implementing an evaluation). Thesix categories and the skills within them were a springboard forengaging evaluators in self-analysis and group discussion on anarray of issues associated with practice. Stevahn et al. (2005)stressed that systematic studies are needed to achieve agreementon the importance of the essential competencies.

We explored the applicability of ECPE to an Asian contextdistinct from the one in which it was developed. Chouinard andCousins (2009) proposed that cultural differences influenceevaluation methodology and methods selection, intergroupdynamics, cross-cultural understanding, and evaluator roles andprovide insight into value perspectives and conflicts (SenGupta,Hopson, & Thompson-Robinson, 2004). Lee’s study (1997) notedcontrasts between Eastern and Western views. The East, to anotable degree, sees interpersonal relationships, cooperation, andauthoritarian orientation as more meaningful when compared tothe West focus on self-development, self-fulfillment, competition,and democratic decision making. Conceptual views have an effecton evaluation practice and what it means to be a qualifiedevaluator; as an example the East’s greater emphasis on relation-ships could lead to less willingness to deal with or report negativeresults.

3. Description of research context

The focus of this study was perceptions of evaluators involvedin the Program Accreditation administered by the Higher Educa-tion Evaluation and Accreditation Council (HEEAC) of Taiwan. Themissions of the HEEAC are to: develop indicators and mechanismsfor university evaluation; conduct higher education evaluation andaccreditation commissioned by the Ministry of Education (MOE);advise the MOE on policy based on evaluation reports. Predicatedon accreditation results, all programs will be classified asaccredited, conditionally accredited, or as a failure. Such designa-tions are important and affect funding allocations and the size ofstudent enrollments.

Accreditation began in 2006, in response to a national law thatrequired all programs be evaluated once every 5 years. The first-cycle evaluation (from 2006 to 2010) stressed input and processevaluations with the emphasis shifting to process and outcome in

the second cycle (from 2011 to 2015). Procedures include a self-evaluation and a 2 day on-site visit. The HEEAC invites 4–6professors or experts in related fields to serve as evaluators foreach site visit. They attend an orientation and a short trainingworkshop held by the HEEAC. Activities during the site visit consistof interviews with stakeholders (faculty, staff, students, graduates,etc.), document review, survey administration, and classroomobservation.

Since the visits were to be done in a competent manner therewas a need to identify essential competencies for sound evaluationpractice. This is especially relevant in Taiwan where there has beenprogress toward professionalization of evaluation, but at the sametime, shortages of experienced evaluators are apparent, stablecareer opportunities in the field are lacking, and formal prepara-tion programs need to be developed (Lee, Altschuld, & Hung, 2008).Evaluation has been influenced by the US (Lee et al., 2008) yetcultural forces shape what an evaluator does and what constitutesmeaningful evaluation work. These were major considerations inthis study.

4. Methodology

The main goal was to understand if a western model of essentialcompetencies for evaluators would be viewed as compatible to anAsian setting. To what degree does it or does not it fit and if thereare unique skills in Taiwan what are the reasons for that? Asecondary issue was to pinpoint needs for training evaluators.

A Delphi technique with two iterations was utilized. Its basicfeature is sequential questionnaires interspersed with summaryand feedback of opinions derived from previous responses (Dalkey& Helmer, 1963; Hung, Altschuld, & Lee, 2008; Linstone & Turoff,1975; Powell, 2003). The researchers generated a nationwide list of12 panelists using the criteria of acknowledged expertise and 10years of evaluation experience. A letter was sent inviting them tojoin the panel and all agreed to serve on two rounds of the survey.They averaged 17 years of teaching evaluation, doing relevantresearch, and conducting evaluations. They also had extensiveservice with prior evaluations of higher education as well as havingbeen involved in numerous evaluation endeavors.

4.1. First survey

The first survey was developed from the Stevahn et al.framework—61 competencies in six categories. The quality ofthe Mandarin translation was reviewed by several facultymembers and minor modifications were made in accord withtheir suggestions. Then the panelists judged the fit of each item toTaiwan in terms of whether it should be retained, removed, ormodified. For items to be removed or modified, the panel providedconcrete suggestions for refinement. One open-ended questionwas at the end of survey for other comments.

All panelists completed the instrument. Nine items wereretained and 45 were modified with changes noted in the text(see Table 1). It must be emphasized that most modifications wereconsidered to be minor and straightforward (language) with littleor no alteration to the main concept/content of the item. A fewneeded more than fine-tuning. As an example, the item using thephrase ‘specify program theory’ where respondents noted thatprogram theory would not have meaning in Taiwan; thus wordsrelated to program logic models and assumptions underlying theeffectiveness of a program were used. Another case was thewording ‘writes formal agreements’ which would not be com-monly understood, so it was changed to ‘write evaluationproposal.’ In addition, 10 new items were generated and eightdeleted or merged based on panel feedback (Table 2). Many of the

Table 1Retention removal, modification, or addition of essential competency items for program evaluators in Taiwan from the 1st survey.

Category Stevahn et al. (2005) # of items in 1st survey

# of items Retention Removal Modification Addition Total

Professional practice 6 0 2 4 7 11

Systematic inquiry 20 4 2 14 0 18

Situational analysis 12 2 2 8 0 10

Project management 12 3 2 7 1 11

Reflective practice 5 0 0 5 0 5

Interpersonal competence 6 0 0 6 2 8

Total 61 9 8 45 10 63

Table 2Changes made to the first survey for the second iteration.

Category Removals/mergers Additions

Professional practice Conveys personal evaluation approaches and skills to potential clientsa Prepares well prior to the evaluation process

Contributes to the knowledge base of evaluationa Ensures the confidentiality of information

Respects and follows the evaluation process

Avoids possible inside benefit

Remains open to input from others (moved

from another category)

Recognizes and responds to policies related

to evaluation practice

Identifies and supports the purposes

of evaluation tasks

Systematic inquiry Assesses validity and reliability of data (merged two items together)

Analyzes & interprets data\(merged from Interprets data and Analyze data)

Situational analysis Remains open to input from others (moved to the 1st category)a

Modifies the study as neededa

Project management Responds to requests for proposalsa Assesses the cost and benefit for the project

Budgets an evaluation and justifies cost needed(merged two items together)

Interpersonal competence Conducts interviewing task

Considers peer relationship

a Item has been removed.

Y.-F. Lee et al. / Evaluation and Program Planning 35 (2012) 439–444 441

adjustments arose from cultural considerations (see later discus-sion). The result was a 63 item survey for the second round.

4.2. Second survey

The second survey had two scales per item—importance andcurrent level of competency. Each competency was rated as to howimportant it was for a qualified evaluator in Taiwan to have and thecurrent competency level for evaluators (the experts on the panelwere instructed to answer the latter based upon understanding ofall evaluators in the country not on their own level of skill). A fuzzyDelphi technique was used which has a scale range instead of asingle score to represent a rating. The fuzzy approach has thepotential to deal with uncertainty/ambiguity in ratings as is oftenthe case in social science (Ross, 2004). It might be more realistic interms of how respondents think when providing a score for an itemrather than making a single point value judgment. Some other[(Fig._1)TD$FIG]

Fig. 1. Fuzzy Delphi scales us

benefits of the scale are novelty and improving response rate, sincerespondents can more honestly make their ratings (Chang & Wang,2006; Kaufmann & Gupta, 1988). Fig. 1 is an example of fuzzyDelphi scales employed in the second survey.

Panelists marked a range from 0 on the low side to 1 on the highend in .1 increments for importance (what should be) andcompetency (what is). Complex mathematical calculation isrequired to generate fuzzy number results (detailed informationis in an in-process paper). Each item yields the following summaryresults:

mR the highest level of the panel’s judgmentmL the lowestmT a single score for an overall group judgment

The distance between mR and mL is an indication of spread andonly mT values higher than a specified criterion indicate essential

ed in the second survey.

Table 3Number of items in competency categories and those exceeding the criterion for

importance per category.

Category Number

of items

Number of items

exceeding .675

Professional practice 11 8

Systematic inquirya 18 11

Situational analysis 10 4

Project managementa 11 5

Reflective practice 5 4

Interpersonal competence 8 6

Total 63 38

a Some respondents gave a score of 0 for the lower limit of an item which made

their rating unusable for fuzzy numbers where calculations come from geometric

means. These scores were excluded and thus computations were from a lesser

number of respondents.


competencies in terms of importance. It is also possible to deriveneed indices from the gap between mT of importance and mT ofcompetence and the mT scores for an entire category can beestimated. The rationale for needs identification was based on thegap between ‘‘what is,’’ or the current state of affairs, and ‘‘whatshould be,’’ or the desired state (Altschuld & Witkin, 2000). All 12panelists responded to the second round. A criterion .675 forimportance was determined by averaging group perceptions ofwhat the standard should be for high importance areas.

5. Results

5.1. Judgment for essential competencies in the setting

Thirty-eight out of 63 items had mT values for importance at orsurpassing the criterion with the highest percentage (80%) in thereflective practice category followed closely by interpersonalcompetency (75%) as noted in Table 3. The highest competenciesfor importance across categories were ‘Avoids possible insidebenefit’ (mT = .77), ‘Assesses validity and reliability of data’ (.76),‘Develops recommendations’ (.75), ‘Prepares well prior to theevaluation process’ (.75), and ‘Acts ethically and strives forintegrity and honesty in conducting evaluations’ (.75). Five itemsexceeding the criterion for importance were new ones differentfrom Stevahn et al.’s framework. They are ‘Prepares well prior tothe evaluation process’, ‘Ensures the confidentiality of informa-tion’, ‘Respects and follows the evaluation process’, ‘Avoidspossible inside benefit’, and ‘Conducts interviewing task’.

All six of the categories attained mT values for importanceacross their item sets of greater than .66, and four of themsurpassed the .675 criterion (Table 4). Because all of the categorieswere preselected, such patterns were not surprising. ‘Professionalpractice’ was considered the most important, and the least were‘Project management’ and ‘Situational analysis.’ Generally, theresults were in a restricted range of mT values from .66 to .70.

Table 4Fuzzy number values for importance and current competency per category.

Category Importance Current

competency

mR mL mT mR mL mT

Professional practice .82 .43 .70 .74 .53 .60

Systematic inquirya .80 .43 .69 .74 .54 .60

Situational analysis .79 .47 .66 .73 .56 .59

Project managementa .78 .46 .66 .73 .55 .59

Reflective practice .81 .51 .68 .75 .57 .61

Interpersonal competence .81 .42 .69 .76 .52 .62

a Some respondents gave a score of 0 for the lower limit of an item which made

their rating unusable for fuzzy numbers. These scores were excluded from the

analysis.

5.2. Comparison of importance and current competency levels

Similar to importance, a narrow range of mT values forcompetence per category was observed, .59–.62 (Table 4).Evaluators on the whole were more competent in interpersonalcompetence and less so in project management and situationalanalysis. Among the individual items in the framework, evaluatorswere equipped mostly with the competencies of ‘Aware of self asan evaluator’ (mT value for current competency level = .67), ‘Actsethically and strives for integrity and honesty in conductingevaluations’ (.66), and ‘Communicates with clients throughout theevaluation process’ (.66). While mT ratings for importance itemstended to be higher than for current competency the discrepancyfor categories was small ranging from .07 to .1.

5.3. Needs index values for competencies

Means difference analysis (MDA) is a commonly acceptedprocedure for studying discrepancies and it was assumed to be areasonable index for fuzzy scores. A mean of the importance andcompetence ratings across all items in a category was computedand then the gap between them became the standard for looking atdiscrepancies of individual items. If the item gap was higher thanthe standard, the item was a need. Thirty-four items were greaterthan the standard but the highest MDA value was only .19,suggesting that the needs for the competences were not especiallylarge (Table 5). The competencies with higher needs indices thanothers were ‘Prepares well prior to the evaluation process’,‘Assesses validity and reliability of data’, and ‘Reflects on personalevaluation practice’.

6. Discussion

6.1. Generalization of competencies to Taiwan

From the first survey, all panel members agreed on theapproach of Stevahn et al. in relation to the clustered competen-cies; with mostly small changes needed for the Taiwanese context.A majority of items in each category were retained with minormodifications in wording for the second instrument and on thesecond round they were judged high in terms of importance. Basedupon the data, Stevahn et al.’s framework worked relatively wellalthough contextual impacts were identified.

6.2. Contextual influences in the competencies

A few competencies were different for evaluators in Taiwansuch as skills in program management—‘Writes formal agree-ments’, ‘Budgets an evaluation’, and ‘Trains others involved inconducting the evaluation’. Despite the fact that the panel wasexpert in evaluation, the HEEAC context probably influencedresponses. The tasks noted above were done by a HEEAC teamnot by evaluators and in Taiwan a large number of evaluatorshave quite variable knowledge/experience in program manage-ment.

Similar observations were seen in the following competenciesthat are normally completed before an evaluation begins or early inthe process—‘Conducts literature review’, ‘Frames evaluationquestions’, and ‘Develops evaluation designs’. Evaluators seldomor less than fully deal with these activities so their judgments ofimportance in the second survey tended to be on the lower end ofthe spectrum (mT values less than .60). In regard to ‘Systematicinquiry’ skills, respondents commented that a detailed list of suchcompetencies was not germane since the career of evaluation wasnot well developed in the country. Most faculty, even those morespecialized in evaluation, work on it only part-time. This was also

Table 5Examples of items with higher needs indices per category.

Category Examples of items exceeding the cutoff scorea MDAb

Professional practice Prepares well prior to the evaluation process .19

Ensures the confidentiality of information .15

Avoids possible inside benefit .13

Systematic inquiry Assesses validity and reliability of data .17

Analyzes and interprets data .14

Identifies data sources .12

Situational analysis Serves the information needs of intended users .14

Respects the uniqueness of the evaluation site and client .13

Addresses conflicts .11

Project management Trains others involved in conducting the evaluation .14

Budgets an evaluation and justifies cost needed .14

Assesses the cost and benefit for the project .10

Reflective practice Reflects on personal evaluation practice .16

Pursues professional development in evaluation .11

Pursues professional development in relevant content areas .10

Interpersonal competence Uses written communication skills .11

Conducts interviewing task .10

Uses negotiation skills .10

a The overall importance and satisfaction difference per category.b The top three discrepancy that exceeded the cutoff score per category. The cutoff scores were .1, .09, .07, .07, .07, and .07 for the six categories, respectively.

Y.-F. Lee et al. / Evaluation and Program Planning 35 (2012) 439–444 443

apparent in items dealing with reliability and validity. These topicswere combined on the second survey as were analyzes andinterprets data (Table 2).

Two items differing from Stevahn et al.’s framework related tocontextual influences. One was ‘Conducts interviewing task’. InTaiwan, all evaluators were required to conduct interviews withfaculty, students and staff in on-site visits; thus the ability wasviewed important. The other was the removal of ‘Contributes to theknowledge base of evaluation’ where the panel commented thatthey contributed more to enhancing program quality thanknowledge enrichment.

6.3. The impact of culture on responses

In Table 2, several items revealed the top-down or authoritarianenvironment. This was obvious by the items that were added to thesurvey ‘Respects and follows the evaluation process’, ‘Recognizesand responds to policies related to evaluation practice’, and‘Identifies and supports the purposes of evaluation tasks’ and thosethat were removed, ‘Modified the study as needed’ and ‘Conveyspersonal evaluation approaches and skills to potential clients’.Some respondents mentioned that because program accreditationwas a government-mandated process, evaluators were expected toadhere to (as opposed to ‘follow’ in the American context) HEEACprocedures designed to ensure consistency and fairness. In Taiwanit was critical that evaluators assess outcomes related to policies,obviously accreditation was a means to carry out governmentpolicy.

Another cultural dimension was apparent in two new items—‘Avoids possible inside benefit’ and ‘Ensures the confidentiality ofinformation’ in professional practice. Due to the relatively smallsize of the country, evaluators and faculty members in the programbeing evaluated might know each other, even to the extent of long-term friendships. It was observed (anecdotally) that a fewevaluators leaked preliminary results/inside information thatundermined the fairness of the evaluation. Hence the concernnoted above. On the other hand, these were delicate interpersonalconnections intricately interwoven into the fabric of a society andrigid or harsh rules might not fully work. Instrument reviewerswere keenly aware of this leading to the item about ‘considers peerrelationship’.

6.4. The usage of fuzzy numbers (potential biases in results)

The idea of using a fuzzy scale to capture the subtle nature ofresponses appeared to work well in the study. All respondentsmade judgments in a range of score values and discrepanciesbetween the upper and lower ends were achieved (Table 4). Theapproach seemed utilitarian, but a few problems occurred in dataanalysis. First, it relies heavily on complex mathematic models andthere are multiple ways to deal with the data (Chen & Hwang,1992). The technique, which is mainly employed in fields likeengineering and computer science, has had limited application andresearch in social science and education. Second, this was a needsassessment where research was even in less abundance and it wasmore of a preliminary or exploratory study.

Given these factors, a straight forward way to calculate scoresbased on Chen and Hwang was used. Does it really fit this situation?Is it appropriate? Are there problems in the calculations? Whatabout the validity and reliability of collecting data with fuzzy scales?These questions underscore the need for further investigation.

Another issue is that some respondents gave a score of 0 for thelower limit of an item which made their rating unsuitable for fuzzynumbers. The calculations are based on geometric means so in thefuture such responses will have to be eliminated through betterdirections to respondents.

6.5. No competencies obtained high need index values

As mentioned previously, a secondary interest of the study was tobegin to identify training needs for evaluators. In Table 5, eventhough some competencies had higher need indices than others,they were not very pronounced and pressing needs for improvementwere not clear. This might stem from the HEEAC’s requirement for allevaluators to complete short-term training on evaluation knowl-edge, skills, and ethics prior to the evaluation. Moreover, there maybe errors inherent inmR andmL values for the needs index due to howthey were calculated and the small number of panel members.

7. Lessons learned

Our study to a high degree validated Stevahn et al.’s taxonomyof evaluator competencies in a context quite different from the one


in which it was developed. It fit Taiwan with the exception of someunique competencies. Additional items (like ‘Avoids possibleinside benefit’ and ‘Ensures the confidentiality of information’)were highly associated with the norms and values of the nationalculture. Thus the assessment of the core body of skills, knowledgeand dispositions for evaluators was improved by taking intoconsideration subtle aspects of the setting.

Another relevant factor affecting how competencies are viewedmight be the level of the evaluation profession in the country.Compared with the West, evaluation in Taiwan is in a prematurestate–unstable career opportunities, limited need for evaluationspecialists, and less formal preparation for evaluators. Under theseconditions, detailed descriptions of competencies might not havemattered that much and that is why some items had to be mergedor were rated low in importance.

Lastly, this was an exploratory investigation and morevalidation is needed. Would other ways of calculating fuzzynumbers produce different results? Will the same results occurwith a larger sample? Would there be similar outcomes in otherAsian countries? If not, what cultural factors influence the results—we just do not know. To answer a few of these questions, theresearchers are conducting a follow-up study to contrast fuzzy andLikert scales as well as to replicate the findings with a much biggergroup of respondents.

Acknowledgement

This study was supported under a National Science CouncilResearch Grant in Taiwan (NSC 98-2410-H-260-034).

References

Altschuld, J. W. (1999). The certification of evaluators: Highlights from a reportsubmitted to the Board of Directors of the American Evaluation Association.American Journal of Evaluation, 20(3), 481–493.

Altschuld, J. W. (2005). Certification, credentialing, licensure, competencies, and thelike: Issues confronting the field of evaluation. Canadian Journal of ProgramEvaluation, 20(2), 157–168.

Altschuld, J. W., & Witkin, B. R. (2000). From needs assessment to action. Thousand Oaks:SAGE Publications.

Anderson, S. B., & Ball, S. (1978). The profession and practice of program evaluators. SanFrancisco, CA: Jossey-Bass.

Chang, P. C., & Wang, Y. W. (2006). Fuzzy Delphi and back-propagation model for salesforecasting in PCB industry. Expert Systems with Applications, 30(4), 715–726.

Chen, S. J., & Hwang, C. L. (1992). Fuzzy multiple attribute decision making: Methods andapplications. New York: Springer-Verlag.

Chouinard, J. A., & Cousins, J. B. (2009). A review and synthesis of current research oncross-cultural evaluation. American Journal of Evaluation, 30(4), 457–494.

Dalkey, N. C., & Helmer, O. (1963). An experimental application method to the use ofexperts. Management Science, 9(3), 458–467.

Davis, B. G. (1986). Overview of the teaching of evaluation across disciplines. NewDirections for Program Evaluation, 29, 5–14.

Dewey, J. D., Montrosse, B. E., Schroter, D. C., Sullins, C. D., & Mattox, J. R. (2008).Evaluator competencies: What’s taught versus what’s sought. American Journal ofEvaluation, 29(3), 268–287.

Fetterman, D. M. (2001). The transformation of evaluation into a collaboration: A visionof evaluation in the 21st century. American Journal of Evaluation, 22(3), 381–385.

Ghere, G., King, J. A., Stevahn, L., & Minnema, J. (2006). A professional development unitfor reflecting on program evaluator competencies. American Journal of Evaluation,27(1), 108–123.

Gussman, T. K. (2005). Improving the professionalism of evaluation. Ottawa: Centre forExcellence in Evaluation, Treasury Board Secretariat of Canada.

Hung, H. L., Altschuld, J. W., & Lee, Y. F. (2008). Methodological and conceptual issuesconfronting a cross-country Delphi study of educational program evaluation.Evaluation and Program Planning, 31(2), 191–198.

Huse, I., & McDavid, J. C. (2006). Literature review: Professionalization of evaluators.Prepared for the CES evaluation professionalization project. University of Victoria.

Kaufmann, A., & Gupta, M. M. (1988). Fuzzy mathematical models in engineering andmanagement science. Amsterdam: North-Holland.

King, J. A., Stevahn, L., Ghere, G., & Minnema, J. (2001). Toward a taxonomy of essentialevaluator competencies. American Journal of Evaluation, 22(2), 229–247.

Lee, E. (1997). Overview: The assessment and treatment of Asian American families. InE. Lee (Ed.), Working with Asian Americans: A guide for clinicians (pp. 3–36). NewYork: Guilford press.

Lee, Y. F., Altschuld, J. W., & Hung, H. L. (2008). Practices and challenges in educationalprogram evaluation in the Asia-Pacific region: Results of a Delphi study. Evaluationand Program Planning, 31(4), 368–375.

Lincoln, Y. S. (1994). Tracks toward a postmodern politics of evaluation. EvaluationPractice, 15(3), 299–310.

Linstone, H. A., & Turoff, M. (1975). The Delphi method: Techniques and applications.Reading. MA Addison Wesley Publishing.

McGuire, M., & Zorzi, R. (2005). Evaluator competencies and professional development.Canadian Journal of Program Evaluation, 20(2), 73–99.

Powell, C. (2003). The Delphi technique: Myths and realities. Journal of AdvancedNursing, 41(4), 376–382.

Ross, T. J. (2004). Fuzzy logic with engineering applications. Hoboken, NJ: John Wiley &Sons Ltd.

Rychen, D. S. (2001). Introduction. In D. S. Rychen & L. H. Salganik (Eds.), Defining andselecting key competencies (pp. 1–15). Seattle, WA: Hogrefe and Huber.

Sanders, J. R., & Worthen, B. R. (1970). An analysis of employers’ perceptions of the relativeimportance of selected research and research-related competencies and shortages ofpersonnel with such competencies. Technical Paper No. 3. Boulder, CO: AERA TaskForce on Research Training, Laboratory of Educational Research.

Scriven, M. (1997). Truth and objectivity in evaluation. In E. Chelimsky & W. R. Shadish(Eds.), Evaluation for the 21st century: A handbook. Thousand Oaks, CA: SagePublications. (pp. xiii, 542 pp.).

SenGupta, S., Hopson, R., & Thompson-Robinson, M. (2004). Cultural competence inevaluation: An overview. New Directions for Evaluation, 102, 5–19.

Shadish, W. R., Cook, T. D., & Campbell, D. T. (2002). Experimental and quasiexperimentaldesigns for generalized causal inference. Boston, MA: Houghton Mifflin.

Smith, M. F. (1999). Should AEA begin a process for restricting membership in theprofession of evaluation? American Journal of Evaluation, 20(3), 521–531.

Smith, M. F. (2003). The future of the evaluation profession. In T. Kellaghan & D. L.Stufflebeam (Eds.), International handbook of educational evaluation (pp. 373–386).Boston, MA: Kluwer Academic.

Stevahn, L., King, J. A., Ghere, G., & Minnema, J. (2005). Establishing essential compe-tencies for program evaluators. American Journal of Evaluation, 26(1), 43–59.

Stevahn, L., King, J. A., Ghere, G., & Minnema, J. (2006). Evaluator competencies inuniversity-based evaluation training programs. Canadian Journal of Program Eval-uation, 20(2), 101–123.

Treasury Board of Canada Secretariat. (2001). Evaluation policy Retrieved December 10,2010, from http://www.tbs-sct.gc.ca.

Worthen, B. R. (1975). Some observations about the institutionalization of evaluation.Evaluation Practice, 16, 29–36.

Worthen, B. R. (1999). Critical challenges confronting certification of evaluators.American Journal of Evaluation, 20(3), 533–555.

Zorzi, R., McGuire, M., & Perrin, B. (2002). Evaluation benefits, outputs, and knowledgeelements: Canadian Evaluation Society project in support of advocacy and professionaldevelopment Retrieved July 28, 2006, from http://consultation.evaluationcana-da.ca/pdf/ZorziCESReport.pdf.

Zorzi, R., Perrin, B., McGuire, M., Long, B., & Lee, L. (2002). Defining the benefits, outputs,and knowledge elements of program evaluation. Canadian Journal of ProgramEvaluation, 17(3), 143–150.

Yi-Fang Lee, Ph. D., is an associate professor at National Taiwan Normal University inTaiwan. She has published and presented in the area of needs assessment and Science,Technology, Engineering, and Mathematics retention with underrepresented minori-ties.

James W. Altschuld, Ph. D., is a professor Emeritus at The Ohio State University. He haspresented and published extensively on evaluation topics, especially on needs assess-ment and the evaluation of science and technology education.

Lung-Sheng Lee, Ph. D., is a professor and the president of National United University,Taiwan. He has participated in a variety of institutional and program evaluation formore than 30 years.

http://www.tbs-sct.gc.ca/

http://consultation.evaluationcanada.ca/pdf/ZorziCESReport.pdf

http://consultation.evaluationcanada.ca/pdf/ZorziCESReport.pdf

essential competencies for program evaluators in a diverse cultural context

Documents