procedures for establishing defensible programmes for assessing practice performance

Procedures for establishing defensible programmes forassessing practice performance

Stephen R Lew,1 Gordon G Page,2 Lambert W T Schuwirth,3 Margarita Baron-Maldonado,4 Joelle M JLescop,5 Neil S Paget,6 Lesley J Southgate7 & Winifred B Wade8

Summary The assessment of the performance of

doctors in practice is becoming more widely accepted.

While there are many potential purposes for such

assessments, sometimes the consequences of the

assessments will be �high stakes�. In these circum-

stances, any of the many elements of the assessment

programme may potentially be challenged. These

assessment programmes therefore need to be robust,

fair and defensible, taken from the perspectives of

consumer, assessee and assessor. In order to inform the

design of defensible programmes for assessing practice

performance, a group of education researchers at the

10th Cambridge Conference adopted a project man-

agement approach to designing practice performance

assessment programmes. This paper describes issues to

consider in the articulation of the purposes and out-

comes of the assessment, planning the programme, the

administrative processes involved, including commu-

nication and preparation of assessees.

Examples of key questions to be answered are

provided, but further work is needed to test validity.

Keywords clinical competence ⁄ *standards; physician,

family ⁄ *standards; education, medical ⁄ *standards;

quality of health care ⁄ standards.

Medical Education 2002;36:936–941

Introduction

Assessment of the performance of doctors in practice,

or practice performance assessment, while conducted

informally for many years, is now being formalised. The

assessment of performance can involve a constellation

of activities, ranging from informal physician self-

assessment, to more formally structured external prac-

tice assessment processes imposed by licensing,

registration, certification or re-certification bodies. For

most doctors, these activities provide reinforcement of

effective practice and identify educational needs. For a

small percentage of doctors, however, these activities

identify serious deficiencies in practice performance

and provide a basis for formal action related to pre-

scribed remedial education, restricted licensure, or even

removal of licensure.

In high stakes performance assessments, it is especi-

ally important that the outcomes of the assessment

provide a comprehensive and accurate portrayal of the

doctor’s practice performance. Such assessments must

be fair to the doctor being reviewed, and fair to the

public and other stakeholder groups whose interests are

being served by the assessment.

Practice performance assessment can be viewed as a

process of:

• gathering information that describes what doctors do

in their care of patients, that is, their practice per-

formance, and

• comparing that information with defined standards of

practice performance, to arrive at decisions or

judgements about the quality of that performance.

For such assessments to be defensible, both the data

gathering process and the judgement process must be

defensible. In high stakes situations where a doctor’s

privilege to practise medicine is in question, experience

has shown that it is most often the data gathering phase

of this process that is challenged.1

1Royal Australian College of General Practitioners, Melbourne,

Australia, 2Division of Educational Support and Development,

University of British Columbia, Vancouver, Canada, 3Department

of Educational Development, Maastricht University, The

Netherlands, 4Department of Physiology, University of Alcala,

Madrid, Spain, 5Medical School of Quebec, Montreal, Canada,6Royal Australasian College of Physicians, Sydney, Australia, 7Centre

for Health Informatics and Multiprofessional Education, University

College London, UK, 8Royal College of Physicians, London, UK

Correspondence: Dr Stephen R Lew, Royal Australian College of

General Practitioners, 1 Palmerston Crescent, South Melbourne,

Victoria, Australia. Tel.: 00 61 3 9214 1409; Fax: 00 61 3 9214 1583;

E-mail: [email protected]

Papers from the 10th Cambridge Conference

936 � Blackwell Science Ltd MEDICAL EDUCATION 2002;36:936–941

The conceptual basis for making practice perform-

ance assessment fair and defensible is described else-

where in this journal.2 In this article we will propose

practical guidelines for developing fair and defensible

practice performance assessment programmes. These

guidelines will be presented in the form of questions

that must be addressed in the course of developing such

a system.

Setting up a framework

We propose to develop a document or project plan for

the purpose of setting up a practice performance

assessment. This document includes a description of

the rationale for each decision on the set-up of the

assessment. It serves then as a document to explain and

defend the procedures to all stakeholders. We therefore

aim to achieve defensibility by openness and clarity

about the aims, procedures and consequences of the

assessment. In this framework the document may be

seen as the project plan and the assessment as the

project. A framework of questions may guide designers

of practice performance assessment programmes

through their planning. These questions can be categ-

orised into three groups (Table 1):

1 purposes and outcomes;

2 planning the programme, and

3 processes.

Purposes and outcomes

What are the purposes of the assessment? Whose purposes

are being met? Are the purposes clearly stated and made

accessible prior to implementation of the programme?

These are the first and most important questions to

address in designing a practice performance assess-

ment. There are some obvious purposes of the assess-

ment, such as identifying aspects of performance that

should be improved,3 establishing that acceptable

standards of practice are met,4 providing feedback to

the doctor,5,6 and selection into educational pro-

grammes (Table 2). However, there may also exist

hidden agendas involving aims such as workforce

manipulation or power plays. Therefore, while delin-

eating the purpose of the assessment in a defined

context and for an agreed purpose, a parallel process of

determining the needs and desires of the stakeholders

must also be undertaken. Patients, doctors, licensing

authorities, certifying bodies, professional groups (e.g.

colleges and other professional associations), hospitals,

funders and regional authorities are all likely to have

different and valuable viewpoints on the purposes to be

served by a practice performance assessment.7

The purposes of the assessment therefore need to be

unambiguously stated. They must then be clearly

understood by stakeholders. This requires an adequate

Key learning points

Performance assessment is becoming more topical

and is likely to face more scrutiny and challenge.

In order to establish fair and defensible

programmes for performance assessment,

programme designers should address issues rela-

ting to purposes and outcomes of the programme,

planning the programme and the processes which

enable the programme to run.

A framework is proposed to assist programme

designers in developing and implementing their

own practice performance programmes.

Table 1 Areas to address in setting up a framework for per-

formance assessment

Purposes and outcomes

1 What are the purposes of the assessment?

Whose purposes are being met?

Are the purposes clearly stated and accessible prior to

implementation of the programme?

2 What is the regulatory structure of the assessment and what

are the possible outcomes

and consequences of the decisions?

What does the assessor expect to learn from assessing the

doctor’s performance?

Planning the programme

1 What steps were taken in planning the assessment programme

to ensure its fairness and defensibility?

Is the plan clearly described?

2 Who are the assessees?

Who are the assessors?

How are the judgements determined?

3 How were the assessors chosen?

4 How is the sampling done?

5 Why were these instruments chosen?

What is known about their technical characteristics?

6 How is the standard set?

Processes

1 What steps were taken in the process of administering the

assessment to ensure its fairness and defensibility?

Communication and preparation of assessees

Preparation of assessors

Time allocations

Equity and security

Ability to appeal

How are the assessees supported through the entire

process?

Cost

Establishing defensible programmes for assessing performance • S R Lew et al. 937

� Blackwell Science Ltd MEDICAL EDUCATION 2002;36:936–941

amount of time for publicising the purposes prior to

implementation of the programme.

What is the regulatory structure of the assessment, and what

are the possible outcomes and consequences of the decisions?

What does an assessor expect to learn from assessing a

doctor’s performance?

The answers to these questions are directly linked with

the purpose of the assessment and will have direct

relationships with how defensible the assessment needs

to be. Regulatory structure implies that project planners

should be concerned with addressing issues such as

disjunctive and conjunctive combinations (i.e. can

different methods compensate for each other or should

the assessee pass them all?). Furthermore, issues such

as provision for repeating tests, remediation and

appeals need to be made clear. Other important issues

include the method by which scores are to be combined

and whether any weighting is to be applied.

The possible outcomes and decisions should be

addressed. For example, where the purpose of the

assessment is to provide an environment of continuous

quality improvement with a focus on self-assessment

and voluntary participation, the assessment of educa-

tional needs may represent one of the principal

outcomes; in this case, high levels of physician partici-

pation and satisfaction would represent an indicator of

success.8 If there are no �life and death� issues sur-

rounding participation, it is unlikely that disputes and

legal challenges will result. If the purpose is to deter-

mine the ability to continue to practise medicine, the

outcome may be to identify doctors whose performance

is poor or potentially poor. Assessment processes may

search for reasons for poor performance and result in

assisting doctors to improve in selected areas, restrict-

ing their practice to ensure the safety of the community,

or removing licensure altogether. Clearly, the conse-

quences of these outcomes are considerable.

Planning the programme

What steps were taken in planning the assessment pro-

gramme to ensure its fairness and defensibility? Is the plan

clearly described?

As described previously, a performance assessment

requires judgements to be made with the aid of a

combination of instruments, each set within the context

of everyday practice. Capturing everything is often not

feasible, but the programme should be set up in such a

way that it contains a fair and defensible combination of

the methods. Planning defensible performance assess-

ments must therefore focus on factors relating to the

quality of the judgements and to the quality of the

instruments.1

Who are the assessees? Who are the assessors? How are the

judgements determined?

Before discussing the assessors and the instruments, an

important initial step is to describe the assessees and

how they came to be assessed. Are they referred for

assessment or are they self-referred? If they are referred

by regulatory authorities, doctors may be reluctant

participants, whereas if they are self-referred they may

be highly motivated. In what ways are they similar and

different from each other – for example, by first

language, country of origin, working conditions, prac-

tice profile and practice location? Do they have special

needs that require attention, such as assistance with

language, cultural understanding or medical care?

How were the assessors selected?

The role of the assessors is essential, so they must be

selected carefully. It is relevant to define whether the

assessors are meant to be peers of the assessees, or

external �experts� representing designated or desired

standards of practice. If attempts have been made to

ensure that they are representative, how was their

�representativeness� determined (taking into account

the assessors’ practices, their age, gender, race, religion,

culture)? If the assessors are defined as external experts,

how was their expert status determined or learned? It is

important to establish whether the judges are expert

enough in judgmental procedures to be judges. If, for

example, more senior or academic judges are used, how

can it be determined that they provide a valid assess-

ment? Further questions concern the number of judges

selected and the evidence supporting the choice of the

number of judges.

Care must be taken that the context is sufficiently

enabling for judges to arrive at fair judgements. They

will need to feel unconstrained by time, money, space

and personal and professional pressures.

How is the sampling done?

Sampling refers to the process of testing or judging a

whole by taking a specimen or collection of specimens. In

terms of performance assessment, adequate sampling is

critical for reducing bias and error, and for achieving high

validity and reliability.2 When the performance assess-

Table 2 Possible purposes of practice performance assessment

1 Identifying aspects of performance that should be improved

2 Determining that acceptable standards of practice are met

3 Providing feedback to the doctor

4 Selection into educational programmes

Establishing defensible programmes for assessing performance • S R Lew et al.938


ment was designed, how was the sample size determined?

Are the quality and quantity of samples sufficient? Do

they include reference to different domains of practice

and sufficient sampling times? What method is used to

determine the sample size for the different instruments?

What evidence supports the choice of sample size? Are

the sample sizes different for the different methods? The

Royal Australian College of General Practitioners (RAC-

GP), for example, provides Practice-Based Assessment

as a performance-based assessment programme as an

alternative to the College Examination, for the purpose

of certification to practise as a general practitioner in

Australia.8 This programme samples using a blueprint

derived from national general practice data covering

reasons for encounter, age groups and gender distribu-

tions,9 the RACGP domains of general practice10 and the

International Classification of Primary Care.

Why were these instruments chosen? What is known about

their technical characteristics?

There will be limits to how many instruments can be

used, whether due to cost, feasibility or time. Yet the

selection of instruments is critical to achieve adequate

sampling of the field of practice.11 How then is the

combination of instruments determined? Do they

collectively sample all aspects of practice performance

adequately?

Important issues to consider in selecting the instru-

ments are:

• Their internal validity: do they measure what they

purport to measure?

• Their reliability: are the results internally consistent

and reproducible? What approach to reliability is used?

• Their cost: how are the costs derived and is the cost

acceptable to stakeholders? What measures are taken

to avoid incurring unnecessary costs?

• Their feasibility and acceptability: is the nature of the

assessment acceptable to stakeholders and is their

implementation feasible? Do the instruments have

any shortcomings, and if so, how are they compen-

sated for, minimised or eliminated?

• Previous experience: have these instruments been

used previously and in what context? Has the benefit

of others’ experience with the instruments been made

available?

Is the combination of different methods sufficient to

cover the areas of practice which are the subject of the

assessment? Are the assessment methods used suffi-

ciently tailored to the context of the assessee? For

example, are they culturally appropriate? Do they take

account of the assessee’s location, practice profile and

patient profile?

How is the standard set?

Standard setting is the process by which a standard of

performance is defined relative to a minimally accept-

able level of performance, which in turn is translated

into a �passing score� used to divide candidates into

those whose performance is acceptable and those whose

performance is not acceptable. Whereas many methods

have been used in tests of competence over the past few

decades, setting performance standards is a relatively

new concept in the realm of practice performance. How

then are the decisions on satisfactory ⁄unsatisfactory

performance determined? Are the methods used to set

standards in tests of competence applicable in the

context of performance assessment? How can decisions

related to required standards be justified?

Processes

In our previous discussion, we outlined the need for

clear purposes and outcomes, and sufficient planning

and selection of elements of the assessment. However,

in order to ensure that the programme runs smoothly, a

robust and reliable administrative structure is essential.

Some key aspects of the administration include com-

munication and preparation of assessees, preparation of

assessors, time allocation, equity and security, ability to

appeal, support for the assessees and cost.

What steps were taken in the process of administering

the assessment to ensure its fairness and defensibility?

Communication and preparation of assessees

Questions pertaining to how doctors who will be

assessed are prepared should be addressed as part of

the project plan. This may be accomplished by a

combination of written materials, electronic materials

and personal contact. Do they receive explanatory

documentation that, according to assessees, is easy to

understand? What information is available to assessees

in order to help them prepare? How are stress-related

problems experienced by assessees during assessment

taken into account? Are there provisions for resched-

uling an assessment in the event of illness or misad-

venture?

Preparation of assessors

Preparation of the assessors is an important element in

assuring the quality of the assessment. It is advisable to

describe how assessors are trained. What prerequisite

conditions must assessors meet? What opportunities

exist for assessors to practise assessment techniques?

Are assessors specifically trained in cultural sensitivity?

What quality assurance measures are in place for



assessors? How are the assessors instructed to interact

with the assessee? Such interactions may be formal or

informal, and may vary according to the possible

consequences of the assessment. However, interactions

between assessors and assessees should be uniform for

the same types of assessment.

Time allocations

Is there sufficient time for assessees to prepare? Is there

sufficient time for assessors to complete assessments

and for judges to decide on performance?

Equity and security

Aspects relating to equity and security are likely to be

challenged if they are not dealt with sufficiently. The

following issues should be considered. Are there

opportunities for assessor discussions with other col-

leagues that may colour their judgement? Is there a

system to allow assessors to opt out of assessing

particular individuals, such as when their objectivity is

compromised or may be perceived as being compro-

mised? How are the security and confidentiality of the

records of the assessment ensured to avoid theft or

tampering with material? How will the confidentiality of

any patient records used be maintained? Should asses-

sors and assessees be matched, and if so, on what basis

(e.g. race, gender, etc.)? What processes are in place to

ensure no prejudice, bias or discrimination interfere

with the assessment process?

Ability to appeal

A fair and defensible process should ensure that the

right to appeal exists and is explicitly stated. Under

what circumstances may an assessee appeal against a

judgement? Is the appeal process clearly stated and

easily accessible to the assessee?

How are assessees supported through the entire process?

Experience with practice performance assessment

shows that it may be a very threatening process for

the assessee, particularly if the stakes are high. While

addressing all of the above questions is important,

practice performance assessment must also have struc-

tures in place to support assessees throughout the entire

process of the assessment, from notification through to

preparation and beyond the completion of the assess-

ment itself. Whereas the organisation conducting the

assessment may not itself provide all of these supports,

it is important to consider what notification and

preparation will be given to assessees, how they will

access accurate information about the assessment, how

and when they can obtain feedback, the circumstances

in which they may request special assistance or appeal,

and importantly, how they deal with the results of their

assessment and where they can go for remediation,

further education and counselling as required.

Cost

The project plan must clearly outline the costs. How is

the cost of the assessment derived? Some of these costs

will be overt, while others will be covert. Overt costs

include payments to be made to the organising body

and costs for hiring equipment such as videotape

recorders. Covert costs may be no less substantial,

and include staff costs, mailing costs and lost consulting

time or leisure time. A cost-benefit analysis should be

made as part of the process of cost stipulation. Is the

effort required a reasonable cost considering the

resources, expertise and labour involved, as well as

the outcomes the programme delivers?

Conclusions and future directions

Programmes for assessing the performance of practising

doctors and for identifying poorly performing doctors

continue to gain momentum internationally. The

framework presented in this paper offers a way forward

in assisting designers of such assessment programmes.

We have posed questions to guide the design of these

programmes, but we do not assume to have covered all

possible questions. The questions we have posed

represent the key issues to consider in developing and

implementing fair and defensible programmes of prac-

tice performance assessment. Research and consulta-

tion are required to support the validity of the questions

we have posed.

Contributors

All authors contributed equally to the discussions

undertaken during the 10th Cambridge Conference

on Medical Education that led to the writing of this

article. In writing the paper SRL took main responsi-

bility for preparing the draft, co-ordinating input from

the other authors and writing the final version of the

paper. LWTS and GGP reviewed the first drafts and

their input was used to write subsequent drafts. All

other authors made valuable comments and suggestions

with respect to these drafts. Their input has led to the

final version of the paper.

Acknowledgements

Grateful acknowledgement is made to the sponsors of

the 10th Cambridge Conference: the Medical Council

Establishing defensible programmes for assessing performance • S R Lew et al.940


of Canada, the Smith & Nephew Foundation, the

American Board of Internal Medicine, the National

Board of Medical Examiners and the Royal College of

Physicians.

Funding

No external funding was sought for the production of

this paper.

References

1 Southgate LJ, Cox J, David T, Hatch D, Howes A, Johnson N

et al. The General Medical Council’s Performance Proce-

dures: peer review of performance in the workplace. Med Educ

2001;35:9–19.

2 Schuwirth LWT, Southgate LJ, Page GG, Paget NS, Lescop

JMJ, Lew SR. et al. When enough is enough: a conceptual

basis for fair and defensible practice performance assessment.

Med Educ 2002;36:925–930.

3 Southgate LJ, Dauphinee DW. Maintaining standards in

British and Canadian medicine: the developing role of the

regulatory body. BMJ 1998;316:697–700.

4 Royal Australian College of General Practitioners. Standards

for General Practices. 2nd edn. Melbourne: RACGP; 2000.

5 Newble DI, Paget NS. The maintenance of professional

standards programmes of the Royal Australasian College of

Physicians. J R Coll Physicians Lond 1996;30:252–6.

6 Cunnington JPW, Hanna E, Turnbull J, Kaigas T, Norman

G. Defensible assessment of the competency of the practising

physician. Acad Med 1997;72:9–12.

7 Southgate LJ, Hays RB, Norcini JJ, Mulholland H, Ayers B,

Wooliscroft J et al. Setting performance standards for medical

practice: a theoretical framework. Med Educ 2001;35:474–81.

8 Royal Australian College of General Practitioners. Practice

Based Assessment. A Handbook for Candidates and Examiners

2002. Melbourne: RACGP; 2002.

9 Britt H, Miller GC, Charles J, Knox S, Sayer GP, Valenti L et

al. General practice activity in Australia 1999–2000. Canberra:

Australian Institute of Health and Welfare (General Practice

Series no. 5); 2000.

10 Royal Australian College of General Practitioners. RACGP

Training Program Curriculum. 2nd edn. Melbourne: RACGP;

1999.

11 Hays RB, Davies H, Caldon L, Farmer EA, Finucane P,

McRorie P et al. Selecting performance assessment methods.

Med Educ 2002;36:910–917.

Received 12 March 2002; editorial comments to authors 16 May 2002;

accepted for publication 9 July 2002



procedures for establishing defensible programmes for assessing practice performance

Documents