procedures for establishing defensible programmes for assessing practice performance
TRANSCRIPT
Procedures for establishing defensible programmes forassessing practice performance
Stephen R Lew,1 Gordon G Page,2 Lambert W T Schuwirth,3 Margarita Baron-Maldonado,4 Joelle M JLescop,5 Neil S Paget,6 Lesley J Southgate7 & Winifred B Wade8
Summary The assessment of the performance of
doctors in practice is becoming more widely accepted.
While there are many potential purposes for such
assessments, sometimes the consequences of the
assessments will be �high stakes�. In these circum-
stances, any of the many elements of the assessment
programme may potentially be challenged. These
assessment programmes therefore need to be robust,
fair and defensible, taken from the perspectives of
consumer, assessee and assessor. In order to inform the
design of defensible programmes for assessing practice
performance, a group of education researchers at the
10th Cambridge Conference adopted a project man-
agement approach to designing practice performance
assessment programmes. This paper describes issues to
consider in the articulation of the purposes and out-
comes of the assessment, planning the programme, the
administrative processes involved, including commu-
nication and preparation of assessees.
Examples of key questions to be answered are
provided, but further work is needed to test validity.
Keywords clinical competence ⁄ *standards; physician,
family ⁄ *standards; education, medical ⁄ *standards;
quality of health care ⁄ standards.
Medical Education 2002;36:936–941
Introduction
Assessment of the performance of doctors in practice,
or practice performance assessment, while conducted
informally for many years, is now being formalised. The
assessment of performance can involve a constellation
of activities, ranging from informal physician self-
assessment, to more formally structured external prac-
tice assessment processes imposed by licensing,
registration, certification or re-certification bodies. For
most doctors, these activities provide reinforcement of
effective practice and identify educational needs. For a
small percentage of doctors, however, these activities
identify serious deficiencies in practice performance
and provide a basis for formal action related to pre-
scribed remedial education, restricted licensure, or even
removal of licensure.
In high stakes performance assessments, it is especi-
ally important that the outcomes of the assessment
provide a comprehensive and accurate portrayal of the
doctor’s practice performance. Such assessments must
be fair to the doctor being reviewed, and fair to the
public and other stakeholder groups whose interests are
being served by the assessment.
Practice performance assessment can be viewed as a
process of:
• gathering information that describes what doctors do
in their care of patients, that is, their practice per-
formance, and
• comparing that information with defined standards of
practice performance, to arrive at decisions or
judgements about the quality of that performance.
For such assessments to be defensible, both the data
gathering process and the judgement process must be
defensible. In high stakes situations where a doctor’s
privilege to practise medicine is in question, experience
has shown that it is most often the data gathering phase
of this process that is challenged.1
1Royal Australian College of General Practitioners, Melbourne,
Australia, 2Division of Educational Support and Development,
University of British Columbia, Vancouver, Canada, 3Department
of Educational Development, Maastricht University, The
Netherlands, 4Department of Physiology, University of Alcala,
Madrid, Spain, 5Medical School of Quebec, Montreal, Canada,6Royal Australasian College of Physicians, Sydney, Australia, 7Centre
for Health Informatics and Multiprofessional Education, University
College London, UK, 8Royal College of Physicians, London, UK
Correspondence: Dr Stephen R Lew, Royal Australian College of
General Practitioners, 1 Palmerston Crescent, South Melbourne,
Victoria, Australia. Tel.: 00 61 3 9214 1409; Fax: 00 61 3 9214 1583;
E-mail: [email protected]
Papers from the 10th Cambridge Conference
936 � Blackwell Science Ltd MEDICAL EDUCATION 2002;36:936–941
The conceptual basis for making practice perform-
ance assessment fair and defensible is described else-
where in this journal.2 In this article we will propose
practical guidelines for developing fair and defensible
practice performance assessment programmes. These
guidelines will be presented in the form of questions
that must be addressed in the course of developing such
a system.
Setting up a framework
We propose to develop a document or project plan for
the purpose of setting up a practice performance
assessment. This document includes a description of
the rationale for each decision on the set-up of the
assessment. It serves then as a document to explain and
defend the procedures to all stakeholders. We therefore
aim to achieve defensibility by openness and clarity
about the aims, procedures and consequences of the
assessment. In this framework the document may be
seen as the project plan and the assessment as the
project. A framework of questions may guide designers
of practice performance assessment programmes
through their planning. These questions can be categ-
orised into three groups (Table 1):
1 purposes and outcomes;
2 planning the programme, and
3 processes.
Purposes and outcomes
What are the purposes of the assessment? Whose purposes
are being met? Are the purposes clearly stated and made
accessible prior to implementation of the programme?
These are the first and most important questions to
address in designing a practice performance assess-
ment. There are some obvious purposes of the assess-
ment, such as identifying aspects of performance that
should be improved,3 establishing that acceptable
standards of practice are met,4 providing feedback to
the doctor,5,6 and selection into educational pro-
grammes (Table 2). However, there may also exist
hidden agendas involving aims such as workforce
manipulation or power plays. Therefore, while delin-
eating the purpose of the assessment in a defined
context and for an agreed purpose, a parallel process of
determining the needs and desires of the stakeholders
must also be undertaken. Patients, doctors, licensing
authorities, certifying bodies, professional groups (e.g.
colleges and other professional associations), hospitals,
funders and regional authorities are all likely to have
different and valuable viewpoints on the purposes to be
served by a practice performance assessment.7
The purposes of the assessment therefore need to be
unambiguously stated. They must then be clearly
understood by stakeholders. This requires an adequate
Key learning points
Performance assessment is becoming more topical
and is likely to face more scrutiny and challenge.
In order to establish fair and defensible
programmes for performance assessment,
programme designers should address issues rela-
ting to purposes and outcomes of the programme,
planning the programme and the processes which
enable the programme to run.
A framework is proposed to assist programme
designers in developing and implementing their
own practice performance programmes.
Table 1 Areas to address in setting up a framework for per-
formance assessment
Purposes and outcomes
1 What are the purposes of the assessment?
Whose purposes are being met?
Are the purposes clearly stated and accessible prior to
implementation of the programme?
2 What is the regulatory structure of the assessment and what
are the possible outcomes
and consequences of the decisions?
What does the assessor expect to learn from assessing the
doctor’s performance?
Planning the programme
1 What steps were taken in planning the assessment programme
to ensure its fairness and defensibility?
Is the plan clearly described?
2 Who are the assessees?
Who are the assessors?
How are the judgements determined?
3 How were the assessors chosen?
4 How is the sampling done?
5 Why were these instruments chosen?
What is known about their technical characteristics?
6 How is the standard set?
Processes
1 What steps were taken in the process of administering the
assessment to ensure its fairness and defensibility?
Communication and preparation of assessees
Preparation of assessors
Time allocations
Equity and security
Ability to appeal
How are the assessees supported through the entire
process?
Cost
Establishing defensible programmes for assessing performance • S R Lew et al. 937
� Blackwell Science Ltd MEDICAL EDUCATION 2002;36:936–941
amount of time for publicising the purposes prior to
implementation of the programme.
What is the regulatory structure of the assessment, and what
are the possible outcomes and consequences of the decisions?
What does an assessor expect to learn from assessing a
doctor’s performance?
The answers to these questions are directly linked with
the purpose of the assessment and will have direct
relationships with how defensible the assessment needs
to be. Regulatory structure implies that project planners
should be concerned with addressing issues such as
disjunctive and conjunctive combinations (i.e. can
different methods compensate for each other or should
the assessee pass them all?). Furthermore, issues such
as provision for repeating tests, remediation and
appeals need to be made clear. Other important issues
include the method by which scores are to be combined
and whether any weighting is to be applied.
The possible outcomes and decisions should be
addressed. For example, where the purpose of the
assessment is to provide an environment of continuous
quality improvement with a focus on self-assessment
and voluntary participation, the assessment of educa-
tional needs may represent one of the principal
outcomes; in this case, high levels of physician partici-
pation and satisfaction would represent an indicator of
success.8 If there are no �life and death� issues sur-
rounding participation, it is unlikely that disputes and
legal challenges will result. If the purpose is to deter-
mine the ability to continue to practise medicine, the
outcome may be to identify doctors whose performance
is poor or potentially poor. Assessment processes may
search for reasons for poor performance and result in
assisting doctors to improve in selected areas, restrict-
ing their practice to ensure the safety of the community,
or removing licensure altogether. Clearly, the conse-
quences of these outcomes are considerable.
Planning the programme
What steps were taken in planning the assessment pro-
gramme to ensure its fairness and defensibility? Is the plan
clearly described?
As described previously, a performance assessment
requires judgements to be made with the aid of a
combination of instruments, each set within the context
of everyday practice. Capturing everything is often not
feasible, but the programme should be set up in such a
way that it contains a fair and defensible combination of
the methods. Planning defensible performance assess-
ments must therefore focus on factors relating to the
quality of the judgements and to the quality of the
instruments.1
Who are the assessees? Who are the assessors? How are the
judgements determined?
Before discussing the assessors and the instruments, an
important initial step is to describe the assessees and
how they came to be assessed. Are they referred for
assessment or are they self-referred? If they are referred
by regulatory authorities, doctors may be reluctant
participants, whereas if they are self-referred they may
be highly motivated. In what ways are they similar and
different from each other – for example, by first
language, country of origin, working conditions, prac-
tice profile and practice location? Do they have special
needs that require attention, such as assistance with
language, cultural understanding or medical care?
How were the assessors selected?
The role of the assessors is essential, so they must be
selected carefully. It is relevant to define whether the
assessors are meant to be peers of the assessees, or
external �experts� representing designated or desired
standards of practice. If attempts have been made to
ensure that they are representative, how was their
�representativeness� determined (taking into account
the assessors’ practices, their age, gender, race, religion,
culture)? If the assessors are defined as external experts,
how was their expert status determined or learned? It is
important to establish whether the judges are expert
enough in judgmental procedures to be judges. If, for
example, more senior or academic judges are used, how
can it be determined that they provide a valid assess-
ment? Further questions concern the number of judges
selected and the evidence supporting the choice of the
number of judges.
Care must be taken that the context is sufficiently
enabling for judges to arrive at fair judgements. They
will need to feel unconstrained by time, money, space
and personal and professional pressures.
How is the sampling done?
Sampling refers to the process of testing or judging a
whole by taking a specimen or collection of specimens. In
terms of performance assessment, adequate sampling is
critical for reducing bias and error, and for achieving high
validity and reliability.2 When the performance assess-
Table 2 Possible purposes of practice performance assessment
1 Identifying aspects of performance that should be improved
2 Determining that acceptable standards of practice are met
3 Providing feedback to the doctor
4 Selection into educational programmes
Establishing defensible programmes for assessing performance • S R Lew et al.938
� Blackwell Science Ltd MEDICAL EDUCATION 2002;36:936–941
ment was designed, how was the sample size determined?
Are the quality and quantity of samples sufficient? Do
they include reference to different domains of practice
and sufficient sampling times? What method is used to
determine the sample size for the different instruments?
What evidence supports the choice of sample size? Are
the sample sizes different for the different methods? The
Royal Australian College of General Practitioners (RAC-
GP), for example, provides Practice-Based Assessment
as a performance-based assessment programme as an
alternative to the College Examination, for the purpose
of certification to practise as a general practitioner in
Australia.8 This programme samples using a blueprint
derived from national general practice data covering
reasons for encounter, age groups and gender distribu-
tions,9 the RACGP domains of general practice10 and the
International Classification of Primary Care.
Why were these instruments chosen? What is known about
their technical characteristics?
There will be limits to how many instruments can be
used, whether due to cost, feasibility or time. Yet the
selection of instruments is critical to achieve adequate
sampling of the field of practice.11 How then is the
combination of instruments determined? Do they
collectively sample all aspects of practice performance
adequately?
Important issues to consider in selecting the instru-
ments are:
• Their internal validity: do they measure what they
purport to measure?
• Their reliability: are the results internally consistent
and reproducible? What approach to reliability is used?
• Their cost: how are the costs derived and is the cost
acceptable to stakeholders? What measures are taken
to avoid incurring unnecessary costs?
• Their feasibility and acceptability: is the nature of the
assessment acceptable to stakeholders and is their
implementation feasible? Do the instruments have
any shortcomings, and if so, how are they compen-
sated for, minimised or eliminated?
• Previous experience: have these instruments been
used previously and in what context? Has the benefit
of others’ experience with the instruments been made
available?
Is the combination of different methods sufficient to
cover the areas of practice which are the subject of the
assessment? Are the assessment methods used suffi-
ciently tailored to the context of the assessee? For
example, are they culturally appropriate? Do they take
account of the assessee’s location, practice profile and
patient profile?
How is the standard set?
Standard setting is the process by which a standard of
performance is defined relative to a minimally accept-
able level of performance, which in turn is translated
into a �passing score� used to divide candidates into
those whose performance is acceptable and those whose
performance is not acceptable. Whereas many methods
have been used in tests of competence over the past few
decades, setting performance standards is a relatively
new concept in the realm of practice performance. How
then are the decisions on satisfactory ⁄unsatisfactory
performance determined? Are the methods used to set
standards in tests of competence applicable in the
context of performance assessment? How can decisions
related to required standards be justified?
Processes
In our previous discussion, we outlined the need for
clear purposes and outcomes, and sufficient planning
and selection of elements of the assessment. However,
in order to ensure that the programme runs smoothly, a
robust and reliable administrative structure is essential.
Some key aspects of the administration include com-
munication and preparation of assessees, preparation of
assessors, time allocation, equity and security, ability to
appeal, support for the assessees and cost.
What steps were taken in the process of administering
the assessment to ensure its fairness and defensibility?
Communication and preparation of assessees
Questions pertaining to how doctors who will be
assessed are prepared should be addressed as part of
the project plan. This may be accomplished by a
combination of written materials, electronic materials
and personal contact. Do they receive explanatory
documentation that, according to assessees, is easy to
understand? What information is available to assessees
in order to help them prepare? How are stress-related
problems experienced by assessees during assessment
taken into account? Are there provisions for resched-
uling an assessment in the event of illness or misad-
venture?
Preparation of assessors
Preparation of the assessors is an important element in
assuring the quality of the assessment. It is advisable to
describe how assessors are trained. What prerequisite
conditions must assessors meet? What opportunities
exist for assessors to practise assessment techniques?
Are assessors specifically trained in cultural sensitivity?
What quality assurance measures are in place for
Establishing defensible programmes for assessing performance • S R Lew et al. 939
� Blackwell Science Ltd MEDICAL EDUCATION 2002;36:936–941
assessors? How are the assessors instructed to interact
with the assessee? Such interactions may be formal or
informal, and may vary according to the possible
consequences of the assessment. However, interactions
between assessors and assessees should be uniform for
the same types of assessment.
Time allocations
Is there sufficient time for assessees to prepare? Is there
sufficient time for assessors to complete assessments
and for judges to decide on performance?
Equity and security
Aspects relating to equity and security are likely to be
challenged if they are not dealt with sufficiently. The
following issues should be considered. Are there
opportunities for assessor discussions with other col-
leagues that may colour their judgement? Is there a
system to allow assessors to opt out of assessing
particular individuals, such as when their objectivity is
compromised or may be perceived as being compro-
mised? How are the security and confidentiality of the
records of the assessment ensured to avoid theft or
tampering with material? How will the confidentiality of
any patient records used be maintained? Should asses-
sors and assessees be matched, and if so, on what basis
(e.g. race, gender, etc.)? What processes are in place to
ensure no prejudice, bias or discrimination interfere
with the assessment process?
Ability to appeal
A fair and defensible process should ensure that the
right to appeal exists and is explicitly stated. Under
what circumstances may an assessee appeal against a
judgement? Is the appeal process clearly stated and
easily accessible to the assessee?
How are assessees supported through the entire process?
Experience with practice performance assessment
shows that it may be a very threatening process for
the assessee, particularly if the stakes are high. While
addressing all of the above questions is important,
practice performance assessment must also have struc-
tures in place to support assessees throughout the entire
process of the assessment, from notification through to
preparation and beyond the completion of the assess-
ment itself. Whereas the organisation conducting the
assessment may not itself provide all of these supports,
it is important to consider what notification and
preparation will be given to assessees, how they will
access accurate information about the assessment, how
and when they can obtain feedback, the circumstances
in which they may request special assistance or appeal,
and importantly, how they deal with the results of their
assessment and where they can go for remediation,
further education and counselling as required.
Cost
The project plan must clearly outline the costs. How is
the cost of the assessment derived? Some of these costs
will be overt, while others will be covert. Overt costs
include payments to be made to the organising body
and costs for hiring equipment such as videotape
recorders. Covert costs may be no less substantial,
and include staff costs, mailing costs and lost consulting
time or leisure time. A cost-benefit analysis should be
made as part of the process of cost stipulation. Is the
effort required a reasonable cost considering the
resources, expertise and labour involved, as well as
the outcomes the programme delivers?
Conclusions and future directions
Programmes for assessing the performance of practising
doctors and for identifying poorly performing doctors
continue to gain momentum internationally. The
framework presented in this paper offers a way forward
in assisting designers of such assessment programmes.
We have posed questions to guide the design of these
programmes, but we do not assume to have covered all
possible questions. The questions we have posed
represent the key issues to consider in developing and
implementing fair and defensible programmes of prac-
tice performance assessment. Research and consulta-
tion are required to support the validity of the questions
we have posed.
Contributors
All authors contributed equally to the discussions
undertaken during the 10th Cambridge Conference
on Medical Education that led to the writing of this
article. In writing the paper SRL took main responsi-
bility for preparing the draft, co-ordinating input from
the other authors and writing the final version of the
paper. LWTS and GGP reviewed the first drafts and
their input was used to write subsequent drafts. All
other authors made valuable comments and suggestions
with respect to these drafts. Their input has led to the
final version of the paper.
Acknowledgements
Grateful acknowledgement is made to the sponsors of
the 10th Cambridge Conference: the Medical Council
Establishing defensible programmes for assessing performance • S R Lew et al.940
� Blackwell Science Ltd MEDICAL EDUCATION 2002;36:936–941
of Canada, the Smith & Nephew Foundation, the
American Board of Internal Medicine, the National
Board of Medical Examiners and the Royal College of
Physicians.
Funding
No external funding was sought for the production of
this paper.
References
1 Southgate LJ, Cox J, David T, Hatch D, Howes A, Johnson N
et al. The General Medical Council’s Performance Proce-
dures: peer review of performance in the workplace. Med Educ
2001;35:9–19.
2 Schuwirth LWT, Southgate LJ, Page GG, Paget NS, Lescop
JMJ, Lew SR. et al. When enough is enough: a conceptual
basis for fair and defensible practice performance assessment.
Med Educ 2002;36:925–930.
3 Southgate LJ, Dauphinee DW. Maintaining standards in
British and Canadian medicine: the developing role of the
regulatory body. BMJ 1998;316:697–700.
4 Royal Australian College of General Practitioners. Standards
for General Practices. 2nd edn. Melbourne: RACGP; 2000.
5 Newble DI, Paget NS. The maintenance of professional
standards programmes of the Royal Australasian College of
Physicians. J R Coll Physicians Lond 1996;30:252–6.
6 Cunnington JPW, Hanna E, Turnbull J, Kaigas T, Norman
G. Defensible assessment of the competency of the practising
physician. Acad Med 1997;72:9–12.
7 Southgate LJ, Hays RB, Norcini JJ, Mulholland H, Ayers B,
Wooliscroft J et al. Setting performance standards for medical
practice: a theoretical framework. Med Educ 2001;35:474–81.
8 Royal Australian College of General Practitioners. Practice
Based Assessment. A Handbook for Candidates and Examiners
2002. Melbourne: RACGP; 2002.
9 Britt H, Miller GC, Charles J, Knox S, Sayer GP, Valenti L et
al. General practice activity in Australia 1999–2000. Canberra:
Australian Institute of Health and Welfare (General Practice
Series no. 5); 2000.
10 Royal Australian College of General Practitioners. RACGP
Training Program Curriculum. 2nd edn. Melbourne: RACGP;
1999.
11 Hays RB, Davies H, Caldon L, Farmer EA, Finucane P,
McRorie P et al. Selecting performance assessment methods.
Med Educ 2002;36:910–917.
Received 12 March 2002; editorial comments to authors 16 May 2002;
accepted for publication 9 July 2002
Establishing defensible programmes for assessing performance • S R Lew et al. 941
� Blackwell Science Ltd MEDICAL EDUCATION 2002;36:936–941