improving safety in medical devices and systems

10
Improving safety in medical devices and systems Harold Thimbleby College of Science University of Swansea, Wales Email: [email protected] Abstract—We need to improve healthcare technologies — electronic patient records, medical devices — by reducing use error and, in particular, unnoticed errors, since unnoticed errors cannot be managed by clinicians to reduce patient harm. Every system we have examined has multiple opportunities for safer design, suggesting a safety scoring system. Making safety scores visible will enable all stakeholders (reg- ulators, procurers, clinicians, incident investigators, journalists, and of course patients) to be more informed, and hence put pressure on manufacturers to improve design safety. In the longer run, safety scores will need to evolve, both to accommodate manufacturers improving device safety and to accommodate insights from further research in design-induced error. I. I NTRODUCTION Approximately 11% of patients in UK hospitals suffer adverse events, half are preventable, and about a third lead to moderate or greater disability or death [40]. Medication errors seem to be one of the most preventable forms of error: more than 1 in 6 (17%) of medication errors involve miscalculation of doses, incorrect expression of units or incorrect adminis- tration rates [22]. The Institute of Healthcare Improvement’s “global trigger tool” suggests adverse events may be ten times higher than previously assumed [4]. These figures come from research in different countries with different methodologies and assumptions, and suffer from a lack of reliable information [39], but there is general agreement that preventable mortality is numerically comparable to road accident fatality rates [18]. However, focusing on preventable errors leading to fatalities is a mistake; any patient harm leads to unnecessary problems [4], from longer stays in hospitals, suffering — to patients and carers, including the clinicians — and additional social costs. We should focus on harm, not on error, since some errors do not lead to harm, and some errors can be mitigtated; indeed this is an important distinction this paper will develop. It is tempting, but wrong, to automatically blame the hospital staff [2]. One would imagine that computers, whether embedded in devices or in Health IT systems, would be part of the solution. Unlike calls to improve training or other human processes, if we can improve technology, then everybody can benefit. Yet mortality rates may double when computerized patient record systems are introduced [9]. Healthcare is now widely recongized as turning into an IT problem [15], and we clearly need informed, well-founded principles and strategies to improve safety. In pharmaceuticals, it has long been recognized that unan- ticipated side-effects are unavoidable and often unexpected, Draft — comments welcome. To appear in Proceedings IEEE In- ternational Conference on Healthcare Informatics (ICHI), Philadelphia, USA, 2013. http://www.ischool.drexel.edu/ichi2013 so it is obviously that a rigorous process of translation and regulation, as well as open scientific scrutiny, is required before promising laboratory products are approved for clinical use [7]. So-called “use error” should be considered the unavoidable and expected side-effect of interactive healthcare technolo- gies, particularly complex computer-based systems such as IT systems, infusion pumps, linear accelerators and so forth. In other words, we need a rigorous approach to developing technological interventions. This paper provides a brief overview of some representative use errors, their causes, consequences and the culture in which they lie, then the paper moves to a theoretical and practical plan of action. We argue that a new approach can, over time, reduce patient harm and improve the experience of clinicians, but we need new ways to overcome deep cultural barriers to improvement. Although I have a personal bias — I want better interaction design during product development (my team’s research, e.g., [3], [28], [37] suggests ways of achieving significant reductions in patient harm) — the suggestions in this paper do not presuppose any particular methodology to achieve the improvements that are required, merely that we make safety visible. Details will be presented later. Note. Throughout this paper I refer to “use error” and not “user error.” As Reason’s Swiss Cheese model makes clear [30] — and as illustrated in our examples below — the user is only a small part of a larger system, and in fact only exceptionally is the user the root cause of any adverse incident. Users are rarely even the final defence against harm; medical devices are usually the last line of defence. Just because users, like nurses and other clinicians, might be the most salient part of an incident (they are the final agent with freewill, and who made a choice that in hindsight might seem erroneous), that does not mean our language should implicitly burden them with the prejudice of user error. The error appears to become visible during use, so call it use error — but in fact the appearance is a symptom of the latent errors, the incoherence between the user and the system’s design and the appropriate use (or not) of other information such as patient records and vital signs. II. EXAMPLES OF DESIGN- INDUCED ERROR It is a common design defect for a system (whether a PC or handheld device) to ignore use errors and turn them into “valid” actions that almost certainly are not what the user could have possibly intended or reasonably wanted. A. Two radiotherapy examples Radiotherapy involves complex calculations. Originally ra- diographers at the UK North Staffordshire Royal Infirmary made manual adjustments to dose, but they continued to make

Upload: others

Post on 24-Nov-2021

3 views

Category:

Documents


0 download

TRANSCRIPT

Improving safety in medical devices and systems

Harold ThimblebyCollege of Science

University of Swansea, WalesEmail: [email protected]

Abstract—We need to improve healthcare technologies —electronic patient records, medical devices — by reducing useerror and, in particular, unnoticed errors, since unnoticed errorscannot be managed by clinicians to reduce patient harm. Everysystem we have examined has multiple opportunities for saferdesign, suggesting a safety scoring system.

Making safety scores visible will enable all stakeholders (reg-ulators, procurers, clinicians, incident investigators, journalists,and of course patients) to be more informed, and hence putpressure on manufacturers to improve design safety. In the longerrun, safety scores will need to evolve, both to accommodatemanufacturers improving device safety and to accommodateinsights from further research in design-induced error.

I. INTRODUCTION

Approximately 11% of patients in UK hospitals sufferadverse events, half are preventable, and about a third lead tomoderate or greater disability or death [40]. Medication errorsseem to be one of the most preventable forms of error: morethan 1 in 6 (17%) of medication errors involve miscalculationof doses, incorrect expression of units or incorrect adminis-tration rates [22]. The Institute of Healthcare Improvement’s“global trigger tool” suggests adverse events may be ten timeshigher than previously assumed [4]. These figures come fromresearch in different countries with different methodologiesand assumptions, and suffer from a lack of reliable information[39], but there is general agreement that preventable mortalityis numerically comparable to road accident fatality rates [18].However, focusing on preventable errors leading to fatalitiesis a mistake; any patient harm leads to unnecessary problems[4], from longer stays in hospitals, suffering — to patients andcarers, including the clinicians — and additional social costs.We should focus on harm, not on error, since some errors donot lead to harm, and some errors can be mitigtated; indeedthis is an important distinction this paper will develop.

It is tempting, but wrong, to automatically blame thehospital staff [2]. One would imagine that computers, whetherembedded in devices or in Health IT systems, would be part ofthe solution. Unlike calls to improve training or other humanprocesses, if we can improve technology, then everybody canbenefit. Yet mortality rates may double when computerizedpatient record systems are introduced [9]. Healthcare is nowwidely recongized as turning into an IT problem [15], and weclearly need informed, well-founded principles and strategiesto improve safety.

In pharmaceuticals, it has long been recognized that unan-ticipated side-effects are unavoidable and often unexpected,

Draft — comments welcome. To appear in Proceedings IEEE In-ternational Conference on Healthcare Informatics (ICHI), Philadelphia,USA, 2013. http://www.ischool.drexel.edu/ichi2013

so it is obviously that a rigorous process of translation andregulation, as well as open scientific scrutiny, is required beforepromising laboratory products are approved for clinical use [7].So-called “use error” should be considered the unavoidableand expected side-effect of interactive healthcare technolo-gies, particularly complex computer-based systems such asIT systems, infusion pumps, linear accelerators and so forth.In other words, we need a rigorous approach to developingtechnological interventions.

This paper provides a brief overview of some representativeuse errors, their causes, consequences and the culture in whichthey lie, then the paper moves to a theoretical and practicalplan of action. We argue that a new approach can, overtime, reduce patient harm and improve the experience ofclinicians, but we need new ways to overcome deep culturalbarriers to improvement. Although I have a personal bias — Iwant better interaction design during product development (myteam’s research, e.g., [3], [28], [37] suggests ways of achievingsignificant reductions in patient harm) — the suggestions inthis paper do not presuppose any particular methodology toachieve the improvements that are required, merely that wemake safety visible. Details will be presented later.

Note. Throughout this paper I refer to “use error” and not“user error.” As Reason’s Swiss Cheese model makes clear [30]— and as illustrated in our examples below — the user is onlya small part of a larger system, and in fact only exceptionallyis the user the root cause of any adverse incident. Users arerarely even the final defence against harm; medical devicesare usually the last line of defence. Just because users, likenurses and other clinicians, might be the most salient part of anincident (they are the final agent with freewill, and who made achoice that in hindsight might seem erroneous), that does notmean our language should implicitly burden them with theprejudice of user error. The error appears to become visibleduring use, so call it use error — but in fact the appearanceis a symptom of the latent errors, the incoherence between theuser and the system’s design and the appropriate use (or not)of other information such as patient records and vital signs.

II. EXAMPLES OF DESIGN-INDUCED ERROR

It is a common design defect for a system (whether a PCor handheld device) to ignore use errors and turn them into“valid” actions that almost certainly are not what the user couldhave possibly intended or reasonably wanted.

A. Two radiotherapy examples

Radiotherapy involves complex calculations. Originally ra-diographers at the UK North Staffordshire Royal Infirmarymade manual adjustments to dose, but they continued to make

2

them after their computer system was modified to make theadjustments itself. Consequently, from 1982 to 1991 just under1,000 patients undergoing radiotherapy received under-doses.At the Beatson Oncology Centre (BOC) in Glasgow, unfor-tunately the reverse happened in 2005. The Varis 7 softwarewas upgraded and the full implications were not reviewed, andthe original paper forms for performing calculations continuedto be used. The consequence was that a single patient, LisaNorris, was overdosed in a series of 19 treatments all basedon the same calculation. The report [16] says:

“Changing to the new Varis 7 introduced a specificfeature that if selected by the treatment planner,changed the nature of the data in the Eclipsetreatment Plan Report relative to that in similarreports prior to the May 2005 upgrade . . . theoutcome was that the figure entered on the planningform for one of the critical treatment deliveryparameters was significantly higher than the figurethat should have been used. . . . the error was notidentified in the checking process . . . the setting usedfor each of the first 19 treatments [of Lisa Norris] wastherefore too high . . . ” [6–10, p.ii]

“It should be noted that at no point in the investigationwas it deemed necessary to discuss the incident withthe suppliers of the equipment [Varis 7, Eclipse andRTChart] since there was no suggestion that theseproducts contributed to the error.” [2.7, p.2]

This appears to be saying that whatever a computer sys-tem does, it does not contribute to error provided it wasnot operated by somebody as intended. Indeed, the reportdismisses examining the design of the Varis 7 (or even why anactive piece of medical equipment needs a software upgrade)and instead concentrates on the management, supervision andcompetence of the radiographer who made “the critical error”[10.4, p.43]. It appears that nobody evaluated the clinicalsafety of the new Varis 7 [6.21, p.24], despite an internalmemorandum some months earlier querying unclear controlof purchased software [6.22, p.24].

Sadly, although immediately surviving the overdose LisaNorris died. The report [16] was published just after her death.

B. Throwing away user keystrokes I

In these two radiotherapy examples (above), computerprograms have contributed to adverse incidents, but the reportson the incidents are hard to interpret reliably, in a large partbecause the computer programs are above suspicion, and inthe second case do not record user actions, which wouldhelp establish what users actually did, rather than what thedevices they control actually did. In contrast, the case ofGrete Fossbakk is much clearer. Fossbakk’s case does notinvolve a clinical event, but a financial loss that was, at thelast step, averted. Had she being using a clinical system, theconsequence might have been an incorrect patient numberrather than an incorrect bank account number.

In 2008, Grete Fossbakk transferred 500,000 kroner toher daughter using the web interface to her Union Bank ofNorthern Norway account [29]. Unfortunately, she admits, shemiss-keyed her daughter’s bank account number and a repeated5 in the middle of the account number made it too long.The Union Bank of Northern Norway’s web site then silently

truncated the erroneous number, and this new number (whichwas not the number Fossbakk had keyed) happened to matchan existing account number. The person who received theunexpected 500,000 kr spent it. Only on the steps to the courtroom did the bank relent and refund Fossbakk.

An interesting moral of this story is that, despite it beingwell known that users make typing errors, the bank did nothingto detect the error — the web site simply discarded the excesscharacters; nor did they record the user’s keystrokes, andtherefore when they asked Fossbakk to prove they had madean error, they knew she would be unable to prove the errorwas caused or exacerbated by their design choices. Indeed,she had (after the web site ignored her error) compounded theproblem by confirming the wrong number (by clicking

� �OK� �or

something).

In 2013, some five years later, I looked at my own Lloyd’sBank web site for transferring money to another account. Ituses basic HTML to discard all but the first 8 characters ofan account number. It will therefore ignore the same typeof error Grete Fossbakk made and, again with no keystrokelogging, a user would be hard-pressed to prove the bank made(or compounded) any error.

In both cases — in 2008 and now — the way a user inter-face is designed ignores errors that can easily be detected bythe computer; worse, they are programmed to turn such errorsinto potentially serious unintentional errors (i.e., unintendedbank account numbers), that may then not be detected by theuser. In my Lloyd’s Bank web site, there is explicit programcode to ignore excess characters in bank account numbers.

Yet programming a web site to detect erroneous key pressesand to log them in case of dispute is absolutely trivial. One isforced to conclude banks do not know basic interactive systemdesign principles, or they do not know about the Fossbakkcase, or they find the additional costs of programming properlyexcessive compared to their liabilities, or possibly, even ifthey understand the issues, they do not have the competenceto program a web site properly. Any combination is possi-ble, though all possibilities imply the bank does not caresufficiently to employ sufficiently competent people to do athorough job. Possibly the courts, too, are ignorant about basicprogramming. If a bank does not collect keystroke logs it oughtto undermine their defence, as it suggests they planned notto collect material facts that might help uncover the truth andperhaps help their customers. One wonders whether an airplanemanufacturer who refused to put black boxes in their planeswould be allowed to continue to operate.

Keystroke errors like Fossbakk’s happen, and better pro-gramming could detect, avoid or significantly reduce thechances of them turning into adverse outcomes. Moreover, forwhatever reasons, the programmers involved seem content toleave things so that the user takes on the liabilities for errors.Medical systems are no different in these respects. We havefound medical devices do not record adequate logs [21], theyignore errors [33], [36], [37], and so on.

Many errors with medical devices are hard to notice: smallerrors may have trivial consequences; consequences of errorsmay not be visible for some minutes or hours; users may beunwilling to report large errors; and patients may have co-morbidities so it may be hard to attribute ill-health or death

3

to a specific cause — for all these reasons, there should be ahigh level of care in design. Yet many devices do not providea key click sound (or the sound can be turned off), so a missedkey press or a double key press (e.g., a key bounce) may gounnoticed.

C. Throwing away user keystrokes II

Handheld calculators are used throughout healthcare, fromdrug dose calculations, burns resuscitation, to radiotherapy, andso forth. Calculators are an easier example to discuss thanmany more complex medical devices, say, infusion pumps,since their purpose is well-defined, to do arithmetic depend-ably.

Many infusion pumps include calculators, though the ex-amples we will show here are just about number entry anddisplay, problems they share with all known number-dependentsystems. We’ve reviewed calculator problems elsewhere [32],[36], [37] so two brief examples will serve for the presentpaper (in this section and the next).

I live in Wales, and I am interested in what proportion ofthe world’s population is welsh. I therefore use a calculator tofind out 3, 063, 500÷6, 973, 738, 433, which is the populationof Wales divided by the population of the world (beingnumbers I got off the web so they must be right). I obtainthe following results (ignoring least significant digits):

Casio HS-8V 0.04 . . .Apple iPhone portrait 0.004 . . .

Apple iPhone landscape 0.0004 . . .

These are all market-leading products, yet none of thesecalculators reports an error — only the last is correct. Whateveris going on inside the Apple iPhone, it could clearly report anerror since it provides two different answers even if it doesn’tknow which one is right!

The first elecronic calculators appeared in the 1960s, andHewlett-Packard’s first calculator was made in 1968. We areno longer constrained by technology, and we’ve had some fiftyyears to get their designs right; it is hard to understand whycalculators used in healthcare are not more dependable.

D. Unreliable error correction

Users make mistakes, and a common approach is to providea delete key, or perhaps an undo key. The idea is familiarfrom desktop systems, but devices tend to implement theseerror-correcting functions in broken ways. If the user correctsan error, the correction process needs to be reliable. We willshow that even “high quality” calculators behave in bizarreways, and this can only encourage further use errors. Indeed,the length and intricacy of this section of the paper to describea (completely unnecessary) bug suggests how confusing a userin a clinical environment would find the problem.

On the Hewlett-Packard EasyCalc 100, the delete keybehaves in a peculiar way. If the user keys

� �1� �� �

0� �� �0� �� �

DEL� �they will get 10; the

� �DEL� �deletes keystrokes; yet if they key� �

0� �� �•� �� �

•� �� �DEL� �� �

5� � they might be expecting 0.5 but theywill get 5, ten times out. The

� �DEL� �simply ignores decimal

points — instead, it only deletes the digits. In fact, it doesnot even do that reliably; if the user enters a number with

more than 12 digits, the calculator shows E because it cannothandle any more digits — this is a large number of digits,so this behavior is not only reasonable, but welcome (it isnot present on the Casio and Apple calculators discussedabove). Yet if the user continues to key digits, the

� �DEL� �key

does not delete those digits, but the twelth digit pressed. Inother words, when too many digits have been pressed, thecalculator ignores what the user is doing, so

� �DEL� � deletes

the last key the calculator paid attention to. Unfortunately,pressing

� �DEL� �corrects the error of entering too many digits,

so if the user keyed 1234567890123456� �DEL� �the result would

be 12345678901 not 123456789012345.

That confusing example is easier to understand by writinga to mean 11 digit keystrokes and b to mean 2345 (or in fact,any sequence of digits); thus keying ab

� �6� �� �

DEL� �is treated as a,that is ignoring all of the user’s keystrokes b. One might haveexpected pressing

� �DEL� � to delete the most recent keystroke,

in this case� �

6� �, hence correcting the user’s keypresses to ab— which is how

� �DEL� �works everywhere else in the rest of

the world. Once we see the problem, of course there is anexplanation: when the calculator’s display is full, the calculatorignores keystrokes, but perversely

� �DEL� �works regardless —

but because the calculator has silently ignored keystrokes, itdeletes older keystrokes.

It is not unreasonable that the calculator’s display is lim-ited; in fact 12 digits is generous as things go (the LifeCare4100 LED displays only 4 digits, dropping any fifth digit[1]; this is explained in the user manual, so presumably is adeliberate design choice). There are various sensible solutions:the simplest is that the calculator displays ERROR and locks-up when the display is full — since there has been an error,the user should notice the error and, recognising it as such,have a chance to correct it, say by pressing

� �AC� �; the calculator

could also count how many digits are keyed and how manydeletes, and unlock when digits− deletes ≤ 12.

Interestingly the� �DEL� �on another Hewlett-Packard calcula-

tor, the 20S, behaves quite differently:� �

2� �� �•� �� �

DEL� �� �5� �will be

25 (the EasyCalc would get 5), and� �

2� �� �•� �� �

•� �� �DEL� �� �

5� �wouldbe 5 rather than 2.5.

In short, the key provided to help correct errors behaveslike a conventional delete key much of the time, but notconsistently, and differently on different models even from thesame manufacturer. An inconsistent error correction functionhas to be worse than a consistent one, and one that variesacross the same manufacturer’s models has to be even worse.

Similar correction features are provided on some medicaldevices (and of course calculators are used widely in health-care); unlike most calculators, though, medical devices oftenkeep a log of user actions. It worrying to think that a usercould make a keying slip, correct it, but the device wouldrecord and act on an unrelated number — because the error-correction function is badly designed. In one of the examplesexplained above, a nurse might have entered 0.5 (say, millilitersof morphine), but because of the broken implementation oferror-correction, the infusion pump could deliver 5 mL andrecord 5 mL in its log — providing misleading evidence thatthe nurse incorrectly entered 5.

These design problems must seem trivial to most people,including the manufacturers. One might try to dismiss calcu-

4

Fig. 1. Illustrating the complexity of just a small part of a routine radiotherapycalculation undertaken in an Excel spreadsheet.

lator problems by arguing that such errors are very unlikely,but one should recall that in healthcare there are numerouscalculations. As [37] makes clear, similar problems besetspreadsheet user interfaces — it is a design problem, not acalculator problem per se; figure 1 illustrates part of a verylarge spreadsheet of a routine radiotherapy calculation, wherethere are numerous opportunities for error just for a singlepatient calculation. Or perhaps one would try to argue thatusers should be skilled — but that is made unnecessarily hardby the idiosyncratic diversity of bad designs. One might arguethat the cost of getting designs right (or reducing risk to be aslow as reasonably practical) is prohibitively expensive. Thenone has to wonder why the earlier HP20S is safer than thelater EasyCalc.

Our next example should dispel any remaining sense oftriviality.

E. Kimberly Hiatt

Kimberly Hiatt made an out-by-ten calculation error fora dose of CaCl (calcium chloride). The patient, a baby,subsequently died, though it is not obvious (because the babywas already very ill) whether the overdose contributed to thedeath. Hiatt reported the error and was escorted from thehospital; she subsequently committed suicide — she was the“second victim” of the incident [42]. Notably, after her death,the Nursing Commission terminated their investigation into theincident, so we will never know exactly how the calculationerror occurred [38]. It is possible that Hiatt made a simplekeying mistake on a calculator, such as pressing a decimalpoint twice, resulting in an incorrect number (see figure 2),or maybe she used a calculator with a broken delete key, asdescribed above. We have elsewhere criticized medical devicesthat treat repeated decimal points in bizarre ways [33], thoughthis is perhaps the least of their problems [32], [37].

F. Standards compliance

The ISO Standard 62366, Medical devices — Application ofusability engineering to medical devices, requires an iterativedevelopment process. Devices should be evaluated and im-proved before marketing. The root cause analysis of the deathof Denise Melanson [13], [36] included a two-hour evaluationof five nurses using the Abbott infusion pump involved in theincident. This simple study shows that the Abbott infusionpump has specific design problems. One wonders why theinfusion pump was not tested in a similar way during itsdevelopment, as recommended by ISO 62366.

A

B

Fig. 2. Screenshot of the Infuse app by Medical Technology Solutions.Highlighted (A) is the syntax error 0.5.5, and (B) where the user has entered“1 mcg” (1 microgram) into a field that is shown in units of mg (milligrams).No errors are reported; the 0.5.5 is apparently treated as 0.5 and the 1 mcg isapparently treated as 1 mg, a factor of 1,000 out from what the user typed.The number parser is behaving as if it is stopping (and not reporting an error)when it reads a non-digit or a second decimal point; this is an elementary bug.

Decrease

Setting

Increase

SettingEnter

Cancel

Back

Lighting�Contrast

Backlight Intensity

Display Contrast

Fig. 3. A diagram drawn from the Hospira Plum A+ display panel. Is thebattery 1/3 or 2/3 charged? Note that the horizontal bars for backlight intensityand display contrast have increasing black bars from the left, whereas thebattery inconsistently has black bars increasing from the right. It is not clear,then, whether the battery’s white sectors are indicating 2/3 charged or the blacksectors are indicating 1/3 charged. (Note that the increase/decrease buttons areboth triangles pointing up — neither left nor right.)

The ISO Standard 60878, Graphical symbols for electricalequipment in medical practice, provides a range of standard-ized symbols for medical device features, such as alarms andon/off buttons. The recommended symbol for a battery isshown in figure 4. The symbol used on the Hospira Plum A+is shown in figure 3. It is certainly not clear what the batterysymbol chosen means, and there appears to be no advantage inusing a non-standard symbol, especially one where the designis ambiguous. We have elsewhere estimated that infusion pumpproblems caused by dicharged batteries amount to 4% of allproblems [21]; it is clearly important to be able to read batterycharge levels reliably.

The Hospira Plum A+ infusion pump was drawn to myattention through a Medical Device Alert [23] issued bythe Medicines and Healthcare products Regulatory Agency(MHRA) and widely distributed. The Plum A+ has a volumecontrol for its alarm, and the MHRA alert says there is a riskof staff failing to hear the alarm because the operation of thevolume control is not consistent with the operator’s manual:on some pumps, the knob is turned clockwise to increasethe volume, on others, it is anticlockwise. Wiring it up orprogramming it inconsistently is one design problem, but moreseriously the volume control knob is in a recess on the rear ofthe device. Since the notions of clockwise and anticlockwise

5

Collection 3 - General: Electricity and electronics

– 06 – 3002:IEC © 87806 RTCollection 3 - Généralités : Electricité et électronique

Courant alternatif triphasé neutre

Pour indiquer sur la plaque signalétique que l'appareil ne doit être alimenté qu'en courant alternatif triphasé neutre; pour marquer les bornes correspondantes.

Three-phase alternating current with neutral conductor

To indicate on the rating plate that the equipment is suitable for three-phase alternating current with neutral conductor only; to identify relevant terminals.

5032-2

Courant continu et alternatif

Pour indiquer sur la plaque signalétique que l'appareil peut être alimenté indifféremment en courant continu ou en courant alternatif (tous courants); pour marquer les bornes correspondantes.

Both direct and alternating current

To indicate on the rating plate that the equipment is suitable for both direct and alternating current (universal); to identify relevant terminals.

5033

Pile ou accumulateur, symbole général

Pour marquer un dispositif concernant l'alimentation d'un appareil au moyen de piles ou d'accumulateurs, par exemple un bouton de vérification des piles, l'emplacement des bornes du connecteur, etc.Note 1 - Pour marquer la fonction contrôle des piles, l'emploi du symbole 5546 est recommandé.Note 2 - Le symbole n'est pas destiné à être utilisé pour indiquer la polarité.

Battery, general

To identify a device related to the supply of equipment by means of a (primary or secondary) battery, for instance a battery test button, the location of the connector terminals, etc.Note 1 - To identify a battery check function, the use of symbol 5546 is recommended.Note 2 - This symbol is not intended to be used to indicate polarity.

On battery powered equipment.Sur un matériel alimenté par pile ou accumulateur.

5001

Position des piles ou accumulateurs

Pour marquer le boîtier lui-même et pour marquer le positionnement des éléments à l'intérieur.

Positioning of cell

To identify the battery holder itself and to identify the positioning of the cell(s) inside the battery holder.

On and in battery holders.Sur un boîtier de pile ou accumulateur, et à l'intérieur du boîtier.

5002

Contrôle de pile ou d'accumulateur

Pour marquer la commande permettant de vérifier l'état d'une pile ou d'une batterie (d'accumulateurs) ou pour marquer le voyant de vérification de l'état de ce matériel.Note 1 - Selon l'état de la pile, la surface de la zone noircie peut varier.Note 2 - En combinaison avec un indicateur tel qu'une diode electro-luminescente, ce symbole peut être utilisé pour indiquer que la batterie est en charge.

Battery check

To identify a control to check the condition of a primary or secondary battery or to identify the battery condition indicator.Note 1 - According to the condition of the battery, the size of the darkened area may vary.Note 2 - In combination with an indicator such as an LED, this symbol may be used to indicate the battery is being charged.

5546

Élément rechargeable

Pour marquer un matériel qui ne doit être utilisé qu'avec des éléments rechargeables ou identifier ces éléments. Lorsqu'il figure sur un boîtier, le symbole indique également le positionnement des éléments.

Rechargeable battery

To identify equipment which shall only be used with rechargeable (secondary) cells or batteries, or to identify rechargeable cells or batteries. When shown on a battery holder, the symbol also indicates the positioning of the cells.

5639

Terre

Pour marquer une borne de terre dans les cas où l'utilisation du symbole 5018 et du symbole 5019 n'est pas explicitement recommandée.

Earth (ground)

To identify an earth (ground) terminal in cases where neither the symbol 5018 nor 5019 is explicitly required.

5017

Page 60PD IEC TR 60878:2003

Lice

nsed

Cop

y: U

nive

rsity

Col

lege

Lon

don,

Uni

vers

ity C

olle

ge L

ondo

n, 0

7/02

/201

3 12

:57,

Unc

ontro

lled

Cop

y, (c

) The

Brit

ish

Stan

dard

s In

stitu

tion

2012

Fig. 4. ISO Standard 60878 symbol for a battery. Contrast with the symbolshown in figure 3. The standard also makes clear that the size of the darkenedarea is used to indicate charge. Arguably the standard might be improved ifthe battery was stood on end, so the darkened area was “filling” the batteryagainst gravity in the usual way of filling a vessel.

are reversed on the rear of a device, and to turn the knobthe operator’s hand must be twisted to face back to the user, itseems quite likely that inconsistencies are the least of the user’sproblems! If the sound level is critical to safe operation (it mustbe, else there would not have been a formal alert followed byremedial action) why is the safety-critical design feature on therear of the device where its operation is intrinsically confusing,and anyway a user would be unable to confirm what level itwas set to? The manual clearly shows the knob has no pointer[11], and hence it is impossible to discern the alarm soundlevel from it.

G. Zimed Syringe Driver

The Zimed AD Syringe Driver is, according to its website [43], “a ground-breaking new infusion device that takesusability and patient safety to new levels. . . . Simple to operatefor professionals, patients, parents, or helpers. With colored,on-screen guidance at each step of infusion” and a “continousimprovement programme ensures that the AD Syringe Drivercontinually matches and exceeds the market[’]s needs.” In ourteam’s paper [3] we showed that this device permits use errorthat it ignores, potentially allowing very large numerical errors.One particular cause for concern is over-run errors: a nurseentering a number such as 0.1 will move the cursor rightto enter the least significant digits of the intended number,but an over-run (in this case, an excess number of move-right key presses) may move the cursor to the most significantdigit. Thus an attempt to enter 0.1 could enter 1000.0, justthrough one excess keystroke. It is thus tempting to wonderto what extent the manufacturer’s claims of “usability” arebased on objective measurements, as are required by relevantISO Standards, such as 14971, Medical devices — Applicationof risk management to medical devices, which refer to theusability standards 9241, 62366, etc, discussed elsewhere inthe present paper.

H. Hospira Symbiq

The FDA Safety Information and Adverse Event ReportingProgram records many design issues. Take the Hospira SymbiqInfusion System Touchscreen: Class 1 Recall – May NotRespond to Selection, which affects the Symbiq type Infusers[6]. Due to software issues, the Symbiq’s touchscreen may notrespond to user selection, may experience a delayed responseor may register a different value from the value selected by theuser; a Class 1 Recall is “the most serious type of recall andinvolves situations in which there is a reasonable probabilitythat use of these products will cause serious adverse healthconsequences or death.” Yet the Symbiq had won a designaward based on “systematic evaluation of three design criteria— functional obviousness, ease of operation, and creativity —and three development process criteria — user research duringconcept development, user research during the design process,and the use of appropriate evaluation methods” [12]. Clearlythese prize criteria did not include safety.

I. Summary

Poor design and ignoring detectable error seems to beubiquitous. The specific examples above were not pickedout because they were particularly egregious — some of thesystems discussed are no doubt best in their class — theywere picked out because they are representative. Indeed, thelist of examples could be continued indefinitely, and couldcover everything from implants to national health IT systems.

III. FROM PROBLEMS TO SOLUTIONS

The common thread in all the cases above is that errorsgo unnoticed, and then, because they are unnoticed, theylead to loss or harm. Even in the case of Lisa Norris [16],modifications to a design that induce user error not onlygo unnoticed but are dismissed by investigators. Simply, ifa user does not notice an error, they cannot think about it,and avoid its consequences (except by chance) — the broaderconsequences of this seeming truism are explored in detail byKahneman [17].

All of the example cases above, from Grete Fossbakk on,are that a user makes some sort of slip, mistake, intentionalerror (or even that they violate the rules and bystanders areunaware or complicit in this), and they (or their team) areunaware of this error until it is too late to avoid or mitigatethe consequences. An important role of computers (beyondautomation, data capture, etc) is therefore:

A. Reduce the probability of an error occurring, forexample by reducing confusion over modes;

B. Help notice error, for example by reporting errorsso they are noticed (e.g., sounding an alarm);

C. Block errors, for example by lock-out, requiringconfirmation or restarting; and

D. Enable recovery from errors, for example byproviding an

� �UNDO� �key.

The philosophy here is that errors per se do not matter;what matters is harm. Therefore unnoticed errors are a partic-ular problem, and computers should endeavour to help reduce,detect, block or recover from errors. We will expand further onthe four options below, but first some more general commentsare in order.

It follows that computers (and their designers) shouldconsider and detect actual and likely errors in order to providethis support. Interestingly, much of the focus on use erroris about managing error or error recovery — for instance,providing an undo function — rather than in noticing theerror, even blocking it occuring in the first place (see sectionA below).

The ISO Standard 62366 says, “Reasonably foreseeablesequences or combinations of events involving the USERINTERFACE that can result in a HAZARDOUS SITUATIONassociated with the MEDICAL DEVICE shall be identified.The SEVERITY of the resulting possible HARM shall bedetermined.” (Original emphasis.) One example would be theuser turning the alarm volume right down so that alarms cannotbe heard — a post-market problem of the Hospira Plum A+.

The 62366 standard later discuss categories of foreseeableuser action that lead to harm, namely by using Reason’s

6

error taxonomy [30] (intended/unintended errors, etc). Yet, thisclassic taxonomy of the error does not directly help design.The standard is also remarkably vague on how a manufactureris supposed to implement any of its recommendations. Forexample, it is basic computer science to define the languagethe user interface implements using a formal grammar, androutine analysis of the grammar (say, by an off-the-shelfcompiler-compiler) will classify all foreseeable sequences ofevents that should be properly defined — and will also define,and even implement, relevant error handling. This approachis elementary, and has been taught to undergraduates sincethe 1970s. Yet when we look at, say, the popular Graseby3400 syringe driver, the sequence of actions

� �1� �� �

•� �� �7� �� �

•� �� �0� �

instead of being recognized as an error (the number keyedhas two decimal points) is treated as 1.0 [33]. The devicehas ignored a detectable error, and moreover, done somethingbizarre with the error — turned it into a valid numeric valuethat is wrong. Our papers [3], [37] give many more similarexamples.

Finally, we note that there may be feature interaction orother unintended consequences arising through implementingthese ideas. For example, detecting many errors may lead usersto suffer from “alarm fatigue” and ignore warning alerts. Anysystems developed should be evaluated and the designs iterated(e.g., to ISO 9241, Ergonomics of human-system interaction).

A. Reducing the probability of errors

The bulk of the literature in human-computer interactionis about usability and user experience; arguably if usabilityis poor, this is easy to notice and therefore likely to getfixed (if only because of market pressure), whereas poor errormanagement is very hard to notice, but — at least for health-care applications — much more important. Worse, almostall of the theory in human-computer interaction (e.g., themodel human processor) assumes error-free performance! TheSymbiq usability design award, mentioned above, is perhaps acase in point.

Nevertheless, there is a substantial and useful body ofresearch on interaction design that has focussed on avoid-ing errors in the first place; notable contributions includeNorman’s classic works [26], [27], the discussions in almostall relevant textbooks [25], etc, and specific clinically-relatedhuman factors engineering guidelines such as the John Hopkinsreport [10]. The ISO Standard 9241, Ergonomics of human-system interaction, is not only a basic standard to improve userinterfaces, but has a useful bibliography. More specifically,reducing the probability of misreading numbers in the clinicalcontext has received a lot of attention [14], yet I have notfound a device that obeys all the rules for displaying (let aloneentering) numbers.

It is remarkable that these basic and well-known rules,guidelines and methods are not more widely followed in thedesign of dependable (mission critical and safety critical)clinical user interfaces.

B. Blocking certain errors

Calculators are very bad at detecting error [36], detectingonly the grossest, such as division by zero. This is verysurprising since the syntax of arithmetic expressions is wellunderstood (and allows for detecting many possible errors).

Normally errors in calculations are not obvious. Supposewe wish to calculate 10+5 , which for illustrative purposesincludes an error which in the calm of this paper will be quiteobvious; the relevant sequence of keystrokes is

� �AC� �� �

1� �� �÷� �� �

0� �� �+� �� �

5� �� �=� �. We know before we start that the answer should

be an error, and therefore should be blocked. On the CasioHS-8V, when

� �=� � is pressed, the display shows ERROR 0.

It might have been better if no number (0 in this case) wasshown, but it shows ERROR, and the calculator is effectivelylocked down until the user presses

� �AC� �to clear the error. On

the other hand, the Apple iPhone calculator shows 5 when� �

=� �is pressed, and shows no evidence of any error. In fact, theiPhone detects the 1/0 error when

� �+� �is pressed, but the Error

display is transient, probably missed by the user as they arelooking for the next key to press, and certainly missed if theuser proceeds — when

� �5� � is pressed, the display of Error

turns immediately to 5. The Apple calculator detects an error,but, unlike the Casio, does almost nothing to help the usernotice the error.

Less blatant errors, such as keying too many digits (theproblem exhibited in section II-C) or too many decimal points,are not detected by most calculators. Perhaps worse, somecalculators have an idiosyncratic concept of “too many digits”if the user has pressed

� �DEL� �, as we saw in section II-D.

Again, it is remarkable that the basic and well-knownmethods of computer science are not used to avoid suchproblems. These problems can be discovered by automaticprocesses.

C. Blocking probable errors

Blocking probable errors is a familiar feature of desktopapplications. Quitting a word processor may be the correctintention or it may inadvertently lose all recent edits on adocument, so the application displays a warning message andallows the user to reconsider their action. Obviously to knowwhat a “probable error” is requires prior empirical work or adesign decision that the cost of an unmitigated error is so highthat it should always be queried.

Some errors cannot be detected with certainty. For example,without connection to the EPR, an infusion pump has no ideawhat a correct dose might be; it probably has no idea what thedrug in use is either.

There are various techniques that attempt to reduce wrongdose errors, such as DERS, the dose error reduction system.DERS relies on an infusion pump being correctly set to theright drug, a tedious process that itself is error-prone.

Another approach is to notice that most infusions un-dertaken within a few minutes of a previous infusion arecontinuations. Since the patient is presumably still alive, theprevious infusion rate was not fatal; therefore a large deviationfrom the previous rate should be alarmed, and at the very leastthe user should be asked to confirm the new value. This mightcause problems when a nurse wants to give a patient a bolusand temporarily increases the rate to 999 mL per hour — butsuch devices ought to have a bolus button! Such an approachwould have warned the nurses who mistakenly infused DeniseMelanson with a chemotherapy dose 24 times too high, andthus saved her life.

7

Fig. 5. Example EU product performance labels: tire label (left) and energyefficiency label for consumer goods (right). The A–G scale has A being best.

D. Enabling recovery

It is interesting that almost all desktop applications providean undo function in case the user makes an error (which theynotice and wish to recover from), yet virtually no device has anundo function! Of course many errors may lead to irreversibleharm, and recovery as such is beyond the immediate scopeof the device involved. Many obviously dangerous devices infactories have a large red

� �STOP� �button on them (often easily

pressed by any body part) — and guns have locks [34] — yetmedical devices rarely have buttons to immediately stop them.

An action that may cause irreversible harm can have adelay. Sometimes the user will realize they pressed the wrongbutton, and the delay allows the device to undo the plannedaction, since it has not happened yet. Google’s mail applicationprovides this feature: when the user presses

� �SEND� �the email

is not sent for a few seconds and, while it is still possible,a message is displayed that encourages the user to considersecond thoughts and think whether they wish to undo the send.

E. Summary

These illustrative examples do not exhaust the possibilities.More research is needed, combined with informed economicpressures to adopt improvements (which in turn will cost-justify the investment in the research).

IV. MAKING INFORMED PURCHASING DECISIONS

European Union (EU) legislation now requires car tires toindicate their stopping distance, noise level and fuel efficiencyin a simple way at the point of sale; an example is shownin figure 5. This legislation follows successful schemes toindicate energy efficiency in white goods. When somebodybuys a device, they can now include the visualized factors intotheir decision making: and customers, understandably, want tobuy better products. Indeed, product efficiency has improved,in some cases so much so that the EU has extended the scalesto A?, A??, A???, etc.

Interestingly, the EU has not specified how to make thingsbetter, they have just required the quality to be visible atthe point of sale. The manufacturers, under competition withothers, work out for themselves how to make their productsmore attractive to consumers. It does not matter whetherthe techniques are open or proprietary and give particularmanufacturers commercial advantages — that is exactly the

point, once safety is visible, commercial thinking leads tomaking higher quality products.

It would be a good idea to require a similar system formedical devices.

However, the people who purchase medical devices are notthe people who use them (mostly nurses), nor the people whosuffer from safety problems (patients). We need an adjustmentto the basic scheme: the visual label of quality must not beremovable: both nurses and patients must be able to see thelabel on the device. If a patient questions why they are beinggiven an infusion from a F-rated device, that will tend to putpressure on the purchasing system to buy better devices.

V. WHAT TO MEASURE?

The EU tire labelling scheme is not without its critics. Forexample, it does not indicate tire lifetime, and some manufac-turers claim this is an important feature of their products. TheEU has decided that only some factors are more important;it is also clear from the labelling schemes used on consumergoods that the criteria can be revised. For example, once alltires have excellent tire noise, there is no point indicating itand the EU could either simplify the label or target anotherfactor.

By analogy with tires, then, medical devices should showsome safety ratings — ones that are critical and which canbe assessed objectively — and start with these to createmarket pressure to make key improvements. Obvious issues tomeasure include the response of the device to use error (e.g.,syntax errors during number entry) and the consequences ofunnoticed over-run errors, as affect the Zimed infusion pumpmentioned above.

Our simulation and laboratory work give many examples ofpossible measurements [3], [28]; the John Hopkins report [10](as just one example) implicitly suggests a wide range of im-portant empirical measurements; and just counting how manyof the ISMP rules [14], mentioned earlier, are adhered to wouldprovide a simple measure directly related to safety. There isno shortage of relevant design factors to measure, then, butfurther work is needed to choose a few critical measures thathave an objective (“evidence-based”) impact, but it must beremembered that one does not need to get all the measuresperfectly developed before being able to make an impact. TheEU’s successful experience with labelling electrical consumergoods is an encouraging story, and incidentally shows how thelabelling process can evolve over time as experience develops.

But the experience needs to be explicit, else we cannotlearn from it! It is relatively easy to measure tire stoppingdistances; we have agreement that stopping several metersbefore an obstacle instead of hitting it is desirable, and itand the everyday use of the meter as a unit of distanceis uncontentious (and if somewhere you want to use feet,they are defined in terms of meters). There are no similarconsensus measures in healthcare. Therefore any measurementprocess has to be combined with a process to improve themeasurements and methodologies themselves. That is, we haveto assess and publish the safety of a devices and measure theoutcomes in the environments where those devices are used.In road safety, for instance, we have national figures that allowa correlation between road accidents and types of vehicle, and

8

if we were so inclined we could also correlate down to tires,and thus we could establish more effective metrics to use forcar tire safety labeling initiatives. We should do the same inhealthcare.

VI. CAUSE FOR OPTIMISM?

In the UK, about 2,000 people die on the roads annually,and cars (and tires) are subject to legally-required pre-marketsafety checks, and then after three years to annual safetychecks, as well as random road-side checks. The safety checksare a combination of visual inspection, rigorous tests, andusing advanced equipment (e.g., to measure exhaust emissionquality). After a road accident, proof of roadworthiness isusually required, as it is obvious that a faulty car may causean accident because it is faulty. In healthcare, we don’t evenhave a word analogous to roadworthiness!

In the UK, about 25 people die annually from electricshock. Electrical installations (e.g., house wiring) and elec-trical appliances (e.g., toasters) are subject to legally-requiredchecks. Employers, additionally, have to show they ensure thesafety of their workers (and visitors) by some process suchas appliance testing, which is performed by suitably trainedpersonnel who perform visual inspections and electrical tests.An appliance that has a high earth leakage current (e.g., aClass I device with more than 3.5mA) may be failed, andlabelled with a FAIL DO NOT USE sticker so it is takenout of service.

In UK hospitals about 14,000 people die annually inhospitals from preventable errors [40], and perhaps 10% ofthose errors are due to calculation error. Such relatively highfigures are comparable to road fatalities, and significantlyhigher than electrical fatalities. Testing medical devices forsafe interaction should be routine.

Manufacturers do follow standards — the Hospira PlumA+ conforms to the ANSI/AAMI ES1-1993 Standard, Safecurrent limits for electro-medical apparatus [11] for example,despite the “regulatory burdens” such conformance obviouslyincurs. The industry could conform to safe interaction designstandards too; more is to be gained by doing so.

The usual regulatory interventions to improve products leadto so-called regulatory burdens, which naturally manufacturerswish to reduce. Making performance visible at the point ofsale may be resisted in the short run (for which there are onlycynical explanations), but once the market sees the advantagesof safety, manufacturers will willingly compete to excel. Thecar industry is a case in point: safety sells.

In short, technologies where we have a mature relation-ship set an optimistic note for the improvement of medicaltechnologies. If we can start to use schemes like safety labels— which are an uncontentious idea with safer technologies— then this will start to raise awareness, creating a virtuouscircle.

VII. CAUSE FOR PESSIMISM?

Most common programming languages are hard to use todevelop safe programs [35] and major programming booksignore error [5]. If programmers want to manage use errors,there is a huge cultural gap to leap; worse, many programmers

are not very good, because programming seems easier than itis. That is, almost anybody can write a program that seems towork OK. A few lines of code can make a computer do things,but for it to do those things reliably under all potential circum-stances is a major engineering problem, typically involvingconcepts (e.g., invariant, verification and proof) beyond theeveryday needs of most programmers — and this is beforethe challenges of good interaction design (e.g., detecting andmanaging use error) are added to the requirements. Typically,badly-designed programs that appear to work are sold tousers, with neither adequate engineering to avoid bugs norformative evaluation to find and help correct design defectswhen the systems are used. In the entire world, only a fewhundred people are certified IEEE software developers! Again,there is a relevant international standard: ISO 19759, SoftwareEngineering — Guide to the Software Engineering Body ofKnowledge.

The Dunning-Kruger Effect [20] explains why just tellingpeople they need to program better will not help — theyaren’t skilled enough to understand the problems. Don Berwickexpresses the same sentiment well:

“Commercial air travel didn’t get safer by exhortingpilots to please not crash. It got safer by designingplanes and air travel systems that support pilots andothers to succeed in a very, very complex environ-ment. We can do that in healthcare, too.”— Don Berwick (former President and Chief Execu-tive of the US Institute for Healthcare Improvement)

Healthcare systems (whether devices or IT systems, likeEPRs) are driven by “clinical needs,” which in turn are con-ceived and promoted by clinicians. It takes a significant levelof maturity to recognize the difference between the alluringnature of consumer products and their likely effectiveness inclinical practice. Consumer products, such as smart phones,tablet computers, are wonderful, but designed for individuals.Naturally, individual consumers like such devices. However,it does not follow that they are appropriate for healthcare —some arguments for advanced computers in hospitals have asmuch logic as arguing for ice skating. Ice skating is exciting;clinicians may well find it exciting; but that is not the same asit being relevant for effectively supporting any clinical task!

To compound the problems, most consumer devices arebuilt by very large design and engineering teams; where are thecomparably-resourced teams in healthcare? The consumer mar-ket wants a rapid churn (a continual supply of new products) tokeep up with new features, whereas healthcare needs systemsthat might still be in use in decades’ time. Most healthcareorganisations are short of money; buying technologies that(thanks to consumer market pressures) are obsolete in a coupleof years does not seem like a sound long-term investment.

The consumer industry obviously needs consumers to wantto purchase their devices. Usability sells interactive technol-ogy. It is then, unfortunately, a short step to argue againstinnovations for safety because they are “less usable” andtherefore people will not like the ideas. For example, anextra confirmation step may increase safety but will certainlyadd keystrokes, and the more keystrokes there are the “lessusable” a device must be. This is a misconception. In fact,manufacturers have diverted our attention from the dependable

9

success of tasks to the sense of flow when an error-free taskis accomplished; our natural cognitive biases tend to ignorethe times we failed (we may even blame ourselves rather thanthe design). A useful analogy from cars is that brakes actuallymake cars slower (that’s what they are for!) yet it is obviousthat a car without brakes would have accidents so often thattask completion rates (safely completed journies) would suffer.So, a cause for pessimism is that consumers want usabilityat any cost — thus we need medical device procurement tobecome more professional.

There are many differences between consumer productdesign and healthcare device design. Some of the thingsthat are very enticing about modern consumer products aremobile devices (smart phones, tablets), social applications (likeFacebook and Twitter), multiplayer games, and so forth; theseare all new, constructed activities that are highly seductive. Wedid not do Facebook before it was invented, and we had noidea how much fun it would be. It is no accident that Applehas more financial resources than the United States.

These prominent mass-market devices were not designedfor healthcare. For instance none of them are problematic forthe consumer purposes for which they were designed if theydisconnect or fail, even for a day or so. But a healthcare appli-cation that had the poor dependability of a typical consumerdevice would be dangerous.

Of course new technologies can inspire new healthcareapproaches (such as Patients Online) but it must be emphasizedthat one’s personal enjoyment of a design does not imply itis appropriate (or even enjoyable) in a work environment, letalone in a complex healthcare environment. It is very hard forall of us to make this distinction, as our next example shows.

There is good evidence that nomograms are effective andaccurate ways of performing drug and other calculations [41].In a peer-reviewed article that appeared in the Annals ofInternal Medicine [8], Grimes disparaged nomograms simplybecause they use an obsolete technology — paper. Grimesmakes numerous factual errors, including making an unfavor-able comparison with digital calculators because (he claims)decimal points do not get misplaced with calculators. In thisclaim he is simply mistaken [19], [36], and he parades adangerously false sense of security by saying so. Yet he issufficiently convinced that in a later follow-up in the samejournal he suggests the need for obtaining empirical evidenceto support his claims is as useful as asking for evidence forthe safety of parachutes: indeed, who would sign up to ablinded, randomized trial to defy gravity? The point is, hisconcern (apparently supported by the journal’s reviewers andeditors) for modern computer technology completely overridesany objectivity, and even clouds the scientific objectivity ofdistinguishing personal opinion from fact.

It is interesting to read critiques of pharmaceutical devel-opment [7] and realize that at least in pharmaceuticals thereis a consensus that scientific methods should be used, even ifthe science actually done has many shortcomings. In healthcaredevices, such as infusion pumps, tablet computers, calculators,and so forth, there isn’t even any awareness that these thingsdo need evaluating, in much the same way that proposedtreatments of diseases need rigorous trials.

Fig. 6. (left) Wilhelm Rontgen’s X-ray of his wife Anna’s hand, taken in1895; and (right) illustration of Clarence Dally X-raying his hand, from theNew York World, August 3, 1903, page 1.

VIII. CONCLUSIONS

Today’s acceptance of computers (Health IT) and de-vices with embedded computers (infusion pumps, etc) recallsthe original enthusiastic acceptance of X-rays (figure 6) —Clarence Dally, an early adopter, was already suffering radia-tion damage in 1900 only five years after Rontgen’s first pub-lication [31]; he died from cancer in 1904. When the problemsof X-rays became widely recognized, their risk/benefit trade-offs were assimilated, and it is now obvious that X-rays carryrisks and have to be used carefully.

Or when Ralph Nader published Unsafe at Any Speed: TheDesigned-In Dangers of the American Automobile in the 1960s[24] the car industry operated like many programmers today:they assumed “drivers have accidents” and therefore there wasno obligation to improve the design of cars to make them safer.Cars from the 1960s look obviously dangerous with the benefitof our hindsight today, even to naıve eyes: they are covered indesign features that are sharp and pointy — and they have noseat belts, let alone air bags.

Today’s medical devices are badly programmed, and theculture is to blame users for errors — thus removing theneed to closely examine design. Moreover, often manufacturersrequire users to sign “hold blameless” contracts [19], turningattention away from design to other possible factors, such asthe hapless user. Today’s papers like [9] suggest that mortalityrates in hospitals can double when computerized patient recordsystems are introduced: even if such results are contentious,clearly, computers are not the unqualified “X-ray” blessingsthey are often promoted as being.

Reason’s Swiss Cheese Model [30] makes it very clearthat the nurse (or other clinician) is never the only cause ofpatient harm. This paper has shown that even if the user has a“hole” (in the sense of the Swiss Cheese metaphor) the holesin the programmed devices are largely unnecessary and shouldhave been avoided by better design. We may always havehuman error regardless of training and skill level but basicprogramming errors, particularly in computationally simpledevices like calculators and infusion pumps, are in principlecompletely avoidable — the only obstacles are culture andeconomics. We have therefore proposed some simple ideas— that work well in other complex technology areas —to help put economic pressure on manufacturers to improvetheir designs to make them more appropriate for healthcareapplications.

We need to improve. It would be nice to think thatKimberley Hiatt and many other people might still be alive

10

if their calculators, infusion pumps, linear accelerators and soforth had been made more dependable.

A safety labeling scheme would raise awareness of theissues and increase market competition for safer devices andsystems. It could be done purely voluntarily in the first place,with no regulatory burden on manufacturers. By keeping ratinglabels on devices for their lifetime, end users and patientswould also gain increased awareness of the issues. We couldstart with a simple scheme and, with experience, improve it.

Acknowledgements

Partly funded by EPSRC Grant EP/G059063/1. Particularthanks to and appreciation of comments from my colleaguesand co-workers: Ross Anderson, Ann Blandford, AbigailCauchi, Paul Cairns, Anna Cox, Paul Curzon, Andy Gimblett,Michael Harrison, Paul Lee, Paolo Masci, Patrick Oladimeji,Ben Shneiderman, Peter Szolovits, Martyn Thomas, FrankieThompson, Dave Williams.

REFERENCES

[1] Abbott Laboratories. LifeCare 4100 PCA Plus II Infuser. Rev. 10/91.[2] P. Aspden, J. Wolcott, J. L. Bootman, and L. R. Cronenwett.

Preventing Medication Errors. National Academies Press, 2007.[3] A. Cauchi, P. Curzon, A. Gimblett, P. Masci, and H. Thimbleby. Safer

“5-key” number entry user interfaces using differential formalanalysis. In Proceedings BCS Conference on HCI, volume XXVI,pages 29–38. Oxford University Press, 2012.

[4] D. C. Classen, R. Resar, F. Griffin, F. Federico, T. Frankel,N. Kimmel, J. C. Whittington, A. Frankel, A. Seger, and B. C. James.‘global trigger tool’ shows that adverse events in hospitals may be tentimes greater than previously measured. Health Affairs,30(4):581–589, 2011.

[5] T. Cormen, C. Leiserson, R. Rivest, and C. Stein. Introduction toAlgorithms. MIT Press, third edition, 2009.

[6] Federal Drug Administration, US FDA. Hospira Symbiq InfusionSystem Touchscreen: Class 1 Recall.www.fda.gov/Safety/MedWatch/SafetyInformation/SafetyAlertsforHumanMedicalProducts/ucm326080.htm,2012.

[7] B. Goldacre. Bad Pharma: How drug companies mislead doctors andharm patients. Fourth Estate, 2012.

[8] D. A. Grimes. The nomogram epidemic: Resurgence of a medicalrelic. Annals of Internal Medicine, 149(4):273–275, 2008.

[9] Y. Y. Han, J. A. Carcillo, S. T. Venkataraman, R. S. Clark, R. S.Watson, T. C. Nguyen, H. Bayir, and R. A. Orr. Unexpected increasedmortality after implementation of a commercially sold computerizedphysician order entry system. Pediatrics, 116:1506–1512, 2005.

[10] John Hopkins University. Infusion Pump Workshop 2012: A SystemsEngineering Approach for Human Factors Solutions. 2012.

[11] Hospira. System Operating Manual Plum A+. 2005.[12] Human Factors and Ergonomics Society. Two Medical Devices Win

HFES Product Design Technical Group’s User-Centered ProductDesign Award. www.hfes.org/web/DetailNews.aspx?Id=116 (2006),visited 2013.

[13] Institute for Safe Medication Practices, ISMP. Fluorouracil IncidentRoot Cause Analysis. 2007.

[14] Institute for Safe Medication Practices, ISMP. List of Error-ProneAbbreviations, Symbols and Dose Designations. 2013.

[15] Institute of Medicine. Health IT and Patient Safety. NationalAcademies Press, 2011.

[16] A. M. Johnston. Unintended overexposure of patient Lisa Norrisduring radiotherapy treatment at the Beatson Oncology Centre,Glasgow in January 2006. Scottish Executive, 2006.

[17] D. Kahneman. Thinking, Fast and Slow. Penguin, 2012.

[18] L. T. Kohn, J. M. Corrigan, and M. S. Donaldson. To Err Is Human:Building a Safer Health System. National Academies Press, 2000.

[19] R. Koppel and S. Gordon, editors. First Do Less Harm: Confrontingthe Inconvenient Problems of Patient Safety. Cornell University Press,2012.

[20] J. Kruger and D. Dunning. Unskilled and unaware of it: Howdifficulties in recognizing one’s own incompetence lead to inflatedself-assessments. Journal of Personality and Social Psychology,77(6):1121–1134, 1999.

[21] P. Lee, H. Thimbleby, and F. Thompson. Analysis of infusion pumperror logs and their significance for healthcare. British Journal ofNursing, 21(8):S12–S22, 2012.

[22] T. Lesar, L. Briceland, and D. S. Stein. Factors related to errors inmedication prescribing. Journal American Medical Association,277:312–317, 1997.

[23] Medicines and Healthcare products Regulatory Agency, MHRA.Medical Device Alert, MDA/2013/004. 2013.

[24] R. Nader. Unsafe at Any Speed: The Designed-In Dangers of theAmerican Automobile. Pocket Books, 1966.

[25] J. Nielsen. Usability Engineering. Academic Press, 1993.[26] D. A. Norman. Design rules based on analyses of human error.

Commun. ACM, 26(4):254–258, Apr. 1983.[27] D. A. Norman. The Design of Everyday Things. Basic Books, revised

edition, 2002.[28] P. Oladimeji, H. Thimbleby, and A. Cox. Number entry interfaces and

their effects on errors and number perception. In Proceedings IFIPConference on Human-Computer Interaction, Interact 2011,volume IV, pages 178–185, 2011.

[29] K. A. Olsen. The $100,000 keying error. IEEE Computer,41(4):108–106, 2008.

[30] J. Reason. Human Error. Cambridge University Press, 1990.[31] W. C. Rontgen. On a new kind of rays (translated into English).

British Journal of Radiology, (4):32–33, 1931.[32] H. Thimbleby. Calculators are needlessly bad. International Journal

of Human-Computer Studies, 52(6):1031–1069, 2000.[33] H. Thimbleby. Interaction walkthrough: Evaluation of safety critical

interactive systems. In Proceedings The XIII International Workshopon Design, Specification and Verification of Interactive Systems —DSVIS 2006, volume 4323, pages 52–66. Springer Verlag, 2007.

[34] H. Thimbleby. Think! interactive systems need safety locks. Journalof Computing and Information Technology, 18(4):349–360, 2010.

[35] H. Thimbleby. Heedless programming: Ignoring detectable error is awidespread hazard. Software — Practice & Experience,42(11):1393–1407, 2012.

[36] H. Thimbleby. Ignorance of interaction programming is killingpeople. ACM Interactions, pages 52–57, September+October, 2008.

[37] H. Thimbleby and P. Cairns. Reducing number entry errors: Solving awidespread, serious problem. Journal Royal Society Interface,7(51):1429–1439, 2010.

[38] Various authors. runningahospital.blogspot.co.uk/2011/05/i-wish-we-were-less-patient.html;josephineensign.wordpress.com/2011/04/24/to-err-is-human-medical-errors-and-the-consequences-for-nurses, accessed2013.

[39] C. Vincent, P. Aylin, B. D. Franklin, S. Iskander, A. Jacklin, andK. Moorthy. Is health care getting safer? British Medical Journal,337:1205–1207, 2008.

[40] C. Vincent, G. Neale, and M. Woloshynowych. Adverse events inbritish hospitals: Preliminary retrospective record review. BritishMedical Journal, 322:517–519, 2001.

[41] D. Williams and H. Thimbleby. Using nomograms to reduce harmfrom clinical calculations. In IEEE International Conference onHealthcare Informatics, submitted.

[42] A. W. Wu. Medical error: the second victim. British Medical Journal,320, 2000.

[43] Zimed. AD Syringe Driver.www.zimed.cncpt.co.uk/product/ad-syringe-driver, visited 2013.