examining changes in certification/licensure requirements and the international medical graduate...

Examining changes in certification/licensurerequirements and the international medical graduateexaminee pool

Danette W. McKinley • Brian J. Hess • John R. Boulet •

Rebecca S. Lipner

Received: 21 September 2012 / Accepted: 25 March 2013 / Published online: 20 April 2013� Springer Science+Business Media Dordrecht 2013

Abstract Changes in certification requirements and examinee characteristics are likely to

influence the validity of the evidence associated with interpretations made based on test data. We

examined whether changes in Educational Commission for Foreign Medical Graduates (EC-

FMG) certification requirements over time were associated with changes in internal medicine

(IM) residency program director ratings and certification examination scores. Comparisons were

made between physicians who were ECFMG-certified before and after the Clinical Skills

Assessment (CSA) requirement. A multivariate analysis of covariance was conducted to examine

the differences in program director ratings based on CSA cohort and whether the examinees

emigrated for undergraduate medical education (national vs. international students). A univariate

analysis of covariance was conducted to examine differences in scores from the American Board

of Internal Medicine (ABIM) Internal Medicine Certification Examination. For both analyses,

United States Medical Licensing Examination (USMLE) Step 1 and Step 2 scores were used as

covariates. Results indicate that, of those certified by ECFMG between 1993 and 1997, 17 %

(n = 1,775) left their country of citizenship for undergraduate medical education. In contrast,

38 % (n = 1,874) of those certified between 1999 and 2003 were international students. After

adjustment by covariates, the main effect of cohort membership on the program director ratings

was statistically significant (Wilks’ k = 0.99, F5, 15391 = 19.9, P\0.001). However, the

strength of the relationship between cohort group and the ratings was weak (g = 0.01). The main

effect of migration status was statistically significant and weak (Wilks’ k = 0.98,

F5,15391 = 45.3, P\0.01; g = 0.02). Differences in ABIM Internal Medicine Certification

Examination scores based on whether or not CSA were required was statistically significant,

although the magnitude of the association between these variables was very small. The findings

suggest that the implementation of an additional evaluation of skills (e.g., history-taking, physical

examination) as a prerequisite to postgraduate medical education (residency) provides some

additional, relevant data to those who select ECFMG-certified residents.

D. W. McKinley (&) � J. R. BouletFoundation for Advancement of International Medical Education and Research (FAIMER�),3624 Market Street, 4th Floor, Philadelphia, PA 19104, USAe-mail: [email protected]

B. J. Hess � R. S. LipnerAmerican Board of Internal Medicine (ABIM�), Philadelphia, PA, USA

123

Adv in Health Sci Educ (2014) 19:19–28DOI 10.1007/s10459-013-9456-6

Keywords Clinical skills assessment � Internal medicine certification � Competency

ratings � Validity

Introduction

To ensure a minimal level of competence in a profession (e.g., law, teaching, or medicine),

licensure or certification is usually a prerequisite to practice. Licensure and certification

programs often consist of a series of examinations, as well as assurance that minimal edu-

cational requirements are met. Inappropriate actions by individuals in these professions could

result in serious adverse effects, so protection of the public is of great concern. In the United

States (US), various agencies are charged with protection of the public; this is commonly

accomplished through examination and credentials verification. Ensuring equivalence of

education and experience prior to practice can be challenging, particularly if those seeking

entry into the profession received their professional training in other countries.

The US physician certification and licensure requirements for those who obtained their

medical education abroad have changed over time, usually in accordance with public policy

changes. After World War II, as increasing numbers of physicians trained outside the US sought

opportunities in the US healthcare system, the need to assess the training and qualifications of

these physicians became evident (Gary et al. 1997). In 1956, the Educational Commission for

Foreign Medical Graduates (ECFMGTM) was established to ensure that graduates of medical

schools outside the US and Canada were qualified to enter graduate medical education (resi-

dency) programs and to pursue licensure in the US. While ECFMG certification requirements

for international medical graduates (IMGs) have changed over time, they have always included

assessment of biomedical and clinical knowledge, English proficiency, and verification of

medical education credentials (Gary et al. 1997; Melnick 2006).

Beginning in 1988, IMGs were allowed to take the same professional examinations as US

medical school students and graduates. Between 1988 and 1993, IMGs could take either the

Foreign Medical Graduates Examination in the Medical Sciences (FMGEMS) or the National

Board of Medical Examiners’ Part I and Part II examinations. In 1993, FMGEMS was

administered for the last time, and beginning in 1994, was replaced by the United States

Medical Licensing ExaminationTM (USMLETM) Step 1 and Step 2 examinations. Each of

these examination sequences consisted of multiple-choice questions. Because of concerns

regarding the instruction and evaluation of clinical skills internationally, ECFMG, in 1998,

incorporated an additional, performance-based examination, the clinical skills assessment

(CSA), into the certification process (Hallock and Kostis 2006; Whelan et al. 2005).

When scores from assessments are interpreted as an indication of proficiency in practice,

performance on the test is typically extrapolated to indicate performance in practice (Kane

1994, 2013). Research has shown that changes in examination requirements may be associated

with shifts in the characteristics and ability of the applicants (Whelan et al. 2002). Since

certification test results (i.e., pass-fail decisions) can be interpreted as an indication that those

certified have the minimal skills to competently practice (Kane 1994; Tamblyn et al. 2002), it is

prudent to monitor changes in the requirements and performance over time. Changes in phy-

sician licensure examination requirements provide an opportunity to determine the extent to

which the requirement of an assessment of clinical skills and other changes affect the validity of

these interpretations. The purpose of the current investigation was to compare two cohorts of

ECFMG-certified physicians who entered internal medicine (IM) residency programs and

sought IM certification from the American Board of Internal Medicine (ABIM). Comparisons

20 D. W. McKinley et al.

123

were made between those whose were certified between 1993 and 1997 (before the CSA

requirement) and those who were certified between 1999 and 2003 (post-CSA implementation)

on residency program director ratings and IM certification examination scores. We examined

whether there were changes in the characteristics of examinees over time, and whether these

changes were reflected in residency program and IM certification examination scores.

Method

The ECFMG applicant database was used to extract demographic information, and two

cohorts were identified based on their year of ECFMG certification. Scores for the USMLE

Step 1 (Basic Sciences) and Step 2 (Clinical Knowledge) were extracted from the ECFMG

examination history database. Through a unique identifier, ECFMG examinee records were

merged with information from the ABIM. Once the data were merged, all identifiers were

removed; only a sequence number was retained to link the de-identified records. A retro-

spective, observational study was conducted to examine changes in examinee characteristics

and physician ability as measured by IM ratings and certification examination scores.

Measures

ABIM program director ratings

Program directors use a standardized tool to rate IM residents annually on several skills,

including medical knowledge, history taking, physical examination, procedural skills, and

professionalism. Each component has descriptive anchors that illustrate characteristics of

the worst and best performance. The ratings are scored on a nine-point Likert scale that is

divided into three categories: ‘‘Unsatisfactory’’ (score of 1–3), ‘‘Satisfactory’’ (score of

4–6), and ‘‘Superior’’ (score of 7–9). At the completion of residency training, a satisfactory

rating or higher in all components, including a rating of overall clinical competence, is

required to take the ABIM Internal Medicine Certification Examination. For the purpose of

the current study, 77,005 ratings (five dimensions per resident) from the end of residents’

first year in the program were used, since these data were closest in time to the admin-

istration of the ECFMG CSA. Prior research has shown that the overall clinical compe-

tence ratings correlate significantly with certification examination scores (Shea et al. 1993).

ABIM internal medicine certification examination

As an additional measure of knowledge, we used the ABIM Internal Medicine Certification

Examination, a secure multiple-choice examination comprised of 200 scoreable questions

using patient vignettes that require a single-best-answer response, to measure practicing

physicians’ cognitive skills. Scores reflect fund of medical knowledge, diagnostic acumen,

and clinical judgment in general internal medicine. Physicians were expected to integrate

information, prioritize alternatives, and/or use clinical judgment to reach an appropriate

decision about a course of action in each question. We used physicians’ scores from their first

attempt on the paper-and-pencil administrations of the examination. Because physicians took

the examination at different times (tests were taken between 1991 and 2005), equated scores

(Holland and Dorans 2006) reported on a standardized score scale (Mean = 500, SD = 100)

were used as another measure of physician proficiency.

The international medical graduate examinee pool 21

123

Analysis

Descriptive statistics on the CSA cohorts (i.e., those ECFMG-certified before and after the

CSA requirement) were provided for demographic variables: gender and student migration

status (national student vs. international student). Research has shown that some physicians

leave their home countries to pursue medical education (Hallock et al. 2007), and for these

physicians, a relationship between student migration status and performance may exist

(Boulet et al. 2009; Norcini et al. 2006; van Zanten et al. 2007). In light of the results of

research in this area, migration status was included as a factor in the analysis of the data.

To examine the differences between CSA cohorts on the set of five program director

ratings, a multivariate analysis of covariance (MANCOVA) was conducted. Follow-up

univariate F tests were then conducted to assess differences on each skill rated by the

program director. To control for potential ability differences in the two cohorts, USMLE

Step 1 and Step 2 scores were used as covariates. Cohen’s d (standardized mean differ-

ences) was calculated for each skill dimension to determine the relative magnitude of

differences in mean scores in standard deviation units (Rosnow and Rosenthal 2003). To

compare ABIM Internal Medicine Certification Examination scores, a univariate analysis

of variance (ANOVA) was conducted, with cohort membership (1993–1997 vs.

1999–2003) and migration status (national vs. international) as the independent variables.

We assessed statistical significance using an alpha of 0.05. Analyses were performed using

SPSS, Version 19.0 (IBM 2010). This study was exempt from IRB review. Personal

identification information was removed, and only group-level results are reported.

Results

Demographic characteristics

When the ECFMG data were merged with the program director ratings and IM certification

examination results obtained from the ABIM, a total of 15,609 examinees across both

cohorts were matched. Of those, 205 (1.3 %) were missing program director ratings and

three were missing IM certification scores, leaving 15,401 in the final examinee sample. Of

the 15,401 ECFMG certificate holders, 10,458 (68 %) were certified between 1993 and

1997. When comparing the two groups, 4,101 (39 %) of those certified by ECFMG

between 1993 and 1997 were female, while 2,129 (43 %) of those certified between 1999

and 2003 were female. Those ECFMG-certified between 1993 and 1997 attended 806

medical schools in 122 countries/territories; those certified between 1999 and 2003

attended 654 schools in 120 countries. Cohort differences were observed based on student

migration status; of those certified between 1993 and 1997, 17 % (n = 1,775) left their

country of citizenship for undergraduate medical education. The international students

were citizens of the US (n = 724), Canada (n = 43), and other countries (n = 1,008). In

contrast, 38 % (n = 1,874) of those certified between 1999 and 2003 were international

students. In this group, more international students were citizens of the US (n = 1,278) and

Canada (n = 63).

Program director ratings

Results of the MANCOVA indicated a statistically significant interaction between cohort

group and migration status after controlling for USMLE Step 1 and 2 scores (Wilks’


123

k = 1.0, F5,15391 = 6.7, P \ 0.01), indicating that the differences in ratings between the

two cohorts were different for national and international students. Follow-up F tests

showed that this interaction was significant only for medical knowledge ratings

(F1,15395 = 13.6, P \ 0.01). Figure 1 provides a depiction of the interaction between

cohort group and migration status.

After adjustment by covariates, the main effect of cohort membership was statistically

significant (Wilks’ k = 0.99, F5, 15391 = 19.9, P \ 0.001), indicating that physicians’

performance as measured by program director ratings differed for the CSA cohorts. The

follow-up F tests showed that the mean ratings of those in the post-CSA group were higher,

on average, than those in the pre-CSA group on all ratings. Table 1 provides mean program

director ratings, standardized mean differences (Cohen’s d), and follow-up F test results

based on cohort membership. Standardized mean differences ranged from 0.12 (Medical

Knowledge) to 0.19. The largest differences were found for History-Taking (d = 0.19) and

Procedural Skills (d = 0.19). The strength of the relationship between cohort group and the

ratings was weak, however, with g = 0.01.

Similarly, the main effect of migration status was statistically significant (Wilkes’

k = 0.98, F5,15391 = 45.3, P \ 0.01). The follow-up F tests showed that international

students (those who attended medical school outside their country of citizenship) outper-

formed national students on all ratings except for medical knowledge (Table 2). There was

no statistically significant difference in mean program director ratings of physical exam-

ination and procedural skills. The strength of the relationship between migration status and

the ratings was weak, however, with g = 0.02.

1

2

3

4

5

6

7

8

9

Med Know Hx Take Phys Exam Proc Skills Prof

Natl 93-97 Natl 99-03 Intl 93-97 Intl 99-03

Fig. 1 Mean program director ratings by ECFMG-certified cohort and migration status


123

ABIM Internal Medicine Certification Examinations

ANOVA revealed a statistically significant interaction between cohort group and migration

status for the ABIM Internal Medicine Certification Examination scores (F1, 15395 = 10.7,

P \ 0.01), after controlling for USMLE Step 1 and Step 2 scores. This interaction indicates

that performance differences between the cohort groups varied based on migration status;

however, the interaction effect was negligible g = 0.001. This result is depicted in Fig. 2;

the increase in mean certification score based on cohort membership was larger for national

students. There was a main effect for cohort group (F1, 15395 = 90.9, P = 0.001; g = 0.01)

denoting a statistically significant difference in mean scores obtained by those who were

certified by ECFMG prior to the CSA requirement (Mean = 481.1, SD = 90.8) compared

to those who were certified post-CSA (Mean = 483.7; SD = 85.8). However, the stan-

dardized mean difference was small (d = 0.03), and the association between cohort

membership and examination scores was minimal. There was also a statistically significant

main effect for migration status (F1, 15395 = 516.7, P \ 0.01). Those examinees who

Table 1 Descriptive statistics: program director ratings by ECFMG CSA certification cohort

Skill rated Certification cohort F� P value

Pre (1993–1997) Post (1999–2003)

(n = 10,458) (n = 4,943)

Mean SD Mean SD Standardized meandifference

Medical knowledge 5.81 1.1 5.95 1.2 0.12 5.6 0.02

History taking 5.81 0.9 5.99 0.9 0.19 12.0 0.00

Physical examination 5.81 0.9 5.97 0.9 0.17 4.4 0.04

Procedural skills 5.56 1.2 5.77 0.9 0.19 32.6 0.00

Professionalism 6.43 1.1 6.59 1.1 0.14 13.1 0.00

� F values were obtained from individual significance tests that followed MANCOVA

ECFMG Educational Commission for Foreign Medical Graduates, CSA clinical skills assessment

Table 2 Descriptive statistics: program director ratings by migration status

Skill rated Student migration status F� P value

National International

(n = 11,752) (n = 3,649)

Mean SD Mean SD Standardized mean difference

Medical knowledge 5.90 1.1 5.71 1.1 0.16 44.0 0.00

History taking 5.85 0.9 5.92 1.0 0.07 17.9 0.00

Physical examination 5.86 0.9 5.88 0.9 0.02 3.1 0.08

Procedural skills 5.62 1.1 5.66 1.1 0.04 2.1 0.15

Professionalism 6.46 1.1 6.56 1.1 0.09 20.4 0.00

� F values were obtained from individual significance tests that followed MANCOVA


123

attended medical school in their country of citizenship obtained higher scores, on average

(Mean = 491.9; SD = 84.4) than those who emigrated for their undergraduate medical

education (Mean = 449.6; SD = 96.2). The standardized mean difference based on

migration status was 0.47, indicating an almost one-half of a standard deviation difference

between the two groups. The association between migration status and certification

examination scores was still small, however g = 0.03.

Discussion

For all skills rated by program directors, there were consistent increases in average ratings

from pre- to post-CSA requirement. Program director ratings from the first year of resi-

dency suggested that evaluation of skills measured by the ECFMG CSA allow extrapo-

lation of these results to performance in residency; first-year ratings of medical knowledge,

history taking, physical examination, and procedural skills were higher for IMGs admitted

to programs once the CSA requirement for certification was implemented. These findings

provide evidence to support the validity of the CSA. The additional requirement appears to

be related to improved performance of IMGs during their first year of residency in IM; the

mean program director ratings of those who were in their first year of residency after the

CSA requirement were higher than those who were in residency before the CSA

requirement on all measures. There was also an association between student migration

status and program director ratings. Here, the pattern suggests that those who obtained their

undergraduate medical education in their home country received higher program director

Fig. 2 Mean internal medicine certification scores by ECFMG-certified cohort and migration status


123

ratings in medical knowledge, on average, than those who emigrated for their under-

graduate medical education. However, for all other residency training measures investi-

gated, international students received higher program director ratings, on average. The

interaction between CSA cohort and migration status reflects that, although there is a small

association between cohort membership and program director ratings, the effect is con-

founded with whether or not the physician emigrated for their undergraduate medical

education.

After controlling for differences in prior ability (using USMLE Step 1 and Step 2 CK

scores), the difference in ABIM Internal Medicine Certification Examination scores based

on whether or not CSA was required was statistically significantly different, although the

magnitude of the association between these two variables was very small. The graduate

medical education received up to the point of taking the ABIM certification examination

may offset any differences in ability that would be detected by the certification exami-

nation scores in association with CSA cohort. However, physicians who emigrated for their

medical education had significantly lower IM certification examination scores, on average,

than those who attended medical school in their country of citizenship. The difference in

scores based on student migration status was almost half a standard deviation in magni-

tude; similar to the finding with respect to program directors’ ratings of medical knowl-

edge. Still, the association between this factor and certification examination scores was

small, indicating that factors not accounted for in the analysis may better explain the

differences observed. While these findings provide minimal support for the interpretations

associated with the requirements of ECFMG’s certification process (i.e., including verifi-

cation of credentials, medical education requirements, and performance on USMLE

examinations), the current investigation is not without limitations. First, because ECFMG

certification is required for entry to graduate medical education in the United States, it is

challenging to separate differences in performance based on factors other than the

implementation of the CSA requirement. That is, factors other than those considered in this

study (e.g., changes in medical school curricula, clinical experiences offered during

medical education) may affect the findings. However, even after accounting for ability as

measured by USMLE Steps 1 and 2, differences based on the time periods studied

persisted.

An alternate explanation regarding changes in ability as assessed by the measures used

in this study is associated with changes to the passing scores for the examinations. Testing

organizations regularly review and revise the standards and passing scores associated with

their examinations. For example, standard setting research resulted in changes to the

ECFMG CSA and USMLE Steps 1 and 2. Between 1997 and 1999, the Step 1 passing

score was raised; between 1997 and 2003 the Step 2 passing score was raised four times. In

addition to the implementation of CSA, higher passing scores for the USMLE examina-

tions taken prior to residency are most certainly associated with improved performance for

the latter cohort. While the findings support the more general assertion that changes in

examination requirements are likely to be associated with improved performance, it is

impossible to disentangle this effect from the implementation of the CSA.

Migration status based on emigration for undergraduate medical education, while

considered, was not fully studied. There may be characteristics of the medical schools that

emigres attend that would elucidate performance differences. While international students

included in the current investigation did not perform as well on the IM certification

examination, the ratings they received from program directors were equal to or slightly

higher, on average, than students who attended medical school in their country of citi-

zenship. Because this variable is related to performance, and the number of students


123

emigrating for medical education appears to be increasing (Hallock et al. 2007), additional

research on the differences between the experiences of national and international students

in medical school is warranted.

Third, the study is limited to a single medical specialty (internal medicine) and the

findings cannot be generalized to other specialties. Finally, the comparisons made were

limited to those graduates of international medical education programs and did not include

graduates of medical schools in the US and Canada. Now that the USMLE examination

series includes a clinical skills examination as part of licensure requirements, it is possible

to conduct a similar study for graduates of US and Canadian medical schools.

Despite these limitations, it is interesting to note that the results provide some evidence

of the validity of the implementation of an additional requirement for ECFMG certifica-

tion. Those applicants who obtained their certification after the CSA requirement was

implemented demonstrated that they had slightly greater knowledge and skills than those

certified before the requirement, although this finding is likely to be associated with

changes in the prerequisite licensure examinations (USMLE) as well. Still, the findings

suggest that the implementation of an additional evaluation of skills (e.g., history-taking,

physical examination as measured by the ECFMG CSA) as a prerequisite to practice has

provided valuable new data to those who select ECFMG-certified residents. While self-

selection may have played a role, in addition to the factors studied, the results suggest that

this additional screening may have provided a more able pool of physicians available to

practice in the US healthcare system.

Acknowledgments This research was supported by the Foundation for Advancement of InternationalMedical Education and Research and the American Board of Internal Medicine. The findings and conclu-sions do not necessarily reflect the opinions of the organizations.

References

Boulet, J. R., Cooper, R. A., Seeling, S. S., Norcini, J. J., & McKinley, D. W. (2009). U.S. citizens whoobtain their medical degrees abroad: An overview, 1992–2006. Health Affairs, 28(1), 226–233.

Gary, N. E., Sabo, M. M., Shafron, M. L., Wald, M. K., Ben-David, M. F., & Kelly, W. C. (1997). Graduatesof foreign medical schools: Progression to certification by the Educational Commission for ForeignMedical Graduates. Academic Medicine: Journal of the Association of American Medical Colleges,72(1), 17–22.

Hallock, J. A., & Kostis, J. B. (2006). Celebrating 50 years of experience: An ECFMG perspective. Aca-demic Medicine: Journal of the Association of American Medical Colleges, 81(12 Suppl), S7–16.doi:10.1097/01.ACM.0000243344.55996.1e.

Hallock, J. A., McKinley, D. W., & Boulet, J. R. (2007). Migration of doctors for undergraduate medicaleducation. Medical Teacher, 29(2–3), 98–105.

Holland, P. W., & Dorans, N. J. (2006). Linking and equating. In R. Brennan (Ed.), Educational mea-surement (4th ed., pp. 187–220). Westport, CT: Praeger Publishers.

IBM. (2010). SPSS. Chicago, IL.Kane, M. T. (1994). Validating interpretive arguments for licensure and certification examinations. Eval-

uation and the Health Professions, 17(2), 133–159.; discussion 236–241.Kane, Michael T. (2013). Validating the interpretations and uses of test scores. Journal of Educational

Measurement, 50(1), 1–73. doi:10.1111/jedm.12000.Melnick, D. E. (2006). From defending the walls to improving global medical education: Fifty years of

collaboration between the ECFMG and the NBME. Academic Medicine, 81(12 Suppl), S30–S35.doi:10.1097/01.ACM.0000243462.05719.e1.

Norcini, J., Anderson, M. B., & McKinley, D. W. (2006). The medical education of United States citizenswho train abroad. Surgery, 140(3), 338–346.


123

http://dx.doi.org/10.1097/01.ACM.0000243344.55996.1e

http://dx.doi.org/10.1111/jedm.12000

http://dx.doi.org/10.1097/01.ACM.0000243462.05719.e1

Rosnow, R. L., & Rosenthal, R. (2003). Effect sizes for experimenting psychologists. Canadian Journal ofExperimental Psychology/Revue Canadienne de Psychologie Experimentale, 57, 221–237.doi:10.1037/h0087427.

Shea, J. A., Norcini, J. J., & Kimball, H. R. (1993). Relationships of ratings of clinical competence andABIM scores to certification status. Academic Medicine, 68(10 Suppl), S22–S24.

Tamblyn, R., Abrahamowicz, M., Dauphinee, W. D., Hanley, J. A., Norcini, J., Girard, N., et al. (2002).Association between licensure examination scores and practice in primary care. JAMA, 288(23),3019–3026.

Van Zanten, M., Boulet, J. R., McKinley, D. W., De Champlain, A., & Jobe, A. C. (2007). Assessing thecommunication and interpersonal skills of graduates of international medical schools as part of theUnited States Medical Licensing Exam (USMLE) Step 2 Clinical Skills (CS) Exam. AcademicMedicine, 82(10 Suppl), S65–S68.

Whelan, G. P., Boulet, J. R., McKinley, D. W., Norcini, J. J., Van Zanten, M., Hambleton, R. K., et al.(2005). Scoring standardized patient examinations: Lessons learned from the development andadministration of the ECFMG clinical skills assessment (CSA). Medical Teacher, 27(3), 200–206.

Whelan, G. P., Gary, N. E., Kostis, J., Boulet, J. R., & Hallock, J. A. (2002). The changing pool ofinternational medical graduates seeking certification training in US graduate medical education pro-grams. JAMA, 288(9), 1079–1084.


123

http://dx.doi.org/10.1037/h0087427

examining changes in certification/licensure requirements and the international medical graduate...

Documents