examining changes in certification/licensure requirements and the international medical graduate...
TRANSCRIPT
Examining changes in certification/licensurerequirements and the international medical graduateexaminee pool
Danette W. McKinley • Brian J. Hess • John R. Boulet •
Rebecca S. Lipner
Received: 21 September 2012 / Accepted: 25 March 2013 / Published online: 20 April 2013� Springer Science+Business Media Dordrecht 2013
Abstract Changes in certification requirements and examinee characteristics are likely to
influence the validity of the evidence associated with interpretations made based on test data. We
examined whether changes in Educational Commission for Foreign Medical Graduates (EC-
FMG) certification requirements over time were associated with changes in internal medicine
(IM) residency program director ratings and certification examination scores. Comparisons were
made between physicians who were ECFMG-certified before and after the Clinical Skills
Assessment (CSA) requirement. A multivariate analysis of covariance was conducted to examine
the differences in program director ratings based on CSA cohort and whether the examinees
emigrated for undergraduate medical education (national vs. international students). A univariate
analysis of covariance was conducted to examine differences in scores from the American Board
of Internal Medicine (ABIM) Internal Medicine Certification Examination. For both analyses,
United States Medical Licensing Examination (USMLE) Step 1 and Step 2 scores were used as
covariates. Results indicate that, of those certified by ECFMG between 1993 and 1997, 17 %
(n = 1,775) left their country of citizenship for undergraduate medical education. In contrast,
38 % (n = 1,874) of those certified between 1999 and 2003 were international students. After
adjustment by covariates, the main effect of cohort membership on the program director ratings
was statistically significant (Wilks’ k = 0.99, F5, 15391 = 19.9, P\0.001). However, the
strength of the relationship between cohort group and the ratings was weak (g = 0.01). The main
effect of migration status was statistically significant and weak (Wilks’ k = 0.98,
F5,15391 = 45.3, P\0.01; g = 0.02). Differences in ABIM Internal Medicine Certification
Examination scores based on whether or not CSA were required was statistically significant,
although the magnitude of the association between these variables was very small. The findings
suggest that the implementation of an additional evaluation of skills (e.g., history-taking, physical
examination) as a prerequisite to postgraduate medical education (residency) provides some
additional, relevant data to those who select ECFMG-certified residents.
D. W. McKinley (&) � J. R. BouletFoundation for Advancement of International Medical Education and Research (FAIMER�),3624 Market Street, 4th Floor, Philadelphia, PA 19104, USAe-mail: [email protected]
B. J. Hess � R. S. LipnerAmerican Board of Internal Medicine (ABIM�), Philadelphia, PA, USA
123
Adv in Health Sci Educ (2014) 19:19–28DOI 10.1007/s10459-013-9456-6
Keywords Clinical skills assessment � Internal medicine certification � Competency
ratings � Validity
Introduction
To ensure a minimal level of competence in a profession (e.g., law, teaching, or medicine),
licensure or certification is usually a prerequisite to practice. Licensure and certification
programs often consist of a series of examinations, as well as assurance that minimal edu-
cational requirements are met. Inappropriate actions by individuals in these professions could
result in serious adverse effects, so protection of the public is of great concern. In the United
States (US), various agencies are charged with protection of the public; this is commonly
accomplished through examination and credentials verification. Ensuring equivalence of
education and experience prior to practice can be challenging, particularly if those seeking
entry into the profession received their professional training in other countries.
The US physician certification and licensure requirements for those who obtained their
medical education abroad have changed over time, usually in accordance with public policy
changes. After World War II, as increasing numbers of physicians trained outside the US sought
opportunities in the US healthcare system, the need to assess the training and qualifications of
these physicians became evident (Gary et al. 1997). In 1956, the Educational Commission for
Foreign Medical Graduates (ECFMGTM) was established to ensure that graduates of medical
schools outside the US and Canada were qualified to enter graduate medical education (resi-
dency) programs and to pursue licensure in the US. While ECFMG certification requirements
for international medical graduates (IMGs) have changed over time, they have always included
assessment of biomedical and clinical knowledge, English proficiency, and verification of
medical education credentials (Gary et al. 1997; Melnick 2006).
Beginning in 1988, IMGs were allowed to take the same professional examinations as US
medical school students and graduates. Between 1988 and 1993, IMGs could take either the
Foreign Medical Graduates Examination in the Medical Sciences (FMGEMS) or the National
Board of Medical Examiners’ Part I and Part II examinations. In 1993, FMGEMS was
administered for the last time, and beginning in 1994, was replaced by the United States
Medical Licensing ExaminationTM (USMLETM) Step 1 and Step 2 examinations. Each of
these examination sequences consisted of multiple-choice questions. Because of concerns
regarding the instruction and evaluation of clinical skills internationally, ECFMG, in 1998,
incorporated an additional, performance-based examination, the clinical skills assessment
(CSA), into the certification process (Hallock and Kostis 2006; Whelan et al. 2005).
When scores from assessments are interpreted as an indication of proficiency in practice,
performance on the test is typically extrapolated to indicate performance in practice (Kane
1994, 2013). Research has shown that changes in examination requirements may be associated
with shifts in the characteristics and ability of the applicants (Whelan et al. 2002). Since
certification test results (i.e., pass-fail decisions) can be interpreted as an indication that those
certified have the minimal skills to competently practice (Kane 1994; Tamblyn et al. 2002), it is
prudent to monitor changes in the requirements and performance over time. Changes in phy-
sician licensure examination requirements provide an opportunity to determine the extent to
which the requirement of an assessment of clinical skills and other changes affect the validity of
these interpretations. The purpose of the current investigation was to compare two cohorts of
ECFMG-certified physicians who entered internal medicine (IM) residency programs and
sought IM certification from the American Board of Internal Medicine (ABIM). Comparisons
20 D. W. McKinley et al.
123
were made between those whose were certified between 1993 and 1997 (before the CSA
requirement) and those who were certified between 1999 and 2003 (post-CSA implementation)
on residency program director ratings and IM certification examination scores. We examined
whether there were changes in the characteristics of examinees over time, and whether these
changes were reflected in residency program and IM certification examination scores.
Method
The ECFMG applicant database was used to extract demographic information, and two
cohorts were identified based on their year of ECFMG certification. Scores for the USMLE
Step 1 (Basic Sciences) and Step 2 (Clinical Knowledge) were extracted from the ECFMG
examination history database. Through a unique identifier, ECFMG examinee records were
merged with information from the ABIM. Once the data were merged, all identifiers were
removed; only a sequence number was retained to link the de-identified records. A retro-
spective, observational study was conducted to examine changes in examinee characteristics
and physician ability as measured by IM ratings and certification examination scores.
Measures
ABIM program director ratings
Program directors use a standardized tool to rate IM residents annually on several skills,
including medical knowledge, history taking, physical examination, procedural skills, and
professionalism. Each component has descriptive anchors that illustrate characteristics of
the worst and best performance. The ratings are scored on a nine-point Likert scale that is
divided into three categories: ‘‘Unsatisfactory’’ (score of 1–3), ‘‘Satisfactory’’ (score of
4–6), and ‘‘Superior’’ (score of 7–9). At the completion of residency training, a satisfactory
rating or higher in all components, including a rating of overall clinical competence, is
required to take the ABIM Internal Medicine Certification Examination. For the purpose of
the current study, 77,005 ratings (five dimensions per resident) from the end of residents’
first year in the program were used, since these data were closest in time to the admin-
istration of the ECFMG CSA. Prior research has shown that the overall clinical compe-
tence ratings correlate significantly with certification examination scores (Shea et al. 1993).
ABIM internal medicine certification examination
As an additional measure of knowledge, we used the ABIM Internal Medicine Certification
Examination, a secure multiple-choice examination comprised of 200 scoreable questions
using patient vignettes that require a single-best-answer response, to measure practicing
physicians’ cognitive skills. Scores reflect fund of medical knowledge, diagnostic acumen,
and clinical judgment in general internal medicine. Physicians were expected to integrate
information, prioritize alternatives, and/or use clinical judgment to reach an appropriate
decision about a course of action in each question. We used physicians’ scores from their first
attempt on the paper-and-pencil administrations of the examination. Because physicians took
the examination at different times (tests were taken between 1991 and 2005), equated scores
(Holland and Dorans 2006) reported on a standardized score scale (Mean = 500, SD = 100)
were used as another measure of physician proficiency.
The international medical graduate examinee pool 21
123
Analysis
Descriptive statistics on the CSA cohorts (i.e., those ECFMG-certified before and after the
CSA requirement) were provided for demographic variables: gender and student migration
status (national student vs. international student). Research has shown that some physicians
leave their home countries to pursue medical education (Hallock et al. 2007), and for these
physicians, a relationship between student migration status and performance may exist
(Boulet et al. 2009; Norcini et al. 2006; van Zanten et al. 2007). In light of the results of
research in this area, migration status was included as a factor in the analysis of the data.
To examine the differences between CSA cohorts on the set of five program director
ratings, a multivariate analysis of covariance (MANCOVA) was conducted. Follow-up
univariate F tests were then conducted to assess differences on each skill rated by the
program director. To control for potential ability differences in the two cohorts, USMLE
Step 1 and Step 2 scores were used as covariates. Cohen’s d (standardized mean differ-
ences) was calculated for each skill dimension to determine the relative magnitude of
differences in mean scores in standard deviation units (Rosnow and Rosenthal 2003). To
compare ABIM Internal Medicine Certification Examination scores, a univariate analysis
of variance (ANOVA) was conducted, with cohort membership (1993–1997 vs.
1999–2003) and migration status (national vs. international) as the independent variables.
We assessed statistical significance using an alpha of 0.05. Analyses were performed using
SPSS, Version 19.0 (IBM 2010). This study was exempt from IRB review. Personal
identification information was removed, and only group-level results are reported.
Results
Demographic characteristics
When the ECFMG data were merged with the program director ratings and IM certification
examination results obtained from the ABIM, a total of 15,609 examinees across both
cohorts were matched. Of those, 205 (1.3 %) were missing program director ratings and
three were missing IM certification scores, leaving 15,401 in the final examinee sample. Of
the 15,401 ECFMG certificate holders, 10,458 (68 %) were certified between 1993 and
1997. When comparing the two groups, 4,101 (39 %) of those certified by ECFMG
between 1993 and 1997 were female, while 2,129 (43 %) of those certified between 1999
and 2003 were female. Those ECFMG-certified between 1993 and 1997 attended 806
medical schools in 122 countries/territories; those certified between 1999 and 2003
attended 654 schools in 120 countries. Cohort differences were observed based on student
migration status; of those certified between 1993 and 1997, 17 % (n = 1,775) left their
country of citizenship for undergraduate medical education. The international students
were citizens of the US (n = 724), Canada (n = 43), and other countries (n = 1,008). In
contrast, 38 % (n = 1,874) of those certified between 1999 and 2003 were international
students. In this group, more international students were citizens of the US (n = 1,278) and
Canada (n = 63).
Program director ratings
Results of the MANCOVA indicated a statistically significant interaction between cohort
group and migration status after controlling for USMLE Step 1 and 2 scores (Wilks’
22 D. W. McKinley et al.
123
k = 1.0, F5,15391 = 6.7, P \ 0.01), indicating that the differences in ratings between the
two cohorts were different for national and international students. Follow-up F tests
showed that this interaction was significant only for medical knowledge ratings
(F1,15395 = 13.6, P \ 0.01). Figure 1 provides a depiction of the interaction between
cohort group and migration status.
After adjustment by covariates, the main effect of cohort membership was statistically
significant (Wilks’ k = 0.99, F5, 15391 = 19.9, P \ 0.001), indicating that physicians’
performance as measured by program director ratings differed for the CSA cohorts. The
follow-up F tests showed that the mean ratings of those in the post-CSA group were higher,
on average, than those in the pre-CSA group on all ratings. Table 1 provides mean program
director ratings, standardized mean differences (Cohen’s d), and follow-up F test results
based on cohort membership. Standardized mean differences ranged from 0.12 (Medical
Knowledge) to 0.19. The largest differences were found for History-Taking (d = 0.19) and
Procedural Skills (d = 0.19). The strength of the relationship between cohort group and the
ratings was weak, however, with g = 0.01.
Similarly, the main effect of migration status was statistically significant (Wilkes’
k = 0.98, F5,15391 = 45.3, P \ 0.01). The follow-up F tests showed that international
students (those who attended medical school outside their country of citizenship) outper-
formed national students on all ratings except for medical knowledge (Table 2). There was
no statistically significant difference in mean program director ratings of physical exam-
ination and procedural skills. The strength of the relationship between migration status and
the ratings was weak, however, with g = 0.02.
1
2
3
4
5
6
7
8
9
Med Know Hx Take Phys Exam Proc Skills Prof
Natl 93-97 Natl 99-03 Intl 93-97 Intl 99-03
Fig. 1 Mean program director ratings by ECFMG-certified cohort and migration status
The international medical graduate examinee pool 23
123
ABIM Internal Medicine Certification Examinations
ANOVA revealed a statistically significant interaction between cohort group and migration
status for the ABIM Internal Medicine Certification Examination scores (F1, 15395 = 10.7,
P \ 0.01), after controlling for USMLE Step 1 and Step 2 scores. This interaction indicates
that performance differences between the cohort groups varied based on migration status;
however, the interaction effect was negligible g = 0.001. This result is depicted in Fig. 2;
the increase in mean certification score based on cohort membership was larger for national
students. There was a main effect for cohort group (F1, 15395 = 90.9, P = 0.001; g = 0.01)
denoting a statistically significant difference in mean scores obtained by those who were
certified by ECFMG prior to the CSA requirement (Mean = 481.1, SD = 90.8) compared
to those who were certified post-CSA (Mean = 483.7; SD = 85.8). However, the stan-
dardized mean difference was small (d = 0.03), and the association between cohort
membership and examination scores was minimal. There was also a statistically significant
main effect for migration status (F1, 15395 = 516.7, P \ 0.01). Those examinees who
Table 1 Descriptive statistics: program director ratings by ECFMG CSA certification cohort
Skill rated Certification cohort F� P value
Pre (1993–1997) Post (1999–2003)
(n = 10,458) (n = 4,943)
Mean SD Mean SD Standardized meandifference
Medical knowledge 5.81 1.1 5.95 1.2 0.12 5.6 0.02
History taking 5.81 0.9 5.99 0.9 0.19 12.0 0.00
Physical examination 5.81 0.9 5.97 0.9 0.17 4.4 0.04
Procedural skills 5.56 1.2 5.77 0.9 0.19 32.6 0.00
Professionalism 6.43 1.1 6.59 1.1 0.14 13.1 0.00
� F values were obtained from individual significance tests that followed MANCOVA
ECFMG Educational Commission for Foreign Medical Graduates, CSA clinical skills assessment
Table 2 Descriptive statistics: program director ratings by migration status
Skill rated Student migration status F� P value
National International
(n = 11,752) (n = 3,649)
Mean SD Mean SD Standardized mean difference
Medical knowledge 5.90 1.1 5.71 1.1 0.16 44.0 0.00
History taking 5.85 0.9 5.92 1.0 0.07 17.9 0.00
Physical examination 5.86 0.9 5.88 0.9 0.02 3.1 0.08
Procedural skills 5.62 1.1 5.66 1.1 0.04 2.1 0.15
Professionalism 6.46 1.1 6.56 1.1 0.09 20.4 0.00
� F values were obtained from individual significance tests that followed MANCOVA
24 D. W. McKinley et al.
123
attended medical school in their country of citizenship obtained higher scores, on average
(Mean = 491.9; SD = 84.4) than those who emigrated for their undergraduate medical
education (Mean = 449.6; SD = 96.2). The standardized mean difference based on
migration status was 0.47, indicating an almost one-half of a standard deviation difference
between the two groups. The association between migration status and certification
examination scores was still small, however g = 0.03.
Discussion
For all skills rated by program directors, there were consistent increases in average ratings
from pre- to post-CSA requirement. Program director ratings from the first year of resi-
dency suggested that evaluation of skills measured by the ECFMG CSA allow extrapo-
lation of these results to performance in residency; first-year ratings of medical knowledge,
history taking, physical examination, and procedural skills were higher for IMGs admitted
to programs once the CSA requirement for certification was implemented. These findings
provide evidence to support the validity of the CSA. The additional requirement appears to
be related to improved performance of IMGs during their first year of residency in IM; the
mean program director ratings of those who were in their first year of residency after the
CSA requirement were higher than those who were in residency before the CSA
requirement on all measures. There was also an association between student migration
status and program director ratings. Here, the pattern suggests that those who obtained their
undergraduate medical education in their home country received higher program director
Fig. 2 Mean internal medicine certification scores by ECFMG-certified cohort and migration status
The international medical graduate examinee pool 25
123
ratings in medical knowledge, on average, than those who emigrated for their under-
graduate medical education. However, for all other residency training measures investi-
gated, international students received higher program director ratings, on average. The
interaction between CSA cohort and migration status reflects that, although there is a small
association between cohort membership and program director ratings, the effect is con-
founded with whether or not the physician emigrated for their undergraduate medical
education.
After controlling for differences in prior ability (using USMLE Step 1 and Step 2 CK
scores), the difference in ABIM Internal Medicine Certification Examination scores based
on whether or not CSA was required was statistically significantly different, although the
magnitude of the association between these two variables was very small. The graduate
medical education received up to the point of taking the ABIM certification examination
may offset any differences in ability that would be detected by the certification exami-
nation scores in association with CSA cohort. However, physicians who emigrated for their
medical education had significantly lower IM certification examination scores, on average,
than those who attended medical school in their country of citizenship. The difference in
scores based on student migration status was almost half a standard deviation in magni-
tude; similar to the finding with respect to program directors’ ratings of medical knowl-
edge. Still, the association between this factor and certification examination scores was
small, indicating that factors not accounted for in the analysis may better explain the
differences observed. While these findings provide minimal support for the interpretations
associated with the requirements of ECFMG’s certification process (i.e., including verifi-
cation of credentials, medical education requirements, and performance on USMLE
examinations), the current investigation is not without limitations. First, because ECFMG
certification is required for entry to graduate medical education in the United States, it is
challenging to separate differences in performance based on factors other than the
implementation of the CSA requirement. That is, factors other than those considered in this
study (e.g., changes in medical school curricula, clinical experiences offered during
medical education) may affect the findings. However, even after accounting for ability as
measured by USMLE Steps 1 and 2, differences based on the time periods studied
persisted.
An alternate explanation regarding changes in ability as assessed by the measures used
in this study is associated with changes to the passing scores for the examinations. Testing
organizations regularly review and revise the standards and passing scores associated with
their examinations. For example, standard setting research resulted in changes to the
ECFMG CSA and USMLE Steps 1 and 2. Between 1997 and 1999, the Step 1 passing
score was raised; between 1997 and 2003 the Step 2 passing score was raised four times. In
addition to the implementation of CSA, higher passing scores for the USMLE examina-
tions taken prior to residency are most certainly associated with improved performance for
the latter cohort. While the findings support the more general assertion that changes in
examination requirements are likely to be associated with improved performance, it is
impossible to disentangle this effect from the implementation of the CSA.
Migration status based on emigration for undergraduate medical education, while
considered, was not fully studied. There may be characteristics of the medical schools that
emigres attend that would elucidate performance differences. While international students
included in the current investigation did not perform as well on the IM certification
examination, the ratings they received from program directors were equal to or slightly
higher, on average, than students who attended medical school in their country of citi-
zenship. Because this variable is related to performance, and the number of students
26 D. W. McKinley et al.
123
emigrating for medical education appears to be increasing (Hallock et al. 2007), additional
research on the differences between the experiences of national and international students
in medical school is warranted.
Third, the study is limited to a single medical specialty (internal medicine) and the
findings cannot be generalized to other specialties. Finally, the comparisons made were
limited to those graduates of international medical education programs and did not include
graduates of medical schools in the US and Canada. Now that the USMLE examination
series includes a clinical skills examination as part of licensure requirements, it is possible
to conduct a similar study for graduates of US and Canadian medical schools.
Despite these limitations, it is interesting to note that the results provide some evidence
of the validity of the implementation of an additional requirement for ECFMG certifica-
tion. Those applicants who obtained their certification after the CSA requirement was
implemented demonstrated that they had slightly greater knowledge and skills than those
certified before the requirement, although this finding is likely to be associated with
changes in the prerequisite licensure examinations (USMLE) as well. Still, the findings
suggest that the implementation of an additional evaluation of skills (e.g., history-taking,
physical examination as measured by the ECFMG CSA) as a prerequisite to practice has
provided valuable new data to those who select ECFMG-certified residents. While self-
selection may have played a role, in addition to the factors studied, the results suggest that
this additional screening may have provided a more able pool of physicians available to
practice in the US healthcare system.
Acknowledgments This research was supported by the Foundation for Advancement of InternationalMedical Education and Research and the American Board of Internal Medicine. The findings and conclu-sions do not necessarily reflect the opinions of the organizations.
References
Boulet, J. R., Cooper, R. A., Seeling, S. S., Norcini, J. J., & McKinley, D. W. (2009). U.S. citizens whoobtain their medical degrees abroad: An overview, 1992–2006. Health Affairs, 28(1), 226–233.
Gary, N. E., Sabo, M. M., Shafron, M. L., Wald, M. K., Ben-David, M. F., & Kelly, W. C. (1997). Graduatesof foreign medical schools: Progression to certification by the Educational Commission for ForeignMedical Graduates. Academic Medicine: Journal of the Association of American Medical Colleges,72(1), 17–22.
Hallock, J. A., & Kostis, J. B. (2006). Celebrating 50 years of experience: An ECFMG perspective. Aca-demic Medicine: Journal of the Association of American Medical Colleges, 81(12 Suppl), S7–16.doi:10.1097/01.ACM.0000243344.55996.1e.
Hallock, J. A., McKinley, D. W., & Boulet, J. R. (2007). Migration of doctors for undergraduate medicaleducation. Medical Teacher, 29(2–3), 98–105.
Holland, P. W., & Dorans, N. J. (2006). Linking and equating. In R. Brennan (Ed.), Educational mea-surement (4th ed., pp. 187–220). Westport, CT: Praeger Publishers.
IBM. (2010). SPSS. Chicago, IL.Kane, M. T. (1994). Validating interpretive arguments for licensure and certification examinations. Eval-
uation and the Health Professions, 17(2), 133–159.; discussion 236–241.Kane, Michael T. (2013). Validating the interpretations and uses of test scores. Journal of Educational
Measurement, 50(1), 1–73. doi:10.1111/jedm.12000.Melnick, D. E. (2006). From defending the walls to improving global medical education: Fifty years of
collaboration between the ECFMG and the NBME. Academic Medicine, 81(12 Suppl), S30–S35.doi:10.1097/01.ACM.0000243462.05719.e1.
Norcini, J., Anderson, M. B., & McKinley, D. W. (2006). The medical education of United States citizenswho train abroad. Surgery, 140(3), 338–346.
The international medical graduate examinee pool 27
123
Rosnow, R. L., & Rosenthal, R. (2003). Effect sizes for experimenting psychologists. Canadian Journal ofExperimental Psychology/Revue Canadienne de Psychologie Experimentale, 57, 221–237.doi:10.1037/h0087427.
Shea, J. A., Norcini, J. J., & Kimball, H. R. (1993). Relationships of ratings of clinical competence andABIM scores to certification status. Academic Medicine, 68(10 Suppl), S22–S24.
Tamblyn, R., Abrahamowicz, M., Dauphinee, W. D., Hanley, J. A., Norcini, J., Girard, N., et al. (2002).Association between licensure examination scores and practice in primary care. JAMA, 288(23),3019–3026.
Van Zanten, M., Boulet, J. R., McKinley, D. W., De Champlain, A., & Jobe, A. C. (2007). Assessing thecommunication and interpersonal skills of graduates of international medical schools as part of theUnited States Medical Licensing Exam (USMLE) Step 2 Clinical Skills (CS) Exam. AcademicMedicine, 82(10 Suppl), S65–S68.
Whelan, G. P., Boulet, J. R., McKinley, D. W., Norcini, J. J., Van Zanten, M., Hambleton, R. K., et al.(2005). Scoring standardized patient examinations: Lessons learned from the development andadministration of the ECFMG clinical skills assessment (CSA). Medical Teacher, 27(3), 200–206.
Whelan, G. P., Gary, N. E., Kostis, J., Boulet, J. R., & Hallock, J. A. (2002). The changing pool ofinternational medical graduates seeking certification training in US graduate medical education pro-grams. JAMA, 288(9), 1079–1084.
28 D. W. McKinley et al.
123