Medical school grades may predict future clinical competence

1. INTRODUCTION

In the clinical setting, the incompetency of trainee doctors is always difficult issue. These problematic trainees usually require remediation, creating significant workflow issues and negatively impacting future recruitment.1 Moreover, incompetence may also result in mistreatment, which may be harmful to patients.2 In real-world practice, one crucial question is the delayed intervention for “problematic trainees,” that is, the intervention usually after the observation and development of incompetent behavior. This phenomenon contradicts most updated theories of medical education: the “formative feedback” provided during the training is far more helpful than “summative feedback” after the training.3 However, effective and timely formative feedback is highly dependent on the early detection and identification of problems by medical educators. In the United States, studies using data from medical schools for predicting future clinical performance have shown conflicting results.4–6 An updated meta-analysis showed examination-based selection strategies (i.e., the USMLE board exam) strongly correlate with in-training examinations and a moderate association between grades in medical school and examination-based and outcome.7 One study analyzing the data of Harvard Medical School found that the “bottom quartile” of preclerkship grades could predict academic difficulties, measured by clinical clerkship GPA and OSCEs.8 Conversely, there is also one study showed that USMLE Part 1 may even have a negative correlation with future clinical performance and professionalism.9 In Taiwan, there are no reliable predictors of future clinical performance and the development of competencies. Hence, this study aimed to identify predictors of clinical performance in the Taiwanese context by using a large-scale database from the Taipei Veterans General Hospital.

2. METHODS

This study analyzed the data of medical students who graduated from National Yang-Ming University (NYMU, name changed in 2021 as National Yang-Ming Chiao Tung University) in July 2018 and July 2020. We used data from 2018 as the derivation cohort and 2020 as the validation cohort to avoid mixed data for students who delayed graduation for one year due to certain special programs (i.e., physician-scientist) within the curriculum or students’ career planning.

2.1. The inclusion and exclusion criteria

The inclusion criteria were as follows (1) completed internship training at Taipei Veterans General Hospital (Taipei VGH) and (2) participated in a postgraduate year (PGY) interview at Taipei VGH Trainees without data on the national OSCE or who had not participated in the PGY interview were excluded.

2.2. The procedure and scoring of national OSCE

The procedure of national OSCEs has been validated and used in Taiwan’s teaching hospital for more than 10 years.10 All examinees are presented with the same problem and asked to execute the same task.11 The examiners evaluated the examinees’ performance based on a standardized checklist. The standardized patient should be trained by healthcare professionals to act as patients according to the standardized role-play script.12 In Taipei VGH, the national OSCEs include 12 stations for national OSCEs, and the time duration of each station is 8 minutes. The 12 stations of the OSCE commonly include history taking, communication skills, procedural skills, physical examination, differential diagnosis, and clinical reasoning. At each station, each examinee received scores from the checklist and the global rating scores. The scores consisted of 0 (not at all), 2 (partially achieved), and 3 (completely achieved). The total items from the checklist of each station could be different. Hence, scores were transformed into the percentage of the total scores of each station. The global rating score in each station was classified into five levels: 1 (poor), 2 (fair), 3 (average), 4 (good), and 5 (excellent). In the national OSCEs, the examinee-centered borderline group method with regression established a passing standard.13–15 The passing standard of each station is the mean score from the checklist in those rated as level 2 in the global rating.13,15–20 In this study, we used “the percentage of scores above the qualification standard” as the outcome measurement for performance in national OSCE, defined as the difference between actual and approval standard score, divided by the approval standard score.

2.3. The procedure of structured PGY interview

In Taipei VGH, the structured PGY interview has been in use since 2014. Each station requires three attending physicians from different subspecialties as interviewers. The applicants are required to provide their autobiography, portfolio of clinical training, and three clinical cases. Clinical cases should include at least one internal medicine and one surgical case. At the beginning of the structured interview, the applicants should give a brief introduction and answer questions from the interviewer about the autobiography and portfolio of the clinical training. Then, the applicants present the clinical cases selected by the interviewer. During the interview process, the interviewers also inquire about clinical reasoning, treatment strategies, ethnic issues, or evidence-based medicine about the presented cases. In this study, we used the mean grades from the three interviewers to measure the structured PGY interview. For analyses, all the grades of structuralized PGY interview were transformed into percentile rank scores.

2.4. The medical school grades

In this study, grades of medical school (from the first to the fifth year) were obtained from the same institution (NYMU) to ensure the reliability of the grading system. Then, the grades from 2018 (derivation cohort) and 2020 (validation cohort) were separately transformed into percentile rank scores. The medical school grades were divided into quartiles to analyze the feasibility of the “bottom quartile” of medical school grades as a predictor for clinical performance.8

2.5. Statistics

Data of the baseline demographics of the derivation and validation cohorts are reported as means and standard deviations. In this study, all comparisons between categorical and continuous variables were analyzed using chi-square or t tests, as appropriate. We used Pearson’s correlational analyses to examine potential associations between the grades from the structured PGY interview, national OSCE, and medical school. We further divided the dataset into quartiles to construct a prediction model that is more easily applicable in real-world medical education. The significant variables were used to construct a decision tree based on the exhaustive chi-squared automatic interaction detection algorithm with the following adjustments: ≥20 and ≥5 cases per parent node and child node, respectively; automatic maximum tree depth.21 For all analyses, results were considered significant at p < 0.05.

3. RESULTS 3.1. Demographics and baseline characteristics

The derivation cohort included 50 medical students (20 women and 30 men) who graduated from NYMU in July 2018, received clinical internship training, and participated in the PGY interview at Taipei VGH. In addition, data from 56 medical students (21 women and 35 men) who graduated from NYMU fulfilled the inclusion criteria of the present study and were used for the validation cohort. There were no differences in the performance of the national OSCE, grades in medical school, and the rank scores in structured PGY interviews between the derivation and validation cohorts (Table 1).

Table 1 - Demographics and descriptive statistics Variables Derivation cohort(mean ± SD) Validation cohort(mean ± SD) p * Sex (n, %)  Men 30 (60%) 35 (62.5%) 0.792  Women 20 (40%) 21 (37.5%) Percentage of score above the qualification standard in national OSCE (%) 39.7 ± 8.7 37.5 ± 8.5 0.211 Grades in medical school (%) 40.4 ± 20.6 39.5 ± 19.2 0.809 Rank scores in structuralized PGY interview (%) 67.1 ± 23.1 59.7 ± 25.1 0.119

* All comparisons among categorical and continuous variables were analyzed using chi-square or t tests as appropriate, and results were considered significant by p < 0.05.


3.2. Factors associated with the performance of national OSCE

In this study, we found that medical school grades were associated with the performance of national OSCEs (defined as the percentage of scores above the qualification standard) (Pearson r = 0.34, p = 0.017). However, the mean score of the structured PGY interview was not associated with the performance of the national OSCE (Pearson r = 0.269, p = 0.06). We further divided our medical students into quartiles based on the performance of national OSCE and found a significant difference between students with the highest (“outperformed” group) and lowest quartiles (“improvement required” group) in national OSCEs in grades in medical school (Q1 vs Q4 = 48.1% ± 21.0% vs 29.2% ± 23.9%, p = 0.046), but no difference in the structured PGY interview (Q1 vs Q4 = 79.1% ± 33.6% vs. 74.4% ± 31.0%, p = 0.719) (Fig. 1; Table 2).

Table 2 - Factors associated with performance of national OSCE Variables Grades in medical school p * Structuralized PGY interview p * Highest performance (Q1) 48.1% ± 21.0% 0.046** 79.1% ± 33.6% 0.719 Lowest performance (Q4) 29.2% ± 23.9% 74.4% ± 31.0% Highest performance (Q1) 48.1% ± 21.0% 0.120 79.1% ± 33.6% 0.321 Acceptable-to-lowest performance (Q2-Q4) 37.7% ± 20.1% 69.5% ± 28.4% High-to-acceptable performance (Q1-Q3) 43.9% ± 18.4% 0.029** 71.2% ± 29.7% 0.752 Lowest performance (Q4) 29.1% ± 23.9% 74.4% ± 31.0%

*All comparisons among continuous variables were analyzed using t tests.

**Results were considered significant by p < 0.05.


F1Fig. 1:

Factors associated with highest and lowest performance in national objective structured clinical examinations. PGY = postgraduate year.

3.3. Factors associated with the “outperformed” and “improvement required” performance of the national OSCE

Undergraduates within the lowest quartiles in national OSCEs have lower grades in medical school (Q1-Q3 vs Q4 = 43.9% ± 18.4% vs 29.1% ± 23.9%, p = 0.029), but there were no differences in the structured PGY interview (Q1-Q3 vs Q4 = 71.2% ± 29.7% vs 74.4% ± 31.0%, p = 0.752). On the other hand, undergraduates with the highest quartiles in national OSCEs do not seem to have higher grades in medical school (Q1 vs Q2-Q4 = 48.1% ± 21.0% vs 37.7% ± 20.1%, p = 0.120) or structured PGY interview (Q1 vs Q2-Q4 = 79.1% ± 33.6% vs 69.5% ± 28.4%, p = 0.321).

3.4. Predicting the “improvement required” undergraduate in national OSCE

Based on our findings, we constructed a prediction model using medical school grades. According to this model, we found those with the lowest quartiles of grades in a medical school associated with a higher possibility belonging to the lowest quartiles in the national OSCEs (Q1-Q3 vs Q4 = 15% vs 60%, odds ratio= 8.5 [95% confidence interval = 1.8-39.4], p = 0.029). Using trainees in the next year as validation cohort (n = 56), our prediction model could accurately classify 76.7% “improvement required” and “nonimprovement required” students (Fig. 2).

F2Fig. 2:

Prediction model for clinical competencies. OSCE = objective structured clinical examinations.

4. DISCUSSION

In this study, we found that medical school grades were associated with the performance of OSCEs, and the structured interview was not related to the performance of OSCEs. Undergraduates with the lowest 25% grades in medical school can predict “improvement required” (last 25%) performance in OSCEs. However, neither grades in medical school nor structured interviews can help identify the “outperformed” population in OSCEs.

Our study had several strengths. First, we used national OSCEs to measure clinical performance, a significant predator of postgraduate medical expertise scores. Second, we used a validation cohort to ensure the reliability of the prediction model. A reliable predictor of future clinical competencies is crucial in medical education. Hence, several studies have aimed to identify the “outcome predictors” for medical education.9 Several studies found the interview moderately predicts clinical performance,22–27 but other studies showed no or weak association with residents’ future performance.28–30 Our study found no correlation between interview grades and performance in the national OSCE. However, this should not be interpreted as the structured interview being unhelpful for trainee selection. The interview results might reflect other noncognitive factors, i.e., interpersonal communication skills, interest in the field, dependability, and honesty.31,32 In addition, it may not linearly reflect the clinical competencies but can help identify obvious negative applicant characteristics.33 Moreover, some important applicant characteristics may require the interviewer’s “gut feeling” or clinical experience.34,35 Hence, the interview process should be viewed as an important “gatekeeper” for certain negative characteristics, which are challenging to analyze by the OSCEs, for future clinical performance.

Another important question is how to identify the “improvement required” by the trainees. Early studies found no correlation between school grades and future clinical performance.36 However, recent studies have found that school grades correlate with clinical performance in some institutions.37 This inconsistency may be due to the different curricular designs and grading methods used in medical schools. The key factor for the high predictability of grades from NYMU may be the widely integrating of clinical and basic medicine in the curricula.38,39 For example, problem-based learning courses have been widely applied in basic medicine, usually incorporated with a clinical problem or scenario.12 Moreover, we also introduced the basic clinical skill course in the fourth year of the curriculum, and a previous comparative study showed that trainees’ basic clinical skills training in medical schools outperformed those under the traditional curriculum.40–42 In the present study, we found that school grades could be a reliable predictor of clinical performance. Our prediction model could accurately predict the clinical performance in the validation cohort in the same institution. Using this model, we can set a threshold on performance at the medical school to identify students likely to be the “improvement required” trainees earlier and provide them with additional intervention before developing incompetency. The success of the intervention was demonstrated when the prediction model failed to classify the students.

Our study had several limitations. First, it used a surrogate parameter, the national OSCE, as the outcome measurement for clinical performance. The reason for using the grade from national OSCEs is that the procedure has been validated and widely applied in Taiwan’s teaching hospital for more than 10 years10 with a standardized problem and task during the examination.11 Also, the examinees’ performance was evaluated using a standardized checklist. To avoid the possible confounding factors of the evaluation forms from the trainee’s clinical performance, the grade from the national OSCE seems to be a more reliable parameter for clinical ability. Second, the clinical competencies included several dimensions, including patient care, medical knowledge, practice-based learning and improvement, interpersonal and communication skills, professionalism, and system-based practice. Standardized OSCEs may have limited value for evaluating some dimensions, such as system-based practice. Third, the structured PGY interview and national OSCEs evaluated the competencies at a short time interval (i.e., 1 day). It is difficult to use this measurement to evaluate other important factors, such as resilience. For example, our study did not find an association between the grades in interviews and OSCEs. On the other hand, a United Kingdom study found multiple mini-interview associated with subsequent OSCEs.43 This inconsistency suggests that several mini-interviews may be more helpful in evaluating the trainee’s clinical competencies. Further studies using different interview protocols are warranted to improve the predictability of the interview process. Fourth, we divided grades from structuralized PGY interviews and medical school into quartiles, which might be arbitrary. One reason for dividing the dataset into quartiles is to develop an easily applicable prediction model. Also, our study aimed to examine the generality of the “bottom quartile of medical school grades” as a predictor for less optimal clinical performance, which was proposed by Harvard Medical School.8 According to our research findings, the “bottom quartile of medical school grades” is also a useful predictor for identifying “improvement required” clinical trainees in the context of Taiwan.

In conclusion, using national OSCEs as a standardized surrogate parameter for clinical performance, we found that grades in medical school can predict poor performance but are ineffective in predicting outstanding performance. Based on the prediction model, additional support should be provided for undergraduates with unsatisfactory medical school grades to ensure clinical performance quality during their daily practice.

ACKNOWLEDGMENTS

We wish to express our gratitude to our diligent staffs in the Department of Medical Education, Taipei VGH. This work was supported by Taipei Veterans General Hospital (Grant number: 110EA-007, V110C-033, PED1090388), and Ministry of Science and Technology (Taiwan) (Grant number: MOST 110-2314-B-075-081, MOST 109-2314-B-010-032-MY3 and MOST-110-2511-H-A491-504-MY3).

REFERENCES 1. Stephenson-Famy A, Houmard BS, Oberoi S, Manyak A, Chiang S, Kim S. Use of the interview in resident candidate selection: a review of the literature. J Grad Med Educ. 2015;7:539–48. 2. Fleming AE, Smith S. Mistreatment of medical trainees: time for a new approach. JAMA Netw Open. 2018;1:e180869. 3. Andreassen P, Malling B. How are formative assessment methods used in the clinical setting? A qualitative study. Int J Med Educ. 2019;10:208–15. 4. Gonnella JS, Hojat M. Relationship between performance in medical school and postgraduate competence. J Med Educ. 1983;58:679–85. 5. Wingard JR, Williamson JW. Grades as predictors of physicians’ career performance: an evaluative literature review. J Med Educ. 1973;48:311–22. 6. Taylor CW, Albo D Jr. Measuring and predicting the performances of practicing physicians: an overview of two decades of research at the University of Utah. Acad Med. 1993;68(2 Suppl):S65–7. 7. Kenny S, McInnes M, Singh V. Associations between residency selection strategies and doctor performance: a meta-analysis. Med Educ. 2013;47:790–800. 8. Krupat E, Pelletier SR, Dienstag JL. Academic performance on first-year medical school exams: how well des it predict later performance on knowledge-based and clinical assessments? Teach Learn Med. 2017;29:181–7. 9. Brothers TE, Wetherholt S. Importance of the faculty interview during the resident application process. J Surg Educ. 2007;64:378–85. 10. Huang CC, Chan CY, Wu CL, Chen YL, Yang HW, Huang CC, et al. Assessment of clinical competence of medical students using the objective structured clinical examination: first 2 years’ experience in Taipei Veterans General Hospital. J Chin Med Assoc. 2010;73:589–95. 11. Chong L, Taylor S, Haywood M, Adelstein BA, Shulruf B. The sights and insights of examiners in objective structured clinical examinations. J Educ Eval Health Prof. 2017;14:34. 12. Chang CC, Lirng JF, Wang PN, Wang SJ, Chen CH, Yang LY, et al. A pilot study of integrating standardized patients in problem-based learning tutorial in Taiwan. J Chin Med Assoc. 2019;82:464–8. 13. Wood TJ, Humphrey-Murto SM, Norman GR. Standard setting in a small scale OSCE: a comparison of the Modified Borderline-Group Method and the Borderline Regression Method. Adv Health Sci Educ Theory Pract. 2006;11:115–22. 14. Hejri SM, Jalili M, Muijtjens AM, Van Der Vleuten CP. Assessing the reliability of the borderline regression method as a standard setting procedure for objective structured clinical examination. J Res Med Sci. 2013;18:887–91. 15. Homer M, Pell G. The impact of the inclusion of simulated patient ratings on the reliability of OSCE assessments under the borderline regression method. Med Teach. 2009;31:420–5. 16. Yousuf N, Violato C, Zuberi RW. Standard setting methods for pass/fail decisions on high-stakes objective structured clinical examinations: a validity study. Teach Learn Med. 2015;27:280–91. 17. Norcini JJ. Setting standards on educational tests. Med Educ. 2003;37:464–9. 18. Dwivedi NR, Vijayashankar NP, Hansda M, Dubey AK, Nwachukwu F, Curran V, et al. Comparing standard setting methods for objective structured clinical examinations in a caribbean medical school. J Med Educ Curric Dev. 2020;7:2382120520981992. 19. Kramer A, Muijtjens A, Jansen K, Düsman H, Tan L, Van Der Vleuten C. Comparison of a rational and an empirical standard setting procedure for an OSCE. Med Educ. 2003;37:132–9. 20. Shulruf B, Turner R, Poole P, Wilkinson T. The Objective Borderline method (OBM): a probability-based model for setting up an objective pass/fail cut-off score in medical programme assessments. Adv Health Sci Educ Theory Pract. 2013;18:231–44. 21. Kass GV. An exploratory technique for investigating large quantities of categorical data. J R Stat Soc Ser C Appl Stat. 1980;29:119–27. 22. Ozuah PO. Predicting residents’ performance: a prospective study. BMC Med Educ. 2002;2:7. 23. Shiroma PR, Alarcon RD. Selection factors among international medical graduates and psychiatric residency performance. Acad Psychiatry. 2010;34:128–31. 24. Eva KW, Reiter HI, Trinh K, Wasi P, Rosenfeld J, Norman GR. Predictive validity of the multiple mini-interview for selecting medical trainees. Med Educ. 2009;43:767–75. 25. Grewal SG, Yeung LS, Brandes SB. Predictors of success in a urology residency program. J Surg Educ. 2013;70:138–43. 26. Olawaiye A, Yeh J, Withiam-Leitch M. Resident selection process and prediction of clinical performance in an obstetrics and gynecology program. Teach Learn Med. 2006;18:310–5. 27. Wood PS, Smith WL, Altmaier EM, Tarico VS, Franken EA Jr. A prospective study of cognitive and noncognitive selection criteria as predictors of resident performance. Invest Radiol. 1990;25:855–9. 28. Fryer JP, Corcoran N, George B, Wang E, Darosa D. Does resident ranking during recruitment accurately predict subsequent performance as a surgical resident? J Surg Educ. 2012;69:724–30. 29. George JM, Young D, Metz EN. Evaluating selected internship candidates and their subsequent performances. Acad Med. 1989;64:480–2. 30. Oldfield Z, Beasley SW, Smith J, Anthony A, Watt A. Correlation of selection scores with subsequent assessment scores during surgical training. ANZ J Surg. 2013;83:412–6. 31. LaGrasso JR, Kennedy DA, Hoehn JG, Ashruf S, Przybyla AM. Selection criteria for the integrated model of plastic surgery residency. Plast Reconstr Surg. 2008;121:121e–5e. 32. Wagoner NE, Suriano JR, Stoner JA. Factors used by program directors to select residents. J Med Educ. 1986;61:10–21. 33. Al Khalili K, Chalouhi N, Tjoumakaris S, Gonzalez LF, Starke RM, Rosenwasser R, et al. Programs selection criteria for neurological surgery applicants in the United States: a national survey for neurological surgery program directors. World Neurosurg. 2014;81:473–7.e2. 34. Parker AM, Petroze RT, Schirmer BD, Calland JF. Surgical residency market research-what are applicants looking for? J Surg Educ. 2013;70:232–6. 35. Otero HJ, Erturk SM, Ondategui-Parra S, Ros PR. Key criteria for selection of radiology residents: results of a national survey. Acad Radiol. 2006;13:1155–64. 36. Agahi F, Speicher MR, Cisek G. Association between undergraduate performance predictors and academic and clinical performance of osteopathic medical students. J Am Osteopath Assoc. 2018;118:106–14. 37. Gadbury-Amyot CC, Bray KK, Branson BS, Holt L, Keselyak N, Mitchell TV, et al. Predictive validity of dental hygiene competency assessment measures on one-shot clinical licensure examinations. J Dent Educ. 2005;69:363–70. 38. Vink SC, Van Tartwijk J, Bolk J, Verloop N. Integration of clinical and basic sciences in concept maps: a mixed-method study on teacher learning. BMC Med Educ. 2015;15:20. 39. Hale JF, Cahan MA, Zanetti ML. Integration of basic clinical skills training in medical education: an interprofessional simulated teaching experience. Teach Learn Med. 2011;23:278–84. 40. Jünger J, Schäfer S, Roth C, Schellberg D, Friedman Ben-David M, Nikendei C. Effects of basic clinical skills training on objective structured clinical examination performance. Med Educ. 2005;39:1015–20. 41. Remmen R, Scherpbier A, van der Vleuten C, Denekens J, Derese A, Hermann I, et al. Effectiveness of basic clinical skills training programmes: a cross-sectional comparison of four medical schools. Med Educ. 2001;35:121–8. 42. Yang YY, Wang SJ, Yang LY, Lirng JF, Huang CC, Liang JF, et al. Effects of a new parallel primary healthcare centre and on-campus training programme on history taking, physical examination skills and medical students’ preparedness: a prospective comparative study in Taiwan. BMJ Open. 2017;7:e016294. 43. Kumar N, Bhardwaj S, Rahman E. Multiple mini-interview as a predictor of performance in the objective structured clinical examination among Physician Associates in the United Kingdom: a cohort study. Adv Med Educ Pract. 2018;9:239–45.

留言 (0)

沒有登入
gif