Application of a Novel Multimodal-Based Deep Learning Model for the Prediction of Papillary Thyroid Carcinoma Recurrence

Introduction

Thyroid cancer is the most common endocrine malignancy, and papillary thyroid carcinoma (PTC) is the most frequent thyroid malignancy, accounting for >80% of cases.1 The incidence of thyroid cancer has shown an increasing trend in most countries over the past few decades.2 In Korea, the incidence of thyroid cancer has increased rapidly since 1999, demonstrating the highest incidence worldwide.3,4 According to a recent investigation, although the increasing tendency has declined, the prevalence remains the highest among other countries globally for both sexes.5

Despite its high incidence, studies regarding the prognosis of patients with thyroid cancer are fewer than those for other cancers. This may be attributed to the excellent prognosis of patients with thyroid cancer. Notably, mortality due to thyroid cancer did not increase during the period of dramatic increase in incidence.3 Moreover, the 5-year survival rate of patients with thyroid cancer in Korea has reached approximately 100%.5 Although the mortality rate is low, some patients experience cancer recurrence during follow-up. Previous studies showed that the recurrence rate was 8–28%.6,7 In one study, the time to recurrence was 8.1 years, and 11% of recurrences occurred after 20 years.7 Considering the high incidence of thyroid cancer in the younger population (ages of 15–34 years) and long follow-up period, recurrence should not be overlooked. Early detection of recurrence helps improve patient outcomes and reduces socioeconomic burden.

According to the guidelines of the American Thyroid Association, some factors are suggested to assess the risk of structural disease recurrence in patients without structurally identifiable disease after initial therapy.8 These include clinicopathological features such as Tumor Node Metastasis (TNM) stage, microscopic extrathyroidal extension, cervical lymph node metastases, vascular invasion, and aggressive tumor histology. In clinical practice, patients are classified as having a low, intermediate, or high risk of recurrence based on a comprehensive consideration of these factors. However, this three-category risk stratification system shows various risk ranges depending on specific clinical features, despite sharing the same risk category. Therefore, ambiguity exists when applied in actual clinical practice, and there is a limit to presenting the recurrence risk as a specific numerical value by comprehensively considering each risk factor in each patient.

Recently, machine learning and deep learning technologies have been widely used in the medical field, particularly for the assessment of image data.9,10 However, the application of machine learning models to predict PTC recurrence is rare. Previous studies have predicted the recurrence of PTC using machine learning models consisting of clinicopathological parameters.11–13 In these studies, the prediction accuracy for recurrence varied between 71.4% and 95.0%. However, a recurrence prediction model that considers continuous changes in clinical data during postoperative follow-up has not yet been developed.

In this study, we investigated the accuracy of a novel multimodal model by simultaneously analyzing numerical and time-series data to predict recurrence in patients with PTC after thyroidectomy.

Material and Methods

This research team has previously published results on the AI prediction model. For detailed methodologies, please refer to the corresponding paper.14

Study Population

We analyzed patients with thyroid carcinoma who underwent thyroid lobectomy or total thyroidectomy at the CBNUH between January 2006 and December 2021. Patients were included according to the following inclusion criteria: underwent surgery owing to thyroid carcinoma, with histopathology reports stored in the CBNUH database; diagnosed with PTC owing to the postoperative histopathological examination; and a sequential follow-up >5 years after the surgery.15 The exclusion criteria were as follows: follow-up ≤5 years after the surgery;15 histopathological examination results revealing the absence of PTC after the surgery (including mixed cases of different types of thyroid carcinoma); and follow-up data not continuously acquired, or intermittent (Figure 1).

Figure 1 Flow chart of the data acquisition process.

Predictors

To detect PTC recurrence, we acquired clinical data, including demographic information (sex, age), ultrasonography (US) reports, pathology reports (histopathology and cytopathology reports), whole-body iodine scan, and thyroid function test (TFT) results (thyroid-stimulating hormone [TSH], triiodothyronine [T3], free thyroxine [fT4], and thyroglobulin [Tg]).8 Whole-body iodine scan and US were used to confirm recurrence and were excluded from the model input. PTC recurrence was defined as a new suspicious lesion detected by comparing the previous US results with current US results, which was confirmed using fine-needle aspiration cytology.16–18

We extracted and analyzed tumor size, tumor multiplicity, extrathyroidal extension (ETE), extranodal extension (ENE), and TNM stage (T, N classification) from the pathology reports.13 TNM stage was sorted according to the 7th American Joint Committee on Cancer staging system.8,19 Patients whose N classification was evaluated as N_x because they did not undergo intraoperative neck dissection were labeled as “ENE not occurred.”

TFT results, including the TSH, T3, fT4, and Tg levels, were acquired after PTC surgery.1,8,20 TFT results were acquired every 6 months (± 3 months) from the surgery date as time-series data. We used time-dependent linear interpolation and backward-filling to process the missing TFT results. Furthermore, TFT results were used only for the prior 5 years; in patients with recurrence, results after the recurrence date were not used.

Model Architecture

We proposed a novel multimodal-based deep learning model to predict PTC recurrence. The proposed model used numerical data, including clinical information at the time of surgery, and time-series data, including postoperative TFT results. Our model comprised three blocks (Figure 2); first, a multilayer perceptron block, utilizing numerical data as input and comprising two dense layers (a batch normalization layer and an activation layer [ReLU]). To improve the training speed and alleviate the fallacy of generalization, we included a batch normalization layer.21 Second, a long short-term memory (LSTM) block utilizing time-series data as the input, comprised three LSTM models. To sufficiently reflect data from the past, we adopted the LSTM model, which trains the temporal tendency by dividing the short-term and long-term memory.22 Finally, an ensemble block (using a combined vector of each feature vector as the input) comprised a dense layer, a batch normalization layer, and an activation layer (ReLU), and calculated a probability value as the output through a sigmoid function.

Figure 2 The proposed model architecture for predicting papillary thyroid carcinoma recurrence.

Model Training

To train the model with unbalanced data, we employed weighted binary cross-entropy with weights of 0.8 for the positive (recurrence) group and 0.2 for the negative (nonrecurrence) group. The Adam optimizer was used at a learning rate of 1.0e-3. When the area under the receiver operating characteristic (ROC) curve (AUROC) for the validation data did not increase over 10 epochs, the learning rate was multiplied by 1.0e-1.

The proposed model was implemented using the TensorFlow 2.4.1 library in Python 3.7 and was trained on the CUDA 11.0.3 toolkit using a desktop computer with an NVIDIA GeForce RTX 3080 GPU and Intel Core i7-11700K 3.60GHz CPU.

Evaluation of Model Performance

We performed four-fold cross-validation to evaluate the performance of the proposed model. The dataset was equally divided into four folds. Three folds (75%) were used for training, and one fold (25%) was used for validation and testing. Four validation iterations were performed; within each iteration, a different fold of the data was used for validation and testing, whereas the remaining three folds were used for training. We used four evaluation metrics: sensitivity, specificity, F1-score, and AUROC. The AUROC ranged between 0 and 1, with 0.5 indicating a coin flip and 1 indicating a perfect classifier.

Statistical Analysis

Data were presented as 95% confidence intervals (CIs) of the mean in the t-distribution for continuous variables, and as proportions for categorical variables. To evaluate the association between thyroid cancer recurrence and categorical variables, Pearson’s χ2 test or Fisher’s exact test was used. All continuous variables satisfied normality according to the central limit theorem, as the number of samples of each patient according to recurrence was ≥30.23 Therefore, the association between continuous variables and PTC recurrence was evaluated using the t-test according to the test of equal variances (F-test). All statistical analyses were performed using R version 4.1.2. Moreover, p<0.05 was considered statistically significant.

ResultsDemographics and PTC Characteristics

Our dataset consisted of 1613 patients — including 1550 patients with nonrecurrent PTC (nr_PTC) and 63 patients with recurrent PTC (r_PTC) — who underwent total thyroidectomy or thyroid lobectomy for PTC at the CBNUH. The mean age of the 1550 patients with nr_PTC was 48.26 ± 0.57 years; among these patients, 1268 were female (82%). The mean age of the 63 patients with r_PTC was 47.73 ± 3.93 years; among these patients, 42 were female (67%). The mean tumor size was 1.02 ± 0.04 in the nr_PTC group and 1.66 ± 0.33 in the r_PTC group. Tumor multiplicity was 553 (36%) in the nr_PTC group and 37 (59%) in the r_PTC group. Pathological diagnoses of ETE and ENE were observed in 837 (54%) and 180 (12%) patients with nr_PTC, respectively, and in 54 (86%) and 27 (43%) patients with r_PTC, respectively. According to the 7th edition of the American Joint Committee on Cancer/Union for International Cancer Control TNM staging system, T classification was as follows: 572 (37%), 108 (7%), 21 (1%), 841 (54%), and 8 (1%) patients with nr_PTC; and 6 (10%), 2 (3%), 0 (0%), 54 (86%), and 1 (2%) patients with r_PTC were classified as T_1a, T_1b, T_2, T_3, and T_4, respectively. Additionally, N classification was as follows: 887 (57%), 476 (31%), 73 (5%), and 114 (7%) patients with nr_PTC; and 7 (11%), 26 (41%), 27 (43%), and 3 (5%) patients with r_PTC were classified as N_0, N_1a, N_1b, N_x, respectively (Table 1).

Table 1 Demographics and Papillary Thyroid Carcinoma Characteristics of Patients

We analyzed the association between clinical data at the time of surgery and PTC recurrence. Sex, tumor size, tumor multiplicity, ETE, ENE, T classification, and N classification were significantly different between patients with and without recurrence.

Performance Evaluation

We performed four-fold cross-validation of the dataset to evaluate the model performance. The ROC curve for the four folds and average ROC curve are presented in Figure 3A, whereas the sum of the confusion matrix for the four folds is shown in Figure 3B. The overall sensitivity, specificity, and F1-score for the optimal threshold obtained from the ROC curve and AUROC are presented in Table 2. The proposed model achieved an average AUROC of 0.9622, F1-score of 0.4603, sensitivity of 0.9042, and specificity of 0.9077.

Table 2 Four-Fold Cross-Validation Results of the Proposed Model

Figure 3 (A) Receiver operating characteristic (ROC) curve plot of the proposed model evaluated through four-fold cross-validation. (B) Four-fold confusion matrix sum.

Real-Time Prediction

The experimental results showed that the proposed model could predict recurrence at least 1 year before recurrence after PTC surgery (Figure 4). The left y-axis of the graph represents TSH, T3, and fT4 levels, and the right y-axis represents the probability of recurrence predicted by the model in a 6-month cycle until the most recent or recurrence date. Additionally, a recurrence probability value >0.5 was considered PTC recurrence. TSH and Tg levels significantly affected the probability of recurrence. For patient B, when the TSH value increased from 0.36 to 2.37 μIU/mL and the Tg value increased from 0.22 to 1.19 ng/mL, the recurrence probability increased from 0.08 to 0.95. For patient C, when the TSH value increased from 0.09 to 100 μIU/mL and the Tg value increased from 0.83 to 13.06 ng/mL, the recurrence probability increased from 0.04 to 0.98.

Figure 4 Eight examples of real-time prediction via the proposed model. (A–D) Patient with recurrence; (E–H) patient without recurrence. Numeric units are as follows: Thyroid-stimulating hormone (TSH) (μIU/mL), triiodothyronine (T3) (ng/mL), FreeT4 (Ft4) (ng/dl), and thyroglobulin (ng/mL).

Discussion

In this study, we developed and validated a novel multimodal model to predict recurrence in patients with PTC after thyroidectomy by simultaneously analyzing numerical and time-series data. Our model showed an average AUROC of 0.9622, F1-score of 0.4603, sensitivity of 0.9042, and specificity of 0.9077. Moreover, our model exhibited better prediction accuracy than models in previous studies regarding sensitivity, F1-score, and AUROC.12,13,24,25

During follow-up in patients with cancer, early detection of recurrence and appropriate treatment are critical for improving prognosis. PTC has a better prognosis than other cancers; however, a substantial number of patients’ experience recurrence. PTC recurrence has a good prognosis, it can increase medical expenses and induce stress in patients. Moreover, lifelong follow-up is needed, and tests are required at each visit to check for recurrence. Thus, support programs for predicting recurrence can reduce the burden on clinicians and help in decision-making.

Some studies have evaluated the use of machine learning-based models to predict the recurrence of PTC. In previous studies focused on lymph node metastasis,24,25 the AUROCs were 0.75 and 0.86. These studies reported lower AUROCs than in our study; further, they used limited data, such as computed tomography images or single-point data, at the time of initial surgical resection. Another study using data from patients and tumor characteristics also demonstrated a lower predictive value than the present study, including an F1-score of 0.431.11 A study by Kim et al included patients who underwent total thyroidectomy and radioiodine therapy, and used laboratory data on Tg levels during follow-up.12 In that study, the model was able to predict 71.4% (10/14) of recurrences. However, the number of patients was relatively small compared with that in our study (785 vs 1613). The authors of the study excluded patients who received radioiodine therapy, whereas our research included patients irrespective of the use of radioiodine therapy. Moreover, the model used in our study included most of the data that could be obtained during regular follow-up, such as blood and imaging tests. These data were serially collected at each visit during the follow-up period.

According to the guidelines of the American Thyroid Association and Korean Thyroid Association, the risk of structural disease recurrence is suggested to be either low, intermediate, or high.8,26 There are various clinicopathological features at the time of initial therapy to assess the risk of recurrence. However, to date, there is no method for evaluating this risk, which changes over time during long-term follow-up. This study suggests the possibility of evaluating recurrence at the time of examination by assessing results obtained during the follow-up. The experimental results in the present study showed that the proposed model could predict recurrence at least 1 year before its occurrence.

This study had several strengths; first, the proposed model exhibited good prediction performance. Second, we used numerical and time-series data to simultaneously predict recurrence. As mentioned above, different types of data were collected at various time points during the follow-up, similar to the method used in daily clinical practice. Third, the sample size was relatively large. The natural characteristics of PTC include a lower recurrence rate than other cancers; therefore, to increase the number of recurrent cases, we attempted to collect data from a large number of patients with long follow-up periods after thyroidectomy.

However, this study had some limitations; first, it had a retrospective design and was performed using data from a single institution; therefore, there was a possibility of bias. Second, we could not include all the initial features of thyroid cancer, such as gene mutations and detailed treatment or drug compliance information. Some of these factors may influence the risk of recurrence and cannot be excluded.

Conclusion

In conclusion, this study is the first to attempt to predict thyroid cancer recurrence using a deep-learning model that utilizes numerical and time-series data from patients with PTC after thyroidectomy. If a robust predictive model for PTC recurrence is established, high-risk patients can be selected for customized treatment according to risk stratification. However, further research is required to validate these results.

Data Sharing Statement

We may share anonymous key data upon reasonable scientific request to the corresponding author.

Ethics Approval and Informed Consent

This study was approved by the Institutional Review Board of the Chungbuk National University Hospital (CBNUH) (approval no. 2021-07-010-001). This retrospective observational study was conducted in accordance with the principles of the Declaration of Helsinki. The Institutional Review Board of Chungbuk National University Hospital approved the study protocol and waived the requirement for informed consent, as it involved only the review of medical records. All patient data were anonymized and maintained with strict confidentiality throughout the study.

Acknowledgment

The abstract of this paper was presented at the 25th European Congress of Endocrinology as a poster presentation with interim findings. The poster’s abstract was published in ‘Eposter Presentations’ in 2023 25th European Congress of Endocrinology: https://www.endocrine-abstracts.org/ea/0090/ea0090ep951.

Author Contributions

All authors made a significant contribution to the work reported, whether that is in the conception, study design, execution, acquisition of data, analysis and interpretation, or in all these areas; took part in drafting, revising or critically reviewing the article; gave final approval of the version to be published; have agreed on the journal to which the article has been submitted; and agree to be accountable for all aspects of the work.

Funding

This research was supported by the National IT Industry Promotion Agency (NIPA) grant funded by the Ministry of Science and ICT (MSIT) (grant number: S0252-21-1001, Development of AI Precision Medical Solution [Doctor Answer 2.0]).

Disclosure

The authors report no conflicts of interest in this work.

References

1. Sherman SI. Thyroid carcinoma. Lancet. 2003;361(9356):501–511. doi:10.1016/S0140-6736(03)12488-9

2. La Vecchia C, Malvezzi M, Bosetti C, et al. Thyroid cancer mortality and incidence: a global overview. Int J Cancer. 2015;136(9):2187–2195. doi:10.1002/ijc.29251

3. Ahn HS, Kim HJ, Welch HG. Korea’s thyroid-cancer “epidemic”--screening and overdiagnosis. N Engl J Med. 2014;371(19):1765–1767. doi:10.1056/NEJMp1409841

4. Jung KW, Won YJ, Kong HJ, et al. Cancer statistics in Korea: incidence, mortality, survival, and prevalence in 2012. Cancer Res Treat. 2015;47(2):127–141. doi:10.4143/crt.2015.060

5. Kang MJ, Won YJ, Lee JJ, et al. Cancer Statistics in Korea: incidence, Mortality, Survival, and Prevalence in 2019. Cancer Res Treat. 2022;54(2):330–344. doi:10.4143/crt.2022.128

6. Wiltshire JJ, Drake TM, Uttley L, Balasubramanian SP. Systematic Review of Trends in the Incidence Rates of Thyroid Cancer. Thyroid. 2016;26(11):1541–1552. doi:10.1089/thy.2016.0100

7. Grogan RH, Kaplan SP, Cao H, et al. A study of recurrence and death from papillary thyroid cancer with 27 years of median follow-up. Surgery. 2013;154(6):1436–1446. doi:10.1016/j.surg.2013.07.008

8. Haugen BR, Alexander EK, Bible KC, et al. American Thyroid Association Management Guidelines for Adult Patients with Thyroid Nodules and Differentiated Thyroid Cancer: the American Thyroid Association Guidelines Task Force on Thyroid Nodules and Differentiated Thyroid Cancer. Thyroid. 2016;26(1):1–133. doi:10.1089/thy.2015.0020

9. Tagliafico AS, Piana M, Schenone D, Lai R, Massone AM, Houssami N. Overview of radiomics in breast cancer diagnosis and prognostication. Breast. 2020;49:74–80. doi:10.1016/j.breast.2019.10.018

10. Madabhushi A, Lee G. Image analysis and machine learning in digital pathology: challenges and opportunities. Med Image Anal. 2016;33:170–175. doi:10.1016/j.media.2016.06.037

11. Mourad M, Moubayed S, Dezube A, et al. Machine Learning and Feature Selection Applied to SEER Data to Reliably Assess Thyroid Cancer Prognosis. Sci Rep. 2020;10(1):5176. doi:10.1038/s41598-020-62023-w

12. Kim SY, Kim YI, Kim HJ, et al. New approach of prediction of recurrence in thyroid cancer patients using machine learning. Medicin. 2021;100(42):e27493. doi:10.1097/MD.0000000000027493

13. Park YM, Lee BJ. Machine learning-based prediction model using clinico-pathologic factors for papillary thyroid carcinoma recurrence. Sci Rep. 2021;11(1):4948. doi:10.1038/s41598-021-84504-2

14. Kim GH, Lee DH, Choi JW, Jeon HJ, Park S. Multimodal Neural Network for Recurrence Prediction of Papillary Thyroid Carcinoma. Adv Intell Syst-Ger. 2023;5(2):2200365.

15. Durante C, Montesano T, Torlontano M, et al. Papillary thyroid cancer: time course of recurrences during postsurgery surveillance. J Clin Endocrinol Metab. 2013;98(2):636–642. doi:10.1210/jc.2012-3401

16. Zhao L, Gong Y, Wang J, et al. Ultrasound-guided fine-needle aspiration biopsy of thyroid bed lesions from patients with thyroidectomy for thyroid carcinomas. Cancer Cytopathol. 2013;121(2):101–107. doi:10.1002/cncy.21202

17. Baloch ZW, LiVolsi VA. Cytologic and architectural mimics of papillary thyroid carcinoma. Diagnostic challenges in fine-needle aspiration and surgical pathology specimens. Am J Clin Pathol. 2006;125:S135–144. doi:10.1309/YY72M308WPEKL1YY

18. Moon WJ, Baek JH, Jung SL, et al. Ultrasonography and the ultrasound-based management of thyroid nodules: consensus statement and recommendations. Korean J Radiol. 2011;12(1):1–14. doi:10.3348/kjr.2011.12.1.1

19. Metere A, Aceti V, Giacomelli L. The surgical management of locally advanced well-differentiated thyroid carcinoma: changes over the years according to the AJCC 8th edition Cancer Staging Manual. Thyroid Res. 2019;12(1):10. doi:10.1186/s13044-019-0071-3

20. Tuttle RM, Ball DW, Byrd D, et al. Thyroid carcinoma. J Natl Compr Canc Netw. 2010;8(11):1228–1274. doi:10.6004/jnccn.2010.0093

21. Ioffe S, Szegedy C Batch Normalization: accelerating Deep Network Training by Reducing Internal Covariate Shift. Proceedings of the 32nd International Conference on Machine Learning; Proceedings of Machine Learning Research. 2015.

22. Hochreiter S, Schmidhuber J. Long short-term memory. Neural Comput. 1997;9(8):1735–1780. doi:10.1162/neco.1997.9.8.1735

23. Kwak SG, Kim JH. Central limit theorem: the cornerstone of modern statistics. Korean J Anesthesiol. 2017;70(2):144–156. doi:10.4097/kjae.2017.70.2.144

24. Zhu J, Zheng J, Li L, et al. Application of Machine Learning Algorithms to Predict Central Lymph Node Metastasis in T1-T2, Non-invasive, and Clinically Node Negative Papillary Thyroid Carcinoma. Front Med Lausanne. 2021;8:635771. doi:10.3389/fmed.2021.635771

25. Masuda T, Nakaura T, Funama Y, et al. Machine learning to identify lymph node metastasis from thyroid cancer in patients undergoing contrast-enhanced CT studies. Radiography. 2021;27(3):920–926. doi:10.1016/j.radi.2021.03.001

26. Yi KH, Lee EK, Kang H-C, et al. 2016 Revised Korean Thyroid Association Management Guidelines for Patients with Thyroid Nodules and Thyroid Cancer. Ijt. 2016;9(2):59–126.

留言 (0)

沒有登入
gif