Engaging Preference Optimization Alignment in Large Language Model for Continual Radiology Report Generation: A Hybrid Approach

ESR. Medical imaging in personalised medicine: a white paper of the research committee of the European Society of Radiology (ESR). Insights into imaging. 2015;6:141–155.

Ouis MY, Akhloufi M. Deep learning for report generation on chest X-ray images. Comput Med Imag Graph. 2023;102320.

Alfarghaly O, Khaled R, Elkorany A, Helal M, Fahmy A. Automated radiology report generation using conditioned transformers. Inf Med Unlocked. 2021;24:100557.

Article  Google Scholar 

Liao Y, Liu H. Spasić I. Deep learning approaches to automatic radiology report generation: a systematic review. Inf Med Unlocked. 2023;101273.

Henderson M. Radiology facing a global shortage. Online. Available from: https://www.rsna.org/news/2022/may/global-radiologist-shortage. Accessed 11 May 2023

Fleishon HB. Radiology workforce shortage: the “silver squad” option. J American College of Radiology. 2024.

Singh AK, Kumar A, Mahmud M, Kaiser MS, Kishore A. COVID-19 infection detection from chest X-ray images using hybrid social group optimization and support vector classifier. Cogn Comput. 2024;16(4):1765–77.

Article  Google Scholar 

Nazi ZA, Peng W. Large language models in healthcare and medical domain: a review. In: Informatics. vol. 11. MDPI; 2024. p. 57.

Xu L, Tang Q, Lv J, Zheng B, Zeng X, Li W. Deep image captioning: a review of methods, trends and future challenges. Neurocomputing. 2023;126287.

Demner-Fushman D, Kohli MD, Rosenman MB, Shooshan SE, Rodriguez L, Antani S, et al. Preparing a collection of radiology examinations for distribution and retrieval. J Am Med Inform Assoc. 2016;23(2):304–10.

Article  Google Scholar 

He K, Mao R, Lin Q, Ruan Y, Lan X, Feng M, et al. A survey of large language models for healthcare: from data, technology, and applications to accountability and ethics. 2023. arXiv preprint arXiv:2310.05694

Rafailov R, Sharma A, Mitchell E, Manning CD, Ermon S, Finn C. Direct preference optimization: your language model is secretly a reward model. Adv Neural Inf Process Syst. 2024;36.

Selivanov A, Rogov OY, Chesakov D, Shelmanov A, Fedulova I, Dylov DV. Medical image captioning via generative pretrained transformers. Sci Rep. 2023;13(1):4171.

Article  Google Scholar 

Thieme A, Rajamohan A, Cooper B, Groombridge H, Simister R, Wong B, et al. Challenges for responsible AI design and workflow integration in healthcare: a case study of automatic feeding tube qualification in radiology. 2024. arXiv preprint arXiv:2405.05299

Hyland SL, Bannur S, Bouzid K, Castro DC, Ranjit M, Schwaighofer A, et al. Maira-1: a specialised large multimodal model for radiology report generation. 2023. arXiv preprint arXiv:2311.13668

Hochreiter S. Long short-term memory. Neural Computation MIT-Press. 1997.

Cho K. Learning phrase representations using RNN encoder-decoder for statistical machine translation. 2014. arXiv preprint arXiv:1406.1078

Paalvast O, Nauta M, Koelle M, Geerdink J, Vijlbrief O, Hegeman JH, et al. Radiology report generation for proximal femur fractures using deep classification and language generation models. Artif Intell Med. 2022;128:102281.

Article  Google Scholar 

Gajbhiye GO, Nandedkar AV, Faye I. Translating medical image to radiological report: adaptive multilevel multi-attention approach. Comput Methods Programs Biomed. 2022;221:106853.

Article  Google Scholar 

Yang S, Niu J, Wu J, Wang Y, Liu X, Li Q. Automatic ultrasound image report generation with adaptive multimodal attention mechanism. Neurocomputing. 2021;427:40–9.

Article  MATH  Google Scholar 

Wang F, Liang X, Xu L, Lin L. Unifying relational sentence generation and retrieval for medical image report composition. IEEE Trans Cybern. 2020;52(6):5015–25.

Article  MATH  Google Scholar 

Vaswani A. Attention is all you need. Adv Neural Inf Process Syst. 2017.

Aksoy N, Ravikumar N, Frangi AF. Radiology report generation using transformers conditioned with non-imaging data. In: Medical Imaging 2023: Imaging Informatics for Healthcare, Research, and Applications. vol. 12469. SPIE; 2023. p. 146–154.

Zhang S, Zhou C, Chen L, Li Z, Gao Y, Chen Y. Visual prior-based cross-modal alignment network for radiology report generation. Comput Biol Med. 2023;166:107522.

Article  Google Scholar 

Pahwa E, Mehta D, Kapadia S, Jain D, Luthra A. Medskip: medical report generation using skip connections and integrated attention. In: Proceedings of the IEEE/CVF International Conference on Computer Vision; 2021. p. 3409–3415.

Chen Z, Song Y, Chang TH, Wan X. Generating radiology reports via memory-driven transformer. 2020. arXiv preprint arXiv:2010.16056

Mohsan MM, Akram MU, Rasool G, Alghamdi NS, Baqai MAA, Abbas M. Vision transformer and language model based radiology report generation. IEEE Access. 2022;11:1814–24.

Article  Google Scholar 

Wang Z, Liu L, Wang L, Zhou L. Metransformer: radiology report generation by transformer with multiple learnable expert tokens. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition; 2023. p. 11558–11567.

Yang S, Wu X, Ge S, Zhou SK, Xiao L. Knowledge matters: chest radiology report generation with general and specific knowledge. Med Image Anal. 2022;80:102510.

Article  Google Scholar 

Liu F, Wu X, Ge S, Fan W, Zou Y. Exploring and distilling posterior and prior knowledge for radiology report generation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition; 2021. p. 13753–13762.

Touvron H, Martin L, Stone K, Albert P, Almahairi A, Babaei Y, et al. Llama 2: open foundation and fine-tuned chat models. 2023. arXiv preprint arXiv:2307.09288

Zheng L, Chiang WL, Sheng Y, Zhuang S, Wu Z, Zhuang Y, et al.: Judging LLM-as-a-judge with MT-bench and chatbot arena.

Jiang AQ, Sablayrolles A, Mensch A, Bamford C, Chaplot DS, Casas Ddl, et al. Mistral 7B. 2023. arXiv preprint arXiv:2310.06825

Han T, Adams LC, Papaioannou JM, Grundmann P, Oberhauser T, Löser A, et al. MedAlpaca–an open-source collection of medical conversational AI models and training data. 2023. arXiv preprint arXiv:2304.08247

Nakaura T, Yoshida N, Kobayashi N, Shiraishi K, Nagayama Y, Uetani H, et al. Preliminary assessment of automated radiology report generation with generative pre-trained transformers: comparing results to radiologist-generated reports. Japanese Journal of Radiology. 2023;p. 1–11.

Radford A, Wu J, Child R, Luan D, Amodei D, Sutskever I, et al. Language models are unsupervised multitask learners. OpenAI blog. 2019;1(8):9.

Google Scholar 

Brown TB. Language models are few-shot learners. 2020. arXiv preprint arXiv:2005.14165

Jiang Z, Cai X, Yang L, Gao D, Zhao W, Han J, et al. Learning to summarize Chinese radiology findings with a pre-trained encoder. IEEE Transactions on Biomedical Engineering. 2023.

Wang Z, Liu L, Wang L, Zhou L. R2GenGPT: radiology report generation with frozen LLMs. Meta-Radiology. 2023;1(3):100033.

Article  Google Scholar 

Jin H, Che H, Lin Y, Chen H. PromptMRG: diagnosis-driven prompts for medical report generation. In: Proceedings of the AAAI Conference on Artificial Intelligence. vol. 38; 2024. p. 2607–2615.

Tunstall L, Beeching E, Lambert N, Rajani N, Rasul K, Belkada Y, et al. Zephyr: direct distillation of LM alignment. 2023. arXiv preprint arXiv:2310.16944

Abdin M, Jacobs SA, Awan AA, Aneja J, Awadallah A, Awadalla H, et al. Phi-3 technical report: a highly capable language model locally on your phone. 2024. arXiv preprint arXiv:2404.14219

Su J, Ahmed M, Lu Y, Pan S, Bo W, Liu Y. RoFormer: enhanced transformer with rotary position embedding. Neurocomputing. 2024;568:127063.

Article  Google Scholar 

Hu EJ, Shen Y, Wallis P, Allen-Zhu Z, Li Y, Wang S, et al. LoRA: low-rank adaptation of large language models. 2021. arXiv preprint arXiv:2106.09685

Hong J, Lee N, Thorne J. ORPO: monolithic preference optimization without reference model. 2024;2(4):5. arXiv preprint arXiv:2403.07691

Liu Z, Lin Y, Cao Y, Hu H, Wei Y, Zhang Z, et al. Swin transformer: hierarchical vision transformer using shifted windows. In: Proceedings of the IEEE/CVF international conference on computer vision; 2021. p. 10012–10022.

Li M, Liu R, Wang F, Chang X, Liang X. Auxiliary signal-guided knowledge encoder-decoder for medical report generation. World Wide Web. 2023;26(1):253–70.

Article  MATH  Google Scholar 

Sennrich R. Neural machine translation of rare words with subword units. 2015. arXiv preprint arXiv:1508.07909

Loshchilov I, Hutter F. Decoupled weight decay regularization. 2017. arXiv preprint arXiv:1711.05101

Loshchilov I, Hutter F. SGDR: stochastic gradient descent with warm restarts. 2016. arXiv preprint arXiv:1608.03983

Papineni K, Roukos S, Ward T, Zhu WJ. Bleu: a method for automatic evaluation of machine translation. In: Proceedings of the 40th annual meeting of the Association for Computational Linguistics; 2002. p. 311–318.

Lin CY. Rouge: a package for automatic evaluation of summaries. In: Text summarization branches out; 2004. p. 74–81.

Banerjee S, Lavie A. METEOR: an automatic metric for MT evaluation with improved correlation with human judgments. In: Proceedings of the acl workshop on intrinsic and extrinsic evaluation measures for machine translation and/or summarization; 2005. p. 65–72.

Delbrouck JB, Chambon P, Bluethgen C, Tsai E, Almusa O, Langlotz C. Improving the factual correctness of radiology report generation with semantic rewards. In: Findings of the Association for Computational Linguistics: EMNLP 2022; 2022. p. 4348–4360.

Li Y, Liang X, Hu Z, Xing EP. Hybrid retrieval-generation reinforced agent for medical image report generation. Adv Neural Inf Process Syst. 2018;31.

Li CY, Liang X, Hu Z, Xing EP. Knowledge-driven encode, retrieve, paraphrase for medical image report generation. In: Proceedings of the AAAI conference on artificial intelligence. vol. 33; 2019. p. 6666–6673.

Biswal S, Xiao C, Glass LM, Westover B, Sun J. CLARA: clinical report auto-completion. In: Proceedings of The Web Conference 2020; 2020. p. 541–550.

Jing B, Wang Z, Xing E. Show, describe and conclude: on exploiting the structure information of chest X-ray reports. 2020. arXiv preprint arXiv:2004.12274

Jing B, Xie P, Xing E. On the automatic generation of medical imaging reports. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers); 2018. p. 2577–2586.

Wang Z, Zhou L, Wang L, Li X. A self-boosting framework for automated radiographic report generation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition; 2021. p. 2433–2442.

Wang X, Peng Y, Lu L, Lu Z, Summers RM. TieNet: text-image embedding network for common thorax disease classification and reporting in chest X-rays. In: Proceedings of the IEEE conference on computer vision and pattern recognition; 2018. p. 9049–9058.

Xue Y, Xu T, Rodney Long L, Xue Z, Antani S, Thoma GR, et al. Multimodal recurrent model with attention for automated radiology report generation. In: Medical Image Computing and Computer Assisted Intervention–MICCAI 2018: 21st International Conference, Granada, Spain, September 16-20, 2018, Proceedings, Part I. Springer; 2018. p. 457–466.

Liu G, Hsu TMH, McDermott M, Boag W, Weng WH, Szolovits P, et al. Clinically accurate chest X-ray report generation. In: Machine Learning for Healthcare Conference. PMLR; 2019. p. 249–269.

Xue Y, Huang X. Improved disease classification in chest X-rays with transferred features from report generation. In: Information Processing in Medical Imaging: 26th International Conference, IPMI 2019, Hong Kong, China, June 2–7, 2019, Proceedings 26. Springer; 2019. p. 125–138.

Xiong Y, Du B, Yan P. Reinforced transformer for medical image captioning. In: Machine Learning in Medical Imaging: 10th International Workshop, MLMI 2019, Held in Conjunction with MICCAI 2019, Shenzhen, China, October 13, 2019, Proceedings 10. Springer; 2019. p. 673–680.

Li M, Lin B, Chen Z, Lin H, Liang X, Chang X. Dynamic graph enhanced contrastive learning for chest X-ray report generation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition; 2023. p. 3334–3343.

Vinyals O, Toshev A, Bengio S, Erhan D. Show and tell: a neural image caption generator. In: Proceedings of the IEEE conference on computer vision and pattern recognition; 2015. p. 3156–3164.

Rennie SJ, Marcheret E, Mroueh Y, Ross J, Goel V. Self-critical sequence training for image captioning. In: Proceedings of the IEEE conference on computer vision and pattern recognition; 2017. p. 7008–7024.

Lu J, Xiong C, Parikh D, Socher R. Knowing when to look: adaptive attention via a visual sentinel for image captioning. In: Proceedings of the IEEE conference on computer vision and pattern recognition; 2017. p. 375–383.

Artetxe M, Ruder S, Yogatama D. On the cross-lingual transferability of monolingual representations. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics; 2020.

Gao Y, Xiong Y, Gao X, Jia K, Pan J, Bi Y, et al. Retrieval-augmented generation for large language models: a survey. 2023. arXiv preprint arXiv:2312.10997

Li C, Wong C, Zhang S, Usuyama N, Liu H, Yang J, et al. LLaVa-med: training a large language-and-vision assistant for biomedicine in one day. Advances in Neural Information Processing Systems. 2024;36.

Johnson AE, Pollard TJ, Berkowitz SJ, Greenbaum NR, Lungren MP, Deng Cy, et al. MIMIC-CXR, a de-identified publicly available database of chest radiographs with free-text reports. Scientific data. 2019;6(1):317.

留言 (0)

沒有登入
gif