Evaluation of the accuracy of an artificial intelligence in identifying contraindications to exercise therapy - Comparison with and interrater reliability of physical therapists judgments

Ali O, Abdelbaki W, Shrestha A, et al. A systematic literature review of artificial intelligence in the healthcare sector: Benefits, challenges, methodologies, and functionalities. J Innov Knowl. 2023;8:100333. https://doi.org/10.1016/j.jik.2023.100333.

Article  Google Scholar 

Goodman K, Zandi D, Reis A, Vayena E. Balancing risks and benefits of artificial intelligence in the health sector. Bull World Health Organ. 2020;98:230-230A. https://doi.org/10.2471/blt.20.253823.

Article  PubMed  PubMed Central  Google Scholar 

Pawloski PA, Brooks GA, Nielsen ME, Olson-Bullis BA. A systematic review of clinical decision support systems for clinical oncology practice. J Natl Compr Canc Netw. 2019;17:331–8. https://doi.org/10.6004/jnccn.2018.7104.

Article  PubMed  PubMed Central  Google Scholar 

Verboven L, Calders T, Callens S, et al. A treatment recommender clinical decision support system for personalized medicine: method development and proof-of-concept for drug resistant tuberculosis. BMC Med Inform Decis Mak. 2022;22:56. https://doi.org/10.1186/s12911-022-01790-0.

Article  PubMed  PubMed Central  Google Scholar 

Fiske A, Henningsen P, Buyx A. Your robot therapist will see you now: Ethical implications of embodied artificial intelligence in psychiatry, psychology, and psychotherapy. J Med Internet Res. 2019;21:e13216. https://doi.org/10.2196/13216.

Article  PubMed  PubMed Central  Google Scholar 

El Asmar ML, Dharmayat KI, Vallejo-Vaz AJ, et al. Effect of computerised, knowledge-based, clinical decision support systems on patient-reported and clinical outcomes of patients with chronic disease managed in primary care settings: a systematic review. BMJ Open. 2021;11:e054659. https://doi.org/10.1136/bmjopen-2021-054659.

Article  PubMed  PubMed Central  Google Scholar 

Rughani G, Nilsen TIL, Wood K, et al. The selfBACK artificial intelligence-based smartphone app can improve low back pain outcome even in patients with high levels of depression or stress. Eur J Pain. 2023;27:568–79. https://doi.org/10.1002/ejp.2080.

Article  PubMed  Google Scholar 

Lewis R, Gómez Álvarez CB, Rayman M, et al. Strategies for optimising musculoskeletal health in the 21st century. BMC Musculoskelet Disord. 2019;20:164. https://doi.org/10.1186/s12891-019-2510-7.

Article  PubMed  PubMed Central  Google Scholar 

Briggs AM, Cross MJ, Hoy DG, et al. Musculoskeletal Health Conditions Represent a Global Threat to Healthy Aging: A Report for the 2015 World Health Organization World Report on Ageing and Health. Gerontologist. 2016;56(Suppl 2):S243–55. https://doi.org/10.1093/geront/gnw002.

Article  PubMed  Google Scholar 

Bonanni R, Cariati I, Tancredi V, et al. Chronic pain in musculoskeletal diseases: Do you know your enemy? J Clin Med. 2022;11:2609. https://doi.org/10.3390/jcm11092609.

Article  CAS  PubMed  PubMed Central  Google Scholar 

Teepe GW, Kowatsch T, Hans FP, Benning L. Postmarketing follow-up of a digital home exercise program for back, hip, and knee pain: Retrospective observational study with a time-series and matched-pair analysis. J Med Internet Res. 2023;25:e43775. https://doi.org/10.2196/43775.

Article  PubMed  PubMed Central  Google Scholar 

Areias AC, Costa F, Janela D, et al. Impact on productivity impairment of a digital care program for chronic low back pain: A prospective longitudinal cohort study. Musculoskelet Sci Pract. 2023;63:102709. https://doi.org/10.1016/j.msksp.2022.102709.

Article  PubMed  Google Scholar 

Chhabra HS, Sharma S, Verma S. Smartphone app in self-management of chronic low back pain: a randomized controlled trial. Eur Spine J. 2018;27:2862–74. https://doi.org/10.1007/s00586-018-5788-5.

Article  CAS  PubMed  Google Scholar 

Marcuzzi A, Nordstoga AL, Bach K, et al. Effect of an artificial intelligence–based self-management app on musculoskeletal health in patients with neck and/or low back pain referred to specialist care. JAMA Netw Open. 2023;6:e2320400. https://doi.org/10.1001/jamanetworkopen.2023.20400.

Article  PubMed  PubMed Central  Google Scholar 

Mathews SC, McShea MJ, Hanley CL, et al. Digital health: A path to validation. NPJ Digit Med. 2019;2:38. https://doi.org/10.1038/s41746-019-0111-3.

Article  PubMed  PubMed Central  Google Scholar 

Bossuyt PM, Reitsma JB, Bruns DE, et al. STARD 2015: An Updated List of Essential Items for Reporting Diagnostic Accuracy Studies. Clin Chem. 2015;61:1446–52. https://doi.org/10.1373/clinchem.2015.246280.

Article  CAS  PubMed  Google Scholar 

Sounderajah V, Ashrafian H, Golub RM, et al. Developing a reporting guideline for artificial intelligence-centred diagnostic test accuracy studies: the STARD-AI protocol. BMJ Open. 2021;11:e047709. https://doi.org/10.1136/bmjopen-2020-047709.

Article  PubMed  PubMed Central  Google Scholar 

World Physiotherapy. Policy statement: Physical therapists as exercise and physical activity experts across the life span. World Physiotherapy. 2019. https://world.physio/sites/default/files/2020-09/PS-2019-Exercise-experts.pdf

Jette DU, Ardleigh K, Chandler K, McShea L. Decision-making ability of physical therapists: physical therapy intervention or medical referral. Phys Ther. 2006;86:1619–29. https://doi.org/10.2522/ptj.20050393.

Article  PubMed  Google Scholar 

Gallotti M, Campagnola B, Cocchieri A, et al. Effectiveness and consequences of direct access in physiotherapy: A systematic review. J Clin Med Res. 2023;12:5832. https://doi.org/10.3390/jcm12185832.

Article  Google Scholar 

Lange T, Kopkow C, Lützner J, et al. Comparison of different rating scales for the use in Delphi studies: different scales lead to different consensus and show different test-retest reliability. BMC Med Res Methodol. 2020;20:28. https://doi.org/10.1186/s12874-020-0912-8.

Article  PubMed  PubMed Central  Google Scholar 

Diamond IR, Grant RC, Feldman BM, et al. Defining consensus: a systematic review recommends methodologic criteria for reporting of Delphi studies. J Clin Epidemiol. 2014;67:401–9. https://doi.org/10.1016/j.jclinepi.2013.12.002.

Article  PubMed  Google Scholar 

Esteva A, Kuprel B, Novoa RA, et al. Dermatologist-level classification of skin cancer with deep neural networks. Nature. 2017;542:115–8. https://doi.org/10.1038/nature21056.

Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

Fleiss JL. Measuring nominal scale agreement among many raters. Psychol Bull. 1971;76:378–82. https://doi.org/10.1037/h0031619.

Article  Google Scholar 

De Vries H, Elliott MN, Kanouse DE, Teleki SS. Using pooled kappa to summarize interrater agreement across many items. Field Methods. 2008;20:272–82. https://doi.org/10.1177/1525822x08317166.

Article  Google Scholar 

Landis JR, Koch GG. The measurement of observer agreement for categorical data. Biometrics. 1977;33:159–74.

Article  CAS  PubMed  Google Scholar 

Terwee CB, Prinsen CAC, Chiarotto A, et al. COSMIN methodology for evaluating the content validity of patient-reported outcome measures: a Delphi study. Qual Life Res. 2018;27:1159–70. https://doi.org/10.1007/s11136-018-1829-0.

Article  CAS  PubMed  PubMed Central  Google Scholar 

Mokkink LB, Boers M, van der Vleuten CPM, et al. COSMIN Risk of Bias tool to assess the quality of studies on reliability or measurement error of outcome measurement instruments: A Delphi study. BMC Med Res Methodol. 2020;20:293. https://doi.org/10.1186/s12874-020-01179-5.

Article  CAS  PubMed  PubMed Central  Google Scholar 

Sokolova M, Japkowicz N, Szpakowicz S. Beyond accuracy, F-score and ROC: A family of discriminant measures for performance evaluation. In: Lecture Notes in Computer Science. Berlin Heidelberg, Berlin, Heidelberg: Springer; 2006. p. 1015–21.

Google Scholar 

Yacouby R, Axman D. Probabilistic Extension of Precision, Recall, and F1 Score for More Thorough Evaluation of Classification Models. In Proceedings of the First Workshop on Evaluation and Comparison of NLP Systems. 2020;2020:79–91 Association for Computational Linguistics, Online.

Article  Google Scholar 

Lalkhen AG, McCluskey A. Clinical tests: sensitivity and specificity. Contin Educ Anaesth Crit Care Pain. 2008;8:221–3. https://doi.org/10.1093/bjaceaccp/mkn041.

Article  Google Scholar 

Dukic V, Gatsonis C. Meta-analysis of diagnostic test accuracy assessment studies with varying number of thresholds. Biometrics. 2003;59:936–46. https://doi.org/10.1111/j.0006-341x.2003.00108.x.

Article  MathSciNet  CAS  PubMed  PubMed Central  Google Scholar 

Armstrong RA. When to use the Bonferroni correction. Ophthalmic Physiol Opt. 2014;34:502–8. https://doi.org/10.1111/opo.12131.

Article  PubMed  Google Scholar 

Redier H, Daures JP, Michel C, et al. Assessment of the severity of asthma by an expert system. Description and evaluation. Am J Respir Crit Care Med. 1995;151:345–52. https://doi.org/10.1164/ajrccm.151.2.7842190.

Article  CAS  PubMed  Google Scholar 

Gudmundsson HT, Hansen KE, Halldorsson BV, et al. Clinical decision support system for the management of osteoporosis compared to NOGG guidelines and an osteology specialist: A validation pilot study. BMC Med Inform Decis Mak. 2019;19:27. https://doi.org/10.1186/s12911-019-0749-4.

Article  PubMed  PubMed Central  Google Scholar 

Farmer N. An update and further testing of a knowledge-based diagnostic clinical decision support system for musculoskeletal disorders of the shoulder for use in a primary care setting. J Eval Clin Pract. 2014;20:589–95. https://doi.org/10.1111/jep.12153.

Article 

留言 (0)

沒有登入
gif