Precision information extraction for rare disease epidemiology at scale

Health Promotion and Disease Prevention Amendments of 1984. In: 21 USC 360bb, 98th Congress, 2nd Session edition. United States of America: U.S. Government Printing Office; 1984. p. 2817.

Regulation (EC) N°141/2000 of the European Parliament and of the Council of 16 December 1999 on orphan medicinal products. European Union; 2000. p. 1.

Dicken J. Rare diseases: although limited available evidence suggests medical and other costs can be substantial. Goverment Accountability Office (GAO); 2021.

Google Scholar 

Nguengang Wakap S, Lambert DM, Olry A, Rodwell C, Gueydan C, Lanneau V, Murphy D, Le Cam Y, Rath A. Estimating cumulative point prevalence of rare diseases: analysis of the Orphanet database. Eur J Hum Genet. 2020;28(2):165–73.

Article  PubMed  Google Scholar 

Stanarevic KS. Health information behaviour of rare disease patients: seeking, finding and sharing health information. Health Info Libr J. 2019;36(4):341–56.

Article  Google Scholar 

Orphan Drug Act. In: 21 USC, 97th Congress, 2nd Session edition. United State of America: U.S. Government Printing Office; 1983. p. 2049.

Bruckner-Tuderman L. Epidemiology of rare diseases is important. J Eur Acad Dermatol Venereol. 2021;35(4):783–4.

Article  CAS  PubMed  Google Scholar 

Valdez R, Ouyang L, Bolen J. Public health and rare diseases: oxymoron no more. Prev Chronic Dis. 2016;13:E05.

Article  PubMed  PubMed Central  Google Scholar 

Puerto Rico Heart Health Program. [https://biolincc.nhlbi.nih.gov/studies/prhhp/]

Kuakini Honolulu Heart Program. [https://www.kuakini.org/wps/portal/kuakini-research/research-home/kuakini-research-programs/kuakini-honolulu-heart-program]

Breen N, Correa-de-Araujo R, Amarreh I, Araojo R, Arispe I, Ashman J, Berchick E, Chaves K, Bronson J, Chandra A, et al. Compendium of federal datasets addressing health disparities. U.S. Department of Health and Human Services, U.S. Public Health Service; 2019.

Google Scholar 

National Health and Nutrition Examination Survey. https://www.cdc.gov/nchs/nhanes/index.htm

National Health Interview Survey. https://www.cdc.gov/nchs/nhis/about_nhis.htm

National Patient Information Reporting System. https://www.ihs.gov/npirs/

Duggan MA, Anderson WF, Altekruse S, Penberthy L, Sherman ME. The surveillance, epidemiology, and end results (SEER) program and pathology: toward strengthening the critical relationship. Am J Surg Pathol. 2016;40(12):e94–102.

Article  PubMed  PubMed Central  Google Scholar 

Hankey BF, Ries LA, Edwards BK. The surveillance, epidemiology, and end results program: a national resource. Cancer Epidemiol Prevent Biomarkers. 1999;8(12):1117–21.

CAS  Google Scholar 

National Notifiable Diseases Surveillance System. https://www.cdc.gov/nndss/index.html

Orphanet: Procedural document on Epidemiology of rare disease in Orphanet (Prevalence, incidence and number of published cases or families). Orphanet; 2019

American College of Medical Genetics Newborn Screening Expert G. Newborn screening: toward a uniform screening panel and system—executive summary. Pediatrics. 2006;117(5 Pt 2):S296-307.

Google Scholar 

About Cystic Fibrosis. https://www.cff.org/What-is-CF/About-Cystic-Fibrosis/

Buiting K, Williams C, Horsthemke B. Angelman syndrome—insights into a rare neurogenetic disorder. Nat Rev Neurol. 2016;12(10):584–93.

Article  CAS  PubMed  Google Scholar 

Maas NM, Van Buggenhout G, Hannes F, Thienpont B, Sanlaville D, Kok K, Midro A, Andrieux J, Anderlid BM, Schoumans J, et al. Genotype-phenotype correlation in 21 patients with Wolf-Hirschhorn syndrome using high resolution array comparative genome hybridisation (CGH). J Med Genet. 2008;45(2):71–80.

Article  CAS  PubMed  Google Scholar 

Labuda SM, Williams SH, Mukasa LN, McGhee L. Hansen’s disease and complications among marshallese persons residing in Northwest Arkansas, 2003–2017. Am J Trop Med Hyg. 2020;103(5):1810–2.

Article  PubMed  PubMed Central  Google Scholar 

AFM Cases and Outbreaks. https://www.cdc.gov/acute-flaccid-myelitis/cases-in-us.html

Birnbaum ZW, Sirken MG. Design of sample surveys to estimate the prevalence of rare diseases: three unbiased estimates. Vital Health Stat 2(196511):1–8.

Barendregt JJ, van Oortmarssen G, Vos,Theo, , Murray CJ. A generic model for the assessment of disease epidemiology: the computational basis of DisMod II. Nat Rev Neurol. 2003; 1

Addressing the challenges of persons living with a rare disease and their families. United Nations General Assembly; 2021.

Genetic and Rare Diseases Information Center. https://rarediseases.info.nih.gov/.

About Orphanet. https://www.orpha.net/consor/cgi-bin/Education_AboutOrphanet.php?lng=EN

Hamosh A, Scott AF, Amberger JS, Bocchini CA, McKusick VA. Online Mendelian Inheritance in Man (OMIM), a knowledgebase of human genes and genetic disorders. Nucleic Acids Res. 2005;33(Database issue):D514-517.

Article  CAS  PubMed  Google Scholar 

Karystianis G, Thayer K, Wolfe M, Tsafnat G. Evaluation of a rule-based method for epidemiological document classification towards the automation of systematic reviews. J Biomed Inform. 2017;70:27–34.

Article  PubMed  Google Scholar 

Huertas-Quintero JA, Losada-Trujillo N, Cuellar-Ortiz DA, Velasco-Parra HM. Hypophosphatemic rickets in Colombia: a prevalence-estimation model in rare diseases. Lancet Reg Health Am. 2021;7:100131.

PubMed  PubMed Central  Google Scholar 

Wasserman RC. Electronic medical records (EMRs), epidemiology, and epistemology: reflections on EMRs and future pediatric clinical research. Acad Pediatr. 2011;11(4):280–7.

Article  PubMed  PubMed Central  Google Scholar 

Tisdale A, Cutillo CM, Nathan R, Russo P, Laraway B, Haendel M, Nowak D, Hasche C, Chan CH, Griese E, et al. The IDeaS initiative: pilot study to assess the impact of rare diseases on patients and healthcare systems. Orphanet J Rare Dis. 2021;16(1):429.

Article  PubMed  PubMed Central  Google Scholar 

Gokhale KM, Chandan JS, Toulis K, Gkoutos G, Tino P, Nirantharakumar K. Data extraction for epidemiological research (DExtER): a novel tool for automated clinical epidemiology studies. Eur J Epidemiol. 2021;36(2):165–78.

Article  PubMed  Google Scholar 

Cameron D, Smith GA, Daniulaityte R, Sheth AP, Dave D, Chen L, Anand G, Carlson R, Watkins KZ, Falck R. PREDOSE: a semantic web platform for drug abuse epidemiology using social media. J Biomed Inform. 2013;46(6):985–97.

Article  PubMed  Google Scholar 

Osborne JD, Wyatt M, Westfall AO, Willig J, Bethard S, Gordon G. Efficient identification of nationally mandated reportable cancer cases using natural language processing and machine learning. J Am Med Inform Assoc. 2016;23(6):1077–84.

Article  PubMed  PubMed Central  Google Scholar 

Yoon HJ, Stanley C, Christian JB, Klasky HB, Blanchard AE, Durbin EB, Wu XC, Stroup A, Doherty J, Schwartz SM, et al. Optimal vocabulary selection approaches for privacy-preserving deep NLP model training for information extraction and cancer epidemiology. Cancer Biomark. 2022;33(2):185–98.

Article  PubMed  PubMed Central  Google Scholar 

Vaswani A, Parmar N, Uszkoreit N, Jones N, Gomez L, Kaiser AN, Polosukhin L. Illia: attention is all you need. In: 31st conference on neural information processing systems (NIPS 2017), vol. 30. Long Beach, CA; 2017.

Devlin J, Chang M-W, Lee K, Toutanova K. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of NAACL-HLT 2019. Minneapolis, Minnesota: Association for Computational Linguistics; 2019. p. 4171–4186

Lee J, Yoon W, Kim S, Kim D, Kim S, So CH, Kang J. BioBERT: a pre-trained biomedical language representation model for biomedical text mining. Bioinformatics. 2020;36(4):1234–40.

Article  CAS  PubMed  Google Scholar 

Gu Y, Tinn R, Cheng H, Lucas M, Usuyama N, Liu X, Naumann T, Gao J, Poon H. Domain-specific language model pretraining for biomedical natural language processing. ACM Trans Comput Healthc. 2022;2022(1):1–23.

Article  Google Scholar 

Ji Z, Wei Q, Xu H. BERT-based ranking for biomedical entity normalization. AMIA Jt Summits Transl Sci Proc. 2020;2020:269–77.

PubMed  PubMed Central  Google Scholar 

Alsentzer E, Murphy J, Boag W, Weng W-H, Jindi D, Naumann T, McDermott M. Publicly available Clinical BERT embeddings. In: 2nd clinical natural language processing workshop. Minneapolis, Minnesota, USA. 2019. p. 72–78.

Si Y, Wang J, Xu H, Roberts K. Enhancing clinical concept extraction with contextual embeddings. J Am Med Inform Assoc. 2019;26(11):1297–304.

Article  PubMed  PubMed Central  Google Scholar 

Peng Y, Yan S, Lu Z. Transfer learning in biomedical natural language processing: an evaluation of BERT and ELMo on ten benchmarking datasets. In: 18th BioNLP workshop and shared task. Florence, Italy; 2019. p. 58–65.

Li F, Jin Y, Liu W, Rawat BPS, Cai P, Yu H. Fine-tuning bidirectional encoder representations from transformers (BERT)-based models on large-scale electronic health record notes: an empirical study. JMIR Med Inform. 2019;7(3): e14830.

Article  PubMed  PubMed Central  Google Scholar 

Mahajan D, Poddar A, Liang JJ, Lin YT, Prager JM, Suryanarayanan P, Raghavan P, Tsou CH. Identification of semantically similar sentences in clinical notes: iterative intermediate training using multi-task learning. JMIR Med Inform. 2020;8(11): e22508.

Article  PubMed  PubMed Central  Google Scholar 

Mitra A, Rawat BPS, McManus DD, Yu H. Relation classification for bleeding events from electronic health records using deep learning systems: an empirical study. JMIR Med Inform. 2021;9(7): e27527.

Article  PubMed  PubMed Central  Google Scholar 

Rasmy L, Xiang Y, Xie Z, Tao C, Zhi D. Med-BERT: pretrained contextualized embeddings on large-scale structured electronic health records for disease prediction. NPJ Digit Med. 2021;4(1):86.

Article  PubMed  PubMed Central  Google Scholar 

Zhou ZH. A brief introduction to weakly supervised learning. Natl Sci Rev. 2018;5(1):44–53.

Article  Google Scholar 

Sedova A, Stephan A, Speranskaya M, Roth B. Knodle: modular weakly supervised learning with PyTorch. In: Proceedings of the 6th workshop on representation learning for NLP (RepL4NLP-2021); Online. Association for Computational Linguistics; 2021. p. 100–111.

Honnibal M, Montani, I. spaCy 2: Natural language understanding with Bloom embeddings, convolutional neural networks and incremental parsing. 2017.

Patrini G, Nielsen F, Nock R, Carioni M. Loss factorization, weakly supervised learning and label noise robustness. In: The 33rd international conference on machine learning. 2016. p. 708–717.

Ba JL, Kiros JR, Hinton GE: Layer normalization. In arXiv preprint; 2016.

Zhu Q, Nguyen DT, Grishagin I, Southall N, Sid E, Pariser A. An integrative knowledge graph for rare diseases, derived from the Genetic and Rare Diseases Information Center (GARD). J Biomed Semantics. 2020;11(1):13.

Article  PubMed  PubMed Central  Google Scholar 

Madeira F, Park YM, Lee J, Buso N, Gur T, Madhusoodanan N, Basutkar P, Tivey ARN, Potter SC, Finn RD, Lopez R. The EMBL-EBI search and sequence analysis tools APIs in 2019. Nucleic Acids Res. 2019;47(W1):W636–41.

Article  CAS  PubMed  PubMed Central  Google Scholar 

John JN, Sid E, Zhu Q. Recurrent neural networks to automatically identify rare disease epidemiologic studies from PubMed. AMIA Annu Symp Proc. 2021;2021:325–34.

Google Scholar 

Hochreiter S, Schmidhuber J. Long short-term memory. Neural Comput. 1997;9(8):1735–80.

Article  CAS  PubMed  Google Scholar 

Dai Z, Callan J. Deeper text understanding for IR with contextual neural language modeling. In: Proceedings of the 42nd international ACM SIGIR conference on research and development in information retrieval. 2019. p. 985–988.

留言 (0)

沒有登入
gif