Available online 9 September 2022, 104200
Highlights•We developed comprehensive annotation criteria for patient status, comprising 46 entity types, 9 attributes, and 36 relations through recursive annotation of case reports.
•Comprehensiveness requires that assertions be expressed as entity types, not as attributes.
•Our corpus, comprising 182 semi-automatically annotated cases in Japanese, is publicly available.
AbstractIn clinical records, much of the clinical information is recorded as free text, thus necessitating the use of advanced automatic information extraction technology. The development of practical technologies requires a corpus with finer granularity annotations that describe the information in the corpus, but such annotation criteria have not been researched enough thus far . This study aimed to develop fine grained annotation criteria that exhaustively cover patients’ states in case reports. We collected 362 case reports—written in Japanese—of intractable diseases that were expected to contain a broad range of patients’ states. Criteria were developed by repeatedly revising and annotating the clinical case report text. A set of annotation criteria for patients’ states, consisting of 46 entity types, 9 attributes, and 36 relations, was obtained; it allows more detailed information to be expressed than in previous studies by broader range of concept types including treatment, and captures clinical information based on a combination of causality and judgment, which could not be expressed before.
KeywordsNatural language processing
Datasets
Medical records
AbbreviationsNLPnatural language processing
CUIsconcept-unique identifiers
i2b2Informatics for Integrating Biology & the Bedside
© 2022 The Authors. Published by Elsevier Inc.
留言 (0)