Annotation of epilepsy clinic letters for natural language processing

Synthetic letters

We produced 200 synthetic epilepsy clinic letters, based on United Kingdom (UK) hospital outpatient epilepsy clinic consultations. Epilepsy clinic letters are written by clinicians and describe relevant details, discussions, investigations, and management plans. They are part of the patient health record and are written in a variety of styles, lengths, and formats.

The synthetic letters were written by neurology consultants, specialist trainees, and epilepsy specialist nurses to ensure a variation in writing styles and content. They were based on real clinic letters but contained completely synthetic information and any patient or clinician information in the letters is completely fictitious, i.e. no real personal, demographic, or clinical information is included. Four letters were duplicated within the set to test for consistency in annotations.

Annotations

The letters were double annotated by four trained researchers and clinicians (100 letters each) according to annotation guidelines formed during the development of ExECT. We developed the annotation guidelines based on previous annotation sessions and modified them to incorporate annotators’ suggestions, providing examples derived from clinic letters to assist with more difficult cases.

We used the annotation tool, Markup [15] with an epilepsy concept list based on the Unified Medical Language System (UMLS) ontology [16] with mapping of terms from the International League Against Epilepsy (ILAE) epilepsy and seizure classification [17, 18]. Markup provides annotators with a list of entities (concepts) to be annotated and drop-down lists of features (attributes to be assigned to each entity, including UMLS concept unique identifiers [CUIs]) associated with each diagnostic or treatment term (Fig. 1). We ran several trial sessions to ensure familiarity with Markup and the annotation process before the annotation task.

Entities that were annotated included:

Birth history

birth age, perinatal events, normal/abnormal birth;

Diagnosis

epilepsy, epilepsy type/syndrome, seizure type;

Epilepsy cause

clear statements identifying past events or comorbidities causing an individual’s epilepsy;

Investigations

EEG (including examination type), CT, and MRI results, annotated as normal, abnormal, or not stated;

Onset

time of onset of epilepsy or specific seizure types, expressed as age, date, or time since first epileptic seizure or mention of epilepsy;

Patient history

unspecified seizures (seizures, blank episodes), febrile seizures, major health events, and comorbidities, with age, date, or time since/onset of the event;

Prescriptions

current prescribed antiseizure medications (ASM) with dose, dose unit, and frequency;

Seizure frequency

number of seizures, by type if stated (including periods of seizure freedom) since or during specific point in time/time period/date, or changes in seizure frequency since/during specified time or since last clinic visit;

When diagnosed

age, date, or time since the diagnosis of epilepsy.

Levels of certainty expressed in the statements, ranging from 1 (negation) to 5 (strong affirmation) were assigned to phrases relating to diagnosis and patient history (Supplementary Table 1).

Fig. 1figure 1

Annotating synthetic letters in Markup (www.getmarkup.com). Annotation types are listed on the left-hand side, above the UMLS selection dropdown. Completed annotations are listed on the right-hand side

Inter-annotator agreement

We combined the annotation sets from all four annotators, creating two sets of 200 annotations each. We compared these two sets (of 200 letters each) using inter-annotator agreement (IAA). IAA, which assesses the level of agreement between the annotators, was calculated using F1 score, the harmonic mean of precision (positive predictive value) and recall (sensitivity), an established information retrieval performance measure [19]. We define agreement when two annotators select the same entity and attributes for a specific term. All annotations were reviewed during consensus meetings.

The final corrected set, representing consensus opinion, formed the gold standard which we used to validate ExECTv2, with the IAA scores providing a benchmark measure for the pipeline’s performance [20]. ExECT is an epilepsy NLP pipeline written within GATE (Generalised Architecture for Text Engineering). See supplementary information for a figure detailing the ExECT pipeline (Supplementary Fig. 1) and Fonferko-Shadrach et al. for further details on ExECT [14]. ExECTv2 has several improvements over version 1 which include: an expanded range of extracted terms, updated gazetteers that include the most recent International League Against Epilepsy (ILAE) classification system, and added rules for combined seizure and epilepsy terms [21]. We used R version 4.1.0 to calculate per item (every mention of the entity) and per letter (correct extraction of the term in a letter) validation scores.

留言 (0)

沒有登入
gif