Supporting SNOMED CT postcoordination with knowledge graph embeddings

SNOMED CT postcoordination is an underused mechanism that can help to implement advanced systems for the automatic extraction and encoding of clinical information from text. It allows defining non-existing SNOMED CT concepts by their relationships with existing ones. Manually building postcoordinated expressions is a difficult task. It requires a deep knowledge of the terminology and the support of specialised tools that barely exist. In order to support the building of postcoordinated expressions, we have implemented KGE4SCT: a method that suggests the corresponding SNOMED CT postcoordinated expression for a given clinical term. We leverage on the SNOMED CT ontology and its graph-like structure and use knowledge graph embeddings (KGEs). The objective of such embeddings is to represent in a vector space knowledge graph components (e.g. entities and relations) in a way that captures the structure of the graph. Then, we use vector similarity and analogies for obtaining the postcoordinated expression of a given clinical term. We obtained a semantic type accuracy of 98%, relationship accuracy of 90%, and analogy accuracy of 60%, with an overall completeness of postcoordination of 52% for the Spanish SNOMED CT version. We have also applied it to the English SNOMED CT version and outperformed state of the art methods in both, corpus generation for language model training for this task (improvement of 6% for analogy accuracy), and automatic postcoordination of SNOMED CT expressions, with an increase of 17% for partial conversion rate.

留言 (0)

沒有登入
gif