Defining health data elements under the HL7 development framework for metadata management

Data elements were derived by constraining metadata (Class, Attribute and Data type) in HCDM and described according to ISO/IEC 11179 metamodel which defines how a data element can be classified and semantically described, named, identified, stored, retrieved, and managed [45, 46]. A data element comprises two parts in ISO/IEC 11179 metamodel: Data Element Concept and Value Domain. A Data Element Concept joins an Object class (like a person) with its Property (like sex) [47]. The Value Domain is the set of permissible values for one or more data elements. The mapping concept of ISO/IEC 11179 metamodel to HCDM are as follows: the Object Class in ISO/IEC 11179 metamodel corresponds to the Class in the HCDM, the Property of Object Class corresponds to the Attribute of Class, and the data type of Value Domain corresponds to the Data Type of attribute in the HCDM (Table 4).

Table 4 Mapping relationship between ISO/IEC 11179 metamodel and HCDM

Based on the HCDM, national health data dictionary (NHDD), which includes three types of DEs (initial DE, general DE, domain DE), was developed and has also been issued as a China’s health industry standard in May 2020 [48]. Initial DEs were formed by the combination of classes, attributes and data types in HCDM. General DEs were generated by de-composing the semantic components of data types of initial DEs. Domain DEs were defined or specified by constraining general DEs through terms in controlled vocabulary.

Initial data elements

100 initial DEs were extracted from HCDM and represented through data types (foundation, basic and quantities). The initial DEs serves as a bridge between the HCDM and general DEs, and so they have no corresponding specification on the semantic expression. As shown in Fig. 2, the initial DE person’s address is formed by constraining the Class (DE:Object class) “person”, Attribute (DE:Property) “address” of person and the Data type (DE:data type)"AD”.

Fig. 2figure 2

Abstract process of initial data elements. The left side indicates the initial data elements abstract process, and the right side shows an example for initial data element person’s address, which is formed by constraining the object class “person”, the attribute “address” of person and data type “AD” in the Health Concept Data Model

General data element

General DEs are independent of specific domain context to be maintained at a higher level. 144 general DEs were developed from initial DEs. The mapping method from ISO/IEC 11179 metamodel to the HCDM was as the same as initial DE’s derivation. But data types of general DEs were developed through further specializing initial DEs’data types. Basing on initial DEs’ data types, we unfolded the components of HCDM data types. The general DE was then formed by the combination of initial DE and each unfolded components of Data Type.

Such specialization mainly aimed at ANY which is the data type for value from medical observation. ANY can be specified into quantitative measurements, liter, index values, ranges, ordinals, nominal, etc. Based on actual demand, 19 metadata items were adopted in this work from ISO/IEC 11179 to describe general DEs. Table 5, taking Person Nationality Code as an example, presents standardized description of the general DE.

Table 5 Standardized description of general DE Person Nationality Code

In addition, six categories of representation format for general DEs were also defined according to ISO/IEC 11179–3: text, symbols, values, date, time and code. When some similar DEs appeared repeatedly, only one DE was retained such as code system identifiers and system names which repeated in all general DEs with coded attribute (entity class code, entity code, role code, act code, etc.), only one code system identifiers and system names was retained in NHDD.

Domain DE and Controlled vocabulary

General DEs are largely independent of specific domain context and usually need to be localized before being adopted by domain data developers. Such localization should follow a unified rule to avoid semantic confusion for information sharing. Controlled vocabularies were developed on the basis of the standard Health Information Value Codes (standard number: WS 364) and by referring to HL7 vocabularies [50]. There are currently 12 controlled vocabularies in NHDD: Entity classCode and Entity code, EntitydeterminerCode, Entity URLScheme, Entity telecommunicationAddressUse, Person addressType, Role classCode and Role code, Rolelink code, Participation typeCode, Act classCode and Act code, Act moodCode, Act relationshipCode, and Act statusCode.

The Entity classCode for each object class provides all possible subtypes (can be further subdivided) or instance (can’t be further subdivided) of the object class for localization of the general DEs. The controlled vocabulary Entity classCode provides restrictions for general DEs to be specified into one or more domain DEs. Entity is specialized into instances of human, microorganisms animals plants listed in the controlled vocabularies for the general DEs of Entity classCode and Entity Code. The link between Controlled vocabularies Entity Class Code and Entity Code is shown in Table 6 in which codes are the permissible value set for classCode and code of “Entity” in Fig. 1.

Table 6 Controlled vocabularies Entity Class Code and Entity Code

Consequently, related general DEs can be constrained into specific domain DEs. As shown in Fig. 3, “Entity name” of general DE can be constrained to a domain DE “doctor’s name” based on the term “human”, “doctor”, and to a domain DE “surgeon’s name” based on the term “human”, “surgeon” (subtype of “doctor”) in the vocabulary of “Entity Code” and “Role Code”, and to “operator’s name” based on the term “human”, “operator” in the vocabulary of “Entity Code” and “Participation Code Type”. The “Entity name” of general DE can also be constrained to a domain DE “operation doctor’s name” based on the vocabularies combination (pre-coordinated) of the “Entity Code (term: human)”, “Role Code (term: doctor)” and “Participation Code Type (term: operator)”.

Fig. 3figure 3

The relationship of general DE, controlled vocabulary and domain DE. “Entity name” of general DE can be constrained to the domain DE “doctor’s name” based on the term “doctor” and the domain DE “surgeon’s name” based on the term “surgeon” (subtype of “doctor”) in the vocabulary of “roleCode”, and to “operator’s name” based on the term “operator” in the vocabulary of “participationCodeType”. The “entity name” ofgeneral DE can also be constrained to the domain DE “operation doctor’s name” based on the vocabularies combination (pre-coordinated) of the “roleCode (term: doctor)” and “participationCodeType (term: operator)”

In total, domain DEs are standardized through 22 metadata items, including 14 data element attributes and 6 value domain attributes, which are all from the ISO/IEC11179 model. Among them, the metadata item named “Metadata Reference” can be related to NHDD and the “Relation Type” can be constrained to the class in HCDM. Value domain attributes indicate the relationship between domain DEs and controlled vocabularies.

The relationships of HCDM, initial DEs, general DEs and domain DEs are shown in Table 7.

Table 7 The process of forming initial data elements, general data elements and domain data elements in the class “Entity”The web-based system for HCDM

Based on HCDM and NHDD, the web-based system (http://222.249.173.28:38646/STDWEB/) was developed to facilitate centralized management for healthcare metadata. Main functions of the system include: data element management (input, search, browse, edit, etc. for data elements and other metadata items, such as data element concepts, value domains, data sets, etc.), import, export of DEs and data sets (excel, word, pdf, XML formats), and system maintenance. Users can be authorized to browse or edit the content of the system. If a user needs to add a new metadata item, or to update an existing one, he/she should apply for user permission firstly, the added or updated metadata must be inspected and approved by authorized organization before publishing.

The system was constructed basing on a cloud architecture and using Java 2 Platform, Enterprise Edition (J2EE). It supports the access from cross-platform, cross-region and cross-network operations, and also supports the standards of simple object access protocol, eXtensible Markup Language (XML), workflow management coalition, etc. Distributed transaction processing mechanism was adopted to ensure a high consistency of distributed operation transactions and information, to prevent data inconsistency caused by the partial server or network runtime failure of distributed system.

The relationships among HCDM, data elements and value domains are connected through web links in the system. The value sets of general DEs are linked to the classification scheme which contains the value codes of general DEs and domain DEs. Figure 4 is a display interface of initial DE in the system, including DE’s Chinese name, English name, data type and edit function. The input and interface of domain DEs are shown in Fig. 5. For instance, by constraining “entity” and “role” (from HCDM) to “person” and “patient” (from controlled vocabularies), general DE “person’s marital status code” will be constrained to the domain DE “patient’s marital status code” accordingly.

Fig. 4figure 4

A display interface of initial DE in the system, including initial DE’s Chinese name, English name, data type, edit and delete function

Fig. 5figure 5

The input and revise interface of domain DE in the system. Domain DEs are standardized through 22 metadata descriptions, including 14 data element attributes and 6 value code attributes. Among them, data element attributes reflect relationships among domain DEs, HCDM and NHDD. Value domain attributes reflect the relationship between domain DEs and controlled vocabularies

留言 (0)

沒有登入
gif