CIDO ontology updates and secondary analysis of host responses to COVID-19 infection based on ImmPort reports and literature

ImmPort data exploration on the basis of existing CIDO development

The method implemented in our study is the recursive XOD strategy (Fig. 1). The added information should be semantically aligned with existing ontology structure. We applied recursive usage of the XOD development pipeline to continuously incorporate and integrate new knowledge data to CIDO.

Figure 2 illustrates how CIDO has been developed and how ImmPort data can fit into the existing CIDO structure. A major task addressed in this study was to use the CIDO as the basis, add new knowledge learned from the ImmPort COVID-19 studies (Table 1) to the current version of CIDO, and then perform the secondary analysis to identify new scientific insights about host-SARS-CoV-2 interactions.

Fig. 2figure2

CIDO representation of host-coronavirus interactions and the addition of new knowledge learned from papers of ImmPort studies to CIDO. ImmPort studies are labeled with their study IDs (e.g., SDY1667). The termed highlighted are the terms newly added to incorporate the knowledge learned from the corresponding ImmPort studies. This figure shows how CIDO can be updated through our ontology reinforcement strategy

In our study, CIDO is used as a foundation and platform for semantic representation of host responses to COVID-19 infection. CIDO provides basic information about host-coronavirus interactions. Overall, SARS-CoV-2 viral processes include the viral binding to the host cell, entry to the cell, viral genetic replication, assembly of the virion, and release of the virion. CIDO also includes terms and axioms about immune responses to SARS-CoV-2 and has expanded modeling to account for dealing with unique changes from a pandemic. A portion of this information is represented in Fig. 2 showing parts of the SARS-CoV-2 life cycle including viral invasion to the host cell, viral replication, and viral shedding from the cell.

CIDO representation of S protein focused coronavirus invasion, host response, and viral mutationModeling of viral invasion and host immune response by S protein

The S protein plays a key role in COVID-19 viral infection and disease pathology. Fig. 3a illustrates how the CIDO ontology represents various coronaviral processes on the surface of and inside the host, such as the main steps of viral infection, reproduction, and shedding (i.e., viral release out of host cell). The process of ‘SARS-CoV-2 S binding to human ACE2’ is defined with the following axioms:

‘has participant’ some ‘ACE2 (human)’

‘has participant’ some ‘S protein of SARS-CoV-2)’

‘part of’ some ‘SARS-CoV-2 entry to human cell’

‘SARS-CoV-2 binding to human cell’

‘SARS-CoV-2 S-ACE2 binding’

Fig. 3figure3

Ontological representation of S protein processes and variations. a CIDO ontological representation of viral life cycle in host (cell invasion, genetic replication, and release for cell and invading other cells) of virulent SARS-CoV-2. CIDO ontological representation of coronaviral molecular processes. The right bottom side of the screenshot represents different axioms for the SARS-CoV-2 S binding to human ACE2. b Seven SARS-CoV-2 clades are defined with each has its specific definition. For example, SARS-CoV-2 clade G virus has AA variant S-D614G, which is a variant of S protein of SARS-CoV-2. c CIDO representation of SARS-CoV-2 virus variants such as SARS-CoV-2 Delta variant based on the WHO classification

Such S-ACE2 binding is critical not only to the viral invasion, but also to the manifestation of many COVID-19 phenotypes such as pneumonia, hypertension, heart disease, acute kidney injury [18, 19].

Additionally, the issue of different clades brings about a concern of unique epitopes. A good representative study of their significance for ontology modeling is the study reported in ImmPort (SDY 1667) [20], which found 142 SARS-CoV-2 T-cell epitopes that are homologous epitopes to SARS-CoV-2 and multiple common cold human coronaviruses. Homologous epitopes are defined as any two epitopes, A and B, that exhibit sufficient homology and that when A elicits a host immune response and becomes part of the host immune memory, the A-specific memory will also recognize B. The Immune Epitope Database (IEDB) [21] has collected over 2000 known SARS-CoV-2-specific T or B cell epitopes, and the numbers are being updated every week. A consensus is that IEDB is the proper database to store and maintain the epitopes, and it is inappropriate for CIDO to record all the epitopes. Instead of listing all individual epitopes identified in the ImmPort study (SDY 1667) [20], we propose an ontology design pattern that represents the relation between two proteins that have epitope cross-reactivity. An example of such a representation is shown below:

‘spike glycoprotein (SARS-CoV-2)’: ‘has epitope cross-reactivity’ some ‘spike glycoprotein (HCoV-OC43)’

CIDO modeling of S protein mutation to avoid active host response

SARS-CoV-2 is a virus and by its nature undergoes under selective pressure that results in production of new variations within the viral proteins. For example, the B.1.1.7 clade variant that emerged in the UK [22]. B.1.1.7 already has early evidence for increased transmissibility and potential higher lethality [23]. While there are already ontological representations for immune responses to proteins in specific species, the actual representation of notable protein mutations has not previously been implemented. In Fig. 3a, we provide modeling on how this is done. Each mutant is identified by the protein name followed by a dash and the type of mutation. Identification of these proteins are done at the individual amino acid level as shown below.

‘S-D614G’: ‘variant of’ some ‘spike glycoprotein (SARS-CoV-2)’

S-D614G is interpreted as S protein with a missense mutation that causes the 614th amino acid, aspartic acid (D) to become glycine (G). CIDO has incorporated this to provide a standard set of annotations. A virus variant may have multiple mutations (Fig. 3b), which can also be systematically represented including an axiom as exemplified below:

‘SARS-CoV-2 GRY (B.1.1.7): ‘SARS-CoV-2 clade GR virus’ and (‘has AA variant’ some (S-H69del and S-V70del and S-Y144del and S-N501Y and N-G204R))

One of the challenges for CIDO has been the ontological classification of new SARS-CoV-2 lineages and strains that have emerged. Multiple naming schemas exist, each with different criteria for their categories. The World Health Organization (WHO) has designated certain coronaviruses as either variants of concern or variants of interest and named them using as Alpha, Beta, Gamma, Delta, etc. (Fig. 3c). For example, the SARS-CoV-2 Delta variant is defined as an equivalent axiom:

SARS-CoV-2 Delta variant: ‘SARS-CoV-2 B.1.617.2 virus’ or ('Severe acute respiratory syndrome coronavirus 2’ and (‘derives from’ some ‘SARS-CoV-2 B.1.617.2 virus’))

Here ‘SARS-CoV-2 B.1.617.2 virus’ is a variant classification based on Phylogenetic Assignment of Named Global Outbreak Lineages (PANGO) [24]. The relation ‘derives from’ indicates that the Delta variant includes the SARS-CoV-2 B.1.617.2 virus or any other viral variant derived from the virus. In addition, CIDO also represents the variant classification assigned by the organization of Global Initiative on Sharing Avian Influenza Data (GISAID) [25] (Fig. 3c).

CIDO representation of RAS related drug interruption for treating COVID-19

The ImmPort study (SDY1641) investigated the roles of renin-angiotensin system (RAS) inhibitors, including angiotensin-converting enzyme inhibitors and angiotensin II receptor blockers, in treating COVID-19 patients with hypertension [26]. Patients treated with angiotensin-converting enzyme (ACE) inhibitors or angiotensin II receptor blocker had a lower rate of severe diseases and lowered IL-6 in peripheral blood. The ACE inhibitors or angiotensin II receptor blocker therapy also increased the CD3 and CD8 T cell counts in peripheral blood and decreased the peak of viral load compared to other antihypertensive drugs [26].

RAS is also closely associated with coronavirus S protein since the S protein binds to the host angiotensin-converting-enzyme 2 (ACE2), a key RAS component. The binding between the S glycoprotein and ACE2 needs to be activated by TMPRSS2, a cellular receptor [9] (Fig. 2). Such binding leads to the subsequent downregulation of ACE2 [27, 28]. angiotensin-converting enzyme inhibitors inhibit the activity of ACE, an important component of the RAS that converts angiotensin I to angiotensin II. Therefore, angiotensin-converting enzyme inhibitors decrease the formation of angiotensin II, a vasoconstrictor. Angiotensin II receptor blockers bind to and inhibit the angiotensin II receptor type 1 (AT1), a receptor that has vasoconstriction role. Angiotensin II receptor blockers can then block the activation of the AT1 and prevent the binding of angiotensin II, leading to the treatment of hypertension [29].

We have modeled the above RAS-related process as shown in Fig. 4. Note that many terms and axioms were already represented in our previous CIDO modeling. To add new results obtained from the ImmPort study (SDY1641), multiple new terms and axioms were added as seen in the bold terms in Fig. 4. In our CIDO modeling, we defined many roles, such as ‘ACE Inhibitor role’, ‘angiotensin II receptor blocker role’, ‘vasoconstrictor role’, and ‘vasodilator role’. These roles can be then used to annotate different drugs or molecules, for example:

perindopril: ‘has role’ some ‘ACE inhibitor role’

nifedipine: ‘has role’ some ‘angiotensin II receptor blocker role’

Fig. 4figure4

Ontological representation of RAS pathway and drug roles. The bold text represents newly added terms from the ImmPort-focused data annotations

By doing so, the biological relevance of the drugs and molecules can be clearly noted and understood by humans and computers.

CIDO representation of host immune markers between immune profiles and covariates that correlate with COVID-19 outcomes

There are many host immune markers that correlate with COVID-19 outcomes. Figure 4 shows the general pattern of gene expression patterns in an ontological representation of genes (including gene markers) that are susceptible to be up-regulated under a specific condition such as SARS-CoV-2 infection.

Two ImmPort publications from two studies report host immune markers that correlate with COVID-19 outcomes: one that introduce inflammatory cytokine signatures that predicts COVID-19 severity and survival (ImmPort Study SDY1662) [30], and the other that introduces many more immune signatures associated with severe COVID-19 (ImmPort Study SDY1665) [31]. The first paper [30] demonstrates that IL-6 and TNF-alpha both are strong independent predictors of disease severity and death outcomes, with IL-18 also serving as a strong, but not independent predictor. Higher levels of IL-6 elevation are associated with the cytokine release syndrome (CRS), a condition that the SARS-CoV-2 infection also causes in compared to higher immune control [32]. The second paper [31] generated an immune profile by analyzing the immune responses in 113 patients with moderate or severe COVID-19, uncovering an overall increase in innate cell types and a concomitant reduction in T cell number. Severe COVID-19 was found to be associated with the elevation of cytokines and immune pathways associated with type 1 (antiviral), type 2 (anti-helminths), and type 3 (antifungal) type II pathways and higher levels of growth factors, type 1/2/3 cytokines and chemokines. However, patients with moderate COVID-19 had a progressive reduction in type 1 (antiviral) and type 3 (antifungal) responses after an early increase in cytokines and enriched with growth factors [31].

The initial immune signature of IL-6 for COVID-19 disease pathology have been further investigated and associated with other pathologies. For example, COVID-19 is linked to cytokine release syndrome (CRS), and the pathogenesis of CRS is associated with IL-6-mediated production of hyperinflammatory cytokines and plasminogen activator inhibitor-1 (PAI-1) [32]. The inhibition of IL-6 signaling using tocilizumab decreased PAI-1 production and alleviated the clinical symptoms in severe COVID-19 patients [32]. However, Kang et al. [32] also shows that while still elevated compared to healthy control, IL-6, IL-8, and MCP-1 are lower to other CRS diseases. Children had three cytokines increased interferon (IFN)-γ-induced protein 10 (IP10), interleukin (IL)-10 and IL-16 [33].

To model these results, we implemented a new class for biomarker and immune signature. A biomarker is a material entity that has a change in expression associated with a specific response to some specific biological process. An immune signature is a biomarker for some specific disease process. We included new object relations to model these differences for different SARS-CoV-2 disease processes.

IL-6: ‘up-expressed as immune signature of’ some (‘severe COVID-19 disease’ and ‘death stage’)

However, these immune markers and profiles are also dependent on host qualities. Figure 5a shows that different qualities, such as biological sex (F/M), age, comorbidities, will infect disease outcomes. Figure 5b provides an CIDO representation of gene expression patterns in SARS-CoV-2 infected patients. Here we focus on the sex comparisons an example to illustrate the effect of biological sex to the disease outcome.

Fig. 5figure5

Ontological representation of gene signature and quality-based immune responses. a General pattern of gene expression patterns in human. b CIDO representation of gene expression patterns in SARS-CoV-2 infected patients

Increasing evidence show that male sex is a risk factor for a more severe COVID-19 disease outcome [12]. In one of the early studies with data in Wuhan, China, of 86 male COVID-19 patients, 12.8% (11/86) died; in comparison, of 82 female patients, 7.3% (6/82) died [34]. A cohort study of 17 million COVID-19 adult patients in England reported a strong association between male sex and risk of death [35]. Globally, approximately 60% of COVID-19 associated deaths are reported in men [36].

In CIDO, we represent the high susceptibility of male to the death using the following axiom:

‘male infected with SARS-CoV-2’: ‘has increased susceptibility compared to female to’ some ‘death stage’

This raises important question on the underlying molecular mechanisms underlying this sex difference and prompted further investigation using secondary analysis from the ImmPort studies. A total of 11 genes from Takashi et al. [12] were collected and compared for age and Body Mass Index corrected differences between patients and health care workers for each sex and is shown in Table 2. From this gene list, males and females showed statistically significant increases in 7 and 10 genes, respectively. To represent these differences between individuals (sex, exposure), we added new CIDO terms to distinguish between these differences as illustrated below (Fig. 6a).

‘symptomatic human male infected by SARS-CoV-2’: ‘organism susceptibly has up-regulated gene’ some ‘CCL4’

Such modeling allows us to perform semantic query as exemplified in Fig. 6c. In this example, we used a DL query to easily identify the number of up-regulated genes that are shared by male and female patients with COVID-19 (Fig. 6b).

Fig. 6figure6

Sex differences in gene ontology and DL query. a CIDO ontological representation of sex-based immune response for SARS-CoV-2. The genes listed are chosen from the results in Table 2. b DL Query infers properties provide a list to identify shared genes between males and females

In addition to the gene list identification and modeling, we further performed a secondary data analysis on the pathways. These gene lists were placed into Reactome to generate a set of pathways they were enriched. From these pathways, we restricted the background to 58 cytokines mapped from Takashi et al. [12] (out of 61 assays) and found that IL-10 immune pathways in both males and females were shown to be significant (p value of 8.47E-5 and 3.56E-5, respectively) despite differences in genes. The implication of such result is described in the following Discussion section.

留言 (0)

沒有登入
gif