An epigenome‐wide DNA methylation study of patients with COVID‐19

1 INTRODUCTION

Among the varieties of viruses, coronaviruses with the largest RNA genome have drawn much attention in light of the recent emergence of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) (Yang et al., 2020; Zhou et al., 2020). Several studies have described the clinical characteristics of COVID-19 (Chen, Zhou et al., 2020; Wang et al., 2020). Infected patients with SARS-CoV-2 have varied symptoms, from mild feverish symptoms to those with severe clinical outcomes (Yang et al., 2020). Additionally, it is worth noting that there were increased concentrations of proinflammatory cytokines (e.g., interferon γ [IFNγ] and interleukin 6 [IL-6]) in coronavirus disease 2019 (COVID-19) patients, and these were associated with extensive lung damage and disease severity (Huang et al., 2020). In recent studies, mild and severe COVID-19 patients were observed to have different inflammatory characteristics, and the disturbed immune responses were related to the severe manifestations of COVID-19 (Bost et al., 2020; Huang et al., 2020). Poorly understood interactions between SARS-CoV-2 and the host limit the development of efficient treatment strategies. A thorough understanding of the interactions between the host and virus during infections will likely save lives.

DNA methylation is a potent epigenetic mechanism that regulates gene expression without altering DNA sequences. It has been shown that the production rate of the ACE2 receptor by its gene, known as the most critical virus receptor for SARS-CoV-2, is under epigenetic control (Zill et al., 2012). Age correlates with hypomethylation of the ACE2 gene in lung tissue, which could, to some extent, explain why aging is a risk factor for SARS-CoV-2 fatality (Corley & Ndhlovu, 2020). Existing studies have revealed that DNA methylation is implicated in the pathophysiology of many different diseases including, virus infections (Silmon de Monerri & Kim, 2014; Chlamydas et al., 2020). For many years, it has been known that viral infections usually use epigenetic mechanisms to find ways to induce syncytium development (Pruimboom, 2020).

Moreover, virus-driven dysregulation of gene methylation levels could also help induce aberrant host immune responses and influence disease development (Kuss-Duerkop et al., 2018). As an individual risk factor for COVID-19, some tumors showed hypo-DNA methylation levels of ACE2, which may cause ACE2 to be highly expressed and increase COVID-19 susceptibility and severity (Chai et al., 2020). In patients with other comorbidities, epigenetic dysregulation, especially DNA methylation, may lead to an abnormal increase in the expression levels of genes and elicit more robust immune response to SARS-CoV-2 (Sawalha et al., 2020). The focus of this study was on the epigenetic events in the pathogenesis of SARS-CoV-2 infection. We aimed to shed light on the DNA methylation changes in healthy subjects and patients with different severities of COVID-19 to explore novel targets and effective methods for treating COVID-19.

2 MATERIALS AND METHODS 2.1 Clinical data

This study was approved by the ethics committee of Qilu Hospital of Shandong University (ethical approval no: KYLL-2020-GXB-001). After receiving written informed consent, 21 blood samples were obtained from seven healthy volunteers and 14 COVID-19 patients just before the discharge. These 21 subjects (12 male, ninr female) were from Shandong Province Chest Hospital (Jinan, China). The patients admitted to hospitals with confirmed SARS-CoV-2 infection were diagnosed with mild or severe COVID-19 between January 27, 2020, and May 18, 2020. All the subjects were biologically unrelated to each other. The severity of the disease was classified based on the “Guideline on the management of COVID-19″ published by the National Health Commission of the People's Republic of China (version 8.0). Mild patients had mild symptoms and no imaging observation of pneumonia. According to the diagnostic criteria for COVID-19 in adults “the New Coronavirus Pneumonia Prevention and Control Program” published by the National Health Commission of China, severe patients had one of the following characteristics: dyspnea with a respiratory rate ≥30 breaths per min, oxygen saturation ≤93%, or arterial blood oxygen partial pressure (PaO2)/oxygen concentration (FiO2) ≤40 kPa.

2.2 DNA extraction

Genomic DNA was extracted from 200 μl of whole blood using a QIAamp DNA Blood Mini Kit (Qiagen) following the manufacturer's instructions. A total of 200 μl of the blood sample and 20 μl of proteinase K were mixed in a 1.5-ml microcentrifuge tube. A total of 200 μl of buffer AL was added to the sample. The solution was mixed with pulse-vortexing for 15 s and then incubated at 56°C for 10 min. A total of 200 μl absolute ethanol was added and mixed again with pulse-vortexing for 15 s. After mixing, the 1.5-ml microcentrifuge tube was centrifuged to remove drops from the inside of the lid. Buffer AW1 (500 μl) was added and centrifuged at 8000 rpm for 1 min. Then the collection tube containing the filtrate was discarded. Buffer AW2 (500 μl) was added and centrifuged at 14,000 rpm for 3 min. The spin column was placed in a new collection tube. The tube was centrifuged at full speed for 1 min. The column was placed in a clean microcentrifuge tube, and the collection tube containing the filtrate was discarded. Buffer AE (200 μl) was then added. The solution was incubated at room temperature for 1 min, and then centrifuged at 8000 rpm for 1 min. All of the above steps were completed according to the guidelines of the Institute for Prevention and Control of Viral Infectious Diseases of Shandong Center for Disease Control and Prevention.

2.3 Detection of DNA methylation in the peripheral blood by microarray

In this study, the methylation-specific chip used was the Illumina Infinium Methylation EPIC 850 K BeadChip (Illumina). The applicable protocol for the 850 K BeadChip followed the manufacturer's guidelines from OE biotech Co., Ltd (Shanghai, China). After extracting DNA from the peripheral blood, gDNA was obtained by bisulfite treatment. After hybridization with each probe, the specific capture probe on the chip was used to combine with the complementary gene fragments. The original data were obtained by hybridization overnight followed by cleaning, extension, staining, and scanning. The degree of methylation was determined by the fluorescence intensity of the probes.

2.4 Screening differential genes based on bioinformatics analysis

The data expression profiles of COVID-19 patients were searched in the GEO database, and one microarray analysis of the whole genome transcriptome to peripheral blood sample taken from severe and mild COVID-19 patients, GSE164805, was selected. Differential mRNAs were screened and adjusted P value < 0.05 is considered as statistically significant.

2.5 Statistical analysis

The signal value was extracted from the original data using a genome studio (genomic software 2011.1; Illumina), and the resulting data were normalized by performing background correction using the methylation module. The diffscore value (diffscore = 10*sgn(βref – βcond)*log10(p)) of the experimental group was less than −13 or > 13; and delta beta > 0.17 or < −0.17, which indicated it was the differential methylation site. (Note: The calculation of the delta beta value is based on the control group and the experimental group. Results for the AVG beta difference; that is, the degree of methylation difference at each site between the experimental group and the normal control group; a negative value indicated that it is a low methylation site compared with the control group, otherwise it was a high methylation site; For a P-value of 0.05, diffscore = ±13.) Both Gene Ontology (GO) and the Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analyses were performed to clarify the biological and functional pathways of differentially expressed methylation loci. GO terms and KEGG with P-value < 0.05 were considered significantly enriched from our data. Cytoscape v3.0 software was used for constructing an interaction network, which was based on differentially methylated genes (DMGs). The matrix of gene expression values was visualized graphically. The gene-interaction relationship was represented by nodes and edges in the graph. In the signaling network, the color of the cycle was considered to be the frequency of the gene interaction. The most prominent central genes in the network indicated the genes with the highest frequency. The top genes which had the highest degree of nodes and hit more than five cytohubba algorithms among the DMGs were defined as hub genes.

3 RESULTS 3.1 Baseline data of patients and healthy volunteers

Data were collected from designated hospitals for COVID-19 treatment in Shandong Province, China. Eight patients (57%) were male. Among the three study groups, the sex proportion was similar. The average age for all subjects was 48.5 years (interquartile range, 42.0−54.0), ranging from 34 years to 88 years. Though severely ill patients had a relatively older median age [55.0 (48.0−57.5) years] than healthy volunteers [48.1 (43.5−52.0) years] and mild patients [42.4 (36.5−47.0) years], there was no significant statistical difference in the age among the three groups (Table 1). The average duration from the onset of symptoms to admission was 5.0 (0.5−9.5) days for mild patients, while that from admission to discharge was 15.1 (11.5−18.0) days. The time from admission to discharge was 36.9 (34.0−41.0) days for severely ill patients. Some laboratory indexes from COVID-19 patients on admission were also analyzed statistically.

TABLE 1. Demographics and baseline characteristics Control (n = 7) Mild (n = 7) Severe (n = 7) All (n = 21) P value Age 48.1(43.5–52.0) 42.4(36.5–47.0) 55(48.0–57.5) 48.5(42.0–54.0) 0.17 Sex Male 4(57%) 4(57%) 4(57%) 12(57%) .. Female 3(43%) 3(43%) 3(43%) 9(43%) .. Days from illness onset to admission .. 5.0(0.5–9.5) 3.4(2.0–4.5) .. 0.75a Days from admission to discharge .. 15.1(11.5–18.0) 36.9(34.0–41.0) .. <0.01a White blood cell count, × 10⁹ per L .. 6.1(4.5–7.2) 7.8(5.3–7.2) .. 0.28a Neutrophil count, × 10⁹ per L 4.1(2.9–5.1) 5.8(3.6–6.0) .. 0.11a Lymphocyte count, × 10⁹ per L .. 1.4(0.7–1.9) 0.9(0.7–1.0) .. 0.37a Current smoker 0 0 0 0 .. Cardiovascular diseases 0 0 0 0 .. Obesity 0 0 0 0 .. Data are shown as the average (interquartile range) or n (n/N%) where N is the total number of subjects with available data. P values comparing the ages of healthy controls, mild patients, and severely ill patients are from a one-way ANOVA. aP values comparing basic information and laboratory parameters at discharge of severe and mild patients are from a Mann–Whitney U test. 3.2 Identification of differential methylation

Following our screening method of differential methylation loci (the sample diffscore value was less than –13 or greater than 13, and delta beta was greater than 0.17 or less than –0.17), 214, 245, and 228 differentially methylated loci were identified in each group (Severe vs Control, Severe vs Mild and Mild vs Control, respectively). Among them, there were 114, 126, and 113 hypermethylation sites and 100, 119, and 115 low methylation sites. All delta beta values, including maximum and minimum values, are shown in Figure 1D as violin plots. We then compared the methylation status of different substructures, such as CpG islands, shores (the 2 kb flanking the islands), shelves (the 2 kb flanking the shores), and sea (regions outside the previous three categories) in our datasets (Figure 1). The results showed robust alteration in DNA methylation of open sea probes in all samples. Moreover, hypomethylated sites were preferentially distributed in shelves and shores, and hypermethylated sites were preferentially located in shores rather than islands and shelves (Figure 1).

image Distribution of differentially methylated sites. (A–C) Distribution of sites in CpG islands, shores, shelves, and sea. (D) Distribution of delta beta values in each group. (E–F) Sites relative to UCSC_RefSeq gene promoters, gene bodies, and intergenic regions. All the differentially methylated sites are referred to simply as “Total array” [Color figure can be viewed at wileyonlinelibrary.com] 3.3 Cluster analysis

Unsupervised cluster analysis clustered the samples into two groups (Figure 2). These results showed statistically significant differences in DNA methylation in the healthy controls and COVID-19 patients with different disease severities.

image Unsupervised hierarchical clustering. (A) Two different groups are represented: mild and control groups. (B) Two different groups are represented: severe and control groups. (C) Two different groups are represented: severe and mild groups. Each column represents an individual patient and each row represents an individual CpG. Z-scores of the β-values are shown in the heatmap [Color figure can be viewed at wileyonlinelibrary.com] 3.4 Intersection of two groups

To find the key “marker” genes in the process of SARS-CoV-2 infection, we carried out Venn gathering analysis. The distribution of the target genes in two groups (“Mild vs Control” and “Severe vs Control”) is shown in Figure 3. As shown, a total of 35 target genes were detected simultaneously in the two groups. In this context, 35 genes were identified as “markers” of SARS-CoV-2 infection based on gathering analysis between the “Severe versus Control” group and the “Mild versus Control” group (common genes are listed in Table 2).

image Venn diagram of two groups. The blue circle indicates datasets of Mild vs Control and yellow circle indicates datasets of Severe vs Control [Color figure can be viewed at wileyonlinelibrary.com] TABLE 2. “Markers” of SARS-CoV-2 infection Target ID Gene Methylation status UCSC ref gene group cg19080354 ATHL1 Hypomethylated TSS1500 cg22451923 BTNL2 Hypomethylated Body cg18698799 C6orf10 Hypermethylated Body cg12469381 CHN2 Hypomethylated TSS200; Body cg01825287 CHST15 Hypermethylated 5'UTR cg26440059 COL23A1 Hypermethylated Body cg11128983 CPLX2 Hypomethylated TSS1500; 5'UTR cg03165426 CRHR2 Hypermethylated Body;5'UTR cg00597445 CRMP1 Hypermethylated Body cg25538415 DCAKD Hypomethylated TSS1500;5'UTR cg09119656 EPAS1 Hypomethylated Body cg23836570 FBRSL1 Hypermethylated Body cg07733779 GNAI2 Hypermethylated TSS1500; 5'UTR; Body cg06096382 HECW1 Hypermethylated TSS1500 cg27287527 HYAL1 Hypermethylated TSS1500; 5'UTR cg20107632 JPH3 Hypermethylated Body cg10644916 KLHL17 Hypermethylated Body cg12033072 KLRG2 Hypomethylated Body cg20963263 MGMT Hypomethylated Body cg06875704 MIR510 Hypermethylated TSS1500 cg02745784 PDE11A Hypermethylated 5'UTR cg18467790 RADIL Hypomethylated Body cg03086067 SH2D4B Hypermethylated Body cg07404046 SIRPB1 Hypomethylated Body cg03959986 SMG6 Hypermethylated TSS1500; TSS200 cg07615678 SPON1 Hypermethylated Body 3.5 GO functional enrichment analysis of differential methylated genes

To address the functional significance of the widespread changes in DNA methylation, we performed a GO analysis of all the datasets. The top 20 biologically enriched results of the significantly enriched GO terms were highlighted (Supporting Information Table 1). Enrichment was seen in GO (Severe vs Control): peptide antigen binding (P = 1.5 × 10−5), neuromuscular process controlling balance (P = 1.06 × 10−4), integral component of the luminal side of endoplasmic reticulum membrane (P = 6.09 × 10−4); GO (Severe vs Mild): neuropeptide signaling pathway (P = 4.28 × 10−4), positive regulation of interleukin-13 secretion (P = 9.98 × 10−4), positive regulation of T-helper 2 cell differentiation (P = 1.592 × 10−3); GO (Mild vs Control): PDZ domain binding (P = 2.52 × 10−4), cytosol (P = 3.56 × 10−4), high-density lipoprotein particle (P = 4.47 × 10−4).

3.6 KEGG functional enrichment analysis of differential methylated genes

The results from KEGG functional enrichment analysis showed that there were 121 (Severe vs Control), 145 (Severe vs Mild), and 183 (Mild vs Control) signal pathways involved in the three groups of differential methylation genes (DMGs); 23, 9, and 21 of which met the screening criteria (P < 0.05) involving 14, 16, and 25 DMGs, respectively (Figure 4).

image Top 20 KEGG pathways of differential methylated genes in the three groups. (A–C) represents “Mild vs Control,” “Severe vs Control,” and “Severe vs Control,” respectively. Top 20 enriched KEGG pathways were filtered with enrichment factors. The size and color of the dots indicate the number of differential methylated genes hit and the p-value of the enriched KEGG pathway, respectively [Color figure can be viewed at wileyonlinelibrary.com] 3.7 Analysis of differential methylated sites in COVID-19 groups with different severities

We compared the methylation β-values of CpG sites between severe and mild samples using a t-test model. Figure 5 presents a general view of the differentially methylated loci in a single chromosome. The PPI network of DMGs in the Severe vs Mild group was constructed with Cytoscape v3.0 software (Figure 6). Hub genes (including GNG7, GNAZ, PRKCZ, and PRKAG2) had the highest degree of nodes and hit more than five cytoHubba algorithms among the DMGs (Supporting Information Table 2). In addition, DMGs involved in the progression of tumors, such as NMU (Chen et al., 2018), were found. Moreover, IL4R and IL17RA, reported to be associated with the pathogenesis of several inflammatory disorders, were also shown to be significant in the PPI network (O'Connor et al., 2009; Massoud et al., 2016). The genes that have up-regulated or down-regulated methylation levels had complex interactions with each other in the signaling network (Figure 6).

image Circos plot of genome-wide DNA methylation changes between the severe and mild group. Red dots denote hypermethylated sites (β > 0.17), green dots denote hypomethylated sites (β > −0.17), and gray dots denote sites |β| wileyonlinelibrary.com] image PPI analysis. Protein interactions between the differential methylated genes. Darker colors indicate higher scores that represent more network involvement and significance [Color figure can be viewed at wileyonlinelibrary.com] 3.8 Combined methylation and transcriptome analysis to validate hub genes

Next, we analyzed a public database GSE689128 including ten COVID-19 patients, whose COVID-19 diagnostic and severity criteria was identical to us. |FC| > 1.5 and adjusted P value < 0.05 were considered as criteria to screen the differential expression (DE)-mRNAs between severe group and mild group. As shown in Supporting Information Figure S1, 14132 DE-mRNAs, including 8056 downregulated and 6076 upregulated mRNA, were identified in the GSE164805 profile. For four hub genes, we found that the expression of GNAS gene (adjusted P value < 0.01) and GNG7 gene (adjusted P value = 0.026) were both down-regulated in the severe group, and the mRNA expression of the other two hub genes had no significant difference between the two groups (adjusted P value > 0.05) (Table 3).

TABLE 3. mRNA expression analysis of four hub genes (severe vs mild) Gene Delta Beta DiffScore Fold change P value adjusted P value symbol (methylation difference) (methylation difference) (mRNA expression) (mRNA expression) (mRNA expression) GNAS −0.23643 −14.44971 −2.6818 0.00033 0.00585 GNG7 −0.36106 −15.92767 −2.02017 0.00446 0.02627 PRKAG2 −0.26721 −15.82896 1.61695 0.02680 0.08674 PRKCZ 0.28300 15.68705 1.33840 0.36300 0.49675 4 DISCUSSION

At present, there is a lack of research that provides evidence of the COVID-19 pathogenesis pathway by epigenetic regulation. Epigenetic disorders might favor viral infection and provide useful biomarkers to stratify disease severity in patients more prone to disseminating SARS-CoV-2 infection (Crimi et al., 2020). In this study, the Illumina 850k methylation chip was used to detect the changes of DNA methylation expression in whole blood samples from a cohort of COVID-19 patients and subjects with no clinically detectable SARS-CoV-2 infection. This study laid a foundation for finding novel SARS-CoV-2 infection “markers” and therapeutic targets. The hub genes and signal pathways identified also point to the need to study the COVID-19 regulation mechanism.

Through the Venn diagram analysis, we combined the DMGs of the “Severe versus Control” and “Mild versus Control” groups and found 35 “marker” genes that acted as candidate genes that may play an important role in indicating SARS-CoV-2 infection. Previous studies have shown that the methylation status in the promoter region (including 5′UTR, TSS200, and TSS1500) of genes can regulate gene expression, and studies based on DNA methylation have also focused on these regions (Bayarsaihan, 2011; Bierne et al., 2012). In these 35 candidate genes, differential methylation sites of 12 genes (ATHL1, CHN2, CHST15, CPLX2, CRHR2, DCAKD, GNAI2, HECW1, HYAL1, MIR510, PDE11A, and SMG6) were situated in the promoter region. Among them, the molecular functions of ATHL1, CHN2, DCAKD, MIR510, and HECW1 are largely unexplored. Emerging evidence suggests that CHST15 involves multiple tumor progression and metastasis (Liu et al., 2019). CPLX2 regulates the formation of synaptic vesicles and acts as an inhibitor of secretion of some autoimmune antibodies, thus it may participate in the pathogenesis of some related autoimmune diseases (Tsuru et al., 2019). Both CRHR2 and PDE11A play roles in signal transduction by regulating the intracellular concentration of cyclic nucleotide cAMP and are reported to be associated with many human cancers (Gu et al., 2010; Koch, 2011). GNAI2 can be upregulated by NF-κB activation and is associated with spontaneous colitis and colitis-associated cancer (Li et al., 2019). HYAL1 catalyzes the hydrolyzation of 20 kDa HA into low-molecular-weight HA fragments with pathophysiological effects including angiogenic, inflammatory, and immunosuppressive effects. In a recent study, researchers revealed overexpression of HYAL1 could prevent fibroblast fibroproliferation. They also showed significant downregulation of HYAL1 in idiopathic pulmonary fibrosis lung tissues (Leng et al., 2020). As an RNA surveillance protein, SMG6 could affect HIV-1 gene expression via the influence of viral reactivation at a posttranscriptional level (Rao et al., 2018). It has been suggested that elucidation of the role of SMG6 may aid in explaining the maintenance of SARS-CoV-2 persistence. In addition to these 12 genes, some other genes whose differential methylation sites were identified in the gene body may also play crucial roles, such as BTNL2, which is involved in multiple inflammatory and autoimmune diseases (Zhao et al., 2020). Whether these DMGs, can be used as new biomarkers of SARS-CoV-2 infection need further study. The function of these DMGs and their role in SARS-CoV-2 infection need to be explored. Further study may provide new direction and strategies for specific treatments and early diagnosis of COVID-19.

Function enrichment and pathway enrichment analyses were conducted in three groups. Among these, the results regarding the “Severe versus Mild” group may be of great value. GO pathway analysis showed that, in the “Severe versus Mild” group, the functions of the DMGs were mainly focused on neuropeptides, interleukin-13, chemokine secretion, T-helper 2 cell differentiation, and mast cell degranulation, indicating that the activation of inflammatory and allergic reactions are closely related to the progression of COVID-19. In the KEGG analysis, DMGs were involved in the sphingolipid signaling pathway, hippo signaling pathway, phospholipase D signaling pathway, hypertrophic cardiomyopathy (HCM), oxytocin signaling pathway, dilated cardiomyopathy, type I diabetes mellitus, and serotonergic synapses. It has been suggested that the occurrence and development of COVID-19 is caused by the joint action of multiple molecular biological signal pathways. Sphingosine has been confirmed to prevent the binding of SARS-CoV-2 spikes to its cellular receptor ACE2 (Edwards et al., 2020). Its derivatives also act as a selective and potent candidate for COVID-19 treatment (Dave et al., 2020). In addition, sphingosine 1-phosphate mediated signaling was suggested to be a potential target for ameliorating neurological symptoms induced by SARS-CoV-2 (Meacci et al., 2020). In a recent study, oxytocin was shown to have anti-inflammatory and proimmune functions, and they act as a potential defense against COVID-19 (Thakur et al., 2020). As for HCM, Bos et al. (2020) extracted RNA from cardiac tissue of patients with HCM and healthy donor hearts. Their results showed a 5-fold increase in ACE2 protein in patients with HCM compared to healthy donors, which may lead to an increased risk for COVID-19 outcomes. In line with the study by Bos et al., a study by Chen, Li et al. yielded similar results. They found patients with basic heart failure disease (including HCM and dilated cardiomyopathy) exhibited increased expression of ACE2, and might have a high possibility of heart attack and progression to a severe condition after SARS-CoV-2 infection (Chen, Li et al., 2020). However, the mechanisms of hippo signaling, phospholipase D, and serotonergic synapse sig

留言 (0)

沒有登入
gif