Whole blood transcriptome signature predicts severe forms of COVID-19: Results from the COVIDeF cohort study

Cohort presentation

A total of 159 samples, collected from patients enrolled during the first COVID-19 outbreak (April-June 2020) and presenting mild pneumonia at the moment of first clinical evaluation, were analysed. They included 115 patients with COVID-19 diagnosis and 44 patients with pneumonia not related to COVID-19. Among the 115 patients diagnosed with COVID-19, 11 evolved towards severe pneumonia, 51 towards intermediate pneumonia and 53 towards mild pneumonia (Supplementary Table S1).

In line with established risks factors (Li et al. 2021; Bergantini et al. 2021), the risk of evolution towards severe pneumonia was associated with age, diabetes, temperature, C-reactive protein (CRP), procalcitonin, fibrinogen, neutrophils and lymphocytes at inclusion, oxygen saturation and CT scan findings (Table 1).

Table 1 Cohort presentation, descriptive statistics and group comparisons. *Comparison of COVID-19 patients and a group of controls presenting non-COVID-19 pneumonia, using Wilcoxon and Fisher’s tests for quantitative and qualitative variable comparisons, respectively; **Comparison of COVID-19 patients depending on their outcome (towards mild, moderate or severe pneumonia), using Kruskal–Wallis’s and Fisher’s tests for quantitative and qualitative variable comparisons, respectively. §Neutrophils and lymphocytes scores, reflecting the scaled relative fractions inferred from transcriptome data. #For hospitalisation duration, patients with fatal outcome were excluded. BMI body mass index, CRP C-reactive protein, CT scan computed tomography scan, ECMO extracorporeal membrane oxygenation, HFNO high-flow nasal oxygen, IMV Invasive Mechanical Ventilation, NIV non-invasive ventilation, PCR Polymerase Chain Reaction. Median values were reported (range values in brackets)Unsupervised transcriptome-based classification of samples

Unsupervised principal component analysis of the whole transcriptome dataset discriminated patients with COVID-19 from controls (first principal component PC1), and patients who would later develop a severe COVID-19 pneumonia from those with mild or intermediate evolution (second principal component PC2, Fig. 1a). Over-representation analysis of the top genes most contributing to PC1 showed an enrichment in signalling pathways mainly related to neutrophil activation, while the top 100 genes most contributing to PC2 were enriched in signalling pathways related to immune response to virus infection, including complement activation, regulation of humoral immune response, response to type I interferon, and regulation of viral genome replication (Supplementary Table S2). Consistently, the most contributing gene to PC1 was CD177, a marker of neutrophil activation. For PC2, the most contributing genes were IFI27, involved in type I interferon cell response, and OTOF, both over-expressed in COVID-19 patients and associating with the severity of evolution (Fig. 1b).

Fig. 1figure 1

Global blood transcriptome discriminates patients depending on the type of pneumonia and the evolution of COVID-19 pneumonia. a) Sample projections based on the combination of the first two principal components (PC1, PC2) of unsupervised PCA performed on the whole dataset (n = 16001 genes, n = 159 samples). The center of each group is indicated by the larger circles. b) Boxplot of CD177, IFI27 and OTOF gene expression in the different group of analysis. *Student’s T-test p-value < 0.05, **Student’s T-test p-value < 0.001; ***Student’s T-test p-value < 10e-6

Considering the variability possibly due to blood cell composition, we inferred the score of different blood cell subtypes for each sample (Supplementary Table S3), then we evaluated the impact of cell composition on sample classification. Globally, compared to controls, COVID-19 samples showed lower neutrophils and higher lymphocytes and lymphocyte subtypes (Table 1, Supplementary Table S4). Indeed, inverse correlation was observed between neutrophils and global lymphocytes proportion, and between neutrophils and lymphocyte T CD8 + and CD4 + memory resting subtypes in particular (Supplementary Figure S1a). However, the variability due to the global blood formula alone couldn’t properly discriminate COVID-19 patients in terms of pneumonia evolution (Supplementary Figure S1b).

Blood early transcriptome signature of COVID-19 pneumonia

By comparing COVID-19 samples (n = 115) to controls (n = 44), and after adjustment on age and blood cell composition, we identified 68 differentially expressed genes (Benjamin-Hochberg adjusted p-value < 0.05 and a logFC > 1.5; Supplementary Table S5), mostly over-expressed in COVID-19 (n = 52/68). Gene ontology analysis of these over-expressed genes in the COVID-19 samples showed an enrichment in pathways related to virus response mainly involving type I interferon signalling (Fig. 2a and b; Supplementary Figure S2; Supplementary Table S6).

Fig. 2figure 2

Differentially expressed genes in early COVID-19 pneumonia. a) Volcano plot of the top differentially expressed genes in COVID-19 (n = 115) versus controls (n = 44). b) Dot plot of the 10 most GO enriched signalling pathways of the differentially over-expressed genes in COVID-19 samples versus controls

Blood early transcriptome signature of future severe COVID-19 pneumonia

In patients with early COVID-19 pneumonia, blood samples were compared between those evolving towards severe pneumonia (n = 11) and those remaining mild (n = 53). After adjustment on age and blood cell composition, we identified 345 differentially expressed genes (Benjamin-Hochberg adjusted p-value < 0.05 and a logFC > 1.5; Supplementary Table S7). The enriched signalling pathways were represented by a response to virus infection involving a response to type I interferon, as assessed by GSEA analysis (Fig. 3a and b; Supplementary Figure S3; Supplementary Table S8).

Fig. 3figure 3

Differentially expressed genes in patients with early COVID-19 pneumonia evolving towards severity. a) Volcano plot of the differentially expressed genes in patients with severe (n = 11) versus mild (n = 53) evolution. b) Dot plot of the top 10 activated and the top 10 suppressed signalling pathways enriched for the differentially expressed genes in patients with future severe COVID-19 pneumonia. c) Venn diagram representation of differentially expressed genes in patients with severe COVID-19 pneumonia in our study (in red) and in the studies of Wang et al. and Jackson et al. (in orange) (Jackson et al. 2022; Wang et al. 2022). d) Venn diagram representation of differentially expressed genes in patients with severe COVID-19 pneumonia in our study (in red) and in patients with severe Influenza infection in the studies of Zerbib et al. and Dunning et al. (in green) (Zerbib et al. 2020; Dunning et al. 2018)

We then tested how similar our early transcriptome signature of future severe COVID-19 pneumonia was with the longitudinal signature of evolution from mild towards severe COVID-19 pneumonia. For that aim, we analysed the overlap of our signature with published signatures reflecting the longitudinal evolution of blood transcriptome towards severe COVID-19 pneumonia (Jackson et al. 2022; Wang et al. 2022). Similarities were observed, including increased expression of CD177, IFI27 and OTOF (Fig. 3c; Supplementary Table S9). Of note, CD177 was also common when studying the overlap between our early signature and differentially expressed genes associated with the evolution towards severe Influenza infection (Fig. 3d; Supplementary Table S10), another virus infection, underlining the importance neutrophil induction beyond interferon activation in virus infections (Zerbib et al. 2020; Dunning et al. 2018).

Early prediction of severe forms of COVID-19

To select a limited set of genes predicting severity of COVID-19, we trained an Elastic Net-penalized linear model on the sub-cohort of COVID-19 mild pneumonia patients with severe or mild evolution of the disease (n = 11 and n = 53, respectively), starting from the 2500 most variable genes. Forty-eight genes were selected (Supplementary Table S11), properly discriminating severe from mild evolution of COVID-19 pneumonia in the training cohort. Of note, patients with intermediate COVID-19 pneumonia outcome—not used for the training—were scattered between patients with mild and severe COVID-19 pneumonia outcome and were referred to as a grey zone. Using receiver operated characteristic (ROC) curve analysis, optimal thresholds (-0.02 and 7.69) were identified on the first component of the 48-genes principal component analysis projection (Supplementary Figures S4-6). Using an independent validation cohort of 77 patients (28 with severe outcome, 23 with intermediate outcome and 26 with mild outcome), we could confirm the classification performance (Fig. 4). Sensitivity, specificity and accuracy for predicting severe outcome were 0.64, 0.91 and 0.81 respectively (Supplementary Table S12). In a multivariate model combining the 48-genes predictor, age, sex and blood cell composition, the 48-genes predictor remained highly significant of severe outcome against mild outcome (logistic regression p-value < 0.001; Table 2). Of note, this signature was not discriminant between Covid and control patients (data not shown).

Fig. 4figure 4

Classification of samples based on the 48 selected genes discriminating COVID-19 patients depending on pneumonia evolution. Samples projection based on the two principal components (PC1, PC2) of unsupervised PCA performed using the 48 genes selected by Elastic net regression on the training cohort. In faint circles are presented the samples from the training cohort, on which the optimization of gene selection was operated. In bright squares are presented the samples from the external independent validation cohort

Table 2 Multivariate model combining transcriptome, age, sex, and neutrophils predictors on COVID-19 pneumonia severity. Training and validation cohorts were combined. The evolution towards severe COVID-19 pneumonia was considered. OR = Odds ratio, CI = Confidential Interval

Post hoc analyses showed a positive correlation between the 48-genes predictor and the following biochemical variables at admission: C-reactive protein (r = 0.65, p = 2.603e-07), procalcitonin (r = 0.70, p-value = 2.073e-06) and fibrinogen (r = 0.62, p-value = 4.364e-05). Of note, diabetes, a well-established risk factor for the evolution towards severe COVID-19 pneumonia, was weakly correlated with the 48-genes predictor (r = 0.37, p-value = 0.0256).

Ability of the 48-gene predictor of severe outcome to monitor longitudinal evolution towards severe COVID-19 pneumonia

To explore the longitudinal performance of our transcriptomic signature, we computed our 48-gene predictor in a published cohort (Supplementary Fig. 7) (Wang et al. 2022). Globally, we found a positive correlation between the 48-genes predictor values and COVID-19 severity assessed by the WHO severity level (r = 0.53, p-value = 6.945e-16; Supplementary Fig. 8). For patients with 2 to 4 COVID-19 severity levels, the longitudinal evolution of the 48-genes predictor showed a decrease, while for patients with 6 to 9 COVID-19 severity levels, the longitudinal evolution was more variable (Fig. 5). Finally, among the 8 patients with a change in the COVID-19 severity level, with our 7.69 threshold, the 48-genes predictor was globally discriminating the patients with a worsening pneumonia (Fig. 5).

Fig. 5figure 5

Longitudinal ability of the 48-genes predictor to monitor evolution of COVID-19 pneumonia towards severity. The red and blue curves represent the mean value of the 48-genes predictor over time in 13 patients with 6–9 and 14 patients with 2–4 WHO COVID-19 severity levels respectively. For 8 patients with a change of severity level during follow-up, individual values are provided (broken lines), with colours reflecting the severity level at each time point during follow-up. The dashed horizontal lines indicate the 48-genes predictor thresholds established on the training cohort

Discussion

In this study focusing on patients with early-stage COVID-19 pneumonia, we identified a blood transcriptome signature predicting the risk of evolving towards a severe pneumonia. This signature could help to improve patients’ management, by proposing specific surveillance and treatments. This signature corresponds to a differential inflammatory profile between patients with severe and mild outcome, with the implication of humoral immune response, complement activation and interferon signalling pathway. This signature includes the over expression of inflammation markers previously reported in patients with severe COVID-19 pneumonia, such as IFI27 (Shojaei et al. 2023) or CD177 (Jackson et al. 2022; Wang et al. 2022; Lévy et al. 2019; An et al. 2021; An et al. 2022).This signature shows a gradient of expression from mild to intermediate and severe forms. Thus, the inflammatory signature observed in patients with severe COVID-19 pneumonia seems to be present early in the course of the disease, and to reflect the risk of developing a severe outcome.

Based on this observation we designed a predictor, with optimal selection of transcriptome biomarkers able to classify patients depending on their COVID-19 pneumonia evolution. This predictor could be validated on an independent validation cohort, where patients were evaluated longitudinally. However, with an accuracy of 0.81 the prediction of evolution towards a severe COVID-19 pneumonia is not absolute. This relates to the limited sensitivity in the validation cohort, with some COVID-19 patients with severe outcome grouped with patients with intermediate outcome. Nevertheless, taking into account the importance of both identifying early patients at risk of evolving towards a severe pneumonia and optimizing healthcare resources, in the validation cohort, none of the patients with severe COVID-19 pneumonia outcome were predicted as mild and only one patient with an actual mild COVID-19 pneumonia outcome was predicted as severe. In addition, the clinical criteria for classifying COVID-19 patients in the validation cohort, and the time of sampling during the course of the disease may have impacted the evaluation of accuracy on the validation cohort. Another limitation of the prediction of outcome is the broad distribution of patients evolving towards intermediate forms of pneumonia, with some overlap with patients evolving towards mild pneumonia -mainly in the training cohort-, and with those evolving towards severe pneumonia -mainly in the validation cohort. Though being important, the proper evaluation of intermediate patients is difficult for two reasons. Firstly, no clear clinical definition is available, with variable diagnostic criteria and thresholds to define intermediate Covid pneumonia (Jackson et al. 2022; Wang et al. 2022). Including these intermediate patients would have increased the size of the training cohort, but also the risk of misclassification. Limiting the inclusion to “mild” and “severe” classes warranted well-defined labels for training the predictor. Secondly, the cohort size is not large enough to assess the existence of statistically significant thresholds in the 48-genes predictor in patients with intermediate COVID-19 pneumonia. However, in our study, these intermediate patients indisputably fall in-between patients with mild and severe outcome as a continuum and they probably represent patients who deserve a closely clinical surveillance thus globally strengthening the validity of transcriptome prediction. This grey zone may also reflect a biological variability, and thus a certain limitation of the ability to predict outcome with this technique. Another potential issue in this study is the risk of overfitting due to the limited number of patients and the high number of features. This risk was mitigated by the cross-validation strategy in the training cohort, and the validation in two independent cohorts. Finally, a post hoc association between the 48-genes predictor and clinical and biochemical variables showed several significant associations. However, the prognostic value of these associated variables could not be tested in the final multivariate model due to the limited cohort size, and to the limited data available in the independent validation cohort.

This prognostic signature appears specific to COVID-19 patients, compared to controls (patients with non-COVID-19 pneumonia). Compared to controls, COVID-19 patients present a transcriptome signature reflecting pathways related to virus response mainly involving type I interferon response. Type 1 interferon activation has been well established when COVID-19 progresses towards severe pneumonia in longitudinal series, with several markers reported including IFI27, SIGLEC1, OAS1/2, IFI44, IFI44L, ISG15 (Shaath et al. 2020; Krämer et al. 2021; Masood et al. 2021; Khorramdelazad et al. 2022; Xu et al. 2022). Another potentially relevant gene identified here is OTOF, associated with inflammation and described as a type I IFN-induced effector (Roberson et al. 2022; Ding et al. 2022). Our results show that interferon type 1 activation occurs early in the course of COVID-19 pneumonia. This signature could contribute to diagnose the SARS-CoV-2 infection.

In addition, beyond COVID-19 pneumonia, to which extent this type 1 interferon signature is systematically present in SARS-CoV-2 infected patients remains to be established, especially in patients with mild or asymptomatic forms of the disease.

This study comes after several publications showing the implication of type I interferon in the evolution towards severe COVID-19 pneumonia. However, contrasting with a vast majority of previous works, this study is focusing on early-stage patients, when COVID-19 pneumonia is still mild. The clinical characterization is quite extensive, and the follow-up well documented, enabling a proper classification in terms of outcome. This original design was required for demonstrating the existence of a signature predicting the outcome. Of note, the inclusion of patients is restricted to the first outbreak wave in the training cohort. To which extent do these signatures stand with the new SARS-COV2 variants remains to be established. In addition, the relative proportion of severe pneumonia in the subsequent outbreak waves decreased, in the context of the progressive immunisation of population through vaccines and history of COVID-19 infections. However, severe pneumonia still occurs, and disease complications are still challenging to predict at individual levels. The early transcriptome signature proposed here may help improving this challenging detection.

In conclusion, whole blood transcriptome is able to early predict the outcome of COVID-19 pneumonia. This discrimination mainly relies on type 1 interferon activation, along with other immune alterations, which are already present at an early stage of the disease in patients later developing a severe pneumonia.

留言 (0)

沒有登入
gif