Integrated analysis of single-cell and bulk RNA sequencing data reveals an immunostimulatory microenvironment in tumor thrombus of osteosarcoma

Clinical specimens and patient features

Clinical specimens collected and analyzed in this study were harvested through informed consent. The study protocol was approved by the ethics committees of Peking University People’s Hospital and Beijing Children’s Hospital of Capital Medical University. The clinical characteristics of enrolled patients are listed in Table 1.

Table 1 Demographic and clinical characteristics of the enrolled patients.Bulk RNA-seq data analysis

We obtained clean data with the adapter removed and low-quality reads filtered. The paired-end reads were mapped to the human genome (UCSC hg38) by Hisat2 (version 2.1.0). Gene annotation and gene expression reads-count were performed by HTSeq (version 0.11.2). Then we selected the protein-coding genes and calculated the normalized expression value using the DESeq2 package (version 1.32.0) on the R platform (version 4.1.0).

Gene-set enrichment analysis (GSEA)

The normalized expression matrix was uploaded to the gene set enrichment analysis (GSEA) software (version 4.1.0). The hallmark gene sets provided by the Molecular Signatures Database (MSigDB) were selected as a reference database. The GSEA program was run with 1000 permutations for statistical significance estimation, and the default signal-to-noise metric between the two phenotypes was used to rank all genes.

Ingenuity pathway analysis (IPA)

Differentially expressed genes (DEGs) were analyzed by the DESeq2 package. DEGs were judged with the threshold adjusted p-value less than 0.05 and absolute log2 fold-change greater than 1. Then these DEGs were uploaded to the Ingenuity Pathway Analysis (IPA) software. Canonical pathways, causal networks, and upstream regulators were enriched with default parameters in this software.

Cell proportion estimation

CIBERSORT package (version 0.1.0) in R was used to deconvolve the bulk RNAseq data. A widely used leukocyte signature matrix LM22 were used as the reference to estimate the proportion of 22 immune cell types in each sample. Then we merged some cell types based on our single-cell RNA sequencing results: “B cells naive”, “B cells memory” and “Plasma cells” were combined as “B cells”; “T cells CD4 memory resting” and “T cells CD4 memory activated” were combined as “CD4Tm”; “NK cells activated” and “NK cells resting” were combined as “NK”. Cell types which were not detected in scRNAseq and cell types estimated in less than 3 samples were removed.

Sample collection and preparation of single-cell suspensions

Obtained during surgery, fresh tumor and thrombus tissues were stored in Dulbecco’s modified Eagle’s medium (DMEM, Gibco) with 10% fetal bovine serum (FBS, Gibco) and processed on ice. Subsequently, the tissues were rinsed with cold phosphate buffer saline (PBS, Gibco) three times and minced into ~1mm3 pieces. The tissue pieces were digested into single-cell suspensions using collagenase-2 (1 mg/mL) at 37 °C for 45 min and filtered through a 70-μm cell strainer. Upon digestion, the single-cell suspensions were centrifugated at 400 g for 5 min and the supernatant was discarded. Adding red blood cell lysis buffer (Solarbio) to remove red blood cells, the suspensions were then passed through a 40-μm filter and centrifugated at 500 g for 5 min, and resuspended in PBS. Cell viability was validated under the phase-contrast light microscope (Leica) after staining with trypan blue (ACMEC).

Single-cell transcriptome library preparation and sequencing

After resection, tissue specimens were rapidly processed for single-cell RNA sequencing. Single-cell suspensions were prepared according to the protocol of Chromium™ Single Cell 3’ Solution (V2 chemistry). All specimens were washed twice with cold 1 × PBS. A hemocytometer (Thermo) was used to evaluate cell viability rates. Then, we used Countess (Thermo) to determine the concentration of the single-cell suspension and adjust the concentration to 1000 cells/μl. Samples with a cell concentration lower than that defined in the user guide (i.e., <400 cells/µl) were pelleted and resuspended in a reduced volume, and the concentration of the new solution was determined again. Finally, the cells in each sample were loaded, and libraries were constructed using a Chromium single-cell kit. Single-cell libraries were subjected to 150-bp paired-end sequencing on the Illumina NavoSeq6000 platform.

Single-cell RNA-seq data preprocessing and quality control

After obtaining the paired-end raw reads, we used CellRanger (10x Genomics, v3.1.0) to preprocess the single-cell RNA-seq data. The cell barcodes and unique molecular identifiers (UMIs) of the library were extracted from read1. Then, the reads were split according to their cell (barcode) IDs, and the UMI sequences from read2 were simultaneously recorded for each cell. Quality control of these raw readings was subsequently performed to eliminate adapter contamination, duplicates, and low-quality bases. After filtering barcodes and low-quality readings that were not related to cells, we used STAR (version 2.5.1b) to map the clean reads to the human genome (hg19), and we retained the uniquely mapped reads for UMI counts. Next, we estimated the molecular counts and generated a UMI count matrix for each cell by counting the UMIs for each sample. Finally, we generated a gene-barcode matrix that showed the barcoded cells and gene expression counts. Based on the number of total reads, the number of detected gene features, and the percentage of mitochondrial genes, we performed quality control filtering through Seurat to discard low-quality cells (v3.1.5). Briefly, the proportion of mitochondrial genes inside one cell was calculated to be lower than 10%, and the number of total reads in one cell was below 50000. Additionally, the cells were further filtered according to the following criteria: each sample with no more than 5000 genes detected, respectively, and at least 200 genes detected per cell in any sample. Low-quality cells and outliers were discarded, and the single cells that passed the QC criteria were used for downstream analyses. Doublets were predicted by DoubletFinder (v2.0).

Clustering analysis and cell phenotype recognition

The Seurat software package was used to perform cell clustering analysis to identify major cell types. All Seurat objects constructed from the filtered UMI-based gene expression matrixes of given samples were merged. We first applied the “SCTransform” function to implement normalization, variance stabilization, and feature selection through a regularized negative binomial model. Then, we reduced dimensionality through principal component analysis (PCA). According to standard steps implemented in Seurat, highly variable numbers of principal components (PCs) 1 to 20 were selected and used for clustering using the Uniform Manifold Approximation and Projection (UMAP). We identified the cell types of these groups based on the expression of canonical cell type markers. Finally, the 5 groups of T cells and 2 groups of ADSCs were respectively clustered for downstream analysis.

DEG analysis

The cell-type-specific genes were identified by running Seurat with the ‘FindAllMarkers’ function on a log-transformed expression matrix with the following parameter settings: min.pct = 0.25, logfc.threshold = 0.25 (that is, there is at least a 0.25 log-scale fold change between the cells inside and outside a cluster), and only.pos = TRUE (that is, only positive markers are returned). For heatmap and violin plots, the SCT-transformed data from the Seurat pipeline were used. Using the Seurat ‘FindMarkers’ function, we found differentially expressed genes (DEGs) between the two cell types. We also used the R package clusterProfiler with the default parameters to identify gene sets that exhibited significant and consistent differences between two given biological states.

Multiplexed immunofluorescence (mIF) staining

To validate the immunostimulatory microenvironment in the tumor thrombus of OS, mIF staining was performed using PANO 7-plex IHC kit (Cat# 0004100100, Panovue). CD3 (Cat# ab135372, Abcam), CD4 (Cat# ZM0418, ZSGB-BIO), CD8A (Cat# CST70306, CST), CD68 (Cat# CST76437, CST), CCL4 (Cat# ab235961, Abcam) and DAPI (Cat# D9542, Sigma-Aldrich) antibodies were sequentially applied, followed by horseradish peroxidase-conjugated secondary antibody incubation and tyramide signal amplification. Next, the slides were microwave heat-treated after each tyramide signal amplification operation. Nuclei were stained with DAPI after all of the antigens above had been labeled. To obtain multispectral images, the stained slides were scanned using the Mantra System (PerkinElmer), which captures the fluorescent spectra at 20-nm wavelength intervals from 420 to 720 nm with identical exposure time. 5 fields of immune cell enriched tumoral area for each slide were selected for image capture. The selected field were scanned to obtain multispectral images using the Mantra System, which captures the fluorescent spectra at 20-nm wavelength intervals from 420 to 720 nm with identical exposure time.

Immunohistochemistry (IHC) staining

Paraffin sections of paired tumor thrombus and primary tumor underwent incubation with anti-IFN-γ (abcam, ab231036) and TGF-β (abcam, ab215715) antibodies at 4 °C overnight. Following this, the sections were visualized under a Leica microscope (Germany). Two independent pathologists, who were blinded to the clinical information of the specimens, assessed the staining scores. The staining intensity and percentage of positive cells were quantified and recorded as follows: staining intensity was graded on a scale of 0 to 3 (negative, low, moderate, and strong staining, respectively), while the percentage of positive cells was graded on a scale of 0 to 4 (0–5%, 5–25%, 25–50%, 50–75%, and >75% of total cell numbers). The weighted score for each area was calculated by multiplying the staining intensity by the percentage of positive cells. The final staining score for each case was determined by calculating the average score of 5 random areas. The grading system was based on the average scores, with grades assigned as follows: 0–1 (negative); 1–4 (positive); 4–8 (positive + ); and >8 (positive + +).

Flow cytometry

Following two washes with phosphate-buffered saline (PBS), cells were stained with CD206 (Biolegend, 321109) and CD86 (Biolegend, 374202) antibodies in PBS containing 0.1% BSA for 30 min at 4 °C. Subsequently, the cells were subjected to flow cytometry analysis, and data were acquired and analyzed using FlowJo V10 (BD Biosciences) software.

Quantification and statistical analysis

The specific tests used to analyze each set of experiments are indicated in the figure legends. Comparisons between two groups after IHC staining were performed using a two-tailed Student’s t-test, and comparisons among three or more groups were performed using one-way ANOVA. The significance of numerical values of clinical features was determined using unpaired multiple T-test (discovery determined using the two-stage linear step-up procedure of Benjamini, Krieger, and Yekutieli, with Q = 1%). Each row was analysed individually, without assuming a consistent SD. Numerical values were corrected for multiple comparisons using the Holm-Sidak method (alpha = 0.05, presented with padj). All statistical calculations were performed using GraphPad Prism software (ver. 7.0, GraphPad, USA) or R software (https://www.r-project.org/).

留言 (0)

沒有登入
gif