We recruited 47 patients (46–83 years old) with established evidence of adenocarcinoma lung cancer. Written informed consent was obtained from all participants and the study protocol was approved by the Human Subjects Institutional Review Board at Hartford Hospital, Hartford, USA, and performed in the context of U19AI142733 grant at the Jackson Laboratory.
Sample collectionNares and oral samples were obtained using sterile swabs (Puritan™ PurFlock Ultra, #22–029–506, Guilford, ME, USA). For the nares, two swabs pre-moistened in nuclease-free water (Qiagen, Hilden, Germany) were inserted 2 cm into one nostril and rotated against the anterior nasal mucosal epithelium for up to 10 s. The tongue dorsum was gently rubbed with two dry swabs for up to 30 s.
One swab each was stored in 350 μl Tissue and Cell lysis solution (Lucigen, #MTC096H, Middleton, WI) lysis buffer with 100-μl glass beads (0.1 diameter, BioSpec Products, Cat No. 11079101, Bartlesville, OK, USA). The other swab was placed and stored in R2A (Innovation Diagnostics, LAB203-A) for future microorganism recovery. All samples were stored at − 80 °C until shipping. Samples were shipped on dry ice to The Jackson Laboratory for Genomic Medicine, Farmington, Connecticut, USA, and stored at − 80 °C until DNA extraction or cultivation.
mWGS DNA extractionMetagenomic whole genome shotgun sequencing (mWGS) allows for comprehensive sampling of all genes in all organisms present in a given complex sample and subsequent identification of the bacteria, viruses, and fungi present. Genomic DNA was extracted from frozen samples using the GenElute Bacterial DNA Isolation kit (Sigma Aldrich, NA2110-1KT, St. Louis, MO, USA) with the following minor modifications as previously described [67]: Briefly, 5 μl of Lysozyme (10 mg/ml, Sigma-Aldrich, L6876, St. Louis, MO, USA), 1 μl of Lysostaphin (5000 U/ml, Sigma-Aldrich, L9043, St. Louis, MO, USA), and 1 μl mutanolysin (5000 U/ml, Sigma-Aldrich, M9901, St. Louis, MO, USA) were added to each sample for a digest at 37 °C, 30 min. Samples were mechanically disrupted 2 × for 3 min at 30 Hz (TissueLyser II, Qiagen, Hilden, Germany). 5 μl of proteinase K (20 mg/ml, Sigma-Aldrich), and 300 μl of Solution C (55 °C, 30 min) was added. The samples were precipitated by adding 300 μl of 100% EtOH (Fisher Scientific, Fair Lawn, NJ, USA), and the lysates were loaded on the GenElute columns. Subsequent steps were carried out according to the manufacturer’s instructions.
mWGS DNA sequencingLibraries were prepared using the Nextera XT DNA Library Prep Kit (Illumina, San Diego, CA, USA) according to the manufacturer’s instructions, however using one-quarter of each reaction volume. Whole genome sequencing (WGS) was carried out using a 2 × 150 bp (paired-end) sequencing protocol for the Illumina NovaSeq 6000 Sequencing System (Illumina, San Diego, CA, USA) according to the manufacturer’s manual. Sequencing was conducted at the Genome Technologies core facility at the Jackson Laboratory for Genomic Medicine, Farmington, USA.
Positive and negative controlsPer patient, one air sample (negative control) was collected by waving a moist swab through the air, to collect potential environmental contaminants, and extracted as described above. Samples that yielded a Qubit measurement were processed for library preparation and sequenced as described. Per extraction round, one sample of a defined, in-house mock community (25 diverse Gram-positive and Gram-negative bacteria and fungi) and a negative reagent control (nuclease-free water, Qiagen, Hilden, Germany) were included. For library generation, a negative control (nuclease-free water, Qiagen, Hilden, Germany) was included as well. One mock community sample was added to each sequencing run and a library/extraction negative control was sequenced if a library product was measurable on the Qubit 2.0 Fluorometer (Thermo Fisher Scientific, Waltham, Massachusetts, USA) or identified on the 4200 TapeStation System (Agilent Technologies, Santa Clara, CA) using the High Sensitivity D1000 ScreenTape Assay (Agilent Technologies, Santa Clara, CA).
mWGS data processing for taxonomic classificationThe mWGS data analysis includes both taxonomic and functional profiling from complex microbial communities. After demultiplexing, the fastq files were processed with MetaPhlAn 4 [68] using the flags –add_viruses and –unclassified_estimation. Samples with no reads mapping to any taxa were excluded.
Relative proportions were used for all analyses. All taxonomic features at the species level with a mean relative abundance < 0.0001% (denoise function [69]) across the dataset were removed from the dataset to reduce potential false positives and allow for multiple hypothesis correction (except for the comparison with the culturomics data).
Isolate culturing and identificationOral and nares samples that would be used for culturing were thoroughly vortexed and then diluted 1:100 and 1:1000 in R2A, to increase the chance of recovering single colonies. Fifty microliters from each dilution was then spread on half of an agar plate (R2A, LB, TSA blood agar, Chocolate agar, aerobic only: SaSelect) for each cultivation condition using a sterile spread tool (Thomas Scientific, 229,616) as previously described [28]. Briefly, for agar cultivation, the following media were sourced from Fisher Scientific: Luria Broth (LB) agar (BP1425500), R2A agar (R454372), Tryptic Soy Broth (TSB) (DF0370-17–3), Chocolate agar (B21169X), Tryptic Soy Agar (TSA) with 5% sheep blood agar (B21261X). SaSelect agar plates (63748) were obtained from BioRad. The anaerobic atmosphere consisted of 5% hydrogen, 5% carbon dioxide, and 90% nitrogen (Airgas, Z03NI9022000008). Aerobic cultures were conducted in ambient atmosphere.
Microbial isolates were identified as previously described [28] using matrix assisted laser desorption ionization-time of flight (MALDI-TOF, Bruker Daltonics, Germany) mass spectrometry. Rapid DNA extraction from microbial isolates was adapted from Köser et al. [70] as previously described [28] with the following modifications. A 2-mL overnight culture was centrifuged at 20,000 × g for 1 min until a bacterial pellet was formed. The pellet was then resuspended in 150 µL of 10 mM Tris and transferred to a 2-mL bead-beating tube containing 150 µL of 10 mM Tris and 100 µL 0.5-mm diameter glass beads (BioSpec Products, NC0417355). The microbial DNA was then sequenced as previously described [28] on the NovaSeq 6000 Sequencing System (Illumina, San Diego, CA, USA), 2 × 150 bp paired end to approximately 100X coverage per genome. Following sequencing, genomic reads were preprocessed and dereplicated using PRINSEQ lite [71] (-derep 1) and trimmed using trimmomatic [72]. Reads were assembled into contigs using SPAdes [73]. QUAST [74] was used to assess contig quality. All assembly steps used default parameters unless otherwise noted. A reference genome was used for Candida glabrata (GCF_000002545.3). Taxonomic classification was assigned to the contigs and the contigs were placed in a phylogenetic tree using GTDB [75, 76] using default parameters. The phylogenetic tree was visualized using the R package ggtree [77].
Statistics and sample comparisons for metagenomics and culturomics dataOverlap of metagenomics and culturomics data was calculated extracting the uniquely identified species of genera in the culturomics data and metagenomics data. Data was analyzed and visualized using the following R packages: reshape2 [78], ggplot2 [79], tidyverse [80], ggpubr [81], dplyr [82], plyr [83], and ggvenn [84].
Comparative genomic analysisGene coding sequences of bacterial genomes were predicted using Prokka [85] and Funannotate [86] was used to sort and predict gene coding sequences for the Candida glabrata genome, all with default parameters.
KEGG orthologs were annotated by blasting the microbial genomes against Functional Mapping and Analysis Pipeline (FMAP)’s database [45] (downloaded in 2020), which consists of the UniProt [46] reference cluster filtered for bacteria, fungi, and archaea sequences possessing a KEGG functional classification, using UBLAST [87]. A minimum e-value of 10−9 was required. KEGG ortholog hits were subsequently filtered for a minimum percent identity of 80%, a minimum of 80% coverage of the query and target sequence, differential ortholog presence between the microbial genomes, and absent from either all ISG stimulator or non-stimulator genomes.
Microbial genomes were annotated for virulence factors by blasting them against VFDB’s protein database A (verified virulence factors, downloaded in 2023) [53] using UBLAST [87] with a e-value of 10−9. Hits were subsequently filtered for a minimum percent identity of 80% and a minimum of 80% coverage of the query and target sequence.
Microbial files were annotated for secondary metabolites using antiSMASH [55] (fungiSMASH for the fungal isolate) using strictness “relaxed.” antiSMASH annotates secondary metabolite gene clusters by identifying experimentally characterized proteins and filters for gene clusters that include the minimal core components of each gene cluster. –genefinding-tool was set to use Prodigal [88] if any gene was lacking an annotation. Defaults were used for other parameters.
Data was analyzed and visualized using the following R packages: ggplot2 [79], ggpubr [81], dplyr [82], rstatix [89], and RColorBrewer [90], tidyverse [80], and pheatmap [91].
Air–liquid interface cell culture cultivation9 mm primary normal human bronchial epithelial air–liquid interface (ALI) tissue cultures were obtained from MatTek Corporation (EpiAirway AIR-100, Gothenburg, Sweden). The ALI cultures had been switched to antibiotic-free media 3 days prior to receipt to facilitate microbial growth. All batches were grown using cells from MatTek Corporation’s EpiAirway AIR-100 standard donor, a healthy adult male. ALI cultures were cultured according to the manufacturer’s directions. Briefly, upon arrival, the ALI cultures were placed in 6-well plates with 1 mL of warmed antibiotic-free EpiAirway AIR-100 Maintenance Media (MatTek Corporation, Gothenburg, Sweden) or the equivalent EpiAirway AIR-100 Assay Media (MatTek Corporation, Gothenburg, Sweden) basally per well. The basal media was replaced daily, and the ALI cultures were kept at 37 °C with 5% CO2.
ALI treatmentsFor each microbial isolate, a single colony was grown overnight in sterile 1X Tryptic Soy Broth (TSB, Becton, Dickinson, and Company, #211,825, Sparks, MD) with 0.1 mg Vitamin K (Sigma Aldrich, #95,271, St. Louis, MO) and 5 mg heme / 1L TSB. For isolates that could not be grown from a single colony, a single colony was patched, and that patch was used to start a liquid culture. ODs were obtained by measuring absorbance at 600 nM in a 96-well plate using Cytation Station 5 (BioTek Agilent Technologies, Santa Clara, CA). 108 colony-forming units (CFUs) were taken from each liquid culture and washed with ultrapure water (Fisher Scientific, #AAJ71786AP, Hampton, NH) being resuspended in 100 µL of ultrapure water (Fisher Scientific, #AAJ71786AP, Hampton, NH), with a final concentration of 107 CFUs per 10 µL.
Just prior to treatment, the ALI cultures were quickly washed twice with transepithelial electrical resistance buffer to remove excess mucus, where the wash buffer was immediately removed without an incubation period. Each wash consisted of the addition and immediate removal of 400 µL of the buffer. ALI cultures were then dosed with 10 µL of microbial isolate/vehicle (ultrapure water, Fisher Scientific, #AAJ71786AP, Hampton, NH). Microbial isolates were dosed at 107 CFUs. Dosed ALI cultures were incubated for 18 or 48 h prior to harvest. Extra inoculum was serially diluted in sterile PBS (MatTek Corporation, Gothenburg, Sweden) and grown on TSA with 5% sheep’s blood (Fisher Scientific, #221,261, Hampton, NH) to determine the number of microbes added to the ALI cultures. To minimize the risk of confounding batch effect, replicates for each microbial treatment were spread across batches to distribute potential variance. Therefore, sample size indicates the number of independent experiments.
At harvest, 200 µL of transepithelial electrical resistance buffer (MatTek Corporation, Gothenburg, Sweden) was added to the apical surface of each ALI culture, pipette mixed, and removed. One hundred forty microliters of Buffer RLT (Qiagen, Hilden, Germany) + 1% beta-mercaptoethanol was added to each ALI culture, dissolving the tissue. The Buffer RLT-tissue solution was frozen at − 80 °C until RNA extraction. S. epidermidis Tü3298-GFP-colonized ALI were visualized under blue light for a qualitative examination of colonization. The wash was serially diluted in PBS (MatTek Corporation, Gothenburg, Sweden) and plated on TSA with 5% sheep’s blood (Fisher Scientific, #221,261, Hampton, NH) or 1X TSB (Becton, Dickinson, and Company, #211,825, Sparks, MD) with 1X Bacto Dehydrated Agar (Fisher Scientific, #214,010, Hampton, NH) for CFU counts. Basal media was collected and frozen at − 80 °C for cytokine bead array assays.
RNA extraction and RNA-seqAll RNA extraction and sequencing library preparation steps were performed in a sterile tissue culture hood. RNA was extracted using RNeasy 96 QIAcube HT kit (Qiagen, Hilden, Germany) according to the manufacturer’s directions. Samples were eluted in nuclease-free water (Qiagen, Hilden, Germany) and frozen at − 80 °C until sequencing preparation. RNA quality was evaluated using the 4200 TapeStation System (Agilent Technologies, Santa Clara, CA) with the High Sensitivity RNA ScreenTape Analysis (Agilent Technologies, Santa Clara, CA) or RNA ScreenTape Analysis (Agilent Technologies, Santa Clara, CA). RINs ranged from 1.7 to 9.9 with an average of 8.4 and a median of 8.9. RNA quantity was measured on the Qubit 2.0 Fluorometer (Thermo Fisher Scientific, Waltham, Massachusetts, USA). The sequencing libraries were prepared with NEBNext rRNA Depletion Kit v2 (New England Biolabs, Ipswich, MA) and NEBNext Ultra II Directional RNA Library Prep Kit for Illumina (New England Biolabs, Ipswich, MA) following the manufacturer’s directions. Library quality was evaluated using the 4200 TapeStation System (Agilent Technologies, Santa Clara, CA) with the High Sensitivity D1000 ScreenTape Assay (Agilent Technologies, Santa Clara, CA). Library quantity was measured on the Qubit 2.0 Fluorometer (Thermo Fisher Scientific, Waltham, Massachusetts, USA). Samples were sequenced using Illumina NovaSeq targeting 30 million reads. Reads per sample ranged from 76 thousand to 318 million with an average and median of 29 million reads.
Transcriptional profiling, gene set enrichment analysis, and CFU confounder analysesRaw RNA-seq reads were processed with trimmomatic 0.39 [72] to remove low-quality reads. Reads were then mapped to T2T-CHM13v2.0 reference genome using STAR 2.5.3a [92]. The raw read counts were calculated using featureCounts (from Subread1.6.4) [93]. The raw read counts were normalized by RUVg [94] within each comparison group of microbe-treated samples versus vehicle controls using 5000 of the most stably expressed genes. Differentially expressed genes were identified and expression level changes of each gene were computed using DESeq2 [95] with shrinkage by rlogTransformation and DEG threshold of |log2FC|> 1 and adjusted p-value < 0.05. Gene set enrichment analysis was performed on pre-ranked gene lists (sign of log2FC * 2|log2FC| * − log10(p-value)) calculated from output of DESeq2 on ALI transcriptomes by GSEA4.3.2 [96, 97] with gene ontology (GO) database or clusterProfiler [98, 99] with the gseGO method, a p-value cut-off of 0.05, and FDR p-value adjustment. CFU confounder analyses were generated by MaAsLin2 [34] with TSS (total sum-scaling) normalization. Fixed effects included CFUs (at the time of harvest), ALI batch, and timepoint. CFUs were log10 transformed prior to MaAsLin2. An effect was considered a confounder if the FDR adjusted P-value < 0.05.
Gene list generation and determination of responsive genes in RNA-seq dataISG gene lists were adapted from the literature: total ISG list [100], 6-gene ISG list [36], and 30-gene ISG list [37]. The antibacterial gene list was derived from MSigDB [38] with gene ontologies [39, 40] (GO:0005125, GO: 0061844, GO: 0019730, GO: 0050832, GO: 0009620). The keratin gene list was adapted from the literature [43]. The junction and mucin gene lists were derived from HGNC (HUGO Gene Nomenclature Committee) [44]. The junction gene list was further adapted from the literature [42]. Within each gene list, genes were further processed by removing genes with raw read count lower than 5. The remaining genes were further subselected with hierarchical clustering using the seaborn Python package [101]. Clusters of genes that were enriched with increased expression, as determined by high log2FC, were defined as responsive genes. The degree of increased expression depended on the gene list.
Statistics and sample comparisons for RNA-seq dataData was analyzed and visualized using the following Python packages: seaborn [101], scikit-learn [102], statannot [103], pandas [104], numpy [105], and matplotlib [106]; and R packages: ComplexHeatmap [107], reshape2 [78], and ggplot2 [79].
Cytokine measurement by Cytometric Bead Array (CBA)Each cultured microbe was grown overnight in TSB with 0.1 mg vitamin K and 5 mg heme / 1 L TSB in a 96-well deep well plates (#503,501, Southern Labware, Cumming, GA, USA). ODs were obtained by measuring absorbance at 600 nM in a 96-well plate using Cytation Station 5 (BioTek Agilent Technologies, Santa Clara, CA). The liquid cultures were centrifuged at 3500 rpm for 5 min (Sorvall Legend X1R M20 rotor, Thermo Scientific, Waltham, MA, USA) to separate the bacterial pellet and bacterial supernatant. The bacterial supernatant was collected and sterile filtered (0.2 μm) using a filter plate (#MSGVS2210, Sigma Aldrich, St. Louis, MO, USA) at 3500 rpm for 5 min (Sorvall Legend X1R M20 rotor, Thermo Scientific, Waltham, MA, USA). The sterile bacterial supernatants were stored at − 80 °C until subsequent use. Sterile bacterial supernatants and ALI basal media were shipped at − 80 °C to Washington University School of Medicine, St. Louis, MO.
For conditioned bacterial supernatant suppression/induction of cytokines, lung epithelial cell line A549 was stimulated overnight with sterile (0.2 μm filtered) bacterial supernatants. Briefly, 1 × 105 cells per well were with 20% volume/volume of each supernatant overnight. The conditioned A549 cell media was collected for CBA analysis.
CBA analysis was conducted using six bead populations with distinct fluorescence intensities that were coated with capture antibodies specific for IL-8, IL-6, IL-10, IL-1, TNF, and IL-12p70 proteins (Becton, Dickinson, and Company BioSciences, #551,811, Sparks, MD). They were incubated together with the samples (sterile bacterial supernatant or microbially colonized ALIs’ basal media), negative controls (sterile bacterial growth media or basal media from vehicle treated ALI), or recombinant standards, and PE-conjugated detection antibodies, to form sandwich complexes and were detected by flow cytometry. Results were generated in graphical and tabular format using the BD CBA Analysis Software (Becton, Dickinson, and Company BioSciences, Sparks, MD). The standard curve for each protein covers a defined set of concentrations from 20 to 5000 pg/ml.
Data was analyzed and visualized using the following R packages: ggplot2 [79], ggpubr [81], dplyr [82], rstatix [89], tidyverse [80], forcats [108], and RColorBrewer [90].
Immunofluorescence stainingALI cultures were washed with 1X PBS (MatTek Corporation, Gothenburg, Sweden), then embedded in OCT (Sakura Finetek USA, Torrance, CA, USA) and snap-frozen at − 80 °C. Frozen sections were cut at 8 µm, air dried on Superfrost plus slides, fixed with 4% paraformaldehyde (#28,906, Thermo Fisher Scientific, Waltham, MA, USA) for 15 min, then permeabilized with 1X PBS/0.1% Triton-X (#HFH10, Thermo Fisher Scientific, Waltham, MA, USA) for 15 min. Tissue sections were treated with Fc Receptor Block (#NB309, Innovex Bioscience, Richmond, CA, USA), followed by Background Buster (#NB306, Innovex Bioscience, Richmond, CA, USA). The sections were then stained with anti-MX1 primary antibody (polyclonal Rabbit N2C2, #GTX110256, GeneTex, Irvine, CA, USA) for 1 h followed by secondary antibody (anti-rabbit IgG AF555, #406,412, BioLegend, San Diego, CA, USA) for 30 min at room temperature in 1X PBS/5% BSA/0.05% saponin. Then, sections were washed three times with 1X PBS for 15 min. Finally, sections were counterstained with 1 µg/ml of 4',6-diamidino-2-phenylindole (DAPI, #D1306, Thermo Fisher Scientific, Waltham, MA, USA) then mounted with Fluoromount-G (#00–4958-02, Thermo Fisher Scientific, Waltham, MA, USA), acquired using a Leica SP8 confocal microscope (Leica Microsystems, Wetzlar, Germany) for high-resolution images and Thunder widefield microscope (Leica Microsystems, Wetzlar, Germany) for quantification both with Leica LAS X software (Leica Microsystems, Wetzlar, Germany) and analyzed using Imaris software (Bitplane, Oxford Instruments, Abingdon, United Kingdom). Following Imaris image analysis, the intensity mean of each marker was quantified using histocytometry (FlowJo).
Data was analyzed and visualized using the following R packages: ggplot2 [79], ggpubr [81], dplyr [82], rstatix [89], and RColorBrewer [90]. All results are expressed as a mean of the replicates, unless specified. All comparisons were made between infection conditions with time point-matched, uninfected controls.
Transepithelial electrical resistance (TEER) measurementsTEER measurements were taken using EVOM Manual for TEER Measurement (#EVM-MT-03–01, WPI, Sarasota, FL, USA) and STX4 EVOM Electrode (#EVM-EL-03–03-01, WPI, Sarasota, FL, USA) following 18 and 48 h of colonization. The electrodes were equilibrated following the manufacturer’s instructions. ALIs were transferred to a 12-well plate. One milliliter of media was added to the basal side and 300 µL of TEER buffer was added to the apical surface. Three readings were taken of each insert, evenly spread across the insert. In between each reading, the electrodes were washed in 70% ethanol and then washed in TEER buffer.
Data was analyzed and visualized using the following R packages: ggplot2 [79], ggpubr [81], dplyr [82], rstatix [89], and RColorBrewer [90].
LDH quantificationLDH was quantified from ALI apical media, basal media, and cell lysate using CyQuant LDH Cytotoxicity Assay (Thermo Fisher Scientific, #C20301, Waltham, MA) following the manufacturer’s directions but with the following modifications. ALI were colonized for 48 h as described above. Following colonization, 200 µL of TEER buffer was added to the apical surface of each ALI. ALI were subsequently incubated at 37 °C for 10 min. Following incubation, the TEER buffer was pipetted up and down 5 times prior to removal to maximize the amount of bacteria removed and minimize confounding signal. The apical wash was saved to quantify apical LDH signal. An aliquot of the basal media was also saved to determine basal LDH secretion. Three hundred microliters of 5X lysis buffer was then added to the apical surface of each ALI and ALI were incubated for at 37 °C for 1.5 h. In the middle of the incubation, a pipette tip was used to mechanically disrupt the ALI to ensure thorough cell lysis. LDH was quantified in the basal media, apical wash, and ALI cell lysate. Samples were run in a 384-well plate format and the signal was read using Cytation Station 5 (BioTek Agilent Technologies, Santa Clara, CA). Apical wash and basal media were diluted 1:2 and cell lysate 1:20 in TEER buffer.
Additional ALI, following the apical wash, were each transferred to a 1.5-mL Eppendorf tube with ~ 200 µL of TEER buffer and 100 µL of 1 mm Zironia/Silica beads (BioSpec Products, #11079110z, Barlesville, OK) and were mechanically homogenized for 3 min at 30 Hz using TissueLyser II (Qiagen, Hilden, Germany). The cell lysates were then serially diluted and plated on TSA to quantify live, adherent bacteria for each microbial treatment.
To normalize the cell lysate LDH signal for any confounding bacterial LDH that was released during lysis, fresh TSB cultures were grown for each microbial treatment. Bacterial pellets were washed with sterile water to remove excess TSB, and then each microbial treatment was serially diluted to create to a standard curve ranging 104–109 CFUs. The bacterial standard curves were incubated for 1.5 h at 37 °C in 100 µL of 10X lysis buffer and then LDH from the bacterial lysate was measured. A higher concentration of lysis buffer than was used for the ALI was used for the bacteria to determine the maximal contribution of bacterial LDH to the cell lysate LDH.
Bacterial LDH signal was subtracted from the cell lysate LDH, based on the CFUs determined from the cell homogenate. Replicates were averaged together. Data was analyzed and visualized using the following R packages: ggplot2 [79], dplyr [82], rstatix [89], and RColorBrewer [90].
留言 (0)