CRISPR-StAR enables high-resolution genetic screening in complex in vivo models

StAR vector design and generation

Different CRISPR-StAR vectors were created to obtain the vector with an equal ratio of recombination products. For the initial proof-of-concept screen in mESCs, CRISPR-StAR vector 1 was generated using the original Switch-ON vector (p206)13, which is a retroviral backbone, and insertion of a lox5171 site at a XbaI restriction site and lox5171 at EcoRI restriction side (Supplementary Table 2). Additionally, an extra antibiotic selection cassette between the sets of lox sites was ligated at the SbfI restriction site, using oligos with an SbfI recognition sequence overhang to amplify PGK-blasticidin. In this way, cells that could induce recombination before induction with 4-OH could be removed from the population by blasticidin selection. Cells transduced with the CRISPR-StAR vector can be selected for with neomycin.

Vector 1 was used for further adaptations, such as an extra loxP site after the first lox5171 site (vector 2) and, additionally, sequence removal between the lox5171 sites to shorten the active sgRNA product (vector 3). Vector 4 contained, again, two sets of lox sites but kept the reduced sequence for the shorter active sgRNA product. Hereafter, vector 4 was used to amplify the full CRISPR-StAR cassette with oligos containing NheI recognition sites, which were used to ligate the purified PCR product into a lentiviral backbone. The lentiviral vectors are 5–7 and differ only in the antibiotic-resistant or GFP cassettes. Vector 4B has only the blasticidin selection cassette. Vector 4BN contains the blasticidin and a GFP–neomycin selection cassette, and vector 4GN has between the lox sites the GFP–neomycin cassette and no blasticidin selection method anymore. All vectors were verified by Sanger sequencing.

Library cloning

The human and mouse genome-wide sgRNA libraries were synthesized by Twist Bioscience. Each single-stranded oligonucleotide consists of an sgRNA of 18–20 bases, flanked by the sequences 5′-CGTCTCACACC-3′ and 5′-GTTTAGAGACGT-3′, which contain BsmBI restriction sites and a fitting overhang sequence. Additionally, primer binding sites for PCR amplification of each subpool were added. Each oligo subpool was independently amplified using NEBNext High-Fidelity 2× PCR Master Mix (for primers, see Supplementary Table 3). The purified PCR product was cloned into the CRISPR-StAR (4GN lentiviral vector) or pLenti-UMI by Golden Gate cloning. Each of the vectors contained random barcodes, which were integrated as previously described in Michlits et al.14. For the Golden Gate assembly, five parallel reactions with 245 ng of vector, 5.8 ng of dsDNA, 15 U of BsmBI v2 (New England Biolabs (NEB)), 1,000 U of T4 DNA ligase (highly concentrated; NEB) and 2.5 µl of T4 ligase buffer in 25 µl were cycled at 37 °C for 5 min, followed by 5 min at 16 °C for 75 cycles. The assembled plasmids were purified and transformed into Endura electrocompetent cells (Lucigen) with a Bio-Rad Pulser II according to the manufacturer’s instructions. Bacteria were recovered for 1 h at 37 °C and 200 r.p.m., and, subsequently, the bacteria were expanded over approximately 16 h at 37 °C and 200 r.p.m. in LB medium with ampicillin. Transformation efficiency was confirmed to be at least 1,000-fold over the size of the pool. Plasmid DNA was harvested using a Macherey-Nagel NucleoBond Midi Prep Kit. Subpools were mixed according to the number of sgRNAs in each pool to generate the two batches in which the genome-wide screen was performed. All plasmid sequences are available as a supplementary file; plasmids may be further obtained from Addgene. sgRNAs and sublibrary information are available in Supplementary Tables 59.

Cell culture

mESCs (feeder free) were maintained in DMEM supplemented with 15% FCS, 2 mM L-glutamine (Gibco), 1× non-essential amino acid (NEAA; Gibco), 1 mM sodium pyruvate (Sigma-Aldrich), 1× penicillin–streptomycin (Sigma-Aldrich), 100 µM β-mercaptoethanol and 100 U ml−1 leukemia inhibitory factor (LIF).

Yumm1.7, Yumm1.7 450R, A375 and A375R cells were cultured in DMEM/F12 medium supplemented with 10% FCS, 2 mM L-glutamine (Gibco) and penicillin–streptomycin (Sigma-Aldrich). Additionally, Yumm1.7 450R medium contained 100 nM B-Raf inhibitor dabrafenib (Selleck Chemicals), and A375R medium contained 100 nM vemurafenib (Selleck Chemicals).

4T1, 4T07, B16F0, B16F1, B16F10 and Hepa1.6 cells were grown in DMEM high-glucose medium supplemented with 10% FCS, 2 mM L-glutamine (Gibco), 1× NEAA (Gibco), 1 mM sodium pyruvate (Sigma-Aldrich) and penicillin–streptomycin (Sigma-Aldrich).

Lastly, EMT6, EO771 and CT26 cells were grown in RPMI 1640 medium supplemented with 10% FCS, 2 mM L-glutamine (Gibco), 1× NEAA (Gibco), 1 mM sodium pyruvate (Sigma-Aldrich) and penicillin–streptomycin (Sigma-Aldrich).

Generation of Cas9-CreERT2-expressing monoclonal cells

All cell lines for CRISPR screening were generated by transducing them with a lentivirus containing Cas9-p2A-Puro and subsequent selection with puromycin. Cells that were also used for CRISPR-StAR screening, Yumm1.7 450R, A375R and A375, consecutively received retrovirus with CreERT2 and GFP or mCherry. Cells with the fluorescent marker were single cell sorted and expanded. Using CRISPR Switch-ON13 and an sgRNA against GFP or mCherry, the monoclonal cells were in vitro tested for Cas9 and CreERT2 induction, functionality and tightness. Seven days after CreERT2 recombinase induction, clones were subjected to single-cell fluorescence-activated cell sorting (FACS) analysis, and GFP reduction was measured. We aimed for a GFP reduction of at least 80–90% and, without 4-OH tamoxifen treatment, little to no reduction of GFP. Clones that matched these criteria were used for in vivo testing. For in vivo testing, we transduced the cells with a sublibrary and assessed by NGS the dropout of essential genes. For maximal Cas9 functionality (and minimizing silencing), we transduced the Yumm1.7 450R cells with an additional Cas9-p2A-blasticidin. Another option to select for functional clones is a PCR-based strategy (Supplementary Fig. 5a). This assay uses a common sgRNA against an essential gene but different UMIs that get transduced to individual clones to be tested. Upon pooling of all cellular clones in vitro, engraftment, tamoxifen induction and tumor harvest, each clone can be addressed individually if (1) it engrafted well (that is, the UMI is detected); (2) it recombined well (the active and inactive band of the CRISPR-StAR construct appears); and (3) Cas9 is active in all cells (that is, the active sgRNA band disappears due to gene essentiality) (Supplementary Fig. 5b).

Lentivirus production

For the production of lentivirus, Lenti-X cells (Clontech, 632180) were transiently transfected with the DNA library pool. Cells were seeded at 70% confluency in 15-cm dishes and transfected 6 h later. Then, 25 µg of library DNA, 12 µg of Gag-pol and 6 μg of VSV-G were mixed together with 129 µl of PEI in 3 ml of DMEM high-glucose medium for the transfection of one 15-cm dish of Lenti-X cells. The transfection mixture was incubated for 20 min at room temperature and added dropwise to the cells. The next morning, media were exchanged, and viral supernatant was harvested 48 h after transfection. The virus-containing supernatant was pooled, filtered through a 0.45-μm PES filter (VWR) and frozen at −70 °C. With a virus titration assay, the dilution ratio was determined for each viral sgRNA library pool to achieve a multiplicity of infection (MOI) of 0.25.

In vitro screening

For in vitro screening, at least 1,000 cells per sgRNA were infected at an MOI of 0.25 to ensure integration of one sgRNA per cell. The determined amount of viral supernatant was added to the cells, together with the appropriate medium for each cell line and 4 μg ml−1 polybrene (Merck Millipore). Cells were maintained as three technical replicates, whereby the sgRNA coverage was preserved at more than 500 cells per sgRNA. Over the course of 5–7 d, cells were selected with 1 mg ml−1 neomycin (500 µg ml−1, G418; Life Technologies, 11811-031). A culture dish with a control population (cells without library) was taken along for assessment of neomycin selection. After elimination of all control cells, the library containing cells was used for in vivo screening or treated with 4-OH tamoxifen to continue the in vitro screen for 14 d.

Mice

For all screening experiments, 5–16-week-old male/female immunocompromised B6(Cg)-Rag2tm1.1Cgn/J or B6.SJL-Rag2tm1FwaPtprca mice (Rag2−/−) were used. Additionally, for the experiments to assess the engraftment rate of cell lines, WT mice C57BL/6J and BALB/cJ were used. Mice were bred in-house, and all animal experiments were carried out according to project licenses approved by Austrian veterinary authorities. All mice were housed in groups of 3–6 in individually ventilated cages (IVCs), type II (Tecniplast), in a specific pathogen-free (SPF) animal facility. The animal housing rooms were temperature and humidity controlled (20 ± 3°, 50 ± 15%) with a 14/10-h light/dark cycle. All animals had ad libitum access to food (Ssniff) and water throughout the entire study.

In vivo CRISPR screening

Mice were anesthetized with isoflurane, followed by the subcutaneous injection of 5 × 105–5 × 106 cells harboring Cas9-CreERT2 and the StAR sgRNA library into the flank of Rag2−/− mice in 50 µl of Matrigel:PBS (1:1; Corning). The cells of the three technical replicates were injected in equal groups of mice. Tumor size was measured 1–2 times per week using a calliper and calculated with the following formula: volume (in mm3) = (long side × short side2) / 2. At day 10 (Yumm1.7 450R) or day 14 (A375R) after cell injection, tamoxifen (dissolved in corn oil; Sigma-Aldrich) was intraperitoneally administrated for CreERT2 recombinase induction, followed by 14 d screening time. Tumors were harvested, cut into pieces and subsequently lysed with lysis buffer (10 mM Tris pH 8, 10 mM EDTA pH 8, 100 nM NaCl, 1% SDS) and 1 mg ml−1 Proteinase K (VWR, 1.24568.0500) for 48–72 h.

Genomic DNA extraction and NGS library preparation

Lysed tumor samples were combined per 5–7 mice, after which genomic DNA was extracted from lysed cell or tumor samples by phenol-chloroform extraction and precipitated with isopropanol. PacI restriction digestion facilitated the accessibility of the DNA for PCR amplification. Each sample was individually amplified with barcoded primers (Supplementary Table 4) in 48 reactions of 50 µl, with an input of 4 µg of DNA per reaction, using KAPA HiFi DNA Polymerase HotStart ReadyMix (Roche). PCR reactions were pooled per sample, purified and quantified by a fragment analyzer. Finally, samples were pooled equally by concentration and sequenced on an Illumina NextSeq2000 P2 in a paired-end sequencing run, using custom primers (Supplementary Table 4). To identify afterwards active or inactive sgRNAs, we needed to read at least 75 bp in read 1; to read the dual index barcodes index 1 and index 2, we obtained 9 bp; and for the UMI barcode, we obtained 11 bp.

Data analysis

For data analysis, BAM files were retrieved from Illumina, and different samples were demultiplexed using the distinct experimental indices. Furthermore, sgRNAs and UMI sequences were identified using Bowtie, SAMtools and FASTX-Toolkit. To confirm an active or inactive sgRNA, we assessed, for each read 1 sequence, the last six bases. Reads that contained ‘TTTT’ in the end were assigned as inactive sgRNA, and ‘CAGC’-containing reads were marked as active sgRNAs.

Subsequently, data were processed in RStudio using a wide variety of packages, such as tidyverse, data.table, dplyr, ggExtra, ggpubr, ggrepel, stringr, patchwork and pROC.

For the in vitro CRISPR-StAR data, reads per UMI were pooled on the sgRNA level, and a pseudocount of 0.5 was added to each active and inactive sgRNA, to avoid 0 values. Because the screen was performed in triplicate, we could do paired analysis. MAGeCK version 0.5.9-foss-2018b37 was used to calculate gene effects, by defining the treatment sample (-t) as the active reads and the control sample (-c) as the inactive reads. The MAGeCK software provided output files with the median log2FC per gene (neg.lfc in output files) and RRA score (neg.score in output files).

In vivo CRISPR-StAR NGS data were, after separation of the active and inactive sgRNAs in R, processed at a UMI level. Before log2FC gene effects were calculated, we implemented several filtering steps to clean the data in a standardized manner.

UMI hopping

The first filtering step that we applied was to address the issue of UMI hopping, which refers to instances where multiple distinct sgRNA are erroneously assigned the same UMI due to technical limitations in the NGS process. Because the UMI barcodes and the sgRNA library are cloned at a very high complexity, it is not common to have multiple sgRNAs combined with the same UMI. However, many sgRNAs were found to have the same barcode but with a much lower read count. To this end, we calculated the sum of the active and inactive reads of each unique sgRNA–UMI combination (sumReads), after which we sorted the dataset in descending order based on the total number of reads associated with each UMI. The next step was an iteration through the UMIs with the highest number of reads until the total number of reads associated with a UMI drops below the predefined threshold of 100,000, whereby, in each iteration, the UMI with the highest number of reads was identified from the top of the sorted list. A ratio was calculated for each occurrence by dividing its number of reads by the maximum number of reads among all occurrences of the same UMI. These ratios were then used to filter out instances where the ratio was less than or equal to 0.001, indicating a low proportion of reads compared to the maximum observed for that UMI. This process ensures that UMIs with a disproportionately small number of reads, compared to the highest observed count for that UMI, are filtered out, thereby mitigating the effects of UMI hopping and improving the accuracy of downstream analyses.

Polymeric UMIs

The second filtering step addresses the matter of polymeric sequences within the UMIs. Polymeric sequences are repetitive sequences of nucleotides that may arise due to technical artifacts or biases during library preparation or sequencing. Again, due to the high complexity of the library, it is not expected to have an extremely high number of reads for polymeric UMIs. For this reason, we excluded UMIs containing seven or more consecutive occurrences of cytosine, seven or more consecutive occurrences of adenine, seven or more consecutive occurrences of thymine or five or more consecutive occurrences of guanine. For guanine, the filter is less strong, as Illumina’s NextSeq 2000 assigns the nucleotides with a two-color/channel sequencing by synthesis technology, whereby no color is interpreted as guanine91, and this is more likely to happen than misdetection of a color.

Active and inactive read filter

After we combined the filtered data of the two screened batches, we applied a last filter. The sumRead of every sgRNA–UMI of each replicate in the two screening batches was calculated. Subsequently, we omitted the sgRNA–UMI replicate combinations with 20 or fewer sumReads.

Lastly, a pseudocount of 0.5 was added to each active and inactive sgRNA–UMI replicate, and gene effects were calculated with MAGeCK in the same way as for the in vitro CRISPR-StAR data. For the in vivo CRISPR-StAR gene effect, only genes with three or more UMIs were used for further analysis.

To pool the data of batch 1 and batch 2 of the genome-wide screen, we calculated the median ratio in each batch for the non-essential genes. Hereafter, the normalization factor between batch 1 and batch 2 for the active and inactive reads could be defined for both the in vitro and the in vivo screen. Before pooling the data of the two batches, reads of batch 2 were multiplied by these normalization factors.

The sgRNAs in the plasmid library subpools were sequenced with individual experimental indices to assess the representation of the library (Extended Data Fig. 5c–f). For the conventional analysis, the reads for each subpool were normalized to the median sgRNA reads of all subpools. Subsequently, we performed MAGeCK analysis, in which active reads were assigned as -t and the sgRNA reads in the library as -c, resulting in log2FC gene effect values. Note, UMI analysis is not possible in the case of the conventional genome-wide screen in vivo. As in this context, the comparison of the very complex plasmid library with the very few number of UMIs in the tumors would not be fair.

Various gene groups were defined in this paper. IDGs are genes that were, in the genome-wide screen, found to be essential, as these had an effect of LFC lower than −3 and an RRA score of at least −log10 10 or higher. Non-essential genes were, in DepMap1, found in no screen to have a gene effect below −0.5. Lastly, core essential genes were, in DepMap, the genes with a gene effect of −1 and lower in at least 400 screens.

TripleColour-StAR assay

The TripleColour-StAR vector was generated from the lentiviral CRISPR-StAR vector by introducing EF1a-TagBFP after the stop cassette, NLS-P2A-miRFP703-T2A-NeoR replacing EGFP-P2A-NeoR and PGK-dTomato after the tracr. These cassettes were synthesized and PCR amplified for subcloning using restriction enzymes. Each sgRNA used for the TripleColour-StAR assay was synthesized, annealed and subsequently inserted into the vector digested with Esp3I restriction enzyme. TripleColour-StAR vectors containing each sgRNA were transfected into Lenti-X cells as previously described. Yumm1.7 450R cells with Cas9 and CreERT2 were treated with the lentivirus and selected with neomycin. After selection, cells were expanded and either subcutaneously injected into Rag2−/− mice or cultured in vitro with and without 4-OH tamoxifen. In vitro samples were analyzed on day 7 and day 14 after 4-OH tamoxifen treatment via FACS. The Cre recombination in mice was induced at day 7 after cell injection with intraperitoneal tamoxifen. Tumors were harvested on day 14 after tamoxifen injection, dissected into small pieces and incubated with Collagenase IV (2 mg ml−1; Worthington, LS004188) and DNaseI (0.2 mg ml−1; Sigma-Aldrich, 4536282001) for 1 h at 37 °C. Cells were then filtered (70 µm), cultured overnight to isolate tumor cells and subsequently analyzed via FACS. FACS data were analyzed using FlowJo version 10.9.0 software (Supplementary Fig. 6 and Supplementary Table 1). Cells carrying TripleColour-StAR with sgEGFP were re-plated on eight-well chamber slides (ibidi, 80826), incubated overnight, fixed with 4% paraformaldehyde solution (Sigma-Aldrich, 28908) and washed with PBS.

Fixed cells were permeabilized with 0.1% Triton X-100 (Sigma-Aldrich, T8787) and blocked with 10% donkey serum (Sigma-Aldrich, D9663). Subsequently, cells were stained with primary antibodies (TagBFP: NanoTag Biotechnologies, N0502-CUSTOM-FluoTag-X2 anti-TagBFP Dylight 405, 1:500; EGFP: Abcam, A13970, 1:1,000; dTomato: Biorbyt, ORB11618, 1:500) and secondary antibodies (Invitrogen, A78948 and A11057; 1:500). Nuclear staining was performed with To-Pro-3 (Invitrogen, T3605; 1:1000). Imaging was conducted using an LSM 800 Axio Observer microscope, and ZEN Black (version 2.3) and ZEN Blue (version 2.3 Sp1) were used for image processing.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

留言 (0)

沒有登入
gif