Discovery and structural mechanism of DNA endonucleases guided by RAGATH-18-derived RNAs

Identification of the RAGATH-18 RNA-associated protein

To discover specific molecular components in the “defense islands” of the microbial genome, we analyzed terabase-scale genomic and metagenomic data of microbiome (Fig. 1a) from a wide range of organisms including humans, other mammals, birds, fish and other marine organisms, invertebrates, insect, plants and porifera (Fig. 1b; Supplementary information, Table S1).29,30,31,32,33,34,35 We first performed a comprehensive annotation for genes related to defense systems including CRISPR-Cas, RM, and TA systems, and defined the boundaries of the “defense islands” relative to non-defense genes.6,36,37 Then we annotated the RNAs within intergenic regions (IGRs) proximal to defense-associated genes using Rfam38 (Fig. 1a). This analysis identified a number of conserved RNAs with known functions38 (Supplementary information, Table S1). Additionally, we discovered novel RNAs that have yet to be characterized biologically and biochemically, including RAGATH-18.28 To further explore whether these conserved RNAs may co-exist with any proteins, we performed gene family analysis of 5 upstream and 5 downstream proteins of the conserved RNAs (Fig. 1a). We noticed a co-occurrence of IS607 TnpBs with RAGATH-18 RNAs and 64% (10,258/16,029) of the RAGATH-18 RNAs being located next to IS607 TnpBs in 9172 strains. (Fig. 1a; Supplementary information, Table S1c). Notably, 84% (8614/10,258) of them are present in the human gut microbial metagenomic data (Fig. 1c; Supplementary information, Table S1c). The 9172 strains cover more than 130 bacterial species, which belong to seven phyla including Firmicutes, Bacteroidota, Fusobacteria, Actinobacteria, Cyanobacteria, Euryarchaeota, and Thermotogae. However, 97.8% of the strains are from the Firmicutes phylum (Supplementary information, Fig. S1a and Table S1) that is mainly comprised of low G + C Gram-positive bacteria in the human gut.39

Fig. 1: Identification of the RAGATH-18 RNA-associated protein.figure 1

a Scheme of the computational pipeline for the identification of the RNA-associated proteins across all genomic and metagenomic data. We identified non-coding RNAs with predicted conserved secondary structures which are enriched proximal to the defense-associated genes, and clustered both five upstream and five downstream proteins of these non-coding RNAs to explore conserved cassettes. DEGD, defense associated gene database; HKGD, housekeeping gene database; RM, restriction modification; IGR, intergenic regions. Gabija is a recently described defense system.11 Group II introns, T-box, pemK, and RAGATH-18 are examples of predicted RNAs in the vicinity of the defense associated genes. b Source of metagenomic data of microbiome used for analysis. c Source distribution of RAGATH-18 RNA-associated protein. d The phylogenetic tree of RAGATH-18 RNA-associated proteins. The microbial source is shown on the outermost with the human gut microbiome in brown. The cluster identifier, clade identifier, and copy number of IS607 TnpBs are shown on the ring from innermost to the second outermost.

We further found that 80% of the IS607 loci encode two ORFs, TnpA and TnpB, of which 30% of TnpAs are pseudo proteins due to the existence of a frameshift or an internal stop codon. By contrast, TnpB encodes a predicted RuvC domain-containing protein.40 Furthermore, RAGATH-18 RNAs co-occur only with IS607 TnpB but not with other IS family members. IS607 TnpBs co-existing with RAGATH-18 RNAs can be divided into two clades (six clusters) with less than 40% similarity (Fig. 1d; Supplementary information, Fig. S1b). The RAGATH-18 RNAs detected by a covariance model have an average length of 73 nucleotides (nt). The sequences among RAGATH-18 RNAs from various species had high conservation, and highly conserved nucleotides formed a possible E-loop and kink-turn.41 The 5’ and 3’ flanking sequences of the RAGATH-18 RNAs are conserved, approximately 50 nt and 60 nt in length, respectively.

IS607 TnpBs are RNA-guided DNA endonucleases

The co-occurrence of RAGATH-18 RNAs and IS607 TnpBs suggests their functional co-operativity. We therefore investigated whether an IS607 TnpB protein and its adjacent RAGATH-18 RNA bind to each other by small RNA-sequencing (Supplementary information, Fig. S2a). A recombinant plasmid was constructed for co-expression of IS607 TnpB and its reRNA containing 5’ flanking, RAGATH-18, 3’ flanking and non-conserved sequences at its 3’ end from the Firmicutes bacterium AM43-11BH strain in E. coli. The protein purification results showed that the IS607 TnpB protein co-existed with a bound nucleic acid (Supplementary information, Fig. S2b). RNase treatment completely degraded the TnpB-bound nucleic acid. By contrast, the bound nucleic acid was insensitive to DNase treatment, indicating that the IS607 TnpB protein formed a complex with a RNA (Supplementary information, Fig. S2b). Small RNA sequencing showed that the TnpB-bound RNA contained 184 nucleotides, including a 37-nt 5’ flanking motif 13 nt downstream of gene encoding IS607 TnpB, a 73-nt RAGATH-18 motif, a 64-nt 3’ flanking motif, and a 10-nt non-conserved sequence adjacent to the 3’ flanking motif (Fig. 2a). These results demonstrate that the Firmicutes bacterium IS607 TnpB forms a specific RNA–protein complex with its adjacent reRNA.

Fig. 2: IS607 TnpBs are RNA-guided DNA endonucleases.figure 2

a Mapping of small RNA sequencing data to the IS607 TnpB non-coding region in Firmicutes bacterium. b Scheme of the biochemical assay used to discover the TAM position and identity. c Weblogo of the identified ISFba1 TAM sequence (left panel); in vitro cleavage assay of ISFba1 TnpB using linearized dsDNA substrates (right panel). The agarose gel was visualized by ethidium bromide (EB) staining. TAM, Target adjacent motif; 2.7 kb, dsDNA substrate; 1.7 kb and 1 kb, cleavage product. Data shown are representative of three independent experiments. d Quantification of the in vitro cleavage assays mediated by ISFba1 TnpB using linearized dsDNA substrates with different TAM sequences. Data represent means ± SD of three biological replicates. e Run-off Sanger sequencing for the ISFba1 TnpB-cleaved plasmid. NTS, the non-targeted strand; TS, the target strand. The major and minor cleavage sites are indicated by red and green triangles. f The proportion of the non-target strand (left panel) and target strand (right panel) cleavage sites quantified by the high-throughput sequencing of the ISFba1 TnpB-cleaved results. Data represent means ± SD of three biological replicates. g In vitro cleavage assay of the wild-type ISFba1 TnpB (WT) and RuvC active site mutated ISFba1 TnpB (D194A, E290A, D371A) using linearized dsDNA substrates. The agarose gel was visualized by EB staining. Data shown are representative of three independent experiments.

Recent studies have shown that the TnpB protein of IS200/IS605 is a programmable endonuclease.23,24 We hypothesized that the IS607 TnpB protein may function similarly given its specific reRNA-binding activity. To test this hypothesis, we constructed a pUC19-based plasmid library containing seven randomized DNA nucleotides next to the upstream of a target sequence. For the programmable endonuclease of IS200/IS605 TnpB, the variable sequence at the 3’ end adjacent to the conserved flanking motif of the protein-bound RNA serves as a guide sequence for targeting DNA substrates.23,24 Inspired by the effective guide sequence length of existing programmable endonucleases, such as CRISPR/Cas9, we therefore replaced the variable sequence of the Firmicutes bacterium reRNA with a 20-nt sequence capable of targeting the plasmid library. The hepatitis delta virus (HDV) self-cleaving ribozyme was used to remove the sequence at the end of transcribed Firmicutes bacterium reRNA. Based on the sequence similarity clustering and taxonomy (up to 6 proteins per species), 43 TnpBs from 41 bacterial strains of 25 different species (Supplementary information, Table S2) were selected to test the dsDNA cleavage activity in vitro. We transformed the expression plasmids of TnpBs into E. coli, purified RNP complexes to cleave the constructed plasmid library in vitro, followed by high-throughput sequencing (Fig. 2b). The high-throughput sequencing results showed that the purified RNP complexes had reRNA-guided dsDNA cleavage activity, and revealed their target-adjacent motif (TAM)23 sequences (Supplementary information, Fig. S2c).

Sequencing data analysis showed that 23 IS607 TnpBs from 22 bacterial strains of 17 different species cleaved dsDNAs with a TAM sequence of 5’-AGGAG (18/22 strains) or 5’-GAGGG (4/22 strains) (Fig. 2c; Supplementary information, Fig. S2c). To verify an essential role of the TAM sequences in the endonuclease activity of the IS607 TnpB systems, we purified the IS607 TnpB protein from Firmicutes bacterium AM43-11BH (ISFba1 TnpB, 387 aa) and assayed its DNase activity with different dsDNA substrates. Consistent with the sequencing data (Supplementary information, Fig. S2c), ISFba1 was fully active in cleaving dsDNA with 5’-AGGAG as the TAM sequence (Fig. 2c). Single mutations of any of these five TAM nucleotides resulted in complete loss of the dsDNA cleaving activity (Fig. 2d), indicating that the TAM sequence for ISFba1 TnpB system is highly specific. Sanger sequencing revealed that the ISFba1 TnpB-cleaved products formed a 5’-overhang end at the TAM-distal region (Fig. 2e), and most of the cleavage sites were located at 22 nt (> 60%) of the target strand and 16 nt (> 80%) of the non-target strand (Fig. 2f). This activity was dependent on the predicted RuvC active site (D194, E290 and D371) of ISFba1 TnpB (Fig. 2g; Supplementary information, Fig. S2d), indicating that RuvC is responsible for cutting both strands of DNA substrates. This is reminiscent of Type V CRISPR-Cas12 family nucleases.23,42 BLASTP43 analysis showed that the similarity between IS607 TnpB family effectors and previously reported active TnpB23,24 is less than 35%, and the regions of similarity were primarily concentrated within the RuvC domain. Together, these results demonstrate that ISFba1 TnpB is a novel TAM-dependent DNA endonuclease and cleaves dsDNA substrates in an reRNA-dependent manner.

We biochemically characterized the compact ISFba1 TnpB by evaluating the effects of divalent metal ions, salt concentration, and temperature on the dsDNA cleavage activity of ISFba1 TnpB in vitro. Our results showed that the endonuclease activity is Mg2+-dependent; under the condition of Mn2+ or Ca2+, it possessed low DNase activity (Supplementary information, Fig. S3a, b), and it achieved the highest endonuclease activity with 25–100 mM NaCl at the optimum temperature of 37 °C (Supplementary information, Fig. S3c, d). Screen of different sizes of the guide sequence of the Firmicutes bacterium reRNA showed that 20 nt is an optimal length for the DNase activity of ISFba1 TnpB (Supplementary information, Fig. S3e).

The trans-cleavage activity toward non-target single-stranded DNA (ssDNA) has been observed in RuvC domain-containing Cas12 family nucleases and adapted for detecting nucleic acids.44,45,46 To investigate whether the ISFba1 TnpB system has the trans-cleavage activity, we performed a cleavage assay using 5’-FAM-labeled non-target ssDNA as the substrate and a target ssDNA as an activator. The results showed that ISFba1 TnpB had trans-cleavage activity toward the non-target ssDNA in the presence of the ssDNA activator, as observed for CRISPR-Cas12a45,47 (Supplementary information, Fig. S2e).

DNA interference activity of IS607 TnpBs in bacteria

The results above demonstrate that IS607 TnpBs are programmable endonucleases. We then investigated whether the IS607 TnpB systems have DNA interference activity in bacteria. To this end, we performed plasmid and endogenous genomic DNA interference assays in E. coli. For the plasmid interference experiment, plasmids expressing 23 IS607 TnpB RNP complexes from 22 bacterial strains of 17 different species were individually co-transformed with a spectinomycin-resistant plasmid containing 5’-AGGAG or 5’-GAGGG TAM and 20-nt target sequence into E. coli. The clones were selected on plates containing kanamycin and spectinomycin by a 10-fold gradient dilution after induction of expression (Fig. 3a). Compared with the non-target control, the co-transformation of plasmids encoding 9 IS607 TnpB members from 8 strains of 6 different species and plasmids with target sequences led to more than 104-fold colony reduction, demonstrating robust plasmid interference ability of these IS607 TnpB systems in E. coli (Fig. 3b, c; Supplementary information, Fig. S3f, g and Table S2). Inactivation of the RuvC active site (D194, E290 and D371) prevented ISFba1 TnpB from generating DNA interference (Fig. 3b), indicating that the DNA interference activity of ISFba1 TnpB is dependent on its enzymatic activity. For the genomic DNA interference assays, IS607 TnpB RNP complex expressing plasmids with a 20-nt guide sequence targeting E. coli genome sites were constructed and transformed into E. coli and selected by kanamycin. A 10-fold dilution titration assay showed that 12 IS607 TnpB members from 11 strains of 7 different species effectively mediated the genomic DNA interference and inhibited growth of the bacterium E. coli (Fig. 3d; Supplementary information, Fig. S3h, i and Table S2). These data collectively demonstrate that the IS607 TnpB systems have the activity of cleaving dsDNA substrates in bacteria.

Fig. 3: IS607 TnpB systems have the DNA interference capability in bacteria.figure 3

a Scheme of plasmid interference in E. coli mediated by the IS607 TnpB systems. KanR, kanamycin resistant; SpeR, spectinomycin resistant. b Plasmid interference assay of wild-type ISFba1 TnpB (WT) and RuvC active site mutated ISFba1 TnpB (D194A, E290A, D371A) using 5’-AGGAG TAM. The culture samples were serially diluted (10×) and selected by the media supplemented with kanamycin (Kan) and spectinomycin (Spe) (left panel). NT, non-target control group; T, target group. Data shown are representatives of three independent experiments. Quantification of the corresponding plasmid interference assay (right panel). Data represent means ± SD of three biological replicates. Two-tailed unpaired t-test: ***P < 0.001, ns, not significant. c Quantification of the plasmid interference by TnpBs of ISFba1, ISCba1, ISRin, ISEre1, ISBsp4, ISClsp2, ISClsp3, ISFba3 and ISFba4 using 5’-AGGAG TAM. The corresponding serial dilution assay is shown in Supplementary information, Fig. S3f. The detailed information of the IS607 TnpB proteins is provided in Supplementary information, Table S2. Data represent means ± SD of three biological replicates. Two-tailed unpaired t-test: ***P < 0.001. d Quantification of the genomic DNA interference mediated by TnpBs of ISFba1, ISCba1, ISRin, ISEre1, ISBsp4, ISClsp2, ISClsp3, ISFba3 and ISFba4 in E. coli. The corresponding serial dilution assay is shown in Supplementary information, Fig. S3h. Data represent means ± SD of three biological replicates. Two-tailed unpaired t-test: ***P < 0.001.

IS607 TnpB-mediated genome editing in human cells

Having established that the IS607 TnpB systems cleave both plasmid and genomic DNA in bacteria, we further explored their genome editing ability in mammalian cells. We therefore constructed plasmids encoding 37 EGFP-linked IS607 TnpB proteins and their corresponding reRNAs carrying 20-nt genomic DNA-targeting sequences. These 37 TnpBs included 14 TnpBs with low homology (< 50% sequence similarity) to ISFba1 TnpB within the same species or with high homology (> 80% sequence similarity) to ISFba1 TnpB in different species and the 23 TnpBs that had been tested in bacteria (Supplementary information, Table S2). The processed TnpB contained a SV40 nuclear localization peptide at the N-terminus and a nuclear localization signal peptide at the C-terminus. The recombinant plasmids were transfected into 293F cells. After 72-h transfection, 293F cells expressing green fluorescent protein were sorted out and the genomic DNA of these cells was extracted for high-throughput sequencing of the targeted sites. The DNA editing efficiencies of the IS607 TnpBs expressed in the human cells were quantified by the frequencies of insertions or deletions (indels) generated on the target sites (Fig. 4a). The data from the assays showed that ISFba1 TnpB displayed editing efficiency at the EMX1 and VEGFA loci: 42.5% at EMX1-T1, and 17.0% at EMX1-T2 and 17.9% at VEGFA-T1 (Fig. 4b, c). These results indicate that ISFba1 TnpB has comparable genome editing activity with SpCas9 and LbCas12a when they were originally developed.15,16 IS607 TnpBs from other strains also demonstrated DNA editing activity, albeit with lower efficiency (Fig. 4d).

Fig. 4: IS607 TnpB-mediated genome editing in human cells.figure 4

a Scheme of the human cell line (HEK293F) genome-editing experiment. b The indel efficiencies of ISFba1 TnpB on endogenous loci EMX1-T1, EMX1-T2, VEGFA-T1, DNMT1-T1, DNMT1-T2 and PITX1-T1 in HEK293F cells, determined by next-generation sequencing (NGS). Data represent means ± SD of three biological replicates. Two-tailed unpaired t-test: ***P < 0.001, **P < 0.01. c NGS statistic of indel efficiency of ISFba1 TnpB on the endogenous locus EMX1-T1. d The indel efficiencies mediated by TnpBs of ISFba1, ISRin, ISFba4, ISEre1, ISAre, ISClsp3, ISDfo4, ISAfa and ISHun on the endogenous loci. Data represent means ± SD of three biological replicates. Two-tailed unpaired t-test: ***P < 0.001, **P < 0.01, *P < 0.05. The detailed information of different IS607 TnpB proteins is provided in Supplementary information, Table S2. The guide sequences are provided in Supplementary information, Table S5. e Heatmap representation of the ISFba1 TnpB cleavage efficiency with 60 single-nucleotide-mutated guide RNAs for EMX1-T1 target site. The identities of single base pair substitutions are indicated on the left; original guide sequence is shown at the bottom and highlighted in the heatmap (gray squares). The cleavage efficiencies were monitored by high-throughput sequencing. Modification efficiencies (increasing from white to blue) are normalized to the original guide sequence. 1–20, mismatch position 1–20. f Heatmap of the single mismatch tolerance of ISFba1 TnpB on the EMX1-T1, EMX1-T2, VEGFA-T1 target sites and SpCas9 on the EMX1 target site. 1–20, mismatch position 1–20; WT, original guide sequence. Data shown are representative of three independent experiments. The guide sequences are provided in Supplementary information, Table S5.

We next investigated substrate specificity of ISFba1 TnpB using the EMX1 and VEGFA sites in 293F. A set of 60 different guide RNAs were generated containing all possible single-nucleotide substitutions of nucleotides in positions 1–20 adjacent to the 5’-AGGAG TAM (Fig. 4e). Single mismatch between the guide and 1–13 bp of target sequence resulted in loss or significant impairment of the endonuclease activity of ISFba1 TnpB (Fig. 4f). In contrast to SpCas9, ISFba1 TnpB tolerated more single mismatches between the guide and target sequences (Fig. 4f).48,49 Consistently, ISFba1 TnpB shows more sensitivity to double mismatches when compared with SpCas9 (Supplementary information, Fig. S3j). Based on the EMX1-T1 site described above, we computationally selected 9 candidate off-target sites in the human genome with a 5’-AGGAG TAM. High-throughput sequencing results revealed that there was no gene editing at all predicted off-target sites (Supplementary information, Fig. S3k). These results indicated that the IS607 TnpB system can be harnessed as a programmable genome editing tool with high specificity.

Cryo-EM structure of ISFba1 TnpB–reRNA–target DNA

To elucidate the structural mechanism of reRNA-guided DNA targeting by the miniature IS607 TnpB systems, we focused on the ISFba1 TnpB system, which has high DNA cleavage activity both in vitro and in vivo. We determined the structure of a ternary complex comprising ISFba1 TnpB (D371A catalytic mutant), a 207-nt reRNA containing a 20-nt guide segment, a 36-nt target DNA strand and a 20-nt non-target DNA strand with a 5’-AGGAG TAM using cryo-EM at resolution of 3.0 Å (Fig. 5a–c; Supplementary information, Figs. S4, S5 and Table S3). The resultant cryo-EM density map allowed us to build the atomic model of the whole ternary complex (Fig. 5b), except for three residues at the N-terminus of the ISFba1 TnpB protein, and flexible loop regions in the reRNA and the target DNA. The structure revealed that ISFba1 TnpB and reRNA form a complex with 1:1 stoichiometry (Fig. 5a–c). ISFba1 TnpB consists of an amino-terminal TAM-interacting domain (TID) and a non-target-strand recognition (REC1) domain, a target-strand recognition (REC2) domain, a reRNA-binding domain (RBD), and a RuvC domain at its carboxy-terminus. The guide RNA–target DNA heteroduplex binds to the central channel formed by TID and REC1 from one side, and the REC2 and RuvC domains from the other side (Fig. 5a–c). The architecture organization mode of the ISFba1 TnpB ternary complex is distinct from those of Cas9, Cas12a and Cas12f50,51,52 (Supplementary information, Fig. S6). One significant difference is that ISFba1 TnpB possesses much smaller REC domains. Additionally, the reRNA in the ISFba1 TnpB system adopts a strikingly different fold from the RNAs in the other three systems. In ISFba1 TnpB system, the reRNA plays a scaffolding role in organizing the complex via stabilizing the conformation of the ISFba1 TnpB effector protein for targeting the dsDNA substrate. This results in solvent exposure of a large portion of the bound reRNA. By contrast, a much lower percentage of the bound RNAs in Cas9 and Cas12a is solvent exposed.50,53,54

Fig. 5: Cryo-EM structure of the ISFba1 TnpB–reRNA–dsDNA complex.figure 5

a Domain organization of ISFba1 TnpB. TID, TAM-interacting domain; REC, recognition domain. TID and RuvC domains are separated into two and three segments, respectively. b, c Cryo-EM reconstruction at 3.0 Å (b) and cartoon representations of the ISFba1 TnpB–reRNA–dsDNA complex (c).

ReRNA architecture

The reRNA from Firmicutes bacterium consists of a guide sequence of 20 nt, which forms an RNA–DNA heteroduplex with the target DNA strand, and the 187-nt RNA scaffold. The RNA scaffold contains a 5’ flanking (G(–22)–G(–50)) motif, RAGATH-18 (A(–51)–A(–122)) motif and a 3’ flanking (A(–123)–G(–187)) motif. The latter can be divided into a joint region (G(–22)–G(–50)), three stem loops (SL1–3) and two pseudoknots (PK1 and PK2) (Fig. 6a–c). The R-loop region contains the 10-bp 5’-AGGAG TAM-containing dsDNA, the 14-bp RNA–DNA heteroduplex and 6-nt non-target DNA strand (Fig. 6a). The upper region of the 5’ flanking motif (C(–1)–U(–21)) and the loop regions of SL1 and SL3 are disordered in the cryo-EM structure, suggesting flexibility of these regions in solution. Deletion and mutations of these disordered regions had no detectable impact on IS607 TnpB-mediated DNA cleavage (Fig. 6d), suggesting that they are dispensable for the formation of the IS607 TnpB complex. The 5’ flanking motif is located at the central portion of the reRNA, adopting a curved conformation to interact with both the RAGATH-18 and the 3’ flanking motif (Fig. 6b). Notably, the looped-out bases G(–30)–A(–34) of the apical loop of the 5’ flanking motif base pair with U(–182)–A(–186) of the 3’ flanking motif, forming PK2 to further stabilize the reRNA structure (Fig. 6c, left panel). Mutations of the bases G(–30)–A(–34) impaired ISFba1 TnpB RNP complex formation (data not shown), highlighting the importance of PK2 for ISFba1 TnpB function. The RAGATH-18 motif adopts a supercoiled structure with a long stem loop emanating from the three-motif junction region. The 3’ flanking motif contains two stem loops, with U(–134)–A(–136) flipping out from SL2 to form PK1 with bases of A(–54)–U(–53) and G(–120) of the RAGATH-18 RNA, further stabilizing the reRNA scaffold structure (Fig. 6c, right panel). Mutations disrupting base pairs in PK1 abolished the DNA cleavage activity of ISFba1 TnpB, supporting the functional importance of PK1 in the ISFba1 TnpB system (Fig. 6d).

Fig. 6: Structural organization of the reRNA.figure 6

a Schematic representation of the reRNA and target DNA. The disordered regions are enclosed in black dashed boxes. The pseudoknot regions are enclosed in red dashed boxes. The nucleotide colors are the same as the cartoon representations in Fig. 5c. TS, target strand; NTS, non-target strand; TAM, target adjacent motif; PK, pseudoknot. b Structure of the reRNA scaffold. c The conformation of PK1 (right panel) and PK2 (left panel) d In vitro cleavage of dsDNA substrates by ISFba1 TnpB RNP with full-length reRNA or truncated or mutant reRNA. AAAA, four consecutive ATP linker used to replace the nucleotides in the mutant regions. The agarose gel was visualized by EB staining. Data shown are representative of three independent experiments.

Recognition of the reRNA and target dsDNA by ISFba1 TnpB

The REC2 (residues R227, R230, R234, N238 and N249), RuvC (residues K299, N300 and R301) and RBD (residues K341 and K354) domains pack against the sugar-phosphate backbone of the reRNA at one side (Fig. 7a–c; Supplementary information, Fig. S7), whereas TID is anchored on SL3 of the 3’ flanking and 5’ flanking motifs of the reRNA through polar interactions via R130, R132 and R142 (Fig. 7d; Supplementary information, Fig. S7). In addition, H131 and Y148 of TID and W336 of RuvC stack against the nucleobases C(–150), G(–187) and A(–83) of reRNA, respectively, contributing to the interaction between ISFba1 TnpB and the reRNA scaffold. Sequence analysis of IS607 TnpBs from different bacterial species showed that these reRNA-recognizing residues of ISFba1 TnpB are conserved (Supplementary information, Fig. S8). These results suggest conserved reRNA-binding and consequently programmable endonuclease activities of the IS607 TnpB effector proteins. This conclusion is supported by our observation that many IS607 TnpB proteins displayed reRNA-guided dsDNA cleavage activity (Supplementary information, Fig. S2c).

Fig. 7: Recognition of the reRNA and target DNA.figure 7

a Inset shows the location of zoomed structure in bd. bd reRNA recognition by REC2 (b), RuvC and RBD (c) and TID (d). e Inset shows the location of zoomed structure in fh. f TAM recognition by the TID and REC1 domains. g In vitro cleavage of dsDNA substrates by ISFba1 TnpB RNP with wild-type (WT) or mutant ISFba1 TnpB protein. The agarose gel was visualized by EB staining. Data shown are representative of three independent experiments. h RNA–DNA he

留言 (0)

沒有登入
gif