Inteins: A Swiss army knife for synthetic biology

The central dogma of molecular biology describes the sequential direct residue-by-residue information flow between DNA, RNA and protein sequences Crick (1970). However, discrepancies have been observed between the information encoded in DNA and the resulting RNA/proteins. These discrepancies were later reconciled through the phenomenon of splicing.

In 1977, a mechanism known as RNA splicing was discovered, which explained the variations between mRNA and DNA sequences Chow et al. (1977); Berget et al. (1977). RNA splicing plays an essential role in eukaryotic cells as it involves the removal of specific sequences, called introns, from pre-mRNA Wilkinson et al. (2020). The remaining sequences, known as exons, are then joined together (see Fig. 1(a)). This reaction can occur either through self-catalysis Sharp (1994) or with the assistance of ribonucleoprotein complexes like the spliceosome Wilkinson et al. (2020). The resulting product, referred to as mature mRNA, is subsequently exported from the nucleus and serves as the template for the translation process. A remarkable aspect of RNA splicing is the ability to generate a wide range of coding mRNAs and, as a result, enables the production of diverse proteins from a single gene. This cellular mechanism, known as alternative splicing, provides the ability to reuse and rearrange nucleotide sequences, ultimately enhancing the cell's fitness by modifying the properties of its proteins Black (2003).

RNA splicing explains the disparities between DNA and mRNA sequences, but it does not account for the differences between mRNA and protein sequences. This puzzle was unraveled in 1990 when protein splicing was discovered Kane et al. (1990); Shah and Muir (2011); Mootz (2009). Since its discovery, this process has been observed in all domains of life Nanda et al. (2020). Intein-based protein splicing is autocatalytic and does not require additional cellular machineries such as spliceosomes Perler (2005). Nonetheless, similar to introns and exons in RNA splicing, the corresponding components in protein splicing are termed inteins and exteins (see Fig. 1(a)). Once the intein is translated, it autocatalytically performs the protein splicing reaction by breaking and re-forming peptide bonds.

It is worth noting that unlike RNA splicing, no apparent advantage has been found for host organisms harboring inteins. While there are speculations that inteins can be beneficial for the host by regulating protein expression, inteins are generally believed to act as selfish or parasitic genes Nanda et al. (2020); Green et al. (2018). In fact, there is an observed negative correlation between the fitness of an organism and the number of inteins in its genome Nanda et al. (2020). Furthermore, many inteins naturally possess homing endonucleases, which facilitate their invasion of different genomic loci. Horizontal gene transfer, commonly performed by viral vectors, is widely speculated to be the primary mechanisms by which inteins are spread throughout different organisms Novikova et al. (2014); Green et al. (2018). When an intein is introduced into a new host, the homing endonucleases target and disrupt conserved essential genes, such as polymerases, which the host cannot effectively silence Novikova et al. (2014). The insertion itself is not lethal due to the intein's scarless excision upon translation. However, this property makes it exceptionally challenging for hosts to eliminate these “parasites” since their removal necessitates a precise deletion of the intein sequence. In fact, there is actually a strong selection pressure for optimizing the intein-based splicing reaction in order to restore as many gene products as possible and mitigate the negative fitness impact incurred by the intein Shah and Muir (2014).

Another intriguing observation is the absence of inteins in the genomes of multicellular organisms Shah and Muir (2014). The reason behind this phenomenon remains unclear. However, one hypothesis posits that multicellular organisms are typically diploid or polyploid, meaning they possess multiple copies of the genome. In such cases, the chromosome without the intein can serve as a template for homologous recombination, effectively removing the intein entirely through a precise process.

Apart from the standard inteins described so far, which are also commonly referred to as “full” or “maxi” inteins, there are two additional types Nanda et al. (2020). The so-called “mini” inteins are considerably shorter due to in-frame deletions of the homing endonuclease. Although they can still catalyze the splicing reaction, they do not have the ability to invade other loci and are therefore no longer considered mobile genetic elements. The other type is known as the “split” intein Wu et al. (1998). As the name suggests, the split intein is separated into two parts which are usually translated from two separate genes. The N-terminal part is abbreviated as IntN, while the C-terminal part as IntC. It is assumed that split inteins naturally arise through a DNA strand break within the intein itself, followed by a translocation event Shah and Muir (2014). Splitting an essential gene at the intein is in general lethal unless the second part of the intein happens to be inserted downstream of another promoter. As demonstrated in Fig. 1(a) and (b), some split inteins have the ability to heterodimerize through a process called capture and collapse and perform a trans-splicing reaction, reconstituting an essential gene by ligating the N-terminal exon of IntN with the C-terminal exon of IntC Shah et al. (2013).

Mini inteins and split inteins, in particular, are frequently utilized in biochemistry. In addition to their natural occurrence, they can also be generated synthetically from maxi or mini inteins, respectively Pinto et al. (2020); Lin et al. (2013); Mootz (2009); Beyer et al. (2020); Aranko et al. (2014). In fact, several inteins have been successfully split into three pieces, with the additional third part referred to as IntM (middle part) Fig. 1(a) Sun et al. (2004); Lienert et al. (2013); Weis et al. (2023). IntM does not directly participate in the splicing reaction, but it plays an essential role in the proper dimerization and folding of all the intein fragments.

Inteins have a rich history of extensive utilization in various in vitro biochemical processes Shah and Muir (2014). They have gained significant popularity among chemists due to their safety, autocatalytic nature, rapid reaction rates, precise splicing, and compatibility with physiological conditions. This is particularly advantageous as organic molecules like proteins can be sensitive to harsh conditions, such as high temperatures or extreme pH values, which can limit the use of conventional chemical techniques. In fact, inteins have been successfully applied in diverse areas, including protein synthesis Agouridas et al. (2019), protein purification Prabhala et al. (2022), protein stabilization Hayes et al. (2021), protein labeling Volkmann et al. (2012), and biosensors Kang et al. (2022); Jeon et al. (2018) among others. Their use in these applications has proven effective and reliable, contributing to advancements in the field of biochemistry and protein manipulation.

While inteins have long been embraced and utilized by chemists, their widespread application within the synthetic biology community has been relatively limited until recently. However, synthetic biologists are increasingly recognizing the value of inteins, thanks to the discovery of new inteins, advancements in their synthetic generation, and more comprehensive in vivo characterization Wang et al. (2022); Shah and Stevens (2020); Dassa et al. (2009); Stevens et al. (2016).

Inteins form a large and diverse protein class with thousands of inteins identified (NCBI, 2024), exhibiting significant variations in size, speed, and efficiency Aranko et al. (2014). Additionally, they have evolved in different environments Hiltunen et al. (2021), resulting in varying temperature Carvajal-Vallejos et al. (2012), salt Ciragan et al. (2016), and pH requirements Lahiry et al. (2018). A notable in vivo cross-comparison study conducted by Pinto et al. in E. coli stands out as one of the most comprehensive thus far Pinto et al. (2020). The study involved the insertion and disruption of mCherry with 34 different mini inteins. The efficiency of the cis-splicing reaction was assessed by quantifying the reconstitution of mCherry. Based on these results, a library of 20 reliable split inteins was generated. Three split sites were selected for every intein generating a new library of 60 split intein pairs. The trans-splicing properties of the split versions were quantified in a similar fashion by evaluating the reconstitution of the two split mCherry proteins expressed from separate promoters. The successful pairs were then cross-compared, leading to the creation of a highly valuable library containing 15 mutually orthogonal split inteins.

In this review, we will highlight how synthetic biology can benefit from such libraries and explore how inteins have been employed in the past to address a wide range of problems. In the upcoming sections, we will explore the difficulties associated with the utilization of inteins and strategies for surmounting these difficulties. Additionally, we will delve into the realm of conditional inteins and their practical applications. Lastly, we will discuss the utilization of inteins as crucial genetic components for executing computations and implementing feedback control mechanisms.

留言 (0)

沒有登入
gif