Genome characterization and CRISPR-Cas9 editing of a human neocentromere

Neocentromeres and evolutionary new centromeres (ENCs), defined as repositioned centromeres fixed in primate species distinguishing orthologous chromosomes (Cardone et al. 2007; Rocchi et al. 2009), have been extensively characterized, and the existence of latent centromeres was also proposed as a possible reason for the emergence of neocentromeres (Ventura et al. 2003). Nevertheless, little is still known about the mechanisms underlying their formation and evolution and, more generally, data explaining the essential features of the centromeric function are scarce, mostly due to the highly repetitive nature of primate centromeres that has hindered accurate molecular characterizations of these regions for a long time (Murillo-Pineda and Jansen 2020). Recently, the first telomere-to-telomere assemblies of human chromosomes, which include repeated arrays at centromeres and pericentromeres, have been produced with novel breakthrough technologies, opening the field to previously impossible analyses (Logsdon et al. 2021; Miga et al. 2020). This will allow, for example, the detailed intra- and inter-species comparison of the genomic sequence, organization, and epigenetics of centromere, neocentromeres, and evolutionary new centromeres without the limit of the sequence gaps mostly caused by repetitive sequences.

With this work, we investigate two main aspects: (i) we characterize the sequence and organization of a region harboring a human neocentromere that is able to be inherited through successive generations, and (ii) we test how this chromosome would evolve after a chromosome breakage. The question we have attempted to answer is: is the canonical centromere pushed to a reactivation in a hybrid cell line? What is the fate of two chromosome fragments with an active and inactive centromere in a low selective pressure context?

Considering the abovementioned, starting with a stabilized cell line from a male fetus (Ventura et al. 2004) heterozygote for the Neo3, we created a somatic cell hybrid to genetically isolate Neo3 from Cen3, its wild-type homolog. This cell line represents a useful cellular model to study the peculiar characteristics underlying the formation and evolution (if followed over time) of a centromere. By means of a ChIP-on-chip experiment, we defined a segment of about 160 kb as the main region affected by centromeric repositioning (Fig. 1). Besides, we sequenced by long-read methods the main domain of the neocentromere and the flanking regions, for a total of about 300 kb, revealing no major differences from the human reference genome (GRCh38/hg38).

No structural variation is present in the assembly sequence. Both regions (Neo3 and the corresponding region on GRCh38/hg38) are gene deserts as it was quite expected. They show comparable features: both are AT-rich sequences and with a concentration of repeated elements above 40%. The base composition is a prerequisite for the establishment of a new centromere, since centromeric regions are AT-rich structures due to the presence of the alpha satellite. However, the Neo3 sequence, like that of almost all neocentromeres, is AT-rich but devoid of alpha-satellite and this demonstrates the importance of the epigenetic control. In this view, the sequencing of neocentromeres is a push towards the knowledge of the structural features underlying their formation. Interestingly, the number of LINE1 insertions is increased in the Neo3 sequence (File S5), in agreement with the view that L1 plays a role in regulating neocentromere activity (Chueh et al. 2009). However, the LINE1 family is reported to have insertion site preference in regions of constitutive AT-rich heterochromatin (Acosta et al. 2008; Marsano and Dimitri 2022; Waters et al. 2004). Probably, the higher number of elements found in the sequenced Neo3 region is a consequence of this aspect. In addition, the presence of these extra LINE-1 elements compared to the wild-type sequence enlarged the Neo3 region.

Therefore, our data consolidate the hypothesis that the neocentromere formation and, more generally, the centromeric function are essentially epigenetic, as previously postulated (Gary et al. 1997), but it also opens to currently purely speculative considerations on a possible structural role played by retrotransposons such as L1.

In the last decade, the introduction of editing methods such as CRISPR-Cas9 has provided an accessible tool for genome manipulation. Different genomic structural variations have been induced for very different purposes (Blasco et al. 2014), and neocentromere formation by deletion of the endogenous centromere has been induced in different model organisms such as Schizosaccharomyces pombe, Candida albicans, Cryptococcus deuterogattii, or chicken cells (Ishii et al. 2008; Ketel et al. 2009; Schotanus and Heitman 2020; Shang et al. 2010). Recently, CRISPR-Cas9 methods have been successfully used for the first time to induce the seeding of a neocentromere (on chromosome 4) in the complex context of human cells, by excising an 8-Mb centromeric region and thus providing an excellent system to study the chromosomal site “before” and “after” the centromere activation (Murillo-Pineda et al. 2021). Although the neocentromere region was gene poor, neither sequence nor transcription changes have been revealed at the seeding site after 200 cell divisions, indicating that the satellite acquisition observed at newly formed centromeres over the course of evolution takes much longer evolutionary times (Rocchi et al. 2012; Tolomeo et al. 2017).

We here apply the CRISPR-Cas9 technology and induce a peri-neocentromeric break in chromosome 3 to generate a large acentric fragment containing the inactivated Cen3 and a small 3q terminal section harboring Neo3.

Our results show that no reactivation of the canonical, alpha-satellite-rich Cen3 was induced to rescue this acentric fragment. Likely, although bearing the “memory” of an active centromere (Cardone et al. 2006; Ventura et al. 2003; Ventura et al. 2004), the absence of selective pressure played against it. Indeed, in a hybrid cell line, the human chromosomal content has no role in cell propagation, so the presence or absence of one or more chromosomes has no effect. The other side of the coin is that the absence of a homologous chromosome allowed the recovery of an apparently high number of cytogenetic rearrangements (7 out of 20 isolated clones, 35%) detectable only by cytogenetic methods (Rayner et al. 2019). Importantly, in order for a linear chromosome to be stable in a cell line, the presence of a functional centromere is not sufficient, since the existence of the two intact telomeres is essential. Therefore, linear chromosomes with terminal breaks are rescued if stabilized by further structural rearrangements (O’Sullivan and Karlseder 2010).

Our genome editing procedure allowed us to follow the destiny of the two derivative chromosomal fragments: an acentric, satellite-rich big piece of roughly 147 Mb, and a small (about 51 Mb) acrocentric fragment containing the neocentromere. Following the breakage, the biggest piece was totally lost, likely being able neither to repair the terminal damage, nor to activate any centromere. Instead, the small terminal fragment containing Neo3 stabilized in three different ways, by forming a small acrocentric, by fusing with hamster chromosomes or by creating a small metacentric chromosome (Fig. 2, S6 and S7).

Very interesting is the clone observed in File S8, where a partial duplication of the signal produced by the distal probe (green signal) appears near the break site. It is likely that the duplication includes telomeric sequences to stabilize the broken chromosomal fragment, but further experiments are needed to verify it.

Detailed cytogenetic characterization of the small metacentric chromosome showed that it was, in fact, a newly formed pseudodicentric isochromosome with a functional centromere in which the terminal 3q composes both chromosomal arms (Fig. 2f, g). Isochromosomes form as a consequence of centromeric misdivision following a transverse division that separates the p and q arms (Wolff et al. 1996), as depicted in Fig. 4, which shows a model describing the mechanism leading to the derivative isodicentric chromosome we found.

Fig. 4figure 4

Model of isochromosome formation. Shown is a hypothetical model of the creation of the isochromosome containing two copies of the region 3q24 to qter, after the induced breakage of Neo3 by CRISPR-Cas9 methods

Although being strongly negatively selected in vivo, isochromosome formation is far from rare in clinical cases, where they are associated to neoplasia (Mertens et al. 1994) and genetic disorder as Turner (Dalton et al. 1998) and Pallister-Killian (Izumi and Krantz 2014) syndromes. Interestingly, the specific occurrence of 3q sSMCs has been already reported (Barbi et al. 2003; Cunha et al. 2016; Gimelli et al. 2007; Izumi et al. 2008), and previous studies have proposed the presence of the BCL6 gene (3q27.3) as an explanation for the positive selection of cells containing multiple copies of this small fragment of the genome. This gene is, indeed, considered responsible for the acquisition of the cellular proliferative advantage seen in lymphomas (Batanian et al. 2006).

However, it has already been described in maize, for example, that an isochromosome was formed following a chromosome break (Douglas et al. 2021).

In conclusion, we have shown that the selective pressure exerted by a living organism or in vitro by a cell line is essential to rescue chromosome fragments derived from a double-stranded break.

We have also shown that with CRISPR/Cas9 technology it is possible to generate a pseudodicentric isochromosome in an in vitro system. This will be useful to build up cellular models for simulating the biological and pathological conditions in which isodicentric chromosomes are often observed.

留言 (0)

沒有登入
gif