Hidden secrets of the cancer genome: unlocking the impact of non-coding mutations in gene regulatory elements

The Human Genome Project, which generated the first map of the human reference genome, marked a pivotal milestone in genetics research. However, the significance of non-coding genomic regions, formerly considered “junk DNA,” remained largely unexplored. The convergence of large-scale sequencing technologies and computational biology pipelines in the field of functional genomics has revealed the importance of non-coding regions in orchestrating gene expression programs [1]. These regions, collectively known as gene regulatory elements (GREs) [2], have been classified based on their impact on gene expression into gene promoters, enhancer elements (EEs), insulator elements (IEs), and silencers. Additionally, genomic alterations on GREs, ranging from single nucleotide variants (SNVs) to larger structural variants (SV), can disrupt the expression of regional as well as distant genes in disease states, specifically in cancer [3]. Consequently, these previously overlooked genetic modifications can dramatically impact normal gene expression programs [4, 5] by affecting the binding of transcription factors (TFs) [6], altering genome organization [7], modulating chromatin accessibility [8], or changing regional DNA methylation levels [9] at GREs.

Two main types of mutations that play a pivotal role in various diseases are involved in GRE dysregulation: germline single nucleotide polymorphisms (SNPs) and somatic SNVs [10]. Notably, genome-wide association studies (GWAS) have linked different SNPs located within non-coding regions to various types of cancer [11]. In contrast, projects such as the Pan-Cancer Analysis Whole Genomes (PCAWG) have identified thousands of non-coding somatic SNVs in numerous cancer types [12, 13]. Regardless of the origin, these point mutations are enriched within the transcription factor binding sites (TFBS) of GRE sequences in cancer [14,15,16,17]. This area of functional genomics opens an opportunity to leverage the clinical utility of non-coding mutations in different disease states, specifically in the context of cancer, bringing a chance to improve diagnostic, prognostic, and predictive models to improve patient’s clinical outcomes. While the impact of large SV on precision oncology has been discussed elsewhere [5], this review provides an overview of the recent findings on the functional impact of non-coding somatic and germline single nucleotide alterations affecting GREs in cancer. Considering the growing body of evidence highlighting the clinical significance of SNVs within non-coding regions of the genome, there has been a surge of innovation in technologies aimed at their comprehensive characterization and the exploration of their intricate molecular mechanisms [18, 19], which is also discussed.

Types and definitions of gene regulatory elements

GREs are defined by a specific combination of histone marks and conglomerates of TFBS [20,21,22,23]. Based on the impact on the expression of regional as well as distant genes, GREs are classified into promoters, enhancer elements (EEs), insulator elements (IEs), and silencers (Fig. 1). Regarding the annotation of GRE, it is crucial to acknowledge the work conducted by the ENCODE (Encyclopedia of DNA Elements) consortium, which employed different experimental techniques – including ChIP-seq of TFs and histone marks, RNA-seq, among others – to characterize the regulatory elements in the human genome [24, 25]. This section provides information about each type of GRE to better understand the impact of single nucleotide variations on cancer biology.

Fig. 1figure 1

Schematic overview of GREs and their chromatin interactions. The figure shows a DNA strand with promoters (P with red circles), enhancer elements or super-enhancers elements (EEs or SEEs in yellow), insulator elements (IEs in green) with CTCF binding, and silencer element (grey). The representation includes a Topologically Associating Domain (TAD), transcription factors (TFs), mediator protein complex, histones, and their marks in each type of GRE

Gene promoters comprise sequences upstream of the transcription start site (TSS), where the transcription machinery is assembled [26]. Many genes have been described to have alternative TSS [27]; as a result, different promoters can be associated with a single gene. However, the impact of gene promoters is usually associated with a nearby single gene. On the other hand, enhancer elements (EE) are defined by clusters of TFBS whose activation may affect the expression of both regional and distant genes by recruiting coactivators in cooperation [28]. Thus, these cis-regulatory elements have a highly variable location relative to the target genes [29]. EEs are activated or repressed in a spatial–temporal manner to define cellular fate during development [30]. As a consequence of its activation, the chromatin is looped allowing the proximity between EEs and promoters through the action of mediator proteins called cohesins [31]. Moreover, a single EE can regulate multiple genes, and one gene can be regulated by multiple EEs [32]. In addition, conglomerates of EEs have been defined as super-enhancer elements (SEEs). These GREs span large genomic regions and are enriched in binding motifs for master TFs and cofactors [33, 34]. Multiple TFs can occupy SEEs, modulating gene expression through SEE-promoter interactions, and forming core transcriptional regulatory circuits [35]. These elements are capable of driving cell-type-specific genes involved in key hemostatic functions and defining cell fates. Thus, the alteration of SEEs has been demonstrated to be crucial for tumor development and progression, as well as in therapeutic drug resistance or insensitivity [36, 37]. Another type of GREs known as silencer elements have the opposite effect compared to EEs. These regulatory elements repress gene expression by blocking the TF aggregation on either the gene promoter or upstream regulatory elements [23, 38]. Moreover, dual-function regulatory elements (REs) have been characterized in Drosophila [39], yet their presence in mammals remains unexplored. These genomic regions exhibit the capacity to function as both EEs and silencer elements. Notably, more than 5% of human silencers display regulatory element properties, underscoring the versatility of REs [40]. Finally, interactions between gene promoters and EEs can be influenced by another type of GRE that acts as boundary elements, known as insulator elements (IEs) [41]. These types of GREs are responsible for generating and maintaining the chromatin structural units called Topologically Associating Domains (TADs), which divide the genome into different compartments confining the interaction of GREs inside TADs [42]. Thus, alterations affecting IEs disrupt the TAD organization and have also been confirmed to contribute to tumorigenesis [43]. Activation of IEs mainly involves the binding of two critical proteins, CCCTC-binding factor (CTCF) and cohesin (RAD21) [44, 45]. Therefore, dysregulation of IEs alters gene expression programs by reshaping the landscape of promoter-EE interactions. Apart from single nucleotide mutations involving CTCF binding sites, many IEs can be impaired through abnormal DNA methylation [46, 47].

Cancer-associated non-coding single nucleotide mutations in GREs

Numerous SNPs and SNVs have been identified outside of coding genomic regions [48, 49]. Mechanistically, these alterations can influence the stability of GREs, leading to an alteration in the balance between the expression of tumor suppressor genes and oncogenes [50,51,52]. In this context, genomic alterations that lack measurable biological or phenotypic effects are often referred to as "passenger mutations" [53], whereas mutations conferring advantages to tumors are denoted as "driver mutations". The latter can be further categorized as either "major drivers" or "mini drivers", based on their magnitude of impact [54]. Another important factor in determining the impact of the SNV is the type of GRE affected. Tables 1, 2, 3 highlight the most important SNVs associated with cancer, including both SNPs and somatic mutations that affect promoters, EEs, and IEs, respectively.

Table 1 Non-coding mutations in promoter regions with impact on cancer developmentTable 2 Non-coding mutations in EEs and SEEs promoting alterations in TF affinity with impact on cancer developmentTable 3 Single nucleotide mutations impacting IEs stability on cancer developmentNon-coding single nucleotide mutations within gene promoters

SNPs in promoter regions that disrupt the TFBS are studied across various tumor types, including lung cancer [55], hepatocellular carcinoma [56], neuroblastoma [57,58,59], and breast cancer [60,61,62]. A well-described example of germline single nucleotide mutations in tumorigenesis are the SNPs located on the promoter region of the oncogene Murine Double Minute 2 homolog (MDM2) [63]. MDM2, which is under the control of two distinct promoters, P1 and P2 [64], can negatively modulate the tumor suppressor p53, targeting it for proteasomal degradation [65]. For example, the G-allele of the rs2279744, known as SNP309 at the P2 promoter increases MDM2 expression by elongating the Sp1 TFBS. This alteration significantly reduces the tumor suppressor p53 levels [66], ultimately enhancing the risk of cancer development in humans, as depicted in Fig. 2A. In the context of melanoma pathogenesis, the SNP309 variation generates a stronger E2F1 binding site (Fig. 2B), which is responsible for cyclin D1 modulation and tumor proliferation [67]. Another germline mutation described within this promoter (rs117039649), located just 24 bp upstream of the SNP309, has the opposite impact, by reducing the Sp1 binding affinity and, therefore, the expression levels of MDM2 in ovarian and breast cancer [68]. Furthermore, a third SNP (rs2870820) found on the MDM2 promoter, known as SNP55, leads to an allele-specific expression by impairing NF-κB binding (Fig. 2C) [69]. Thus, the MDM2 gene highlights the complex interplay between genetic variations and gene regulation, demonstrating that the same promoter can be affected by different SNPs, causing a substantial differential effect in pathogenesis.

Fig. 2figure 2

SNPs in MDM2-P2 promoter and its oncogenic consequences. A The presence of the germline alteration rs2279744 promotes Sp1 binding diminishing the p53 tumor suppressor pathway [66]. B The same mutation in the MDM2-P2 promoter generates a strong affinity with the E2F1 TF, modulating cyclin D1 and generating tumor proliferation [67]. C rs2870820 (SNP55C > T) is related to MDM2-P2 transcriptional activity. SNP55C has an affinity for NFκB p50 homodimers and it suppresses oncogene MDM2 transcription. However, the alteration does not retain this affinity with NFκB p50 favoring oncogene transcription [69]

Somatic SNVs have been identified as affecting gene promoters in different cancer types as well [70,71,72]. One of the most relevant findings was in the human telomerase reverse transcriptase (TERT) gene [73, 74]. In glioblastoma, Bell et al. discovered two somatic SNVs (chr5:1,295,411; G > A and chr5:1,295,433; G > A) in the TERT core promoter, which led to an enhanced GABP recruitment [75]. In melanoma, the TERT promoter contains two highly recurrent somatic SNVs (chr5:1,295,228; C > T, and chr5:1,295,250; C > T) allowing the binding of the ETS TF [76]. The consequence of the increased affinity of these TFs is the reactivation of TERT, a common mechanism in multiple cancers that allows bypassing the replicative senescence [76]. Another example is found in the promoter of SEMA3C, a gene related to tumor development in glioma stem cells [77]. The presence of a somatic SNV (chr7:80,552,013; T > C) has been found to modify the binding affinity of several TFs, such as RUNX1, ZNF354C, FOXA2, and EN1. Importantly, this mutation alters the binding site for FOXA1 in the SEMA3C promoter, leading to a reduced TF binding to the region [78]. Similarly, a somatic SNV in the FOXA1 promoter region (chr14:38,064,406; G > A) has been detected in primary breast cancers [79]. The mutant motif creates a stronger binding site for TF members of the E2F family, promoting high expression levels of FOXA1. This gene works as a transcriptional pioneer factor in breast cancer, enhancing chromatin accessibility for estrogen receptor interaction to its genomic targets [80], and has been linked to decreased response to fulvestrant, an estrogen receptor antagonist [81, 82]. In melanoma, the SDHD promoter contains different C > T transitions within the core ETS TF binding motifs, such as C524T and C523T, specifically affecting the binding of GABPA, GABPB1, and ETS1 [71, 83]. These alterations lead to a decreased expression of SDHD, which is associated with an unfavorable prognosis [83]. Furthermore, in primary liver cancer, Lowdon RF et al. identified a somatic mutation (chr4:81,187,908; A > T) in the FGF5 promoter region, which generates a new MYC binding site and enhances FGF5 expression [84]. SNVs at promoter regions affecting gene expression in cancer have been compiled in Table 1.

Single nucleotide mutations affecting enhancer and super-enhancer elements in cancer

Non-coding single nucleotide mutations within EEs and SEEs have been shown to disrupt critical TFBSs and influence transcriptional regulation through intricate interactions between these genetic variations and the epigenomic landscape. GWAS studies have demonstrated this phenomenon across a spectrum of cancer types, including but not limited to ovarian cancer [

留言 (0)

沒有登入
gif