Making sense of the linear genome, gene function and TADs

Characteristics of TADs in cortical neurons and embryonic stem cells (ESCs)

We analysed ESC and cortical neuron Hi-C data from Bonev et al. [25] using the Juicer pipeline [26]. This data represents some of the highest quality mammalian Hi-C data currently in the public domain. Throughout this work we have focused on autosomal TADs, so unless explicitly stated TADs refer to autosomal TADs only. We annotated 8371 (median size 0.29 Mb) and 5950 (median size 0.29 Mb) TADs in ESCs, and 8001 (median size 0.32 Mb) and 5430 (median size 0.33 Mb) TADs in cortical neurons, using two TAD callers Arrowhead [26] and TopDom [27], respectively (Fig. 1A–B). We detected similar sized TADs with Arrowhead and TopDom for both cell types. Our results confirm the finding from Bonev et al. (found using the directionality index TAD calling method) that there are more and smaller TADs in ESCs than cortical neurons [25].

Fig. 1figure 1

Features of autosomal TADs in ESCs and cortical neurons. A The number of TADs called using Arrowhead and TopDom in ESCs and cortical neurons. More TADs are called in ESCs than cortical neurons with both TAD callers. B Size of TADs called using Arrowhead and TopDom in ESCs and cortical neurons (plotted on a log10 scale). Both Arrowhead and TopDom call significantly smaller TADs in ESCs than cortical neurons (Wilcoxon test, p-value: p < 0.001 = ***, p < 0.01 = **, p < 0.05 = *, ES = effect size calculated using r for Wilcoxon). C The number of TADs per chromosome is strongly correlated with the size of the chromosome. D In both cell types and with both TAD callers most TADs have few genes. Overall, there is a low correlation between TAD size and gene number. Several TADs containing many genes were further investigated and found to contain multiple members of large gene families (annotated)

To learn more about the distribution of TADs in the genome, we looked at their association with chromosomes and protein-coding genes, matching the expected null, we observed that, for both TAD callers and datasets, the number of TADs on a chromosome correlates strongly with the size of the chromosome (Fig. 1C) (r range 0.94 to 0.98).

Most TADs contain relatively few genes (mean number of genes: ~ 2.74 and ~ 3.49 (Arrowhead), and ~ 2.15 and ~ 2.62 (TopDom) in ESCs and cortical neurons, respectively) and there is little correlation between the number of genes within a TAD and the TAD size (r range −0.04 to 0.25) (Fig. 1D). We investigated several TADs which contained a large number of genes and found that they contained genes from large paralog families, e.g. olfactory genes and protocadherin (Fig. 1D). This is consistent with previous studies which have noted that genes from these functional families tend to fall within the same TAD, likely due to their shared regulatory requirements [12].

TAD randomisation

In order to globally assess the functional similarity between genes in the same TADs, we sought to synthesise “random TADs” representing the null distribution. We developed two randomisation strategies in order to generate two null distributions controlling for different possible confounding signals. In the first randomisation strategy, which we refer to as random TADs, we maintained the basic structure of real TADs, i.e. TAD size, number of genes within the TAD and the approximate TAD overlap structure. This allowed us to control for the influence of linear gene order and distance which are known to correlate with gene functional similarity [23, 24]. In this randomisation strategy, the position of each TAD was randomised within the same chromosome to a new region of the same size, containing the same number of genes as the original TAD. For TopDom random TADs, TAD overlapping was prevented, reflecting the non-overlapping structure of TopDom TADs. For Arrowhead random TADs, if the new random TAD overlapped an existing random TAD this was controlled in order to favour “nested” TADs, thereby approximating the global TAD overlap structure seen in Arrowhead TADs (see “Methods”) (Figs. 2B, 3A). In the second randomisation method, which we refer to as random genome TADs, we again maintained the basic TAD structure but removed signal attributed to the linear gene. In order to do this, the positions of TADs were maintained but the identity of the genes in the genome were randomly shuffled within each chromosome (Fig. 2C). Using both randomisation strategies allows us to disentangle the functional similarity of genes within the same TAD from the functional similarity which can be attributed to proximity in the linear genome. Each TAD randomisation method was run 100 times for each cell type and each TAD caller.

Fig. 2figure 2

Randomisation and functional analysis procedure. A Schematic representing the structure of annotated TADs. B Null dataset one: random TADs. TADs were randomised within the same chromosome by selecting regions of equal size to the original TAD which also contain the same number of genes, thus controlling for the effect of the linear genome. C Null dataset two: random genome TADs. In order to remove the effect of the linear genome another null TAD set was generated in which the TADs remained in the same positions but the order of genes on the chromosome were randomised. D Pairwise strategy for comparing functional similarity between genes in the same TAD. All possible pairs of genes in each TAD were compared

Fig. 3figure 3

Features of autosomal TADs vs random TADs. A TADs (black) vs an example set of random TADs (blue) shown on the Hi-C matrix for the equivalent region of Chr2 in both ESCs and cortical neurons (CN). Matrices visualised using JuiceBox. B Median distance between gene start coordinates in TADs (dotted line) vs the median distance between genes in 100 sets of random TADs (plotted on a log10 scale). Genes are significantly closer together in random TADs than TADs (Wilcoxon test, median p-value: p < 0.001 = ***, p < 0.01 = **, p < 0.05 = *, ES = median effect size calculated using r for Wilcoxon). C Proportion of TADs with a CTCF binding site within ± 10 kb of both boundaries, one boundary or neither boundary. As expected a greater proportion of TADs have a CTCF binding site near both boundaries than in an example set of random TADs

In order to compare the functional similarity of genes within TADs to genes within random TADs or random genome TADs, we adopted a pairwise comparison approach (Fig. 2D). For every feature investigated, every possible pair of genes within a TAD/random TAD was compared in order to generate a distribution of scores. As TADs detected using Arrowhead can be nested or overlapping, the same pair of genes can exist in multiple Arrowhead TADs, in these cases gene pairs were considered only once. This means that for nested TADs (one TAD falls fully within another) pairs of genes from the largest of the nested TADs were considered and for overlapping TADs (only part of one TAD falls within another) pairs of genes from both TADs were considered with pairs of genes from the overlapping region considered once. The distribution of scores for each feature in TADs was compared to each of 100 sets of random TADs and the median p-value was reported.

To assess the gene distribution within TADs, we compared the distance between genes in TADs to genes in random TADs. We found that for both TAD callers, and both cell types, genes are significantly further apart in TADs than in random TADs (Fig. 3B, median p-value < 0.001).

It has previously been shown that TAD boundaries are enriched for CTCF which is hypothesised to play a crucial role in TAD formation by loop extrusion [5,6,7,8]. To assess this in our data we tested for the presence of CTCF ChIP-seq peaks near TAD boundaries vs random TAD boundaries (Fig. 3C, Additional file 1: Fig. S2). We observed that ~ 62%, 59%, 28%, 35% of ESC Arrowhead TADs, ESC TopDom TADs, cortical neuron Arrowhead TADs and cortical neuron TopDom TADs, respectively, had a CTCF ChIP-seq peak within in ± 10 kb of both TAD boundaries. This is compared to ~ 29%, 31%, 4.5%, 4.9% of ESC Arrowhead random TADs, ESC TopDom random TADs, cortical neuron Arrowhead random TADs and cortical neuron TopDom random TADs, respectively. Supporting previous reports [4, 6, 28, 29], this suggests that CTCF binding is common at the boundaries of TADs and is more prevalent than expected if TADs were randomly placed. This result also shows that more ESC TADs have a CTCF ChIP-seq peak near both boundaries than cortical neuron TADs. This could be due to a reduction in the number of chromatin domains formed by loop extrusion involving CTCF during differentiation. We noted that this still left 30%, 30%, 46% and 42% which had a CTCF ChIP-seq near only one boundary and 7.9%, 12%, 27% and 23% which did not have a ChIP-seq peak near either boundary, in ESC Arrowhead TADs, ESC TopDom TADs, cortical neuron Arrowhead TADs and cortical neuron TopDom TADs, respectively. This suggests that these domains may not be formed by loop extrusion involving CTCF and therefore may represent a different class of domains [11].

In order to assess the features of these TADs separately, we split TADs into CTCF TADs (which we define as TADs with a CTCF ChIP-seq peak within ± 10 kb of both boundaries) and nonCTCF TADs (which we define as TADs with a CTCF ChIP-seq peak within ± 10 kb of one, or neither boundary) (Additional file 1: Fig. S2). We compared the size of CTCF TADs and nonCTCF TADs between ESCs and cortical neurons. In both CTCF TADs and nonCTCF TADs we found cortical neuron TADs were significantly larger (p-value < 0.001, except Arrowhead nonCTCF TADs which was not significantly different) (Additional file 1: Fig. S3A–B). We also compared the distance between genes in CTCF and nonCTCF TADs to random CTCF/nonCTCF TADs, respectively. We found that genes are significantly further apart in both CTCF TADs and nonCTCF TADs than expected in random CTCF/nonCTCF TADs (median p-value < 0.001) (Additional file 1: Fig. S3C–D).

To further investigate the biological context of CTCF TADs and nonCTCF TADs we calculated the percentage of TADs which were CTCF TADs or nonCTCF TADs, and in the A or B compartments (Additional file 2: Table S4). We found that for both TAD callers and tissues CTCF TADs were most commonly in the A compartment (~ 44%, 40%, 20% and 23% of TADs are CTCF TADs and in the A compartment compared to ~ 18%, 18%, 7% and 12% of TAD which are CTCF TADs and in the B compartment in ESC Arrowhead TADs, ESC TopDom TADs, cortical neuron Arrowhead TADs and cortical neuron TopDom TADs, respectively). Whereas ESC nonCTCF TADs are more commonly found in the B compartment (percentage of TADs which are nonCTCF in compartment A vs B in ESC Arrowhead TADs: ~ 11% vs 26%, ESC TopDom TADs: ~ 8% vs 33%) and cortical neuron nonCTCF TADs are split relatively equally between A and B compartments (cortical neuron Arrowhead TADs: ~ 42% vs 31% and cortical neuron TopDom TADs: ~ 30% vs 35%).

TADs vs paralogy and gene constraint

We have shown examples of TADs that contain a large number of genes from the same paralogous families (Fig. 1D), suggesting that genes within TADs could be more functionally similar due to shared ancestry [30]. We therefore investigated whether genes within TADs are enriched for paralogous gene pairs, genome wide. To do this we assessed the proportion of paralogous gene pairs within TADs and random TADs. Similarly to Ibn-Salem et al. [31] we found a greater proportion of paralogous gene pairs fall within TADs compared to random TADs (Fig. 4A, median p-value of TADs vs random TADs < 0.001). This suggests that pairs of paralogous genes are more likely to fall within the same TAD than can be explained by the linear proximity of the genes alone. We further investigated this relationship and found that Arrowhead TADs which contain at least one pair of paralogs are significantly larger in size than TADs with no pairs of paralogs (Fig. 4B, p-value < 0.001). However, no difference in size was observed between TopDom TADs containing at least one pair of paralogs or no pairs of paralogs. Despite this both Arrowhead and TopDom TADs containing a pair of paralogs were significantly larger than observed in random TADs containing a pair of paralogs, and significantly smaller than random genome TADs containing a pair of paralogs (Additional file 1: Fig. S4A) (median p-value < 0.01). This suggests that TADs containing a pair of paralogs are significantly larger than expected if TADs are randomly placed in the genome and significantly smaller than expected if the effect of the linear order of the genome is randomised (the latter result is probably due to the increased probability of larger TADs containing a pair of paralogs in a randomised genome). No significant difference was observed between the size of TADs and random TADs which contained no pairs of paralogs.

Fig. 4figure 4

Paralogs and constraint vs autosomal TADs. A Proportion of paralogous gene pairs in TADs, the median proportion in 100 sets of random TADs, and the median proportion in 100 sets of random genome TADs. TADs contain significantly more pairs of paralogous genes than both random TADs and random genome TADs (Fisher’s exact test, median p-value: p < 0.001 = ***, p < 0.01 = **, p < 0.05 = *). B Size of TADs containing pairs of paralogs vs TADs (with > 1 gene) containing no pairs of paralogs (plotted on a log10 scale). For Arrowhead TADs, TADs which contain pairs of paralogs are larger than TADs which have no paralog pairs. (Wilcoxon test, p-value: p < 0.001 = ***, p < 0.01 = **, p < 0.05 = *, ES = effect size calculated using r for Wilcoxon). C Distribution of mean constraint scores of genes occupying the same TAD. TADs are split depending on the number of genes they contain (1, 2, 3, 4, 5, 6+). Dots indicate the mean of the distribution. D Table showing FDR corrected p-values of differences between groups in C calculated with the Wilcoxon test. Significant p-values are highlighted red. Genes singly occupying a TAD have a significantly higher constraint score than the average constraint of genes in TADs with > 1 gene. E Biological processes GO term functional enrichment of genes singly, doubly or triply occupying an Arrowhead TAD. Only the top 25 most significant GO terms passing a p-value threshold of < 0.05 (multiple testing corrected using the “gSCS” option) are shown for each gene set

When TADs are split into CTCF TADs and nonCTCF TADs we find that pairs of paralogs are significantly enriched in nonCTCF TADs compared to random nonCTCF TADs (median p-value of nonCTCF TADs vs random nonCTCF TADs < 0.001). However, on the whole this is not true for CTCF TADs where paralogous pairs are largely depleted (with the exception of ESC TopDom TADs) (median p-value of CTCF TADs vs random CTCF TADs < 0.001 for ESC Arrowhead and cortical neuron Arrowhead and < 0.05 for cortical neuron TopDom) (Additional file 1: Fig. S5). This suggests that although pairs of paralogs are enriched in TADs they are largely depleted in CTCF TADs. This raises the possibility that TADs detected by Arrowhead and TopDom may be made up of two functional groups. As CTCF TADs are bounded by CTCF, they are likely to have been formed by loop extrusion involving CTCF. In contrast, nonCTCF TADs may have been formed by other mechanisms and therefore may represent other types of domain, e.g. compartment domains or TADs formed by loop extrusion not involving CTCF [10].

In order to further assess the impact of evolutionary forces on genes within TADs, we assessed the average constraint scores of genes in TADs. Constraint scores quantify the degree of selective constraint acting on protein-coding genes, with a higher score indicating a greater strength of purifying selection [32]. Selective constraint can change over evolutionary time, and we therefore considered constraint scores calculated in the mouse lineage [33]. We find that protein-coding genes which singly occupy a TAD are significantly more constrained than the mean constraint of genes co-occupying TADs (Fig. 4C–D, Additional file 2: Table S1). Genes singly occupying a TAD also have significantly higher constraint than seen in random TADs (with the exception of cortical neuron TopDom random TADs) suggesting the result cannot be explained by the structure of the linear genome alone. Genes singly occupying a TAD are also significantly more constrained than seen in random genome TADs (FDR corrected median p-value of genes singly occupying TADs vs genes singly occupying TADs in random TADs < 0.05 or random genome TADs < 0.001, Additional file 1: Fig. S6). This suggests that genes, which singly occupy TADs, may be under higher selective constraint and may be more functionally important than genes which co-occupy a TAD. This in turn suggests that the protection from aberrant regulation of functionally important genes, implied by being in a private TAD, is under selective constraint.

We next sought to test if the relationship between TADs and average gene constraint is observable in both CTCF TADs and nonCTCF TADs. When considering either CTCF TADs or nonCTCF TADs, as seen above, we find that generally the constraint of genes in singly occupied TADs is significantly higher relative to the average constraint of genes co-occupying a CTCF/nonCTCF TAD (Additional file 1: Fig. S7, Additional file 2: Tables S2 and S3).

In order to assess which biological processes genes which singly occupy a TAD are involved in, we carried out a functional enrichment analysis (see “Methods”) using Biological process GO terms (Fig. 4E and Additional file 1: Fig. S8). We found that genes which singly occupy a TAD are highly enriched for developmental processes, genes which occupy a TAD with one other gene (double occupancy) are also enriched for developmental processes but to a lesser extent and genes which occupy a TAD with two other genes (triple occupancy) are less enriched for developmental processes still. For example, “system development” is the most significant GO term associated with singly occupied Arrowhead TADs in both ESCs and cortical neurons (p-value = 8.96 × 10–48 and 4.02 × 10–43, respectively), but it is less significantly associated in doubly occupied or triply occupied TADs (doubly occupied: p-value = 5.31 × 10–20 and 3.11 × 10–20, triply occupied: p-value = 9.30 × 10–06 and 1.35 × 10–06 for ESC and cortical neuron, respectively). We repeated the enrichment analysis using genes which singly, doubly and triply occupy random TADs in order to establish whether randomly placed TADs with similar features (e.g. only one gene in the length of the TAD) have a similar pattern of enrichment (Additional file 1: Fig. S8). We found that genes that singly occupy a random TAD are also enriched for developmental processes but to a lesser degree than TADs. This suggests that the enrichment for developmental function observed in genes that singly occupy a TAD cannot be explained by the linear genome alone.

In order to further investigate the similarity of sets of genes singly occupying TADs between cell types, we compared genes singly occupying a TAD in ESC with cortical neuron. We found that for both Arrowhead and TopDom, genes which singly occupied TADs were very similar between ESC and cortical neuron (~ 64% and ~ 58% of ESC and ~ 68% and ~ 61% of cortical neuron genes singly occupying TADs were also singly occupying TADs in the other tissue for Arrowhead and TopDom, respectively) (Additional file 1: Fig. S9A). We next tested the functional enrichment of the genes singly occupying TADs in: ESC only, cortical neuron only or both ESC and cortical neuron. In general we found enrichment for similar developmental functions between the groups (Additional file 1: Fig. S9B). Together these results suggest that similar genes singly occupy TADs in both tissues.

We also assessed singly occupied TADs in the context of their compartment and CTCF/nonCTCF TAD identity (Additional file 2: Table S5). We found that ESC singly occupied TADs were most commonly CTCF TADs in the A compartment. Suggesting that they are more likely to be transcriptionally active and formed by loop extrusion involving CTCF. On the other hand cortical neuron singly occupied TADs were most commonly nonCTCF TADs with similar proportions in the A and B compartments.

Expression and functional similarity of non-paralogous genes in CTCF TADs

We have shown that CTCF TADs and nonCTCF TADs are unequal in their functional relevance, with nonCTCF TADs enriched for paralogous gene pairs. We therefore next focused on the functional similarity of pairs of genes in CTCF TADs (Fig. 2). Since paralogous gene pairs are highly likely to share functional similarity and we have previously assessed their relationship with TADs (Fig. 4A–B, Additional file 1: Fig. S5) we excluded all pairs of paralogous genes and removed the olfactory genes (see “Methods”) in all functional analyses. This will allow the assessment of functional similarity between genes within TADs without recent shared ancestry.

In order to assess whether pairs of genes in the same TAD have correlated expression patterns we used FPKM/RPKM counts from RNA-seq expression data. RNA-seq generated during neural differentiation from Bonev et al. [25] and from the most closely matching cell types/tissues to ESCs and cortical neurons which had greater than three samples (mouse ESCs differentiating to primordial germ cell like cells (PGC) and forebrain at different embryonic stages, respectively) from Encode or GEO were used [34,35,36,37]. Using these expression counts we calculated spearman’s rank correlation coefficient between pairs of genes in CTCF TADs, 100 sets of random CTCF TADs, and 100 sets of random genome CTCF TADs (Fig. 5C–D). We found pairs of genes in CTCF TADs have a significantly higher expression correlation than pairs of genes in random genome CTCF TADs in all comparisons (median p-value < 0.05). This is an expected result because randomising the genome removes the effect of linear gene proximity. However, in 7 out of 8 comparisons we find no significant difference in expression correlation between pairs of genes in CTCF TADs and pairs of genes in random CTCF TADs (median p-value < 0.05). This suggests that contrary to the majority of other studies [17,18,19,20] we find little evidence that pairs of genes sharing a TAD are more likely to have similar expression patterns than can be explained by their linear proximity. A study by Soler-Oliva et al. found that algorithmically identified co-expression domains in breast tissue/breast cancer tend not to coincide with TADs, which supports our findings [38].

Fig. 5figure 5

Pairwise gene co-expression in autosomal CTCF TADs. Olfactory genes and paralogous gene pairs have been excluded in all panels. AB Top panel: median expression correlation coefficient (spearman) for pairs of genes vs binned distance in the real genome and 1000 random genomes. Bottom panel: stars indicate bins with a significantly higher median expression correlation in the real genome vs 1000 random genomes (FDR corrected p-value < 0.05). Jitter has been applied on the y axis to allow clearer visualisation of close together points. A Expression correlation coefficients were calculated using RNA-seq from two replicates each of ESC, neural progenitor cells (NPC) and cortical neuron cells. B Expression correlation coefficients were generated using mouse ESCs differentiating to primordial germ like cells (PGC) and forebrain RNA-seq from encode. The mouse ESCs differentiating to PGC RNA-seq was generated with three replicates each of ESCs, epiblast like cells (day 2), PGC (day 4) and PGC (day 6). The forebrain RNA-seq was generated with two replicates each, of embryos of varying ages. CD Median expression correlation coefficient (spearman) for pairs of genes in CTCF TADs (dotted line) and median expression correlation coefficient (spearman) in 100 sets each of: random CTCF TADs and random genome CTCF TADs. CTCF TADs called using both Arrowhead and TopDom, in both ESC and cortical neuron (CN) Hi-C (Wilcoxon test, median p-value: p < 0.001 = ***, p < 0.01 = **, p < 0.05 = *, NS = not significant, median ES = effect size calculated using r for Wilcoxon). C Expression correlation coefficients were calculated using RNA-seq from A. D Expression correlation coefficients were calculated using RNA-seq from B

Next, we sought to assess whether pairs of genes within the same CTCF TAD are more likely to share functional annotations than pairs of genes in random CTCF TADs or random genome CTCF TADs. Here, we used molecular function (MF) GO semantic similarity scores, shared pathways, and PPI (see “Methods”). In 11 out of 12 comparisons we found that pairs of genes in CTCF TADs are significantly more similar (median p-value < 0.01) in terms of functional annotation than pairs of genes in random genome CTCF TADs (Fig. 6D–F). Again, this is expected as randomising the genome removes functional similarity that can be explained by linear proximity. Using binned linear distance, we found greater similarity between pairs of genes which are in very close linear proximity than expected if genes were randomly ordered on the chromosome (Fig. 6A–C). We next compared the functional annotations of genes in CTCF TADs with genes in random CTCF TADs. We found that for the majority of comparisons there was no significant difference (10 out of 12 comparisons, median p-value < 0.05) (Fig. 6D–F). Pairs of genes in Arrowhead ESC CTCF TADs and Arrowhead cortical neuron CTCF TADs have significantly more similar MF GO terms than pairs of genes in random CTCF TADs (median p-value < 0.01 and < 0.05, respectively). A similar trend in MF GO similarity was observed for all other CTCF TADs compared to random CTCF TADs but the difference wasn’t significant in the other 2 comparisons (median p-value < 0.05). This could indicate that pairs of genes in CTCF TADs have slightly more similar MF GO term annotations than pairs of genes in random CTCF TADs. However, perhaps this is limited to few TADs as the increase in similarity is very small and often not significant. Overall, we find the biggest contribution to the functional similarity between pairs of genes in CTCF TADs can be attributed to their linear proximity in the genome. When we control for linear proximity we find a less consistent picture but in the majority of comparisons, pairs of genes in CTCF TADs are no more likely to be functionally similar than if CTCF TADs were randomly placed.

Fig. 6figure 6

Functional similarity of pairs of genes in autosomal CTCF TADs. Olfactory genes and paralogous gene pairs have been excluded in all panels. AC Top panel: median GO semantic similarity, number of gene pairs with ≥ 1 shared pathways or number of gene pairs with ≥ 1 shared PPI for pairs of genes vs binned distance in the genome and 1000 random genomes. Bottom panel: stars indicate bins with a significantly higher functional similarity in the genome vs 1000 random genomes (FDR corrected p-value < 0.05). DF TADs called using both Arrowhead and TopDom; in both ESC and cortical neuron Hi-C. Median p-value: p < 0.001 = ***, p < 0.01 = **, p < 0.05 = *, NS = not significant. A Distribution of MF GO semantic similarity for pairs of genes binned by distance in the genome vs 1000 random genomes. B Distribution of the number of pairs of genes sharing ≥ 1 pathway binned by distance in the genome vs 1000 random genomes. C Distribution of the number of pairs of genes sharing ≥ 1 PPI binned by distance in the genome vs 1000 random genomes. D Median MF GO semantic similarity for pairs of genes in CTCF TADs (dotted line) compared to the distributions of median MF semantic similarity for 100 sets each of: random CTCF TADs and random genome CTCF TADs (Wilcoxon test, median p-value, median ES = effect size calculated using r for Wilcoxon). E Proportion of annotated pairs of genes sharing ≥ 1 pathway in CTCF TADs and the median proportion of pairs of genes sharing ≥ 1 pathway in 100 sets each of: random CTCF TADs and random genome CTCF TADs (Fisher’s exact test, median p-value). F Proportion of annotated pairs of genes sharing ≥ 1 PPI in CTCF TADs and the median proportion of pairs of genes sharing ≥ 1 PPI in 100 sets each of: random CTCF TADs and random genome CTCF TADs (Fisher’s exact test, median p-value)

留言 (0)

沒有登入
gif