A fast and adaptive detection framework for genome-wide chromatin loop mapping from Hi-C data [METHODS]

Siyuan Chen1,2,3,8, Jiuming Wang4,8, Inkyung Jung5, Zhaowen Qiu6, Xin Gao1,2,3 and Yu Li4,7 1Computer Science Program, Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division, King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia; 2Center of Excellence on Smart Health, King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia; 3Center of Excellence for Generative AI, King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia; 4Department of Computer Science and Engineering, The Chinese University of Hong Kong (CUHK), Hong Kong SAR 999077, China; 5Department of Biological Sciences, Korea Advanced Institute of Science and Technology (KAIST), Daejeon 34141, Republic of Korea; 6Institute of Information and Computer Engineering, NorthEast Forestry University, Harbin 150040, China; 7The CUHK Shenzhen Research Institute, Hi-Tech Park, Nanshan, Shenzhen 518057, China

8 These authors contributed equally to this work.

Corresponding authors: liyucse.cuhk.edu.hk, xin.gaokaust.edu.sa Abstract

Chromatin loop identification plays an important role in molecular biology and 3D genomics research, as it constitutes a fundamental process in transcription and gene regulation. Such precise chromatin structures can be identified across genome-wide interaction matrices via Hi-C data analysis, which is essential for unraveling the intricacies of transcriptional regulation. Given the increasing number of genome-wide contact maps, derived from both in situ Hi-C and single-cell Hi-C experiments, there is a pressing need for efficient and resilient algorithms capable of processing data from diverse experiments rapidly and adaptively. Here, we propose YOLOOP, a novel detection-based framework that is different from the conventional paradigm. YOLOOP stands out for its speed, surpassing the performance of previous state-of-the-art (SOTA) chromatin loop detection methods. It achieves a 30-fold acceleration compared with classification-based methods, up to 20-fold acceleration compared with the SOTA kernel-based framework, and a fivefold acceleration compared with statistical algorithms. Furthermore, the proposed framework is capable of generalizing across various cell types, multiresolution Hi-C maps, and diverse experimental protocols. Compared with the existing paradigms, YOLOOP shows up to a 10% increase in recall and a 15% increase in F1-score, particularly noteworthy in the GM12878 cell line. YOLOOP also offers fast adaptability with straightforward fine-tuning, making it readily applicable to extremely sparse single-cell Hi-C contact maps. It maintains its exceptional speed, completing genome-wide detection at a 10 kb resolution for a single-cell contact map within 1 min and for a 900-cell-superimposed contact map within 3 min, enabling fast analysis of large-scale single-cell data.

Footnotes

[Supplemental material is available for this article.]

Article published online before print. Article, supplemental material, and publication date are at https://www.genome.org/cgi/doi/10.1101/gr.279274.124.

Freely available online through the Genome Research Open Access option.

Received March 5, 2024. Accepted August 8, 2024.

留言 (0)

沒有登入
gif