Decoding biology with massively parallel reporter assays and machine learning [Reviews]

Alyssa La Fleur1, Yongsheng Shi2 and Georg Seelig1,3 1Paul G. Allen School of Computer Science and Engineering, University of Washington, Seattle, Washington 98195, USA; 2Department of Microbiology and Molecular Genetics, School of Medicine, University of California, Irvine, Irvine, California 92697, USA; 3Department of Electrical & Computer Engineering, University of Washington, Seattle, Washington 98195, USA Corresponding author: gseeliguw.edu, yongshesuci.edu Abstract

Massively parallel reporter assays (MPRAs) are powerful tools for quantifying the impacts of sequence variation on gene expression. Reading out molecular phenotypes with sequencing enables interrogating the impact of sequence variation beyond genome scale. Machine learning models integrate and codify information learned from MPRAs and enable generalization by predicting sequences outside the training data set. Models can provide a quantitative understanding of cis-regulatory codes controlling gene expression, enable variant stratification, and guide the design of synthetic regulatory elements for applications from synthetic biology to mRNA and gene therapy. This review focuses on cis-regulatory MPRAs, particularly those that interrogate cotranscriptional and post-transcriptional processes: alternative splicing, cleavage and polyadenylation, translation, and mRNA decay.

留言 (0)

沒有登入
gif