RNAirport: a deep neural network-based database characterizing representative gene models in plants

Elsevier

Available online 20 March 2024

Journal of Genetics and GenomicsAuthor links open overlay panel, , , , , Abstract

A 5′-leader, known initially as the 5′-untranslated region, contains multiple isoforms due to alternative splicings (aS) and transcription start sites (aTSS). Therefore, a representative 5′-leader is demanded to examine the embedded RNA regulatory elements in controlling translation efficiency. Here, we develop a ranking algorithm and a deep-learning model to annotate representative 5′-leaders for five plant species. We rank the intra- and inter-sample frequency of aS-mediated transcript isoforms using the Kruskal-Wallis test-based algorithm and identify the representative aS-5′-leader. To further assign a representative 5′-end, we train the deep-learning model 5′leaderP to learn aTSS-mediated 5′-end distribution patterns from cap-analysis gene expression (CAGE) data. The model accurately predicts the 5′-end, confirmed experimentally in Arabidopsis and rice. The representative 5′-leader-contained gene models and 5′leaderP can be accessed at RNAirport (http://www.rnairport.com/leader5P/). This stage 1 5′-leader annotation records 5′-leader diversity and will pave the way to Ribo-Seq ORF annotation, identical to the project recently initiated by human GENCODE.

Keywords

5′-leader

Transcript isoforms

RNA regulatory elements

uORF

Deep learning

Synthetic biology

Translational control

© 2024 The Authors. Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, and Genetics Society of China. Published by Elsevier Limited and Science Press.

留言 (0)

沒有登入
gif