Available online 11 August 2022
Highlights•CRISPR-OTE extracts multi-dimensional sequence features.
•The optimal subset of physicochemical features is selected as prior knowledge.
•Transfer learning enables CRISPR-OTE to be successfully applied to different CRISPR systems and species.
AbstractObjectiveClustered Regularly Interspaced Short Palindromic Repeats (CRISPR) is a powerful genome editing technology. Guide RNA (gRNA) plays an essential guiding role in the CRISPR system by complementary base pairing with target DNA. Since the CRISPR targeting mechanism problem has not yet been fully resolved, it remains a challenge to predict gRNA on-target efficiency. Current gRNA design tools often lack efficient information extraction and cannot learn the target efficiency patterns thoroughly.
Material and MethodsIn this study, CRISPR-OTE is proposed to consider both multi-dimensional sequence information and important complementary prior knowledge based on a simple but effective framework. CRISPR-OTE consists of the local-contextual information branch and the prior knowledge branch. The local-contextual information branch extracts multi-dimensional sequence features from the DNA primary sequence by a parallel framework of Convolutional Neural Networks (CNN) and bidirectional Long Short-Term Memory networks (biLSTM). The prior knowledge branch selects the optimal subset of physicochemical features to provide the neural network with complementary knowledge, such as complex secondary structures. A simple feature fusion strategy is also adopted to fully utilize multi-modal data from the two branches.
ResultsThe experimental results show that the optimal subset of physicochemical features (RNA secondary structure and melting temperature of 34nt target) can effectively improve the prediction performance. Additionally, combining multi-dimensional sequence features and multi-modal features can extract information more comprehensively. Through transfer learning, CRISPR-OTE trained on the CRISPR-Cpf1 system can also be successfully applied to the CRISPR-Cas9 system.
ConclusionThe performance of CRISPR-OTE is superior to other methods in different CRISPR systems and species. Therefore, CRISPR-OTE is a simple on-target efficiency prediction framework with better accuracy and generalization performance.
Graphical abstractDownload : Download high-res image (82KB)Download : Download full-size imageKeywordsGenome editing
CRISPR
On-target efficiency
Deep learning
Prior knowledge
View full text© 2022 Published by Elsevier Masson SAS on behalf of AGBM.
留言 (0)