In silico approaches for prediction of anti-CRISPR proteins

Journal home page for Journal of Molecular BiologyAuthor links open overlay panel, , Highlights•

Numerous viruses infecting bacteria and archaea encode one or more proteins that protect them against CRISPR-mediated immunity (anti-CRISPR proteins, Acrs)

Most Acrs are small proteins with low sequence conservation which hampers their identification by computational methods.

Several computational pipelines for Acr prediction have been developed based on their diagnostic features.

Acrs have been predicted using comparative genomics, guilt-by-association approach, identification of self-targeting spacers and machine learning techniques.

Progress in Acr prediction hinges on the availability of efficient wet lab pipelines to expand the repertoire of known acr genes and validate the computationally discovered candidates.

Abstract

Numerous viruses infecting bacteria and archaea encode CRISPR-Cas system inhibitors, known as anti-CRISPR proteins (Acr). The Acrs typically are highly specific for particular CRISPR variants, resulting in remarkable sequence and structural diversity and complicating accurate prediction and identification of Acrs. In addition to their intrinsic interest for understanding the coevolution of defense and counter-defense systems in prokaryotes, Acrs could be natural, potent on-off switches for CRISPR-based biotechnological tools, so their discovery, characterization and application are of major importance. Here we discuss the computational approaches for Acr prediction. Due to the enormous diversity and likely multiple origins of the Acrs, sequence similarity searches are of limited use. However, multiple features of protein and gene organization have been successfully harnessed to this end including small protein size and distinct amino acid compositions of the Acrs, association of acr genes in virus genomes with genes encoding helix-turn-helix proteins that regulate Acr expression (Acr-associated proteins, Aca), and presence of self-targeting CRISPR spacers in bacterial and archaeal genomes containing Acr-encoding proviruses. Productive approaches for Acr prediction also involve genome comparison of closely related viruses, of which one is resistant and the other one is sensitive to a particular CRISPR variant, and “guilt by association” whereby genes adjacent to a homolog of a known Aca are identified as candidate Acrs. The distinctive features of Acrs are employed for Acr prediction both by developing dedicated search algorithms and through machine learning. New approaches will be needed to identify novel types of Acrs that are likely to exist.

Keywords

anti-CRISPR proteins

comparative genomics

self-targeting

guilt-by-association

machine learning

Published by Elsevier Ltd.

留言 (0)

沒有登入
gif