Carbohydrate binding modules: Compact yet potent accessories in the specific substrate binding and performance evolution of carbohydrate-active enzymes

The non-catalytic carbohydrate binding modules (CBMs) are ubiquitous protein domains that play a crucial role in carbohydrate metabolism in various organisms. These modules were originally defined as cellulose-binding domains (CBDs) due to the binding of crystalline cellulose as their primary ligand (van Tilbeurgh et al., 1986). The existence of CBMs was initially reported mainly in cellulases of fungal origin (Várnai et al., 2014), and the distribution, function, and application of cellulose-binding CBMs have been systematically reviewed (Liu et al., 2022; Sidar et al., 2020). Subsequently, more CBMs with diverse substrate specificity have been found in a wide range of carbohydrate-active enzymes (CAZymes) derived from bacteria, eukaryotes, archaea, and their viruses (Shi et al., 2023), involving in the assembly and degradation of carbohydrates (Cantarel et al., 2009). CBMs are known for their ability to target their parent proteins to specific carbohydrate substrates, which is helpful for increasing the enzyme concentration around the substrates and maintaining the protein–ligand association for more efficient catalysis of CAZymes (Bolam et al., 1998). The specific recognition and binding functions of CBMs make them excellent model systems for studying the mechanism of protein–carbohydrate interactions, as well as powerful tools for targeted processing of carbohydrates in numerous biotechnological applications (Tomme et al., 1998).

Based on amino acid sequence similarity, CBMs are currently divided into 99 discrete families in the CAZy database (Carbohydrate Active enzyme database, http://www.cazy.org/CAZy), excluding the deleted CBM7 family and the re-classified CBM33 family. The number of CBM families has more than doubled in the last two decades, given that only 43 different familied were identified in 2006 (Shoseyov et al., 2006). CBM families are typically named according to their discovery order or structural features. The vast majority of CBMs are found in CAZymes from glycoside hydrolase (GH) and glycosyl transferase (GT) families that are related to the hydrolysis or rearrangement of glycosidic bonds (Boraston et al., 2004), and the CBMs derived from GTs have been adequately enumerated in previous reviews (Gomez-Casati et al., 2013; Guillén et al., 2010). Nevertheless, several other CBM members have also been found to be present in CAZymes from polysaccharide lyase (PL), carbohydrate esterase (CE), and auxiliary activity (AA) families during the last decade, including CBM2 in lytic polysaccharide monooxygenase (AA), CBM32 in alginate lyase (PL), CBM48 in feruloyl esterase (CE), CBM70 in hyaluronate lyase (PL), CBM77 in pectate lyase (PL), and CBM87 in N-acetylgalactosamine deacetylase (CE) (Bamford et al., 2020; Courtade et al., 2018; Forsberg and Courtade, 2023; Pereira et al., 2021; Sim et al., 2017; Suits et al., 2014; Venditto et al., 2016). The CBM33 members were originally classified as chitin-binding proteins, but were later identified to be oxidative enzymes and re-classified into the family AA10 (Vaaje-Kolstad et al., 2010). There remains a few members of CBM families that are not associated with CAZymes, defined as “orphan” CBMs (Abbott and Bueren, 2014).

CBMs are structurally characterized by their compact folds containing 30–200 amino acids (Gomez-Casati et al., 2013; Sidar et al., 2020). To date, the structure information of CBMs from 65 different families has been determined by X-ray crystallography and nuclear magnetic resonance (NMR) spectroscopy (summarized in Table 1), and the number has more than doubled over the last two decades (Hashimoto, 2006). The currently known CBMs have been grouped into seven fold families based on their three-dimensional structures (Boraston et al., 2004). The β-sandwiches in a variety of different topologies, including jelly-roll topology and immunoglobulin-like topology, are the most abundant CBM folds found in CAZymes. They are composed of two stacked β-sheets and several flexible loops, and the β-sheets commonly consist of 3–6 anti-parallel β-strands, forming a convex and concave surface (Richardson, 1981; Boraston et al., 2004). The ligand binding sites are normally located in a groove on the concave surface (Cattaneo et al., 2018; Luís et al., 2013; Najmudin et al., 2006). Several other folds and structural motifs have also been identified in CBMs, such as cysteine knots in CBM1, oligonucleotide and oligosaccharide binding (OB) folds in CBM10, β-trefoils in CBM13 and CBM42 (Fujimoto et al., 2000; Miyanaga et al., 2006), hevein-like folds in CBM14 and CBM18 (Harata et al., 2001; Madland et al., 2019), βαβ folds in CBM43 (Hurtado-Guerrero et al., 2009), and βααβ folds in CBM50 (Kitaoku et al., 2022). The differences in the spatial structure of CBMs fundamentally leads to the diversity of substrate preferences and binding patterns. CBMs have been grouped into three types according to their binding patterns with ligands, type A (surface-binding), type B (glycan chain-binding), and type C (small sugar-binding) (Boraston et al., 2004; Shi et al., 2023), which are further defined as CBMs that bind the crystalline surfaces of polysaccharides, the internal regions of glycan chains (endo-type), and the termini of glycans (exo-type), respectively (Gilbert et al., 2013). The structural features of ligand binding sites of several typical CBMs belonging to these three different types are shown in Fig. 1. Type A CBMs have been widely reported to possess a flat binding surface with three aligned aromatic residues for ligand docking (Hanazono et al., 2016). Type B CBMs generally have a shallow groove that binds glycans through the aromatic side chains of conserved residues (Doxey et al., 2010). The binding site of Type C CBMs is commonly able to only accommodate the termini of glycan chains or small monosaccharides and saccharic acids (Notenboom et al., 2001b).

With the rapid expansion of discovered CAZymes with different catalytic abilities, CBMs have been found to exhibit a broad affinity to various saccharides, such as cellulose, starch, α-glucan, β-glucan, xylan, mannan, chitin, agarose, galactan, glucomannan, and others (Boraston et al., 2001; Cid et al., 2010; Henshaw et al., 2006; Houston et al., 2004; Janeček et al., 2019; McLean et al., 2000; Mikami et al., 2006; Tunnicliffe et al., 2005; Venditto et al., 2016; Yaniv et al., 2012). The substrate variety, enzyme source, and binding type of CBMs from different families are listed in Table 1. Although CBMs within the same family share significant sequence similarities, they may append to the catalytic domains of different parent enzymes, display difference in their binding site structures and binding patterns, and show binding specificities to various substrates. Families with CBM members derived from cellulases that hydrolyze crystalline cellulose usually also have members from relevant enzymes that show affinity to other insoluble substrates, including lignin, chitosan, and xylan. Moreover, CBMs that share affinity to the same type of substrate may show similar structures in their binding sites, even if they are classified into different families. Janeček et al. comprehensively reviewed the history, evolution, structure, and function of starch-binding domains from different CBM families (Janeček and Ševčík, 1999; Janeček et al., 2011, Janeček et al., 2019), indicating that the complex substrate recognition and binding mechanisms of CBMs depend not only on their characteristic motif in amino acid sequences, but also on the key amino acid residues and spatial structure of their binding sites.

At present, much effort has been made to study the structure-function relationships of CBMs. The high specificity and efficiency of CBMs for ligand binding enable them to play an important role as stable and portable protein accessories in the processing of valuable carbohydrates and in the development and application of novel CAZymes. This paper provides a comprehensive overview of the function and application of CBMs derived from CAZymes, highlighting the importance of CBMs in carbohydrate recognition and binding. The understanding of CBMs has significant implications in various fields, ranging from basic research in protein–carbohydrate interactions to the sustainable development of innovative biotechnology.

留言 (0)

沒有登入
gif