Methods for Cryo-EM Single Particle Reconstruction of Macromolecules Having Continuous Heterogeneity

Over the past few years, the combination of cryo-electron microscopy (cryo-EM) imaging and single-particle analysis has been established as the method of choice for studying the structure of large protein complexes at atomic or near-atomic resolution [1]. Its recent success has been enabled by advances in detector technology, sample preparation techniques and the availability of advanced image processing software packages. The two other major structure-determination methods are X-ray crystallography, which requires the sample to be crystallized, and NMR, which is useful only with relatively small proteins.

Cryo-EM single-particle analysis (we will denote this simply as cryo-EM) involves the imaging of individual copies (called particles) of a macromolecular structure. Through the computational processing of 104 to 106 of such particle images, 3D density maps can be obtained through single-particle reconstruction. To density maps of sufficient resolution, atomic structures can then be fitted, with ∼4 Åbeing the worst resolution for successful ab-initio fitting. Because the macromolecules are suspended in solution before they are rapidly frozen, particle images are likely to reflect a more native conformation, but also may contain frozen instances of flexibility or conformational variation. Therefore, one of the promises of cryo-EM is that researchers will be able to construct a complete picture of all the possible conformations of the imaged structures.

Conformational changes are key to the function of many macromolecular machines. The molecular motors dynein and kinesin undergo cyclical changes as a chemical reaction (hydrolysis of ATP) drives a mechanical stepping motion that moves cargo along a microtubule filament. Glucose transporters allow cells to take up this nutrient through a conformational cycle that enforces the transport of one or two Na+ ions with each glucose molecule. DNA replication is carried out by the replisome, a large combination of molecular machines that, in stepwise fashions, unwind the double-stranded DNA and synthesize new complementary DNA strands.

While existing methods excel at reconstructing clearly-defined discrete conformations from cryo-EM data, the problem of reconstructing continuously-varying conformations of a macromolecule is where most state of the art methods fall short. This is the continuous heterogeneity reconstruction problem. Fortunately, there has been a surge in the effort devoted to developing computational tools for continuous heterogeneity reconstruction, and this is the focus of this survey article. Our aim is to highlight the defining characteristics of each method and the conceptual overlaps and differences between them.

We note that there are many ideas and approaches for continuous heterogeneity and many papers introduce multiple new ideas and combine multiple approaches. For brevity and clarity, we cluster together different works and omit important implementation details. This paper covers the conceptual families of ideas and does not make specific recommendations about software to use. Some of the work surveyed is theoretical or less accessible to the user. Where available, we included links to some of the software that may be more accessible.

There are a number of recent surveys covering other aspects of the cryo-EM pipeline.

A general review of the computational challenges and the main components in the analysis pipeline are available in [2]. A comprehensive description of the mathematical aspects of the problem, focusing on homogeneous reconstruction and validation, is available in [3].

Discrete heterogeneity is discussed in [4]. A survey of earlier work on continuous heterogeneity, with a comprehensive survey of normal mode analysis (Section 6) is available in [5]. In [6], the focus is on the interpretation of the energy landscape resulting from the heterogeneity analysis using likelihood-based methods. An up-to-date overview of the full cryo-EM pipeline, from sample preparation to the latest reconstruction methods, including time-resolved cryo-EM is available in [7].

The recent reviews [8], [9] focus on machine learning approaches to cryo-EM. Specifically, the former gives an overview of machine learning algorithms used in each step in the cryo-EM pipe-line, from pre-processing and particle picking to 3D reconstruction and post-processing, while the latter is a thorough survey of deep generative modeling techniques for 3D reconstruction.

In this review, we summarize the state of the art methods for analyzing continuous heterogeneity in cryo-EM. Our aim is to sort the main families of ideas in the area and convey some of the main ideas of each technique.

We first discuss a simplified image formation model that most cryo-EM reconstruction methods assume, as well as discrete heterogeneous reconstruction and multi-body refinement, in Section 2.

In Section 3 and Section 4, we describe manifold learning approaches to continuous heterogeneity reconstruction, specifically applied to particle images and reconstructed volumes respectively. In Section 5 and Section 6, we discuss linear models based on covariance estimation and normal mode analysis respectively.

In Section 7, we discuss nonlinear models for continuous heterogeneous reconstruction which conceptually fit into the “hyper-molecules” framework, including traditional and deep learning algorithms. In Section 8, we discuss inference methods that bypass the traditional treatment of latent variables.

Finally, we conclude the article with a discussion in Section 9.

留言 (0)

沒有登入
gif