On kinetics and extreme values in systems with random interactions

Biological environments such as the cytoplasm are comprised of many different molecules, which makes explicit modeling intractable. In the spirit of Wigner, one may be tempted to assume interactions to derive from a random distribution. Via this approximation, the system can be efficiently treated in the mean-field, and general statements about expected behavior of such systems can be made. Here, I study systems of particles interacting via random potentials, outside of mean-field approximations. These systems exhibit a phase transition temperature, under which part of the components precipitate. The nature of this transition appears to be non-universal, and to depend intimately on the underlying distribution of interactions. Above the phase transition temperature, the system can be efficiently treated using a Bethe approximation, which shows a dependence on extreme value statistics. Relaxation timescales of this system tend to be slow, but can be made arbitrarily fast by increasing the number of neighbors of each particle.

Intracellular environments, in particular the cytoplasm, are comprised of thousands of different molecules organized in compartments. Beyond the membrane-bound organelles, one finds additional organization within a given milieu, for instance in the form of stress granules in the cytosol or lipid rafts in membranes. There is growing evidence that this second level of organization is linked to phase separation processes. Direct modeling of such a complex environment is unfortunately intractable.

Salvation may come from assuming such interactions are random, that is, the interaction matrix between different chemical species is a random matrix. This kind of model was introduced to physics by Eugene Wigner to model spectral lines of heavy atoms [1]. The underlying postulate was that interactions in quantum mechanics were so complicated that they could be treated as if they were statistically random. The power of this approach is that the probability distribution of eigenvalues of random matrices is given straightforwardly by the Wigner semi-circle law. One can therefore make general statements about the expected average behavior of the system. An introduction to random matrices can be found in [2], and a review of applications of random matrices to some systems can be found in [3, 4]. It has been used in a wide range of contexts, including population dynamics [5] and RNA folding [6].

This approach has been applied to phase behavior of biological systems through the work of Sear and Custa [7], in order to model the cytosol. In this picture, phase behavior of proteins depends on a large number of microscopic details, which are tied in a complex fashion. For instance, valency of proteins is known to play an important role [8], which in turn can depend on the protein structure and environment [9]. The hypothesis of Sear and Custa can therefore be interpreted as interactions being so complex, that they appear as essentially drawn from random distributions. In their original work, Sear and Custa derive results for Gaussian distributed random values, and establish phase separation criteria that depend on the ratio of the variance to the number of components. Subsequent work by Jacobs and Frenkel has shown the system to be completely characterized by the mean and variance of the generating distribution, independently of its shape [10], and the demixing temperature was later shown to be related to extreme values of the interaction matrix [11].

Recent extensions of this model has targeted systems where a portion of the interactions are structured [12]. In this type of extension, interactions derive from the sum of microscopic details and noise, which present a more realistic picture of biological systems. In this system, two regimes exist for phase separation, depending on whether structured (microscopic details) or noise dominates. Understanding the behavior of systems with random interactions therefore provides a pathway to understand effects of noise in other systems.

Studies dealing with random interactions used mean-field approaches to understand the system. It appears good enough at first glance to search for an approximate solution to a problem, which is itself an approximate representation of biological systems. The question I want to answer here, is whether mean-field treatment of the system provides reasonable answers, and whether extreme values of the interactions matrix play a role. Here, I show that, every system exhibits a demixing transition, where some components precipitate out of the fully mixed system. I investigate conditions for phase separation for various distributions of interactions, by building a separate partition function for demixed states. Results for Gaussian and Laplace distributed interactions are consistent with [11]. However, I find that predictions of [10, 11], and this new partition function all fail to explain dependencies observed, suggesting that the de-mixed state may not correspond to the ground state. I also show that an exact partition function can be written for the mixed state. Using this partition function, I show that the system tends to exhibit extremely slow relaxation time near phase separation temperatures. This can be remedied by constructing an appropriate polymer-like mean-field model. I then discuss perspectives of such models in the context of biological environments.

I make use of a L × L lattice model, with a fixed number of neighbors z = 4. Lattice size is set to L = 32 as no significant effect of this parameter was noted. The system is comprised of N molecule types, and a single molecule is placed on each lattice site. Simulations are run in the semi-grand canonical ensemble in absence of solvent, with all chemical potentials equal. While I simulate 2D models, theoretical partition functions are derived for arbitrary dimensions and neighbor number z, and I therefore expect results to hold in 3D in regions where these partition functions are valid. In particular, there is no partition function derived in the vicinity of the transition itself, whose behavior may depend on the dimensionality of the system. The system is governed by the following Hamiltonian:

Equation (1)

Where si is the state of particle at lattice point i ($s_i \in 1, 2, \ldots, N$) and $\langle i,j \rangle$ represents every adjacent pair of particles on the lattice. Here, $\eta(s_i, s_j)$ represents contact energies between any two particles, which is symmetric, such that $\eta(s_i, s_j) = \eta(s_j, s_i)$. Here, these interactions are modeled by statistical distribution, such that each contact energy derives from a distribution $\mathcal$. We further assume that there are no correlations between the different interactions. For a given realization of the system, η can therefore be viewed as a symmetric N × N random matrix, which is quenched. From a microscopic perspective, this model assumes that interactions depend on so many microscopic variables that they can be viewed as deriving from a statistical distribution. Changes in the variance of $\mathcal$ only leads to a rescaling of the temperature, and we therefore use standardized distributions available in the SI. The inverse temperature is denoted by $\beta = 1 / kT$, and the inverse phase separation temperature, if it exists, by βT . This temperature cannot be identified through usual free energy methods due to the large number of bassins, and we therefore rely on energy discontinuities to identify βT . Distributions with heavy tails (sub-exponential) do not have a stable high-temperature regime in the large N limit and are therefore not discussed in the main text (see SI section ‘Heavy tailed distributions’ and in particular equation S1, its derivation and the subsequent discussion).

Simulations are equilibrated over 224 sweeps, where a sweep represents a Monte-Carlo move attempt on every lattice site of every replica. Statistics are then gathered every 32 sweeps over 224 sweeps. I make use of parallel tempering, using 210 replicas and attempting a swap every 4 sweeps during equilibration and every 215 during statistics gathering. Results are averaged over 20 realizations of interactions η. Exact mathematical definitions for distributions used in this articles are available in the SI. Results presented in the main text are averaged over realizations, in particular βT is averaged using the median value. Details of averaging methods, and variations between realizations is available in the SI.

For bounded distributions, general asymptotic solutions are available based on the distribution order. I classify a bounded distribution as being of order γ, if the probability distribution near its lower bound can be expanded into a polynomial of order γ. Mathematically, if the probability distribution $\mathcal(E)$ has a lower bound E0, it is of order γ iff $\lim_\mathcal(E_0 + \epsilon) \propto \epsilon^\gamma$ for arbitrarily small ϵ, where $\gamma \gt -1$. For instance, uniform distributions are of order 0 in this classification.

Results are denoted by a subscript when they are dependent on distribution. I use the following labels: $\mathcal$ for Gaussian, $\mathcal$ for Laplace and Γ for bounded distribution.

The reader should also be careful about equilibrium for these simulations. In presence of an exponential relaxation, one would generally require simulation lengths to at least exceed $t \gtrapprox 5 \tau_ N^2$, where $\tau_$ is the autocorrelation time. This condition is generally not achieved for large β or N, which imply that some simulations may be out of equilibrium.

First, I review the correlation-less mean-field predictions of [11], with a system where all sites are filled (φ = 1 in the language of [11]). This system shows two regions in phase space, a fully mixed phase found at high-temperature and a demixed phase at low temperatures. I denote the phase transition inverse temperature as βT . In [11], authors call these regions ‘condensed’ and ‘demixed’. In the mixed phase, the mean energy is simply given by $2 \langle E \rangle = z \vec \cdot \eta \cdot \vec = z \langle \eta \rangle $. Additionally, the condition for a phase with n2 interactions to have the same free energy as a phase of N2 interactions is simply:

Equation (2)

Where $\Delta\eta = \eta - \langle\eta \rangle$, and $\vec$, $\vec$ are vectors comprised of n ones and N − n zeroes that select the components that are precipitating. Details can be found in [11], where authors propose the following argument: as additional interactions are incorporated, the enthalpy gain tends to regress towards the mean, therefore phase separation occurs for small values of n. Consequently, the enthalpy gain corresponds to an extreme value of the interaction matrix. There is therefore always a demixing transition, whose temperature depends on the specific distribution η. In particular with this argument, for $\eta \sim \mathcal$, authors in [11] find that the transition temperature scales as $\beta_} \sim \sqrt$. If the same argument is applied to other distributions, one finds that $\beta_} \sim \mathcal(1)$ and $\beta_ \sim \log(N)$ (see SI for derivation).

An alternative derivation based on thermodynamic stability can be established by examining the mean-field free energy, which is written as :

Equation (3)

Properties of the system are then sought for a homogeneous composition, i.e. $\phi_i = 1 / N$. In particular, thermodynamic stability requires that the free energy is concave, which can be in turn related to eigenvalues of the interaction matrix, given by the Wigner semi-circle distribution. This derivation predicts $\beta_T \sim \sqrt$, independently of the underlying distribution (see [7, 10]).

3.1. Mixed state

In the high temperature limit, there is no correlations in the system, and we can attempt to factorize the partition function using a Bethe approximation. In this approximation, we replace interactions in the system by a sum over local neighbors n, i.e. the partition for site i is given by $\text(-\beta \sum_n \eta(s_i, s_n))$. Further interactions are assumed to be in a mean-field like regime dictated by similar partition functions, which imply that distributions over neighbors are independent. This can therefore be viewed as a mean-field regime with local correlations. Using this approximation here replaces individual sites with a z-tuplet of bonds, where the density of states of each bond is i.i.d. as $\mathcal$. To make the calculation tractable, I disregard any effects due to granularity of η, i.e. realizations of rows and columns of η are treated as continuous, which implicitly assumes $N\rightarrow\infty$. This could similarly be viewed as computing the partition function for annealed disorder, although here N takes the place normally taken by L in usual disordered systems. Under this set of approximations, the single site partition function $\mathcal$ can be written as:

Equation (4)

Where g(E) is a density of states corresponding to a sum of z random variables i.i.d. as $\mathcal$. The partition function of the system is simply $\mathcal^$; as the summation is performed over the z-tuplets of bonds and not on sites, so that the system requires an extra factor $1/2$ in the exponent. Exact analytical expressions for g(E) are only available for a limited number of cases (see SI, section ‘Particular cases’). In principle, the factorization of the partition function is unjustified, except in the limit of $\beta\rightarrow0$. As N increases, thermodynamic quantities, e.g. mean energy, converge to the predictions of equation (4) for $\beta \lt \beta_T$ (see figure 1). I conjecture that this convergence is due to the convergence of statistical similarity of each row and column in η as N increases, and that this approximation is exact in the limit $N\rightarrow\infty$. Expected mean energies extracted from equation (4) are given in table 1 for various distributions.

Figure 1. Mean energy per site in the system for Gaussian (A), Laplace (B) and Uniform (C) distributions. Color (inset) indicates the number of states (N) available. The ‘Theory’ label indicates the mean energy computed from the factorized partition function defined in equation (4). Note that vertical axis is logarithmic for C.

Standard image High-resolution image

Table 1. Mean energies and timescales predicted by partition functions for the mixed phase. Mean energies are exact except for the γ-Bounded case, in which case it is asymptotic as $\beta\rightarrow\infty$. Timescales are asymptotic expressions near $\beta\rightarrow\infty$, except for Laplace distributions, where it is asymptotic near $\beta\rightarrow1$. z indicates the number of neighbors of each lattice site.

DistributionMean energyTimescaleGaussian $-\frac\beta$ $z^\beta \text(\frac \beta^2)$ Laplace $\frac$ $\text(\frac)$ Uniform $-\frac(\coth(\beta) - \beta^)$ βz γ-Bounded $\sim z \beta^$ $\beta^$ 3.2. Demixing transition

The presence of a transition can be clearly observed in figure 1, where the mean energy change abruptly. The temperature at which this transition takes place depends on N. For Gaussian distributions, trend of $\beta_}$ versus N are compatible with predictions of [11] ($\beta_} \sim \sqrt$). This function varies slowly, and verifying the relation requires data spanning multiple orders of magnitude. Due to uncertainties and the narrow range of values taken by $\beta_}$, multiple trends can be fitted with reasonable agreement: $\log(N)$, $\sqrt)$ or even a power-law N0.10 (see figure S1). The simulations here are therefore unable to confirm the scaling predictions of [11]. The scaling exponent of 0.1 is however clearly distinct from mean-field predictions of [7], which predicts $\beta_T \sim N^$.

For bounded distributions, data shows that $\beta_ \sim N^a$, where a is an apparent characteristic exponent, in contradictions with results presented in [7, 10, 11] (see figure 2). To understand this discrepancy, I investigate the partition function of the system at very low temperatures. At zero temperatures, the system is in its ground state, a crystal comprised of particles corresponding to the minimum of η. At non-zero but sufficiently low temperatures, a small proportion of these sites are replaced by defects. If the number of defect is sufficiently small, a dilute approximation can be used, which implies that the amount of neighboring defects is negligible. A factorized partition function can be trivially written for isolated defects, $\mathcal = \sum_k \text(-z \beta U_k)$, where Uk is the energy of the kth order defect ($U_1 \lt U_2 \lt \ldots U_$). I further approximate that the energy of a given defect can be replaced by its expectation value, such that the partition function can be expressed as $\mathcal = \sum_k \text(-z\beta\langle U_k \rangle)$, where $\langle U_k \rangle$ is the mean of the kth order statistic of η. Order statistics are implicitly dependent on N. The functional dependence of the transition temperature is then sought by investigating the functional form of the partition function (see SI section ‘Particular cases’ for details). This disregards any contribution from the high temperature state. When doing so, one recovers $\beta_} \sim \sqrt$ as in [11]. However, for γ-order distributions, it predicts a power-law with variable exponent, $\beta_T\sim N^$.

Figure 2. Behavior of phase transition temperature, βT , versus number of states N for bounded distributions. (A) Example of observed power-law behavior, $\beta_T\sim N^a$ for $\eta\sim\mathrm$, as well as fitted exponent. (B) Observed apparent exponent for Beta (blue), parabolic (orange) and triangular (yellow) distributions. The prediction of the ground state model is shown in black for reference. Error bars indicate 95% confidence interval of the fitting procedure.

Standard image High-resolution image

This derivation of the partition function can be directly linked to a particular picture of the phase transition. At low temperature, the demixed phase contains mostly two components. Other components (defects) are sparsely distributed throughout the phase. As temperature is increased, the amount of defects increases, both in quantity and in their energy. The phase transition is found when the demixed phase is mostly comprised of defects.

To test the partition function for the ground state model, I investigate the symmetric Beta family of distributions, $\eta\sim\mathrm(\alpha, \alpha)$. This family exhibits $\alpha = 1 + \gamma$, and the model therefore predicts $a = 1 / \alpha$. The ratio of predicted versus measured exponent is shown in figure 2(B). In addition to the Beta family of distribution, I also computed the apparent exponent for parabolic (γ = 0) and triangular (γ = 1) distributions. It appears that, while the measured exponent does decrease with γ, the theory does not correctly predict the exponents observed in simulations. In particular, the partition function predicts universal behavior with γ, whereas two different values for γ = 0 are observed in simulations. This may be related to the presence of multiple transitions (see discussion below, and further discussion in SI).

3.3. Timescale

To derive a characteristic timescale, we compute the inverse of the probability of accepting a Monte-Carlo move from equilibrium. That is, if we are currently at an energy $E = \langle E \rangle$ and a new energy is generated based on the distribution g(E), how many trials will it take to change the state. The energy of a proposed move can be directly estimated from g(E), and the mean energy can be estimated by equation (4). We compute the number of trials required by investigating how many proposed moved have an acceptance rate above $\text(-1)$. Acceptance rates are determined by the Metropolis criteria, $\text(-\beta \Delta E)$, and the number of trials can therefore be determined by how many proposed moves will have energy below $\langle E \rangle + \beta^$. Formally, this defines the mixed-state timescale as:

Equation (5)

which is asymptotically valid as $\tau_ \rightarrow \infty$. Expected timescales are given in table 1. The corresponding timescales can be extracted from the energy auto-correlation function

留言 (0)

沒有登入
gif