Modeling stochasticity and variability in gene regulatory networks
 David Murrugarra^{1, 2}Email author,
 Alan VelizCuba^{3},
 Boris Aguilar^{4},
 Seda Arat^{1, 2} and
 Reinhard Laubenbacher^{1, 2}
DOI: 10.1186/1687415320125
© Murrugarra et al; licensee Springer. 2012
Received: 14 November 2011
Accepted: 6 June 2012
Published: 6 June 2012
Abstract
Modeling stochasticity in gene regulatory networks is an important and complex problem in molecular systems biology. To elucidate intrinsic noise, several modeling strategies such as the Gillespie algorithm have been used successfully. This article contributes an approach as an alternative to these classical settings. Within the discrete paradigm, where genes, proteins, and other molecular components of gene regulatory networks are modeled as discrete variables and are assigned as logical rules describing their regulation through interactions with other components. Stochasticity is modeled at the biological function level under the assumption that even if the expression levels of the input nodes of an update rule guarantee activation or degradation there is a probability that the process will not occur due to stochastic effects. This approach allows a finer analysis of discrete models and provides a natural setup for cell population simulations to study celltocell variability. We applied our methods to two of the most studied regulatory networks, the outcome of lambda phage infection of bacteria and the p53mdm2 complex.
1 Introduction
Variability at the molecular level, defined as the phenotypic differences within a genetically identical population of cells exposed to the same environmental conditions, has been observed experimentally [1–4]. Understanding mechanisms that drive variability in molecular networks is an important goal of molecular systems biology, for which mathematical modeling can be very helpful. Different modeling strategies have been used for this purpose and, depending on the level of abstraction of the mathematical models, there are several ways to introduce stochasticity. Dynamic mathematical models can be broadly divided into two classes: continuous, such as systems of differential equations (and their stochastic variants) and discrete, such as Boolean networks and their generalizations (and their stochastic variants). This article will focus on stochasticity and discrete models.
where each coordinate function f_{ i } : X → X_{ i } is a function in a subset of {x_{1}, ..., x_{ n } }. Dynamics is generated by iteration of f, and different update schemes can be used for this purpose. As an example, if X_{ i } = {0, 1} for all i, then each f_{ i } is a Boolean rule and f is a Boolean network where all the variables are updated simultaneously. We will assume that each X_{ i } comes with a natural total ordering of its elements (corresponding to the concentration levels of the associated molecular species). Examples of this type of dynamical system representation are Boolean networks, logical models and Petri nets [5–7].
To account for stochasticity in this setting several methods have been considered. Probabilistic Boolean networks (PBNs) [8, 9] introduce stochasticity in the update functions, allowing a different update function to be used at each iteration, chosen from a probability space of such functions for each network node. For other approaches, see [10–12]. These models will be discussed in more detail in the next section. In this article we present a model type related to PBNs, with additional features. We show that this model type is natural and a useful way to simulate gene regulation as a stochastic process, and is very useful to simulate experiments with cell populations.
1.1 Modeling stochasticity in gene regulatory networks
Gene regulation processes are inherently stochastic. Accurately modeling this stochasticity is a complex and important goal in molecular system biology. Depending on the level of knowledge of the biological system and the availability of data for it one could follow different approaches. For instance, viewing a gene regulatory network as a biochemical reaction network, the Gillespie algorithm can be applied to simulate each biochemical reaction separately generating a random walk corresponding to a solution of the chemical master equation of the system [13, 14]. At an even more detailed level one could introduce time delays into the Gillespie simulations to account for realistic time delays in activation or degradation such as in circadian rhythms [15–17]. At a higher level of abstraction, stochastic differential equations [18] contain a deterministic approximation of the system and an additional random white noise term. However, all these schemes require that all the kinetic rate constants to be known which could represent a strong constraint due to the difficulty of measuring kinetic parameters, limiting these approaches to small systems.
As mentioned in the introduction, discrete models are an alternative to continuous models, which do not depend on rate constants. In this setting, several approaches to introduce stochasticity have been proposed. Specially for Boolean networks, stochasticity has been introduced by flipping node states from 0 to 1 or vice versa with some flip probability [12, 19–21]. However, it has been argued that this way of introducing stochasticity into the system usually leads to overrepresentation of noise [11]. The main criticism of this approach is that it does not take into consideration the correlation between the expression values of input nodes and the probability of flipping the expression of a node due to noise. In fact, this approach models the stochasticity at a node regardless of the susceptibility to noise of the underlying biological function [11].
Probabilistic Boolean networks [8, 9, 22] is another stochastic method proposed within the discrete strategy. PBNs model the choice among alternate biological functions during the iteration process, rather than modeling the stochasticity of the function failure itself. We have adopted a special case of this setting, in which every node has associated to it two functions: the function that governs its evolution over time and the identity function. If the first is chosen, then the node is updated based on its logical rule. When the identity function is chosen, then the state of the node is not updated. The key difference to a PBN is the assignment of probabilities that govern which update is chosen. In our setting, each function gets assigned two probabilities. Precisely, let x_{ i } be a variable. We assign to it a probability ${p}_{i}^{\uparrow}$, which determines the likelihood that x_{ i } will be updated based on its logical rule, if this update leads to an increase/activation of the variable. Likewise, a probability ${p}_{i}^{\downarrow}$ determines this probability in case the variable is decreased/inhibited. The necessity for considering two different probabilities is that activation and degradation represent different biochemical processes and even if these two are encoded by the same function, their propensities in general are different. This is very similar to what is considered in differential equations modeling, where, for instance, the kinetic rate parameters for activation and for degradation/decay are, in principle, different.
Note that all these approaches only take account of intrinsic noise which is generated from small fluctuations in concentration levels, small number of reactant molecules, and fast and slow reactions. Another source of stochasticity is related to extrinsic noise such as a noisy cellular environment and temperature. For more about intrinsic vs extrinsic noise see [3, 23].
2 Method
where

f_{ i }: X → X_{ i }is the update function for x_{ i }, for all i = 1, ..., n.

${p}_{i}^{\uparrow}$ is the activation propensity.

${p}_{i}^{\downarrow}$ is the degradation propensity.

${p}_{i}^{\uparrow}$, ${p}_{i}^{\downarrow}\in \left[0,1\right]$.
We now proceed to study the dynamics of such systems and two specific models as illustration.
2.1 Dynamics of SDDS
That is, if the possible future value of the ith coordinate is larger (smaller, resp.) than the current value, then the activation (degradation) propensity determines the probability that the ith coordinate will increase (decrease) its current value. If the ith coordinate and its possible future value are the same, then the ith coordinate of the system will maintain its current value with probability 1. Notice that π_{i, x}(x_{ i } → y_{ i } ) = 0 for all y_{ i } ∉ {x_{ i } , f_{ i } (x)}.
By convention we omit edges with weight zero. See Additional file 1 for pseudocodes of algorithms to compute dynamics of SDDS. Software to test examples is available at http://dvd.vbi.vt.edu/adam.html[24] as a web tool (choose SDDS in the model type).
Given $F={\left\{{f}_{i},{p}_{i}^{\uparrow},{p}_{i}^{\downarrow}\right\}}_{i=1}^{n}$ a SDDS, it is straightforward to verify that F has the same steady states (fixed points) as the deterministic system $G={\left\{{f}_{i}\right\}}_{i=1}^{n}$ (see Additional file 1). It is also important to note that the dynamics of F includes the different trajectories that can be generated from G using other common update mechanisms such as the synchronous and asynchronous schemes (see Additional file 1).
2.1.1 Example
Pr(01 → 10) = (.1)(.9) = .09, Pr(01 → 00) = (1  .1)(.9) = .81
Pr(01 → 01) = (1  .1)(1  .9) = .09, Pr(01 → 11) = (.1)(1  .9) = .01
Pr(10 → 10) = (1  .2)(1  .5) = .4, Pr(10 → 01) = (.2)(.5) = .1
Pr(10 → 00) = (.2)(1  .5) = .1, Pr(10 → 11) = (1  .2)(.5) = .4
Pr(11 → 11) = (1)(1  .9) = .1, Pr(11 → 10) = (1)(.9) = .9
Pr(00 → 00) = (1)(1) = 1.
3 Applications
We illustrate the advantages of this model type by applying it to two widely studied biological systems, the regulation of the p53mdm2 network and the control of the outcome of phage lambda infection of bacteria. These regulatory networks were selected because stochasticity plays a key role in their dynamics.
3.1 Regulation in the p53Mdm2 network
Propensity probabilities for the p53Mdm2 regulatory network
P  Mc  Mn  Dam  

Activation  .9  .9  .9  1 
Degradation  .9  .9  .9  .05 
The state space for this model is specified by [0, 2] × [0, 1] × [0, 1] × [0, 1], that is, except for the first variable P which has three levels {0, 1, 2}, all other variables are Boolean. The update functions for this model are provided in Additional file 1 and also in the model repository of our web tool at http://dvd.vbi.vt.edu/adam.html.
Propensity parameters for Figure 7 (top frame)
CI  CRO  CII  N  

Activation  .8  .2  .9  .9 
Degradation  .2  .8  .9  .9 
Propensity parameters for Figure 7 (bottom frame)
CI  CRO  CII  N  

Activation  .3  .7  .9  .9 
Degradation  .7  .3  .9  .9 
To highlight the features of our approach we compare our model with the one presented in [25] in which variability has been analyzed. The main difference between these two models is in the way the simulations are performed. In [25], the transition from one state to the next is determined by parameters called "on" and "off" time delays. For instance, to transition from 2001 to 2101 it is required that ${t}_{Mc}<{t}_{\overline{\text{dam}}}$ which means that the "on" delay for Mc (time for activating) is less than the "off" delay (time for degrading) of the damage. Otherwise, if ${t}_{Mc}>{t}_{\overline{\text{dam}}}$ the system will transition from 2001 to 2000. In this article, transitions from one state to others are given as probabilities which are determined from the propensity probabilities. Therefore, the complexity of the model presented here is at the level of the wiring diagram (i.e. the number of variables) while the complexity of the model in [25] is at the level of the state space (i.e. number of possible states) which is exponential in the number of variables. Another key difference is the way DNA damage repair is modeled. In [25], a delay parameter ${t}_{\overline{\text{dam}}}$ is associated with the disappearance of the damage, and this is decreased by a certain amount τ at each iteration so that ${t}_{\overline{\text{dam}}}^{\left(n\right)}={t}_{\overline{\text{dam}}}^{\left(0\right)}n\tau \ge 0$ where n is the number of iterations. In order to simulate DNA damage with this approach it is required to estimate τ, n, and ${t}_{\overline{\text{dam}}}^{\left(0\right)}$. Within our model framework a single parameter, the degradation propensity, is used to model the damage repair which is a more natural setup.
3.2 Phage lambda infection of bacteria
The state space for this model is specified by [0, 2] × [0, 3] × [0, 1] × [0, 1], that is, the first variable, CI, has three levels 0, 1, 2, the second variable, CRO, has four levels {0, 1, 2, 3}, and the third and fourth variables, CII and N, are Boolean. Update functions for this model are available in our supporting material, Additional file 1. This model has a steady state, 2000, and a 2cycle involving 0200 and 0300. The steady state 2000 represents lysogeny where CI is fully expressed while the other genes are off. The cycle between 0200 and 0300 represents lysis where CRO is active and other genes are repressed.
4 Conclusions
Using a discrete modeling strategy, this article introduces a framework to simulate stochasticity in gene regulatory networks at the function level, based on the general concept of PBNs. It accounts for intrinsic noise due to spontaneous differences in timing, small fluctuations in concentration levels, small numbers of reactant molecules, and fast and slow reactions. This framework was tested using two widely studied regulatory networks, the regulation of the p 53Mdm 2 network and the control of phage lambda infection of bacteria. It is shown that in both of these examples the use of propensity probabilities for activation and degradation of network nodes provides a natural setup for cell population simulations to study celltocell variability. The new features of this framework are the introduction of activation and degradation propensities that determine how fast or slow the discrete variables are being updated. This provides the ability to generate more realistic simulations of both single cell and cell population dynamics. In the example of the p 53Mdm 2 system, one can see that individual simulations show sustained oscillations when DNA damage is present, while at the cell population level these individual oscillations average to a damped oscillation. This agrees with experimental observations [4]. In the second example, λphage infection of bacteria, it is observed that differences in developmental outcome due to intrinsic noise can be captured with this framework. Due to the lack of experimental data we are unable to calibrate the model so that it reproduces the correct difference in percentages due to intrinsic noise. So instead we present a plot of the difference in developmental outcome as a function of the propensity parameters.
It is worth noting that this article addresses only intrinsic noise generated from small fluctuations in concentration levels, small numbers of reactant molecules, and fast and slow reactions. Extrinsic noise is another source of stochasticity in gene regulation [3, 23], and it would be interesting to see if this framework or a similar setup can be adapted to account for extrinsic stochasticity under the discrete approach. This framework also lends itself to the study of intrinsic noise and it is useful for the study of developmental robustness. For instance, one could ask what the effect of this type of noise is on the dynamics of networks controlled by biologically inspired functions.
Relating the propensity parameters to biologically meaningful information or having a systematic way for estimating them is very important. A preliminary analysis shows that it is possible to relate the propensity parameters in this framework with the propensity functions in the Gillespie algorithm under some conditions (see Additional file 1 where for a simple degradation model, the degradation propensity is correlated by a linear equation with the decay rate of the species being degraded). More precisely, in the Gillespie algorithm [13, 14], if one discretizes the number of molecules of a chemical species into discrete expression levels such that within these levels the propensity functions for this species do not change significantly, then one obtains the setup of the framework presented here as a discrete model. That is, simulation within the framework presented here can be viewed as a further discretization of the Gillespie algorithm, in a setting that does not require exact knowledge of model parameters. For a similar approach see [10].
Acknowledgements
DM and RL were partially supported by NSF grant CMMI0908201. RL and DM thank Ilya Shmulevich for helpful suggestions. The authors thank the anonymous reviewers for many suggestions that improved the article.
Declarations
Authors’ Affiliations
References
 Avigdor E, Elowitz M: Functional roles for noise in genetic circuits. Nature 2010, 467: 167173. 10.1038/nature09326View Article
 Acar M, Mettetal J, van Oudenaarden A: Stochastic switching as a survival strategy in fluctuating environments. Nat Gen 2008, 40: 471475. 10.1038/ng.110View Article
 StPierre F, Endy D: Determination of cell fate selection during phage lambda infection. PNAS 2008, 105: 2070520710. 10.1073/pnas.0808831105View Article
 GevaZatorsky N, Rosenfeld N, Itzkovitz S, Milo R, Sigal A, Dekel E, Yarnitzky T, Liron Y, Polak P, Lahav G, Alon U: Oscillations and variability in the p53 system. Mol Syst Biol 2006., 2: 2006.0033, doi: 10.1038/msb4100068
 Irons D: Logical analysis of the budding yeast cell cycle. J Theor Biol 2009, 257: 543559. 10.1016/j.jtbi.2008.12.028MathSciNetView Article
 Thomas R, D'Ari R: Biological Feedback. CRC Press, Boca Raton; 1990.MATH
 Chaouiya C, Remy E, Mossé B, Thiery D: Qualitative Analysis of Regulatory Graphs: A Computational Tool Based on a Discrete Formal Framework. Volume 294. Lecture Notes in Control and Information Sciences; 2003:830832.
 Shmulevich I, Dougherty E, Kim S, Zhang W: Probabilistic Boolean networks: a rule based uncertainty model for gene regulatory networks. Bioinformatics 2002, 18(2):261274. 10.1093/bioinformatics/18.2.261View Article
 Shmulevich I, Dougherty E: Probabilistic Boolean Networks: The Modeling and Control of Gene Regulatory Networks. SIAM, Philadelphia; 2010.View Article
 Teraguchi S, Kumagai Y, Vandenbon A, Akira S, Standley D: Stochastic binary modeling of cells in continuous time as an alternative to biochemical reaction equations. Phys Rev E 2011., 84(4): 062903
 Garg A, Mohanram K, Di Cara A, De Micheli G, Xenarios I: Modeling stochasticity and robustness in gene regulatory networks. Bioinformatics 2010, 15;25(12):i101i109.
 Ribeiro AS, Kauffman SA: Noisy attractors and ergodic sets in models of gene regulatory networks. J Theor Biol 2007, 247: 743755. 10.1016/j.jtbi.2007.04.020MathSciNetView Article
 Gillespie D: Exact stochastic simulation of coupled chemical reactions. J Phys Chem 1977, 81(25):23402361. 10.1021/j100540a008View Article
 Gillespie D: Stochastic simulation of chemical kinetics. Annu Rev Phys Chem 2007, 58: 3555. 10.1146/annurev.physchem.58.032806.104637View Article
 Bratsun D, Volfson D, Tsimring LS, Hasty J: delayinduced stochastic oscillations gene regulation. PNAS 2005, 102(41):1459314598. 10.1073/pnas.0503858102View Article
 Ribeiro AS: Stochastic and delayed stochastic models of gene expression and regulation. Math Biosci 2010, 223(1):111. 10.1016/j.mbs.2009.10.007MATHMathSciNetView Article
 Ribeiro AS, Zhu R, Kauffman SA: A general modeling strategy for gene regulatory networks with stochastic dynamics. J Comput Biol 2006, 13(9):16301639. 10.1089/cmb.2006.13.1630MathSciNetView Article
 Toulouse T, Ao P, Shmulevich I, Kauffman S: Noise in a small genetic circuit that undergoes bifurcation. Complexity 2005, 11(1):4551. 10.1002/cplx.20099MathSciNetView Article
 ÁlvarezBuylla ER, Chaos A, Aldana M, Benítez M, CortesPoza Y, EspinosaSoto C, Hartasánchez DA, Lotto RB, Malkin D, Escalera Santos GJ, PadillaLongoria P: Floral morphogenesis: stochastic explorations of a gene network epigenetic landscape. PLoS ONE 2008, 3(11):e3626. doi:10.1371/journal.pone.0003626 10.1371/journal.pone.0003626View Article
 Davidich MI, Bornholdt S: Boolean network model predicts cell cycle sequence of fission yeast. PLoS ONE 2008, 3(2):e1672. doi:10.1371/journal.pone.0001672 10.1371/journal.pone.0001672View Article
 Willadsen K, Wiles J: Robustness and statespace structure of Boolean gene regulator models. J Theor Biol 2008, 249(4):749765.MathSciNetView Article
 Layek R, Datta A, Pal R, Dougherty ER: Adaptive intervention in probabilistic Boolean networks. Bioinformatics 2009, 25(16):20422048. 10.1093/bioinformatics/btp349View Article
 Peter SS, Michael BE, Eric DS: Intrinsic and extrinsic contributions to stochasticity in gene expression. PNAS 2002, 99(20):1279512800. 10.1073/pnas.162041399View Article
 Hinkelmann F, Brandon M, Guang B, McNeill R, Blekherman G, VelizCuba A, Laubenbacher R: ADAM: analysis of the dynamics of algebraic models of biological systems using computer algebra. BMC Bioinf 2011, 12: 295. 10.1186/1471210512295View Article
 AbouJaoudé W, Ouattara D, Kaufman M: From structure to dynamics: frequency tuning in the p53mdm2 network: I. logical approach. J Theor Biol 2009, 258(4):561577. 10.1016/j.jtbi.2009.02.005View Article
 Batchelor E, Loewer A, Lahav G: The ups and downs of p53: understanding protein dynamics in single cells. Nat Rev Cancer 2009, 9: 371377. 10.1038/nrc2604View Article
 Ptashne M: A Genetic Switch: Phage λ and Higher Organisms. Cell Press and Blackwell Scientific Publications, Cambridge; 1992.
 Thiery D, Thomas R: Dynamical behaviour of biological regulatory networksII. Immunity control in bacteriophage lambda. Bull Math Biol 1995, 57: 277295.
 Reichardt L, Kaiser D: Control of λ repressor synthesis. PNAS 1971, 68: 21852189. 10.1073/pnas.68.9.2185View Article
 Kourilsky P: Lysogenization by bacteriophage lambda I. Multiple infection and the lysogenic response. Mol Gen Gen 1973, 122: 183195. 10.1007/BF00435190View Article
 Arkin A, Ross J, McAdams H: Stochastic kinetic analysis of developmental pathway bifurcation in Phageinfected Escherichia coli cells. Genetics 1998, 149: 16331648.
Copyright
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.