Synthetic Gene Circuits - Methods and Protocols (2021)
Synthetic Gene Circuits - Methods and Protocols (2021)
Filippo Menolascina
Editor
Synthetic
Gene Circuits
Methods and Protocols
METHODS IN MOLECULAR BIOLOGY
Series Editor
John M. Walker
School of Life and Medical Sciences
University of Hertfordshire
Hatfield, Hertfordshire, UK
Edited by
Filippo Menolascina
School of Engineering, Institute for Bioengineering, University of Edinburgh, Edinburgh, UK
Editor
Filippo Menolascina
School of Engineering
Institute for Bioengineering
University of Edinburgh
Edinburgh, UK
This Humana imprint is published by the registered company Springer Science+Business Media, LLC, part of Springer
Nature.
The registered company address is: 1 New York Plaza, New York, NY 10004, U.S.A.
Preface
v
vi Preface
automating the design, construction, testing, and modeling of biocircuits. We hope the
result will be met with favor. I personally wish to thank all the authors for their contribu-
tions: editing this book and venturing in their science were a tremendously enjoyable
learning experience for me.
Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v
Contributors. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix
vii
viii Contents
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 347
Contributors
ix
x Contributors
Abstract
Qualitative modeling approaches are promising and still underexploited tools for the analysis and design of
synthetic circuits. They can make predictions of circuit behavior in the absence of precise, quantitative
information. Moreover, they provide direct insight into the relation between the feedback structure and the
dynamical properties of a network. We review qualitative modeling approaches by focusing on two specific
formalisms, Boolean networks and piecewise-linear differential equations, and illustrate their application by
means of three well-known synthetic circuits. We describe various methods for the analysis of state
transition graphs, discrete representations of the network dynamics that are generated in both modeling
frameworks. We also briefly present the problem of controlling synthetic circuits, an emerging topic that
could profit from the capacity of qualitative modeling approaches to rapidly scan a space of design
alternatives.
Key words Qualitative modeling, Gene regulatory networks, Synthetic circuits, Boolean models,
Piecewise-linear differential equation models, Network control
1 Introduction
Filippo Menolascina (ed.), Synthetic Gene Circuits: Methods and Protocols, Methods in Molecular Biology, vol. 2229,
https://doi.org/10.1007/978-1-0716-1032-9_1, © Springer Science+Business Media, LLC, part of Springer Nature 2021
1
2 Madalena Chaves and Hidde de Jong
a aTc
lacI glnG
IPTG
Cbf1 Ash1
CBF1 ASH1
Gal4 Swi5
GAL4 SWI5
Galactose
Gal80
GAL80
3 Boolean Models
the components involved are not measured, and the reaction rates
and other parameters are unknown. In such cases, a Boolean model
of the network provides a global qualitative view of the dynamical
behavior of the network, using all the available information on the
network, but without introducing unknown parameters.
The first application of Boolean models to biological networks
was suggested by Stuart Kauffman in 1969 [11] and by René
Thomas around the same time [13]. Both used the Boolean repre-
sentation to describe genetic regulatory networks, where events
such as mRNA transcription and protein translation may be
thought of as being “turned on” or “turned off” (1 or 0).
The last 20 years have witnessed an increasing availability of
genomic and proteomic data, the discovery of new biological mole-
cules and pathways, and the multiplication of interactions among
biological components. Nevertheless, it is still difficult to obtain
detailed parameter sets to characterize each biological reaction or
interaction. On the mathematical side, several methods have been
proposed to better characterize Boolean models and introduce
quantitative elements: probabilistic and stochastic approaches
[26, 27], complex updating schedules [28–30], model reduction
[31, 32], attractor computation [33–35], characterization of state
transition graphs [36], network interconnections [37], and control
methods [38–40].
All these advances sparked a new wave of interest in Boolean
models for application to a wide range of biological networks, from
the cellular division cycle in various organisms [41–43], to signal
transduction networks [15, 44], cancer-related networks [45], or
pattern formation [46, 47]. A large collection of recent examples
can be found in a special issue of the journal Frontiers in
Physiology [48].
In addition to Boolean models, there are several approaches
using discrete and logical functions to describe biological networks.
Further work by René Thomas and collaborators extends Boolean
models in several ways [23], such as the inclusion of multiple
discrete levels, by assigning parameters to the transition graph
edges to indicate different concentration thresholds. Among
other formal methods, Petri nets have been successfully applied to
model biological systems [49]. A Petri net is defined through a
graph with two types of nodes (places and transitions), connected
by weighted directed edges. Places may be marked by a number of
tokens that enable transitions. Petri nets are especially suitable to
model biochemical and metabolic networks, as the incidence matrix
of the net reflects the stoichiometry matrix [50].
3.1 The Toggle As a first example, consider the toggle switch, a network with two
Switch components L and T (for LacI and TetR protein expression, respec-
tively), and two inputs A and I (for aTc and IPTG concentration,
respectively). Both variables and inputs take values 0 or 1. To write
8 Madalena Chaves and Hidde de Jong
LT L + T+ L T L + T+
00 11 00 10, 01 01 11
01 11
01 01 01 01
10 10 10 10
00 10 00 10
11 00 11 10, 01
A B C D
3.2 The Oscillator This network is also composed of two genes lacI and glnG encod-
with Positive Feedback ing for two proteins, LacI and NRI, both regulated by the phos-
phorylated form of the transcription factor NRI. The protein LacI
represses transcription of glnG and, in turn, the input IPTG lifts
LacI repression. In general, the phosphorylated transcription factor
NRI will activate genes lacI and glnG at different concentrations or
activity thresholds, that is, whenever protein NRI is above a first
threshold concentration θ1N , transcription of glnG is activated, and
when NRI becomes higher than a second threshold concentration
θ2N , it induces activity of lacI. The experimental system [51] implies
that θ1N < θ2N . These distinct thresholds of activation for NRI
require a variable with at least three discrete concentration levels,
while Boolean variables have only two levels. To resolve this prob-
lem, a generalized logical model would consider a multi-leveled
variable N to describe the concentration of protein NRI (as in the
corresponding PLDE model, Subheading 4.2). Alternatively, Bool-
ean models can also be extended as suggested in [52], by creating
two different Boolean variables, N1 and N2, to represent N as
follows:
( (
0, N < θ1N , 0, N < θ2N ,
N1 ¼ N 2 ¼
1, N > θ1N , 1, N > θ2N ,
These two variables will evolve according to different Boolean
rules, but should always satisfy N1 N2, by definition of the thresh-
olds. More specifically, if N is a logical variable with three levels {0,
1, 2}, then N1 and N2 allow us to code for those three levels in a
Boolean notation, that is, “0¼00,” “1¼10,” and “2¼11” so that
the higher concentration of N corresponds to both N1 and N2 ON,
while the intermediate concentration of N corresponds to N1 ON
and N2 OFF. Note that the Boolean state (N1, N2) ¼ (0, 1) does
10 Madalena Chaves and Hidde de Jong
not encode for any level of variable N and does not take part in the
state transition graph of the Boolean model.
Therefore, three variables will be considered: L for LacI and
N1, N2 for NRI protein expression. The input IPTG is denoted I.
To assign the rules for variables N1 and N2, we will consider that
NRI transcription is activated in a first stage by the positive feed-
back loop and in a second stage LacI repression comes into play.
Thus N1 is regulated by N2 only, while N2 is regulated both by N1
and L. The Boolean rules for the oscillator with positive feedback
become:
Lþ ¼ N 2, ð4Þ
Nþ
1 ¼ N 2, ð5Þ
Nþ
2 ¼ ðØL _ I Þ ^ N 1 : ð6Þ
The input I ¼ 1 induces the expression of NRI, followed by its
phosphorylation, and subsequent expression of LacI, so the system
converges to state 111.
In the case I ¼ 0, the synchronous and asynchronous updating
schedules lead to quite different state transition graphs, but both
contain only one attractor, consisting of the origin with all proteins
weakly expressed. In the synchronous case, however, the transition
010 ! 001 is an artifact of the simultaneous updating of N1 and
N2. This problem is resolved in the asynchronous state transition
graph (Fig. 3b), where the states 001 and 101 are transient and do
a b
L N1 N2 L+ N1+ N2+ L+ N1+ N2+
synchronous asynchronous
011 111
000 000 000
001 110 101, 011, 000
010 001 000, 011 010 110
011 111 111
100 000 000 001 101
101 110 111, 100
110 000 010, 100 000 100
111 110 110
c
C4
C1 C2 C3 C5
Fig. 3 Oscillator with positive feedback and zero input (I ¼ 0). (a) Truth table for
synchronous and asynchronous updating schedules. (b) Asynchronous state
transition graph. (c) Hierarchical state transition graph after decomposition into
strongly connected components (see Subheading 5.2 for the corresponding
analysis and definition of the components Ci)
Qualitative Modeling of Synthetic Circuits 11
not have any incoming arrows from other states. In this graph, the
effect of the negative feedback loop between LacI and NRI can be
observed in the cyclic orbit which is reached whenever NRI is above
its intermediate threshold concentration (N1 ¼ 1):
111 ! 110 ! 010 ! 011 ! 111. However, this cyclic orbit is not
an attractor itself and the Boolean model predicts that all trajec-
tories eventually converge to the point attractor formed by the
origin (see the transitions from the states 010 and 110 to 000).
In this example, the global behavior of the Boolean model
differs from that of the corresponding PLDE model in Subheading
4.2, even though both models have the same point attractor at the
origin and the cyclic orbit of the Boolean model corresponds
exactly to the orbit depicted in Fig. 6b. However, in the PLDE
model, the cyclic orbit is also an attractor, and there are trajectories
converging either to the origin or to a (damped) periodic orbit
depending on the initial conditions. In this case, the PLDE model
allows for a more detailed description of the continuous state space,
as discussed below (Subheading 4.2).
3.3 The IRMA Circuit This circuit is composed of five genes encoding for five proteins,
Ash1, Cbf1, Gal4, Gal80, and Swi5, and one input G (galactose).
One of the proteins (Swi5) is a transcription factor for three of the
genes. In this circuit, the different activity thresholds of Swi5
relative to the three genes will play an important role in determin-
ing the dynamical properties of the system. These thresholds define
the (distinct) concentrations of Swi5 which trigger the transcrip-
tion of the three genes. If S denotes the (continuous) concentration
of protein Swi5, then transcription of gene ASH1 is activated
whenever S > θaS . Similarly, transcription of genes CBF1 and
g
GAL80 is initiated when S > θcS and S > θS , respectively. From
the analysis in [51], the activity threshold for CBF1 should be the
g
lowest, and in this section we will consider that θcS < θS < θaS. These
distinct thresholds for S require a logical variable with at least four
discrete concentration levels so, as in the oscillator example, an
extended Boolean model will be constructed [52], by creating
three different Boolean variables to represent S as follows:
( g
0, S < θcS , 0, S < θS , 0, S < θaS ,
Sc ¼ Sg ¼ g Sa ¼
1, S > θcS , 1, S > θS , 1, S > θaS :
Fig. 4 State transition graph of the IRMA model, for the case G ¼ 1. The yellow nodes represent the two
attractors. This graph was constructed in the software platform Cytoscape [53]
they have also been used for modeling actual regulatory networks,
for example, in microbiology [69–71]. Computer tools allowing
the definition of PLDE models of regulatory networks and their
qualitative analysis are available, such as Genetic Network Analyzer
(GNA) [72, 73]. Recent publications present the (qualitative) anal-
ysis of more general classes of PLDE models [74], while other work
presents the related formalism of hybrid automata and their appli-
cation to circuit modeling [75].
4.1 The Toggle A PLDE model for the toggle switch can be developed, analogously
Switch to the Boolean model in Subheading 3.1. We again use the variables
L (LacI), T (TetR), I (IPTG), and A (aTc), but now treat them as
concentrations taking their values in 0 . Similarly, we introduce
for each of the variables a concentration threshold, labeled θL, θT,
θI, and θA, respectively. With these definitions, the step function
s+(L, θL) evaluates to 1, if L is present at a high concentration, above
its threshold θL, and to 0, if L is present at a low concentration,
below its threshold. Like in the Boolean model, we would like to
express that the gene encoding TetR is expressed when the concen-
tration of L is low or that of I high, in other words Øs+(L, θL) _ s+(I,
θI). An equivalent formulation is obtained using de Morgan’s law:
Ø(s+(L, θL) ^Øs+(I, θI)) ¼ Ø(s+(L, θL) ^ s(I, θI)), which can be
interpreted as saying that TetR is not expressed when LacI is
present at a high concentration and not inhibited due to the pres-
ence of IPTG. The latter expression can be rewritten in algebraic
form as (1 s+(L, θL) s(I, θI)). Similarly, the regulation of LacI by
TetR and aTc gives rise to the step function expression (1 s+(T,
θT) s(A, θA)). Boolean expressions of step functions can always be
translated into equivalent algebraic expressions [54].
With the above considerations, the model for the toggle switch
reads as
L_ ¼ κL ð1 s þ ðT , θT Þ s ðA, θA ÞÞ γ L L, ð16Þ
T_ ¼ κT ð1 s þ ðL, θL Þ s ðI , θI ÞÞ γ T T , ð17Þ
where I and A are considered constant inputs. The dynamics of this
model can be analyzed in the plane, where we assume for the time
being that IPTG and aTc are absent from the medium, that is,
I ¼ A ¼ 0 and therefore s(I, θI) ¼ s(A, θA) ¼ 1. The thresholds
for T and L divide the phase space into four regions (Fig. 5a), in
each of which the model of Eqs. 16–17 reduces to a simple linear
model. For example, in the region S1, defined by the inequalities
0 L < θL and 0 T < θT, we have L_ ¼ κL γ L L and T_ ¼
κT γ T T . In this region all solution trajectories (monotonically)
converge to the asymptotically stable steady state of the linear
system given by (κL/γ L, κT/γ T)0 . This so-called focal point is here
assumed to lie outside S1, in the region S3, which amounts to
assuming that κL/γ L > θL and κT/γ T > θT (Fig. 5a). In other
16 Madalena Chaves and Hidde de Jong
a b
T
S4 S3
κT /γT 01 11
S4 S3
θT
S2
00 10
S1 S2
S1
0 θL κL/γL L
Fig. 5 PLDE model of toggle switch in the absence of inputs (I ¼ A ¼ 0). (a) Phase
plane analysis. Some example solutions are shown (solid curves). (b) State
transition graph. The names of the states correspond to the names of the regions
and the states have been labeled with the values of s+(L, θL) and s+(T, θT)
4.2 The Oscillator In developing the PLDE model of the oscillator with positive
with Positive Feedback feedback (Fig. 1b), like in the Boolean model, we will not distin-
guish between the phosphorylated and non-phosphorylated forms
of NRI, but rather build upon the fact that in the strain considered,
phosphorylation of NRI is constitutive. Contrary to the Boolean
model, however, we introduce only a single variable for the NRI
concentration (N), in addition to a variable for the LacI concentra-
tion (L) and the input IPTG (I). N has two different threshold
concentrations, a first threshold for activation of the promoter
driving NRI expression and a second threshold for the promoter
driving LacI expression. These two thresholds will be referred to as
θ1N and θ2N , respectively. The limitation to two state variables makes
it possible to display the dynamics of the model in the phase plane,
which will be convenient for illustrative purposes.
This results in the following PLDE model of the network:
L_ ¼ κL s þ ðN , θ2N Þ γ L L, ð18Þ
N_ ¼ κ N ð1 s þ ðL, θL Þ s ðI , θI ÞÞ s þ ðN , θ1N Þ γ N N :
ð19Þ
The regulatory logic embedded in the equation for N agrees
with the details of the molecular implementation of the regulatory
circuit, where for the gene to be expressed, NRI needs to be present
and LacI to be absent or inactive. Moreover, the choice of promo-
ters in the circuit guarantees that θ1N < θ2N [24].
Figure 6a shows the phase plane analysis of the oscillator
model, under the assumption that IPTG is absent (I ¼ 0) and that
κL/γ L > θL and κN =γ N > θ2N . Notice that any other choice of the
parameter inequalities would be inconsistent with the implementa-
tion of the regulatory circuit, as it would imply that even when the
proteins were expressed, the concentrations of NRI and LacI would
never rise to a level where they can influence the expression of their
target genes. Interestingly, the analysis shows that the system has
the potential to generate oscillations in the regions where N > θ1N .
Below this threshold, however, the system falls back to the trivial
stable steady state (0, 0)0 . The oscillations and the steady state show
18 Madalena Chaves and Hidde de Jong
a b
N
S5 S6 02 12
κN /γ N S5 S6
θN2
S4 11
01
S3 S4
S3
θN1
S1 S2 00 10
S1 S2
0 θL κL/ γ L L
c d
Z2 Z3
Z5 Z2 Z5 Z3
. Z1 . (0, 0) Z1 ( θL , 0)
0 θL
Fig. 6 PLDE model of oscillator with positive feedback in the absence of input
(I ¼ 0). (a) Phase plane analysis. Some example solutions are shown (solid
curves). (b) State transition graph. The names of the states correspond to the
names of the regions and the states have been labeled with the values of s+(L,
θL) and s þ ðN , θ1N Þ þ s þ ðN , θ2N Þ. (c) Refined phase plane analysis of the lower-
left portion of the phase plane with example solutions. (d) State transition graph
corresponding to the analysis in c
these subtle aspects of the dynamics are absent from the Boolean
model of Subheading 3.2, both models agree in predicting oscilla-
tions and a stable steady.
4.3 The IRMA Circuit Whereas the example networks in the previous two sections are
small and can be analyzed in the phase plane, this is not the case
for the IRMA network. The model has five genes, ASH1, CBF1,
GAL4, GAL80, and SWI5, and one input, galactose. The PLDE
model previously developed for this network [76] has five state
variables, one for each protein concentration (A, C, G4, G80
and S), and one input variable, representing the galactose concen-
tration (G):
A_ ¼ κ0A þ κA s þ ðS, θaS Þ γ A A, ð20Þ
5.1 Analysis Attractors in a state transition graph are (minimal) sets of states
of Attractors and Their which do not have any outgoing transitions, that is, transitions
Stability from a state inside to a state outside the attractor. Usually, two
different types of attractors are distinguished: point attractors,
consisting of a single state, and cyclic attractors, consisting of a set
of states forming one or several cycles. The interest of attractors for
the study of network dynamics is that, starting from an initial state
in the graph, the system reaches an attractor after a finite number of
transitions and then indefinitely remains there. For this reason,
attractors have been associated with end-points of developmental
trajectories in higher organisms [46, 47] or possible responses of
microorganisms to a challenge from their environment [69]. In
synthetic biology, attractors may correspond to different functional
states and thus form an objective of circuit design [6]. Although
new measurement techniques have made it possible to follow the
transient dynamics of networks, for instance, by using fluorescent
reporter proteins, in many cases attractors remain the only reliably
observable states of the system.
Given a state transition graph, the identification of attractors is
straightforward. Point attractors can be found by inspecting all
individual states and cyclic attractors by looking for strongly
connected components (SCCs) of the graph. An SCC is a set of
states which are mutually connected, that is, there exists a directed
pathway from each state to any other in the SCC. An SCC is also a
maximal set, in the sense that it contains every state mutually
connected to any other state in the SCC. An SCC may have incom-
ing edges and outgoing transitions, and for it to be an attractor, it
needs to be a terminal SCC, that is, have no outgoing transitions.
Since the size of the state transition graphs grows exponentially
with the number of variables (genes), however, this enumeration
approach may not be feasible in many situations of practical inter-
est. Several approaches for identifying attractors that do not require
the prior generation of the state transition graph have been devel-
oped. These approaches are based, for example, on the solution of a
constraint satisfaction problem [77], a satisfiability problem
[78, 79], a problem formulated in the answer set programming
framework [80], or a temporal logic query [81].
Qualitative Modeling of Synthetic Circuits 21
5.2 Reduction For high-dimensional systems, state transition graphs are typically
of State Transition handled through a square matrix of size 2 n, which is the number of
Graphs states in the graph for a model with n Boolean variables. Numerical
operations on state transition graphs are thus limited by the mem-
ory capacities of current computers, which cannot deal efficaciously
in real time with networks of n > 25 (approximately). Methods that
enable the analysis of large networks are thus critical, for example,
when studying the interactions of a synthetic circuit with a host
network.
22 Madalena Chaves and Hidde de Jong
5.3 Formal Besides the detection and reachability of attractors, other dynamical
Verification of Network properties may be of interest for network analysis and design. For
Properties Using example, in order to validate a model, it is important to know if
Model Checking there exist paths in the graph in which the predicted qualitative
ordering of events, the temporal sequence of changes in gene
activity or protein concentrations, are consistent with experimental
observations.
24 Madalena Chaves and Hidde de Jong
Fig. 7 Verifying reachability properties of the oscillator with positive feedback using model checking. The
property AG EF Zero, with Zero equal to (L ¼ 0 ^ N ¼ 0), is tested for the PLDE model of the oscillator in the
absence of IPTG (Fig. 6). GNA and NuSMV show the property to be false and a counterexample is shown in the
form of oscillations in the concentrations of LacI and NRI
26 Madalena Chaves and Hidde de Jong
The IRMA model has been analyzed using the above formal
verification tools [76]. The objective of the study was to verify that
the network structure and the observed data are compatible by
(1) expressing the measured RT-qPCR expression patterns of the
genes as temporal logic formulae and (2) testing if there are com-
binations of parameter inequalities for which the model predictions
are compatible with the observations. Surprisingly, among the
almost 5000 possible combinations of parameter inequalities, only
a handful turned out to be consistent with the data. The ordering of
the different activation thresholds of Swi5 inferred from the data
was corroborated by independent measurements of the promoter
activities. This and other examples from the literature [94, 95]
illustrate the interest of using temporal logic and model checking
for supporting the analysis and design of synthetic circuits.
5.4 Modular Analysis Networks in synthetic biology are often constructed by coupling
of Network Dynamics small networks, or modules, through known interactions so as to
obtain new dynamical behaviors [96]. To take advantage of this
modular approach, a recent method [37, 97] proposes to analyze a
Boolean network as the interconnection of two or more smaller
modules. In particular, this method calculates the attractors of the
full network from the attractors of the modules, thus avoiding the
calculation of the full state transition graph.
To illustrate this interconnection method, we will study a
hypothesized synthetic biology coupling between the toggle switch
(module Σ A) and the IRMA circuit (module Σ B).
Input/Output Characterization of the Modules’ Attractors: The
first step is to characterize the asymptotic input/output behavior
(or the attractors) of each module and identify the variable(s) of
each module which will influence some variable(s) in the other
module, in other words, identify the outputs and inputs for each
module. The full network will be obtained by interconnecting the
output of each module to the input of the other. For simplicity, we
assume that each module has a single input and a single output
where the inputs are as given above, with u denoting the aTc
concentration for the toggle switch (but fixing I ¼ 0) and G for
the IRMA circuit. For the outputs we will consider LacI for the
toggle switch and Gal4 for IRMA (see Fig. 8). Next, the attractors
of each module are computed for each input and they are classified
in terms of their output values, so that Auα denotes an attractor of
module Σ A subject to input u and whose output is α (both u and α
are Boolean values):
Attractors of Σ A : A 01 ¼ f10g, A 00 ¼ f01g, A 11 ¼ f10g,
where 10 and 01 are the two attractors of the toggle switch when
the input is 0; for each of these, the output L takes the values 1 and
0, respectively. In the case of input u ¼ 1, the toggle switch has only
Qualitative Modeling of Synthetic Circuits 27
A C
G4 ◦ u
Sa Sg Sc v ◦ L T
G80
c c c
A01 × B11 A11 × B11 A00 × B11 A00 × B10 A11 × B00 A01 × B00
c c c
A01 × B10 A11 × B10 A00 × B10 A00 × B00 A11 × B10 A01 × B10
Fig. 9 Asymptotic graph for the interconnection between the toggle switch and the IRMA circuit. Bold arrows
denote a cyclic attractor of the interconnected system. There are two other point attractors, A00 B00 and
A01 B10
the response of the circuit to different inputs, and also allow for a
better regulation and control of the system, for which some tech-
niques will be discussed in the next section.
6.1 Control There are different ways to answer this question, but a first distinc-
Strategies tion can be made between open-loop and closed-loop control. In
open-loop control, the function U(t) is determined independently
of the dynamics of the system (31). As an example, consider the
toggle switch and suppose the target state is the one where both
LacI and TetR are strongly expressed, which is not a steady state of
the system without inputs. With the help of either the Boolean or
the PLDE models, we know that the following input will effectively
drive the toggle switch to the target state: U(t) ¼ (A(t), I(t)) ¼
(Ahigh, Ihigh), i.e., both inputs should be at a constant but suffi-
ciently high concentration.
An attractive open-loop method for practical use is known as
“bang-bang” control. The idea is to use only two constant values
for the input, Ihigh and Ilow, and apply them sequentially, by intervals
of appropriate length. This strategy tends to accelerate convergence
to the target state. This is useful when only a limited number of
input values are available, as is often the case in synthetic biology
experiments.
In contrast, a closed-loop control strategy takes into account
the evolution of the system and uses the current state to “correct”
the input. If all variables are observable, the control function is
Qualitative Modeling of Synthetic Circuits 31
6.2 Control In Boolean models, a control system still has the form of Eq. 31
for Boolean Models where u takes values in a discrete set U f0, 1gp . A control func-
tion at time t corresponds to a discrete sequence of input values, U
[t] ¼ [u1, . . ., ut]. To construct a Boolean control function, there
are several approaches that take advantage of the discrete nature of
the system and are interpreted as a protocol for interventions. The
idea is to successively avoid pathways that lead away from the target
state.
In [102], two types of control actions are introduced, deletion
of a node or deletion of an edge in the regulatory network. The first
action corresponds to setting that node at a constant value, while
deletion of edge xi ! xj is encoded in the logical rules by:
f j ðx, ui,j Þ ¼ f j ðx 1 , . . ., Øui,j ^ x i , . . ., x n Þ, ð32Þ
where ui,j ¼ 0 implies no control is exerted and ui,j ¼ 1 implies that
xi no longer influences xj. In its general form, an input ui,j is added
for every edge in the network.
For probabilistic Boolean networks, algorithms were developed
that solve the problems of optimal finite-horizon [38] or infinite-
horizon [103] control. The goal is to drive the system from an
initial state z0 to a desired target state zM in a finite (or infinite)
number of steps while minimizing the cost associated with each
state transition. Finite- (or infinite-) horizon corresponds to the
case of a fixed (or very large) time window available for application
of a given treatment. For the infinite-horizon problem [103], the
cost is of the form
X
1 M 1
J Π ðz 0 Þ ¼ lim E ~ t , μt , w t Þ ,
gðz
M !1 M t¼0
6.3 Control In synthetic biology the main control question is often related to
of Synthetic Circuits the robustness of a circuit with respect to perturbations in the
environment, maintaining homeostasis [104, 105], or the reliabil-
ity and the predictability of circuit functioning [16, 17].
Applications of closed-loop control techniques to synthetic
biology circuits may involve a computer interface within the exper-
imental setup [17]. In this in silico approach, real-time measure-
ments are sent to the computer, where a calibrated mathematical
model of the circuit is used for online simulation of the PI control-
ler, which returns the updated input value. This was the methodol-
ogy used in [19] to control the toggle switch. The first objective
was to drive the system to the unstable steady state corresponding
to both LacI and TetR at their threshold concentrations. To do
this, the authors applied a PI controller through a computer inter-
face, computing aTc and IPTG separately, and succeeded in main-
taining the system near the unstable steady state. A second
experiment consisted of forcing the toggle switch with periodic
control, in an open-loop configuration. Independent pulses of
aTc and IPTG were applied to the synthetic circuit with different
periods. The toggle switch responded with periodic oscillations,
but only for carefully chosen periods of forcing.
In the experiments [19], both inputs were used to control the
system, and they were independently computed. However, a recent
theoretical result shows that a single input (in this case aTc) suffices
to control the toggle switch to the unstable steady state, x∗ ¼ (θL,
θT) [106]. The novelty is a feedback control law which is piecewise-
constant in regions of the state space: U(L(t)) ¼ umin < 1 if L
(t) < θL, that is, LacI is weakly expressed and the control law
decreases the influence of TetR on LacI; conversely, U(L(t)) ¼
umax > 1 if L(t) > θL. A similar approach on control of PLDE with
affine controls is discussed in [107], with the goal of either gen-
erating sustained oscillations in a system where they do not occur
naturally or, conversely, suppressing oscillations by damping, with
applications to a bacterial model.
Implementation of feedback control laws in a cellular environ-
ment remains one of the challenges in synthetic biology, even
though in silico techniques using PI controllers and optogenetic
devices (where gene transcription is controlled by light signals) are
increasingly used [16, 17].
Two main directions can be identified in current synthetic
biology approaches [16]: first, the design and implementation of
new circuits with biological components help to understand the
fundamental mechanisms guiding and regulating cellular behavior;
second, the design of controllers for natural regulatory and
Qualitative Modeling of Synthetic Circuits 33
7 Concluding Remarks
8 Notes
8.2 Dynamic One of the first studies spelling out at length the interest of positive
Properties of Positive and negative feedback loops for the functioning of regulatory net-
and Negative works is the book on logical modeling by René Thomas [23]. Here,
Feedback Loops it was conjectured that positive feedback loops are a prerequisite for
multistability, that is, the co-occurrence of multiple stable steady
states (point attractors). On the other hand, negative feedback
loops were hypothesized to be necessary for stable oscillations.
Later work has confirmed the conjectures, both for the case of
positive and negative feedback loops [121–123]. Notice that the
criteria have been proposed for deterministic ODE models, and
that the existence of feedback loops provides necessary but not
sufficient conditions. For example, in Fig. 5, if we choose κL/
γ L < θL and κT/γ T < θT, then the toggle switch has only a single
stable steady state. Corresponding proofs of the conjectures in the
discrete, logical context have also been developed [124, 125].
Qualitative Modeling of Synthetic Circuits 35
8.3 Updating Boolean variables are defined in continuous time, but their state is
Schedules for Boolean allowed to change only at a discrete set of time instants. An updat-
Models ing schedule essentially determines the order in which the variables
change their state and it may be deterministic, where the same
order is applied at each iteration [30] or non-deterministic, where
the order is given by a random or stochastic process [126].
A deterministic updating schedule s may be defined as a func-
tion s : {1, . . ., n}!{1, . . ., m}, where m n, s(i) < s( j) means that
variable i is updated before variable j, and s(i) ¼ s( j) indicates that
variables i and j are updated simultaneously. The case m ¼ 1 denotes
the synchronous updating schedule and the case m ¼ n denotes an
asynchronous sequential schedule. In the case of random schedules,
both m ¼ mt and s(i) ¼ st(i) depend on the current iteration time t.
If xi[t] denotes the state of variable i at time t, then the state at the
next iteration, xi[t + 1], is given by:
x i ½t þ 1 ¼ f i ðx 1 ½t þ Δt1i , . . . , x n ½t þ Δtni Þ, ð33Þ
where Δt j i ¼ 0 if st(i) st( j) and Δt j i ¼ 1 if st(i) > st( j). In general,
each realization of an updating schedule leads to a different trajec-
tory. The dynamic properties of various updating schedules have
been studied in the literature (see [28, 30, 126] for some examples).
8.4 Discontinuities The use of step functions results in PLDE models with favorable
in Piecewise-Linear mathematical properties, except at the thresholds where disconti-
Differential Equation nuities occur. As explained in Subheading 4, these discontinuities
Models arise from the fact that when a protein concentration crosses a
threshold, it may change the rate at which some genes are
expressed, and thus switch the local vector field in a region. While
the dynamics at the thresholds are often ignored, this is potentially
dangerous as it may cause steady states and other important dyna-
mical properties of the system to be missed. In order to deal with
the discontinuities in a mathematically rigorous manner, the PL
differential equations have been generalized to differential inclu-
sions [56]. Several different extensions have been proposed, such as
Filippov extensions [56, 57], Aizerman–Pyatnitskii extensions
[51, 60], and hyper-rectangular overapproximations of the former
[55]. The latter overapproximations can be computed with quali-
tative information only, i.e., the parameter inequalities mentioned
in Subheading 4, and have been implemented in the tool GNA
[73]. For relatively mild conditions on the types of regulatory
functions, the three extensions are equivalent in practice
[51]. Other approaches for dealing with discontinuities in
piecewise-linear models have been proposed [58, 59], but are less
amenable to the automated qualitative analysis of higher-
dimensional networks.
36 Madalena Chaves and Hidde de Jong
Acknowledgements
References
exhibiting toggle switch or oscillatory behav- asymptotic and transient dynamics. Automa-
ior in Escherichia coli. Cell 113(5):597–608 tica 49(4):884–893
25. Cantone I, Marucci L, Iorio F, Ricci M, 38. Datta A, Choudhary A, Bittner ML, Dough-
Belcastro V, Bansal M, Santini S, di erty ER (2003) External control in Markovian
Bernardo M, di Bernardo D, Cosma M genetic regulatory networks. Mach Learn 52
(2009) A yeast synthetic network for in vivo (1–2):169–181
assessment of reverse-engineering and model- 39. Laschov D, Margaliot M (2012) Controllabil-
ing approaches. Cell 137:172–181 ity of Boolean control networks via the
26. Shmulevich I, Dougherty E, Kim S, Zhang W Perron-Frobenius theory. Automatica 48
(2002) Probabilistic Boolean networks: a (6):1218–1223
rule-based uncertainty model for gene regu- 40. Yang JM, Lee CK, Cho KH (2018) Global
latory networks. Bioinformatics 18 stabilization of Boolean networks to control
(2):261–274 the heterogeneity of cellular responses. Front
27. Mori T, Flöttmann M, Krantz M, Akutsu T, Physiol 9:774
Klipp E (2015) Stochastic simulation of Bool- 41. Li F, Long T, Lu Y, Ouyang Q, Tang C
ean rxncon models: towards quantitative anal- (2004) The yeast cell-cycle network is
ysis of large signaling networks. BMC Syst robustly designed. Proc Natl Acad Sci USA
Biol 9(45):1–9 101(14):4781–4786
28. Chaves M, Albert R, Sontag E (2005) 42. Fauré A, Naldi A, Chaouiya C, Thieffry D
Robustness and fragility of Boolean models (2006) Dynamical analysis of a generic bool-
for genetic regulatory networks. J Theor Biol ean model for the control of the mammalian
235(3):431–449 cell cycle. Bioinformatics 22(14):124–131
29. Gonzalez A, Naldi A, Sànchez L, DThieffry, 43. Ortiz-Gutiérrez E, Garcı́a-Cruz K,
Chaouiya C (2006) GINsim: a software suite Azpeitia E, Castillo A, Sánchez M, Alvarez-
for the qualitative modelling, simulation and Buylla E (2015) A dynamic gene regulatory
analysis of regulatory networks. BioSystems network model that recovers the cyclic behav-
84(2):91–100 ior of Arabidopsis thaliana cell cycle. PLoS
30. Aracena J, Goles E, Moreira A, Salinas L Comput Biol 11(9):e1004486
(2009) On the robustness of update schedules 44. Calzone L, Tournier L, Fourquet S,
in Boolean networks. BioSystems 97(1):1–8 Thieffry D, Zhivotovsky B, Barillot E, Zino-
31. Naldi A, Rémy E, Thieffry D, Chaouiya C vyev A (2010) Mathematical modelling of
(2011) Dynamically consistent reduction of cell-fate decision in response to death receptor
logical regulatory graphs. Theor Comput Sci engagement. PLoS Comput Biol 6(3):
412(21):2207–2218 e1000702
32. Zañudo J, Albert R (2013) An effective net- 45. Zhang R, Shah M, Yang J, Nyland S, Liu X,
work reduction approach to find the dynami- Yun J, Albert R, Loughran TP Jr (2008) Net-
cal repertoire of discrete dynamic networks. work model of survival signaling in large gran-
Chaos 23(2):025111 ular lymphocyte leukemia. Proc Natl Acad Sci
33. Irons D (2006) Improving the efficiency of USA 105(42):16308–16313
attractor cycle identification in Boolean net- 46. Sánchez L, Thieffry D (2001) A logical analy-
works. Physica D 217:7–21 sis of the Drosophila gap-gene system. J Theor
34. Akutsu T, Melkman A, Tamura T, Yamamoto Biol 211:115–141
M (2011) Determining a singleton attractor 47. Albert R, Othmer HG (2003) The topology
of a Boolean network with nested canalyzing of the regulatory interactions predicts the
functions. J Comput Biol 18(10):1275–1290 expression pattern of the Drosophila segment
35. Veliz-Cuba A, Aguilar B, Hinkelmann F, Lau- polarity genes. J Theor Biol 223:1–18
benbacher R (2014) Steady state analysis of 48. Barberis M, Helikar T (eds) (2019) Logical
Boolean molecular network models via model modeling of cellular processes: from software
reduction and computational algebra. BMC development to network dynamics. Lausanne:
Bioinform 15:221 Frontiers Media
36. Lorenz T, Siebert H, Bockmayr A (2013) 49. Chaouiya C (2007) Petri net modelling of
Analysis and characterization of asynchronous biological networks. Brief Bioinform 8
state transition graphs using extremal states. (4):210–219
Bull Math Biol 75(6):920–938 50. Heiner M, Koch I (2004) Petri net based
37. Tournier L, Chaves M (2013) Interconnec- model validation in systems biology. In:
tion of asynchronous Boolean networks, Cortadella J, Reisig W (eds) Applications and
38 Madalena Chaves and Hidde de Jong
theory of Petri nets 2004. Springer, Berlin, pp 64. Farcot E (2006) Geometric properties of a
216–237 class of piecewise affine biological network
51. Acary V, de Jong H, Brogliato B (2014) models. J Math Biol 52(3):373–418
Numerical simulation of piecewise-linear 65. Batt G, de Jong H, Page M, Geiselmann J
models of gene regulatory networks using (2008) Symbolic reachability analysis of
complementarity systems. Physica D genetic regulatory networks using discrete
269:103–119 abstractions. Automatica 44(4):982–989
52. van Ham P (1979) How to deal with variables 66. Thomas R, Thieffry D, Kaufman M (1995)
with more than two levels. In: Thomas R Dynamical behaviour of biological regulatory
(ed) Kinetic logic: a Boolean approach to the networks: I. Biological role of feedback loops
analysis of complex regulatory systems. Lec- and practical use of the concept of the loop-
ture notes in biomathematics, vol 29. characteristic state. Bull Math Biol 57
Springer, Berlin, pp 326–343 (2):247–276
53. Shannon P, Markiel A, Ozier O, Baliga N, 67. Edwards R, Siegelmann H, Aziza K, Glass L
Wang J, Ramage D, Amin N, (2001) Symbolic dynamics and computation
Schwikowski B, Ideker T (2003) Cytoscape: in model gene networks. Chaos 11
a software environment for integrated models (1):160–169
of biomolecular interaction networks. 68. Mestl T, Lemay C, Glass L (1996) Chaos in
Genome Res 13(11):2498–2504 high-dimensional neural and gene networks.
54. Mestl T, Plahte E, Omholt S (1995) A math- Physica D 98(1):33–52
ematical framework for describing and analys- 69. de Jong H, Geiselmann J, Batt G,
ing gene regulatory networks. J Theor Biol Hernandez C, Page M (2004) Qualitative
176(2):291–300 simulation of the initiation of sporulation in
55. de Jong H, Gouzé JL, Hernandez C, Page M, B. subtilis. Bull Math Biol 66(2):261–299
Sari T, Geiselmann J (2004) Qualitative sim- 70. Monteiro P, Dias P, Ropers D, Oliveira A, S-
ulation of genetic regulatory networks using á-Correia I, Teixeira M, Freitas A (2011)
piecewise-linear models. Bull Math Biol 66 Qualitative modelling and formal verification
(2):301–340 of the FLR1 gene mancozeb response in Sac-
56. Gouzé JL, Sari T (2002) A class of piecewise charomyces cerevisiae. IET Syst Biol 5
linear differential equations arising in (5):308–316
biological models. Dynam Syst 17 71. Sepulchre JA, Reverchon S, Nasser W (2007)
(4):299–316 Modeling the onset of virulence in a pectino-
57. Casey R, de Jong H, Gouzé JL (2006) lytic bacterium. J Theor Biol 44(2):239–257
Piecewise-linear models of genetic regulatory 72. de Jong H, Geiselmann J, Hernandez C, Page
networks: equilibria and their stability. J Math M (2003) Genetic network analyzer: qualita-
Biol 52(1):27–56 tive simulation of genetic regulatory net-
58. Ironi L, Panzeri L, Plahte E, Simoncini V works. Bioinformatics 19(3):336–344
(2011) Dynamics of actively regulated gene 73. Batt G, Besson B, Ciron P, de Jong H,
networks. Physica D 240(8):779–794 Dumas E, Geiselmann J, Monte R,
59. Plahte E, Kjóglum S (2005) Analysis and Monteiro P, Page M, Rechenmann F, Ropers
generic properties of gene regulatory net- D (2012) Genetic network analyzer: a tool for
works with graded response functions. Phy- the qualitative modeling and simulation of
sica D 201(1):150–176 bacterial regulatory networks. Methods Mol
60. Machina A, Edwards R, van den Driessche P Biol 804:439–462
(2013) Singular dynamics in gene network 74. Huttinga Z, Cummins B, Gedeon T, Mischai-
models. SIAM J Appl Math 12(1):95–125 kow K (2018) Global dynamics for switching
61. Glass L (1975) Classification of biological systems and their extensions by linear differ-
networks by their qualitative dynamics. J ential equations. Physica D 367:19–37
Theor Biol 54(1):85–107 75. Ghosh R, Tomlin C (2004) Symbolic reach-
62. Glass L, Pasternack J (1978) Prediction of able set computation of piecewise affine
limit cycles in mathematical models of hybrid automata and its application to
biological oscillations. Bull Math Biol 40 biological modelling: Delta-Notch protein
(3):27–44 signalling. Syst Biol 1(1):170–183
63. Edwards R (2000) Analysis of continuous- 76. Batt G, Page M, Cantone I, Goessler G,
time switching networks. Physica D 146 Monteiro P, de Jong H (2010) Efficient
(1–4):165–199 parameter search for qualitative models of
Qualitative Modeling of Synthetic Circuits 39
regulatory networks using symbolic model 91. Calzone L, Fages F, Soliman S (2006) BIOC-
checking. Bioinformatics 26(18):i603–i610 HAM: an environment for modeling
77. Devloo V, Hansen P, Labbé M (2003) Identi- biological systems and formalizing experi-
fication of all steady states in large networks by mental knowledge. Bioinformatics 22
logical analysis. Bull Math Biol (14):1805–1807
65:1025–1051 92. Kwiatkowska M, Norman G, Parker D (2011)
78. de Jong H, Page M (2008) Search for steady PRISM 4.0: Verification of probabilistic real-
states of piecewise-linear differential equation time systems. In: Gopalakrishnan G, Qadeer S
models of genetic regulatory networks. (eds) Proceedings of 23rd international con-
IEEE/ACM Trans Comput Biol Bioinform ference computer aided verification
5(2):208–222 (CAV’11). Lecture notes in computer science,
79. Dubrova E, Teslenko M (2011) A SAT-based vol 6806. Springer, Berlin, pp 585–591
algorithm for finding attractors in synchro- 93. Monteiro P, Dumas E, Besson B, Mateescu R,
nous Boolean networks. IEEE/ACM Trans Page M, Freitas A, de Jong H (2009) A
Comput Biol Bioinform 8(5):1393–1399 service-oriented architecture for integrating
80. Abdallah EB, Folschette M, Roux O, Magnin the modeling and formal verification of
M (2017) ASP-based method for the enumer- genetic regulatory networks. BMC Bioinform
ation of attractors in non-deterministic syn- 10:450
chronous and asynchronous multi-valued 94. Batt G, Belta C, Weiss R (2008) Temporal
networks. Algorithms Mol Biol 12:20 logic analysis of gene networks under param-
81. Klarner H, Siebert H (2015) Approximating eter uncertainty. IEEE Trans Autom Control
attractors of Boolean networks by iterative 53:215–229
CTL model checking. Front Bioeng Biotech- 95. Courbet A, Amar P, Fages F, Renard E,
nol 3:130 Molina F (2018) Computer-aided biochemi-
82. Chaouiya C, Naldi A, Thieffry D (2012) Log- cal programming of synthetic microreactors as
ical modelling of gene regulatory networks diagnostic devices. Mol Syst Biol 14(6):e7845
with GINsim. Methods Mol Biol 96. Perez-Carrasco R, Barnes C, Schaerli Y,
804:463–479 Isalan M, Briscoe J, Page K (2018) Combin-
83. Cormen T, Leiserson C, Rivest R, Stein C ing a toggle switch and a repressilator within
(2001) Introduction to algorithms. MIT the AC-DC circuit generates distinct dynami-
Press and McGraw-Hill, Cambridge cal behaviors. Cell Syst 6(4):521–530
84. Paulevé L (2018) Reduction of qualitative 97. Chaves M, Tournier L (2018) Analysis tools
models of biological networks for transient for interconnected Boolean networks with
dynamics analysis. IEEE/ACM Trans Com- biological applications. Front Physiol 9:586
put Biol Bioinformatics 15(4):1167–1179 98. Chaves M, Carta A (2015) Attractor compu-
85. Cummins B, Gedeon T, Harker S, Mischai- tation using interconnected Boolean net-
kow K (2018) DSGRN: examining the works: testing growth rate models in E. coli.
dynamics of families of logical models. Front Theor Comput Sci 599:47–63
Physiol 9:549 99. Bourdon J, Eveillard D, Siegel A (2011) Inte-
86. Veliz-Cuba A (2011) Reduction of Boolean grating quantitative knowledge into a qualita-
network models. J Theor Biol 289:167–172 tive gene regulatory network. PLOS Comput
Biol 7(9):1–11
87. Clarke E, Grumberg O, Peled D (1999)
Model checking. MIT Press, Boston 100. Chaves M, Farcot E, Gouzé JL (2013) Prob-
abilistic approach for predicting periodic
88. Carrillo M, Góngora P, Rosenblueth D orbits in piecewise affine differential models.
(2012) An overview of existing modeling Bull Math Biol 75(6):967–987
tools making use of model checking in the
analysis of biochemical networks. Front Plant 101. Stoll G, Viara E, Barillot E, Calzone L (2012)
Sci 3:155 Continuous time Boolean modeling for
biological signaling: application of Gillespie
89. Bartocci E, Lió P (2016) Computational algorithm. BMC Syst Biol 6(1):116
modeling, formal analysis, and tools for sys-
tems biology. PLoS Comput Biol 12(1): 102. Murrugarra D, Veliz-Cuba A, Aguilar B, Lau-
e1004591 benbacher R (2016) Identification of control
targets in Boolean molecular network models
90. Bernot G, Comet JP, Richard A, Guespin J via computational algebra. BMC Syst Biol
(2004) Application of formal methods to 10:94
biological regulatory networks: extending
Thomas’ asynchronous logical approach with 103. Pal R, Datta A, Dougherty ER (2006) Opti-
temporal logic. J Theor Biol 229(3):339–347 mal infinite-horizon control for probabilistic
40 Madalena Chaves and Hidde de Jong
Boolean networks. IEEE Trans Signal Process 114. Novère NL (2015) Quantitative and logic
54(6):2375–2387 modelling of molecular and gene networks.
104. Miller M, Hafner M, Sontag E, Davidsohn N, Nat Rev Genet 16(3):146–158
Subramanian S, Purnick P, Lauffenburger D, 115. de Jong H, Ropers D (2006) Strategies for
Weiss R (2016) Modular design of artificial dealing with incomplete information in the
tissue homeostasis: robust control through modeling of molecular interaction networks.
synthetic cellular heterogeneity. PLoS Com- Brief Bioinform 7(4):354–63
put Biol 8:e1002579 116. Bornholdt S (2008) Boolean network models
105. Aoki S, Lillacci G, Gupta A, Baumschlager A, of cellular regulation: prospects and limita-
Schweingruber D, Khammash M (2019) A tions. J R Soc Interface 5(Suppl 1):S85–S94
universal biomolecular integral feedback con- 117. Wang RS, Saadatpour A, Albert R (2012)
troller for robust perfect adaptation. Nature Boolean modeling in systems biology: an
570(7762):533–537 overview of methodology and applications.
106. Chambon L, Gouzé JL (2019) A new quali- Phys Biol 9(5):055001
tative control strategy for the genetic toggle 118. Abou-Jaoudé W, Traynard P, Monteiro P,
switch. IFAC-PapersOnLine 52(1):532–537 Saez-Rodriguez J, Helikar T, Thieffry D,
107. Edwards R, Kim S, van den Driessche P Chaouiya C (2016) Logical modeling and
(2011) Control design for sustained oscilla- dynamical analysis of cellular networks.
tion in a two-gene regulatory network. J Front Genet 7:94
Math Biol 62(4):453–478 119. Glass L, Edwards R (2018) Hybrid models of
108. Liu D, Mannan A, Han Y, Oyarzún D, Zhang genetic networks: mathematical challenges
F (2018) Dynamic metabolic control: and biological relevance. J Theor Biol
towards precision engineering of metabolism. 458:111–118
J Ind Microbiol Biotechnol 45(7):535–543 120. Li X, Omotere O, Qian L, Dougherty E
109. Wittmann D, Krumsiek J, Saez-Rodriguez J, (2017) Review of stochastic hybrid systems
Lauffenburger D, Klamt S, Theis F (2009) with applications in biological systems model-
Transforming Boolean models to continuous ing and analysis. EURASIP J Bioinform Syst
models: methodology and application to Biol 2017(1):8
T-cell receptor signaling. BMC Syst Biol 3:98 121. Gouzé JL (1998) Positive and negative cir-
110. Chaouiya C, Bérenguier D, Keating S, cuits in dynamical systems. J Biol Syst 6
Naldi A, Van Iersel M, Rodriguez N, (1):11–15
Dr€ager A, Büchel F, Cokelaer T, Kowal B, 122. Soulé C (2003) Graphic requirements for
Wicks B, Gonçalves E, Dorier J, Page M, multistationarity. ComPlexUs 1(3):123–133
Monteiro P, Von Kamp A, Xenarios I, de 123. Snoussi E (1998) Necessary conditions for
Jong H, Hucka M, Klamt S, Thieffry D, Le multistationarity and stable periodicity. J Biol
Novère N, Saez-Rodriguez J, Helikar T Syst 6(1):3–9
(2013) SBML qualitative models: a model
representation format and infrastructure to 124. Remy E, Ruet P, Thieffry D (2008) Graphic
foster interactions between qualitative model- requirement for multistability and attractive
ling formalisms and tools. BMC Syst Biol 7 cycles in a Boolean dynamical framework.
(1):135 Adv Appl Math 41(3):335–350
111. de Jong H (2002) Modeling and simulation 125. Richard A, Comet JP (2007) Necessary con-
of genetic regulatory systems: a literature ditions for multistationarity in discrete dyna-
review. J Comput Biol 9(1):67–103 mical systems. Discr Appl Math 155
(18):2403–2413
112. Fisher J, Henzinger T (2007) Executable cell
biology. Nat Biotechnol 25(11):1239–1250 126. Deng X, Geng H, Matache M (2006)
Dynamics of asynchronous random Boolean
113. Karlebach G, Shamir R (2008) Modelling and networks with asynchrony generated by sto-
analysis of gene regulatory networks. Nat Rev chastic processes. BioSystems 88(1–2):16–34
Mol Cell Biol 9(10):770–780
Chapter 2
Abstract
The Chemical Langevin Equation approach allows simple stochastic simulation of gene circuits under many
practical situations where the number of molecules of the species involved is not extremely low. Here, we
describe methods and a computational framework to simulate a population of cells containing gene circuits
of interest. These methods account for both intrinsic and extrinsic noise sources, and allow us to have both
individual cell-related species and population-related ones. The protocol covers aspects related to proper
description of the system and setting the software tools. It also helps to deal with the optimization of data
storage and the simulation precision versus computational time issue. Finally, it also gives practical tests to
assess the validity of the underlying technical assumptions.
Key words Synthetic biology, Gene circuits, Stochastic Modeling, Chemical Langevin equation
1 Introduction
Filippo Menolascina (ed.), Synthetic Gene Circuits: Methods and Protocols, Methods in Molecular Biology, vol. 2229,
https://doi.org/10.1007/978-1-0716-1032-9_2, © Springer Science+Business Media, LLC, part of Springer Nature 2021
41
42 Jesús Picó et al.
2 Materials
Fig. 1 QS/Fb circuit. The gene circuit aims to regule the mean expression of a
protein of interest while minimizing the noise strength. To this end, it relies on
the combination of a cell-to-cell communication based on quorum sensing
(QS) via exchange of a diffusible molecule, and intracellular negative feedback
(Fb). The Fb subsystem regulates the expression of the protein of interest inside
each cell, minimizing its noise strength. The QS subsystem induces consensus
among the cells thus achieving homogeneous expression across the population
of cells
2.1 Getting the 1. Define a vector containing the number of molecules of the
Model in Proper Form biochemical species for the population of cells. The dynam-
ics of the circuit will later be expressed using this vector con-
taining the number of molecules of the relevant biochemical
species as the model state variables (see Note 1). Set the num-
ber N of cells to be simulated. This protocol assumes N is
constant throughout the simulation. This is consistent with exper-
imental conditions carried out under continuous operation in
turbidostats and microfluidic devices. Refer to Note 2 on how
to obtain an estimation of the population size N so as to get
statistically correct results taking into account the computational
cost. Refer to Note 3 to relate the population size and the optical
density. Consider all N cells have the same set of relevant intracel-
lular biochemical species. Refer to Note 4 on how to deal with
heterogeneous cells. For a system with c common intracellular
species for all N cells and e extracellular species, define the column
vector n ¼ [ni, . . ., nN, nc+1, . . ., nc+e]T containing all vectors
ni ¼ [n1, . . ., nc]i with the number of molecules for the
c intracellular species in the i-th cell, and the variables nc+1, . . .,
Stochastic Differential Equations for Practical Simulation of Gene Circuits 45
Example 2 For the QS/Fb circuit, consider the set of reactions among the species depicted
below. This set includes some pseudo-reactions: the first reaction with the lumped
functional propensity f 1 ðni3 Þ resulting from a previous model-order reduction (see
Notes 6 and 7), and the 9th reaction accounting for the diffusion process (see Note 7).
The corresponding vector of propensities is shown on the right. For each cell, the last
propensity DVcn5 depends on the extracellular species n5 (Ae) but it is included as an
intracellular reaction as it affects the dynamics of the intracellular species ni4 (A). On the
contrary, the propensity function dAe n5 only affects directly the dynamics of the
extracellular species n5. Therefore, it is included as an extracellular reaction in the vector
of propensities. Refer to Example 12 for the software code implementation.
⎡ ⎤
i
i f1 (n3 ) i
(R · A)2 −−−−→ PI + (R · A)2 i f (n i
)
⎢ 1 3 ⎥
⎢ ⎥
⎢ ⎥
⎢ dI n1 ⎥
d i
PIi −−I→ ∅
⎢ ⎥
⎢ C ⎥
cLR
−−→ R i ⎢ LR ⎥
⎢ ⎥
⎡ ⎤ ⎢ − i ⎥
⎢ k n ⎥
⎢ 1 6 ⎥
+
k1 1
Ri + Ai − −−
− R · Ai ⎢ a(n) ⎥ ⎢ ⎥
k− ⎢ ⎥ ⎢ k+ ni ni ⎥
1
⎢ a(n)2 ⎥ ⎢ 1 2 4⎥
⎢ ⎥ ⎢ ⎥
d
Ri −−R→ ∅ ⎢ . ⎥ ⎢ d ni ⎥
⎢ . ⎥ ⎢ R 2 ⎥
⎢ . ⎥ ⎢ ⎥
+
k2
a(n) = ⎢ i ⎢
⎥ , a(n) = ⎢ k+ (ni )2 ⎥ ⎥
R · Ai + R · Ai −
−−
− (R · A)2
i
⎢ ⎥ ⎢ 2 6 ⎥
k− ⎢ a(n)N ⎥ ⎢ − ⎥
2 ⎢ ⎥ ⎢ k ni ⎥
⎢ ⎥ ⎢ 2 3 ⎥
i dRA
(R · A)2 −−−→ ∅ ⎢ ⎥ ⎢ ⎥
⎢ ⎥ ⎢ ⎥
⎣ ⎦ ⎢ dRA2 ni3 ⎥
i kA
PI −−→ PI + A i i
dAe n5 ⎢ ⎥
⎢ ⎥
⎢ kA ni1 ⎥
D ⎢ ⎥
Ai
−−−−
−− Ae ⎢ ⎥
DVc ⎢ dA ni ⎥
⎢ 4 ⎥
⎢ ⎥
d
Ai −−A→ ∅ ⎢ ⎥
⎢ Dni4 ⎥
⎣ ⎦
dAe
Ae −− →∅ DVc n5
⎡ ⎤
⎢ IN ⊗ Scc 0cN×re ⎥
S=⎣ ⎦ ð2Þ
11×N ⊗ Sec See
where
– Scc is a c rc matrix formed by the stoichiometric coeffi-
cients for the c intracellular non-algebraic species account-
ing only for the rc intracellular reactions
– 0cNre is a c N re null matrix
– Sec is a e rc matrix formed by the stoichiometric coeffi-
cients for the e extracellular non-algebraic species account-
ing for the interactions with the intracellular ones via the rc
intracellular reactions affecting them
– See is a e re matrix formed by the stoichiometric coeffi-
cients for the e extracellular non-algebraic species account-
ing only for the re extracellular reactions
– IN is the N N identity matrix
– 11N is a 1 N row vector of ones
and is the Kronecker product (see Note 9).
N T
n3 , n4 Þ . Refer to Example 16 for the software code
N
implementation.
2.2 Accounting for 1. Define the intrinsic noise matrix. The diffusion term in the
Noise and Computing Euler–Maruyama discrete formulation of the CLE given by
Long-Term Statistics Eq. 5 accounts for the intrinsic noise (see Note 1). For a system
with N cells, rc intracellular reactions, and re extracellular ones,
define the (N rc + re) (N rc + re) matrix Nt as a diagonal
matrix with N rc + re continuous independent normal random
variables with zero mean and unit variance
(Nii ðμ, σ2 Þ ¼ Nð0, 1Þ). Refer to Example 13 for the software
code implementation. See there how to skip some reactions so
they are not affected by intrinsic noise.
2. Define the extrinsic noise characteristics. Time-invariant
dynamics are assumed. That is, the system parameters may
take random values around a nominal one, but they keep
constant in time. The time-variant case, not covered here,
requires setting a stochastic differential equation for the tem-
poral evolution of each parameter for each cell in the popula-
tion. Consider extrinsic noise by randomizing the values of the
model parameters. Each model parameter θ has a nominal value
θn. The value of the parameter assigned to the i-th cell is θi ¼
θn ð1 þ CV θ N ð0, 1ÞÞ , where N ð0, 1Þ is the standard normal
distribution, and CVθ is a user-defined coefficient of variation
for the parameter θ. Refer to Example 10 for the software code
implementation of extrinsic noise.
3. Computing long-term first statistics for the time evolution
of the species of interest in the population of cells. To
compute the long-term moments of the species of interest,
such as mean (μ) and standard deviation (σ), and derived statis-
tics such as the noise strength η2 (squared coefficient of varia-
tion, η2 ¼ σ 2/μ2) for a population of N cells, follow the steps:
(a) For a population of N cells, run a simulation (see Subhead-
ing 3) of length T units of time, with discrete-time sam-
pling δt ensuring T large enough so that the steady state is
reached and maintained for Ts units of time. Refer to Note
2 to estimate the appropriate value of N providing repre-
sentative statistics. Notice only one realization of the
50 Jesús Picó et al.
1
XN 2
s 2n j ðt k Þ ¼ ðni j ðt k Þ mn j ðt k ÞÞ
N i¼1
1
Xf
μn j ¼ m n j ðt k Þ
Ts k¼0
1
Xf 1
Xf
σ 2n j ¼ s 2n j ðt k Þ þ ðm n j ðt k Þ μn j Þ2
Ts k¼0 Ts k¼0
μ2n j
2.3 Software 1. Select a software platform. To implement and run the simula-
tion algorithm of the CLE-based model, efficient software plat-
forms alleviate the computational cost. Here we consider the C+
+ version of the scalable Open Framework for particle and
Particle-Mesh codes (OpenFPM, available in http://openfpm.
mpi-cbg.de/). OpenFPM allows efficient parallel computation
using full parallel particle and mesh algorithms [18].
2. OpenFPM server installation. To install OpenFPM in Linux
or OSX, clone the repository and use the following lines to
install it in the default location (refer to Note 11 on how to
install other OpenFPM possible configurations and
troubleshooting):
Stochastic Differential Equations for Practical Simulation of Gene Circuits 51
git
2 cd openfpm_pdata_2 .0.0
3 ./ install
1 source ˜/ openfpm_vars
example . mk langevin . mk
1 include langevin . mk
2 CC = mpic ++
3 LDIR =
4 OBJ = main . o
5 %. o : %. cpp
7 langevin : $ ( OBJ )
9 all : langevin
11 clean :
12 rm -f *. o *˜ core langevin
3 Methods
3.1 Define the The dynamic CLE model as defined in Materials 2.1.5 and the
OpenFPM Client computation of the long-term statistics of the species of interest are
Program main.cpp implemented in an OpenFPM client program called main.cpp. It
contains three functions: main() with the main algorithmic steps
to execute the simulation and generate the selected output,
input_data() to read the parameters of the model from a file,
and evolve_time() to compute the system states—that is, the
number of molecules of the species of interest—at each simulation
time step δt. The last two functions are called from main(). Their
main features are given next. For further details refer to the com-
plete code available at https://github.com/sc2cl/population_cle.
1. Function main( ) The pseudo-code with the contents of the
main( ) function is given below:
Stochastic Differential Equations for Practical Simulation of Gene Circuits 53
6 Output selection :
12 Initialize variables :
14 Distributed domain
16 Initialize time
step :
24 end loop
379 Box <2 , double > domain ({0.0 ,0.0} ,{1.0 ,1.0}) ;
Line 394 defines the size of the grid, i.e. the number of
cells, as N ¼ a b.
distriduted one )
Line 412 creates the distributed grid g1. The first argu-
ment sets the grid as a 2D dimensional one. The second
one specifies double precision. The variable
aggregate<> is a vector containing as many double
precision variables as states in the cell and an array of
doubles with size equal to the number of model para-
meters of the cell. For the QS/Fb example, there are five
doubles, one for each of the five intracellular species (see
Example 1) and one double array of size 23 (double
[23]) for the parameters of each cell. To make access to
these variables more user friendly, the following lines are
defined at the beginning of the code:
56 Jesús Picó et al.
14 const size_t I = 0;
15 const size_t R = 1;
17 const size_t A = 3;
18 const size_t RA = 4;
19
getDecomposition () ,sz , g ) ;
differential equations
465 {
466
2.
468
469 if ( j %2 == 0)
470 {
472 }
473 else
474 {
476 }
the other way round in the odd time steps (read from g2
and write into g1). The other arguments of this function
are: the time steps (T and sT), the vectors of the extracel-
lular species (x5), the engine of the random number
generator (engine), the normal distribution generator
(NDist), the statistics vector (stats_vec), the number
of cells (Ncells), the step counter (j), the degradation
of the extracelular species (dAee), and the initial condi-
tion of the extracellular species (val4).
612 t += T ;
613 }
if required
7 Increment i
Example 10
269 // Iterate
271 {
subdomain
282 {
NDist ( en ) + 1) * parameters [ l ];
284
287
288 }
species
calculation
variance calculation
processors
63
64 // I parameters
[0];
Example 12
94 // Propensities
95 // I
key ) ) ;
99 double X12 = c1 * c2 *( kdLux + alphaI * g_dist_read .
118 // Noises
138 x4_noise_difu = 0;
stoicheometry included
174 double c6 = 2* k_2 * kd1 * g_dist_write . template get < RA2 >(
180 g_dist_write . template get < RA >( key ) = c10 *( sqrt ( std
Line 10 adds the contribution of the i-th cell to the value of the
extracellular species and Line 11 adds the contribution of the i-
th cell to the variable accounting for the mean number of
molecules of the intracellular species (see Example 17).
cell .
184
present cell .
Line 186 implements a partial calculation of the mean of x1 in the following way:
1
XN
x 1mean ¼ xk
N k¼1 1
where N is the total number of cells, x i1 is the number of molecules of the species x1
(PI) in the i-th cell. Recall this code is actually executed in several processors at the same
time. For example, considering a hypothetical distribution of N cells over two
processors:
P1 = {celli : i = 1, 2, . . . M }, P2 = {celli : i = M + 1, M + 2, . . . N } ,
where P q is the q-th processor, and M < N. Then, the calculation of x 1mean becomes
P1 P2
zfflfflfflfflffl
Xffl}|fflfflfflfflfflffl{ zfflfflfflfflfflfflfflffl
X ffl}|fflfflfflfflfflfflfflfflffl{
1 M 1 N
x 1mean ¼ xk þ xk
N k¼1 1 N k¼M þ1 1
When the iterator is in the j-th cell, with j M, the calculation is executed in
processor P 1, so the partial calculation of x 1mean is
1 k 1 k
j (j−1)
1 1
x1mean (P1 , cellj ) = x1 = x1 + xj1 = x1mean (P1 , cellj−1 ) + xj1
N N N N
k=1 k=1
On the contrary, when the iterator is in the j-th cell, with M < j < N, the calculation
is executed in processor P 2, and the partial calculation of x 1mean is
1
j
1
j−1
1 j 1
x1mean (P2 , cellj ) = xk1 = xk1 + x = x1mean (P2 , cellj−1 )+ xj1
N N N 1 N
k=M +1 k=M +1
After the iterator covers the whole grid of N cells, each processor has finished its partial
calculation of x 1mean :
66 Jesús Picó et al.
1
X
N
x 1mean ðP 1 Þ ¼ xk
N k¼1 1
1
XN
x 1mean ðP 2 Þ ¼ xk
N k¼M þ1 1
value of x5
x5_noise ) ;
240
x5_sto + tot_A_partial ;
242
3.2 Compilation Next, compile the OpenFPM client program main.cpp with the
command make. The result of the compilation is the executable
program langevin. For a successful compilation it is mandatory to
have both the langevin.mk and Makefile files in the working
directory with the compiler configuration mentioned in Materials
2.3.
where -np 4 sets four core processors to run the parallel simulation,
./langevin is the name of the executable program, and the file
param.dat. The input file param.dat is a CSV text file with the
nominal values of the parameters ordered and separated by com-
mas. The first three numeric arguments correspond to the number
of cells to be simulated (240), the user-defined coefficient of vari-
ance for the extrinsic noise (0.1), and the initial number of mole-
cules of the extracellular species (0). The last four remaining
arguments configure:
– Simulation with intrinsic noise (1) or deterministic simulations
(0),
– Long-term population histograms (1) or not (0). See Example
20, right part of the plot.
– Population statistics (mean and variance) at every time step
(1) or not (0). See Example 21.
– Temporal response of all cells at time step (1) or not (0). See
Example 20.
An execution of langevin returns by default the long-term
population statistics in the output file output.dat, see Example
22. Additionally, the file param.dat and the corresponding exe-
cution of the langevin program can be performed in different
ways. See Note 15 to see a parametric swept performed in
MATLAB®. See Note 16 for a Python script to start an execution.
5000
PI molecules
4000
3000
2000
1000
0
0 100 200 300 400 500 600 700 0 0.5 1 1.5x10-3
1000
R molecules
800
600
400
200
0
0 100 200 300 400 500 600 700 0 2 4 6x10-3
1500
(R.A)2 molecules
1000
500
0
0 100 200 300 400 500 600 700 0 1 2 3x10-3
100
A molecules
80
60
40
20
0
0 100 200 300 400 500 600 700 0 0.02 0.04
Time [min] Frequency
70 Jesús Picó et al.
Example 21 Population statistics at each time step comparing stochastic and deterministic
results of the QS/Fb CLE model are depicted below. This is a single realization computed
over 800 min for the four intracellular species considering a population of N ¼ 240 cells.
The stochastic (solid line) and deterministic (dashed line) are two independent simulations
but under the same initial conditions. The average number of molecules of each species
obtained in both simulations closely match.
3000
molecules
PoI/LuxI
2000
1000
0
0 100 200 300 400 500 600 700 800
600
LuxR molecules
400
200
0
0 100 200 300 400 500 600 700 800
600
(LuxR.AHL)2
molecules
400
200
0
0 100 200 300 400 500 600 700 800
AHL molecules
40
20
0
0 100 200 300 400 500 600 700 800
Time(min)
Example 22 Long-term population statistics for the QS/Fb system output (n1
(PI)) using different sets of model parameters are illustrated below. Quorum sensing
(orange dots) in the QS/Fb system reduces the PoI/LuxI noise strength. The no
quorum sensing effect (purple dots) inhibits the diffusion of AHL molecules and
increases the PoI/LuxI noise strength.
Stochastic Differential Equations for Practical Simulation of Gene Circuits 71
4 Notes
pffiffiffiffiffiffiffiffiffipffiffiffiffiffi
nðt þ δtÞ ¼ nðtÞ þ S aðnÞδt þ S N aðnÞ δt ð7Þ
where Nð0, 1ÞJ J is a diagonal matrix containing J statistically
independent normal random variables, and δt is the discretiza-
tion time step.
2. Selecting the size N of the population of cells Recall this
protocol assumes N is constant throughout the simulation. To
get an estimate of the population size N so as to get statistically
correct results at minimum computational cost, run a set of
simulations changing the size of the population of cells and the
culture volume while keeping constant cell density (see Note 3)
and evaluate the effect of changes on the statistical information
of interest (e.g. noise strength). Simulations at different OD
values can assess on its potential effect on the system behavior
(e.g. by affecting cell-to-cell communication mechanisms) (see
Example Note 1).
Example Note 1
The next figure shows the results obtained for the
QS/Fb circuit when comparing noise strength of
protein n1 (PI) at different OD600 values defined in
the table below. A Noise strength does not apprecia-
bly change for OD ∈ [0.005, 5] obtained either
changing the volume Vext and keeping the cell num-
ber N¼240 (blue squares) or changing both N and
Vext (green squares). B Noise strength for different
N and Vext keeping constant OD600 ¼ 0.3.
Stochastic Differential Equations for Practical Simulation of Gene Circuits 73
N fixed
OD fixed
Example Note 2
Considering that N ¼ 8 105 is the quantity of
cells contained in 1 μL of bacterial culture when the
OD is 1 (Source: Agilent, E. coli Cell Culture Con-
centration from OD600 Calculator) and
Vext ¼ 1 103 μL as a typical culture volume in a
microfluidic device, we need N¼240 cells to simulate
a scenario corresponding to OD¼ 0.3
74 Jesús Picó et al.
Example Note 3
The set of reactions below represent the conversion
of the substrate X2 into the product X4 catalyzed by the
enzyme X1:
k+
X1 + X2 −−−− X3
1
−−
k1
k
X3 −−2→ X1 + X4
d
X4 −−4→ ∅
76 Jesús Picó et al.
k2 cx 2
x_ 4 ¼ 1
d4x 4
k
þ x2
k1
f (x2 )
X2 −−−→ X2 + X4
d
X4 −−4→ ∅
n_ i4 ¼ DV c n5 Dni4 þ . . .
PN
n_ 5 ¼ N DV c n5 þ D i
i¼1 n4 þ ...
Example Note 5
In the QS/Fb case, there is one lumped propensity:
the Hill-like function f(n3) modeling the repressible pro-
moter PI/(R.A)2. Transcription and degradation of PI
can be described (see Example 2) using the equivalent set
of pseudo-reactions:
f1 (ni )
(R · A)2i −−−−3→ mPIi + (R · A)2i
dm
mPIi −−→
I
∅
klux
αCI
gPI · (R · A)2 −−→ gPI · (R · A)2 + mPI
dm
mPI −−→
I
∅
SSA CLE
A 200 200
180 180
160 160
140 140
120 120
100 100
80 80
60 60
40 40
20 20
0 0
0 5 10 15 0 5 10 15
Time [min] Time [min]
B C
0.1
0.09
Normalized Counts
0.08
0.07
0.06
0.05
0.04
0.03
0.02
0.01
0
Example Note 6
In the QS/Fb system, three realizations were per-
formed with the same set of parameters and conditions
for a population of N cells. The steady-state portion of
each species of interest was selected for every i-th cell.
The time average was obtained over this steady-state
time-window, resulting in an averaged number of mole-
cules of each species per cell. The figure below depicts the
matrix scatter plot for the three realizations (using a
different color for each one). Notice the distributions
for all four species are unimodal and well shaped. A
MANOVA analysis reflects no statistically significance to
reject the hypothesis of the three realizations have the
same mean and variance, with p-value ¼ [0.5961,
0.6730], and Wilk’s lambda λ ¼ [0.9910, 0.9978]. In
addition, the Mahalanobis distance between the means
of each realization is close to zero (DM ¼ [0.0132,
0.0320, 0.0363]). This analysis confirms that one reali-
zation of the simulation of a population with
N interconnected cells during enough simulation time
provides representative long-term moments of the
population.
Since, the MANOVA test assumes normality, the
Kruskal–Wallis test was performed for the three realiza-
tions in each one of the species. The results for A: [statis-
tic, p-value ¼ [0.610148, 0.737069]; I: [statistic, p-value
¼ [0.427088, 0.807717]; R: [statistic, p-value
¼ [2.22063, 0.309456]; and (R.A.)2: [statistic, p-value
¼ [0.344232, 0.841881]. Since all the p-values are
greater than or equal to 0.01, there is no statistically
significant difference between the medians of the species
Stochastic Differential Equations for Practical Simulation of Gene Circuits 81
git
2 cd openfpm_pdata_2 .0.0
5 make install
the processor ID and the user seed with the mixing procedure
of a hash-based random number generator, Saru [2], as shown
in Example Note 7.
Example Note 7
The code that implements the distributed random
number generators for the QS/Fb CLE model reads like:
362
371 seed1 =0 x79dedea3 *( seed1 ˆ((( signed int ) seed1 ) > >14) ) ;
372 seed2 =( seed1 + seed2 ) ˆ ((( signed int ) seed1 ) > >8) ;
374
Example Note 8
For the QS/Fb case, the figure below shows how the
use of storage memory decreases when the number of
samples is reduced by the decimation process. Decima-
tion to D ¼ 32 yields approximately a 95% reduction of
the required memory space to save the data, and keeps
the long-term statistics without significantly changes
from the initial ones.
4 1
2 0.8
1 0.4
Time Step (min)
10-1
0.2
-2
10
0
0 2 4 6 8 10 0 5 10
Iteration Iteration
Example Note 9
For the QS/Fb case, a parametric swept can be
carried out as follows.
Stochastic Differential Equations for Practical Simulation of Gene Circuits 85
97 D_v = [2 0];
104
, kA_v ) ) ;
106 %%
110 % LuxI
114
115 % LuxR
86 Jesús Picó et al.
119
be read by the
123
124
125 % Write a file named param . dat , with the struct param_out
127
128
num2str ( TEMPOT ) ];
This way, Matlab has the location of the C++ lib in its
environment.
Example Note 10
For the QS/Fb case, the Python3 code
corresponding to the Simulate_PCLE.py:
76 Ncells = 240
77 ruido = 0.15
78 ahl_e_0 = 0/ Vc
state histogram
88 Jesús Picó et al.
83
85 # Params list
87
92 wr . writerow ( param_out )
93
+ str ( TEMPOT )
97 print ( command )
98 os . system ( command )
Acknowledgement
References
1. Acar M, Mettetal JT, van Oudenaarden A noise and regulatory network architecture.
(2008) Stochastic switching as a survival strat- Trends Genet 28(5):221–232
egy in fluctuating environments. Nat Genet 40 10. Chellaboina V, Bhat S, Haddad M, Bernstein D
(4):471–475 (2009) Modeling and analysis of mass-action
2. Afshar Y, Schmid F, Pishevar A, Worley S kinetics. IEEE Control Syst 29(4):60–78
(2013) Exploiting seeding of random number 11. Eldar A, Elowitz MB (2010) Functional roles
generators for efficient domain decomposi- for noise in genetic circuits. Nature 467
tion parallelization of dissipative particle (7312):167–173
dynamics. Comput Phys Commun 184 12. Elowitz MB, Levine AJ, Siggia ED, Swain PS
(4):1119–1128 (2002) Stochastic gene expression in a single
3. Andrews SS, Dinh T, Arkin AP (2009) Stochas- cell. Science 297(5584):1183–1186
tic models of biological processes. Springer 13. Gillespie DT (2000) The chemical Langevin
New York, New York, pp 8730–8749 equation. J Chem Phys 113:297–306
4. Basak S, Chabakauri G (2010) Dynamic mean- 14. Gillespie DT (2007) Stochastic simulation of
variance asset allocation. Rev Financ Stud 23 chemical kinetics. Annu Rev Phys Chem
(8):2970–3016 58:35–55
5. Boada Y, Vignoni A, Picó J (2017) Engineered 15. Higham DJ (2001) An algorithmic introduc-
control of genetic variability reveals interplay tion to numerical simulation of stochastic
among quorum sensing, feedback regulation, differential equations. SIAM Rev 43
and biochemical noise. ACS Synth Biol 6 (3):525–546
(10):1903–1912
16. Higham DJ (2008) Modeling and simulating
6. Boada Y, Vignoni A, Picó J (2019) Multiobjec- chemical reactions. SIAM Rev 50(2):347–368
tive identification of a feedback synthetic gene
circuit. IEEE Trans Control Syst Technol 17. Hilfinger A, Paulsson J (2011) Separating
1–16. intrinsic from extrinsic fluctuations in dynamic
biological systems. Proc Natl Acad Sci 108
7. Cai L, Friedman N, Xie XS (2006) Stochastic (29):12167–12172
protein expression in individual cells at the sin-
gle molecule level. Nature 440 18. Incardona P, Leo A, Zaluzhnyi Y,
(7082):358–362 Ramaswamy R, Sbalzarini IF (2019)
Openfpm: a scalable open framework for par-
8. Cao Y, Gillespie DT, Petzold LR (2005) The ticle and particle-mesh codes on parallel com-
slow-scale stochastic simulation algorithm. J puters. Comput Phys Commun
Chem Phys 122(1):014116 241:155–177.
9. Chalancon G, Ravarani CN, Balaji S, Martinez- 19. Jones DL, Brewster RC, Phillips R (2014) Pro-
Arias A, Aravind L, Jothi R, Madan Babu M moter architecture dictates cell-to-cell
(2012) Interplay between gene expression
90 Jesús Picó et al.
variability in gene expression. Science 346 32. Raser JM, O’Shea EK (2005) Noise in gene
(6216):1533–1536 expression: origins, consequences, and control.
20. Kazeev V, Khammash M, Nip M, Schwab C Science 309(5743):2010–2013
(2014) Direct solution of the chemical master 33. Ruess J, Lygeros J (2015) Moment-based
equation using quantized tensor trains. PLoS methods for parameter inference and experi-
Comput Biol 10(3):e1003359 ment design for stochastic biochemical reaction
21. Khalil HK (1996) Nonlinear systems, 3rd edn. networks. ACM Trans Model Comput Simul
Prentice-Hall, New Jersey 25(2):8
22. Kokotovic P, Khalil H, O’Reilly J (1986) Sin- 34. Samoilov M, Plyasunov S, Arkin AP (2005)
gular perturbation methods in control: analysis Stochastic amplification and signaling in enzy-
and design. Academic Press, Orlando matic futile cycles through noise-induced bist-
23. Kruskal WH, Wallis WA (1952) Use of ranks in ability with oscillations. Proc Natl Acad Sci
one-criterion variance analysis. J Am Stat Assoc USA 102(7):2310–2315
47(260):583–621 35. Schnoerr D, Sanguinetti G, Grima R (2017)
24. Labhsetwar P, Cole JA, Roberts E, Price ND, Approximation and inference methods for sto-
Luthey-Schulten ZA (2013) Heterogeneity chastic biochemical kinetics-a tutorial review. J
in protein expression induces metabolic Phys A: Math Theor 50(9):093001
variability in a modeled Escherichia coli pop- 36. Sutton S (2006) Measurement of cell concen-
ulation. Proc Natl Acad Sci USA 110 tration in suspension by optical density. Micro-
(34):14006–14011 biology 585:210-8336
25. Mélykúti B, Hespanha JaP, Khammash M 37. Swain PS, Elowitz MB, Siggia ED (2002)
(2014) Equilibrium distributions of simple Intrinsic and extrinsic contributions to stochas-
biochemical reaction systems for time-scale ticity in gene expression. Proc Natl Acad Sci 99
separation in stochastic reaction networks. J R (20):12795–12800
Soc Interface 11(97):20140054 38. Van Kampen N (2011) Stochastic processes in
26. Munsky B, Khammash M (2006) The finite physics and chemistry. North-Holland Per-
state projection algorithm for the solution of sonal Library, Elsevier Science
the chemical master equation. J Chem Phys 39. Wilkinson DJ (2006) Stochastic modelling for
124(4):044104 systems biology. Mathematical and computa-
27. Murray JD (1989) Mathematical biology. tional biology Series, 2nd edn. Champan and
Springer, Berlin Hall/CRC, London
28. Novick A, Weiner M (1957) Enzyme induction 40. Wilkinson DJ (2009) Stochastic modelling for
as an all-or-none phenomenon. Proc Natl Acad quantitative description of heterogeneous
Sci USA 43(7):553 biological systems. Nat Rev Genet 10
29. Ostrenko O, Incardona P, Ramaswamy R, (2):122–133
Brusch L, Sbalzarini IF (2017) pssalib: the 41. Woods ML, Leon M, Perez-Carrasco R,
partial-propensity stochastic chemical network Barnes CP (2016) A statistical approach
simulator. PLoS Comput Biol 13(12): reveals designs for the most robust stochastic
e1005865 gene oscillators. ACS Synth Biol 5
30. Raj A, van Oudenaarden A (2008) Nature, (6):459–470
nurture, or chance: stochastic gene expression 42. Zagaris A, Kaper HG, Kaper TJ (2004) Analy-
and its consequences. Cell 135(2):216–226 sis of the computational singular perturbation
31. Rao CV, Arkin AP (2003) Stochastic chemical reduction method for chemical kinetics. J Non-
kinetics and the quasi-steady-state assumption: linear Sci 14(1):59–91
application to the Gillespie algorithm. J Chem
Phys 118(11):4999–5010
Chapter 3
Abstract
Mathematical models play an important role in the design of synthetic gene circuits, by guiding the choice
of biological components and their assembly into novel gene networks. Here, we present a guide for
biologists to build and utilize models of gene networks (synthetic or natural) to analyze dynamical proper-
ties of these networks while considering the low numbers of molecules inside cells that results in stochastic
gene expression. We start by describing how to write down a model and discussing the level of details to
include. We then briefly demonstrate how to simulate a network’s dynamics using deterministic differential
equations that assume high numbers of molecules. To consider the role of stochastic gene expression in
single cells, we provide a detailed tutorial on running stochastic Gillespie simulations of a network,
including instructions on coding the Gillespie algorithm with example code. Finally, we illustrate how
using a combination of quantitative experimental characterization of a synthetic circuit and mathematical
modeling can guide the iterative redesign of a synthetic circuit to achieve the desired properties. This is
shown using a classic synthetic oscillator, the repressilator, which we recently redesigned into the most
precise and robust synthetic oscillator to date. We thus provide a toolkit for synthetic biologists to build
more precise and robust synthetic circuits, which should lead to a deeper understanding of the dynamics of
gene regulatory networks.
Key words Synthetic gene circuits, Mathematical modeling, Dynamical gene network, Stochastic
simulations, Gillespie algorithm, Synthetic oscillator, Synthetic biology, Biological oscillations
1 Introduction
Filippo Menolascina (ed.), Synthetic Gene Circuits: Methods and Protocols, Methods in Molecular Biology, vol. 2229,
https://doi.org/10.1007/978-1-0716-1032-9_3, © Springer Science+Business Media, LLC, part of Springer Nature 2021
91
92 Giselle McCallum and Laurent Potvin-Trottier
2 Materials
3 Methods
3.1 Writing Down The first step in writing down a model is to record all the interac-
a Model tions between the molecules in the circuit or impacting it, and the
chemical reactions that create and eliminate them in a diagram
3.1.1 Abstracting
(sometimes called the network topology). Here, we must consider
the Circuit: Sketching Its
the level of detail we want to include in the model. While it is
Diagram
important to include enough details to accurately reflect the under-
lying processes we wish to learn about, too much detail can weigh
down the model and distract from the effects that particular vari-
ables can have on the behavior of the system. For example, consid-
ering relativity while calculating the movement of a ball through
the air will obscure the understanding of the simple system while
adding futile precision. Consider the repressilator circuit: in this
network, three genes encode different repressor proteins (LacI,
TetR, and λ CI), each of which represses the expression of the
next gene in the circuit in a single feedback loop. Because of the
odd number of repressors, this effectively leads to autorepression
with a delay, producing out-of-phase oscillations of the three pro-
teins. The simplest model of this circuit contains only the repressors
as variables and considers that the proteins directly repress each
other’s production (Fig. 1a). While this model can still lead to
oscillations, it ignores many important biological parameters, such
as transcription rates and difference between mRNAs’ and proteins’
half-lives (see Note 1). We could also model fluctuations in gene
copies due to plasmid copy number, the switching of the promoter
between the active and inactive state, the number of RNA poly-
merases and ribosomes, multimerization of the repressors, and
enzymatic degradation of the repressor via proteases—all biological
parameters with an impact on our circuit (Fig. 1b). However, these
details will not necessarily provide valuable insight on the behavior
of our circuit and will add many new variables and parameters to the
model. While they may not be included in the equations, keeping
these details in mind is helpful when analyzing the model and
circuit’s behavior, as they might help explain unexpected results.
Our favorite approach is to start with the simplest model that can
lead to some understanding of the system. Then, complexity can be
progressively added if it is necessary to explain the observed behav-
ior or to test the effects of a particular component of the system. It
is important to consider that in biology there are still many
unknowns (and unknown unknowns), and that adding many para-
meters to the model will not make it a better representation of
reality. Powerful approaches have been developed to rigorously
model systems with many unknown interactions, but they are
beyond the scope of this chapter [16, 17]. In our example, we
will include mRNA (m) and proteins (P) as variables in our net-
work. The transcription of an mRNA is repressed by the previous
Using Models to (Re-)Design Synthetic Circuits 95
a P1 b n plam
ids
Ø gene P1
Ø
+
RNAP
RNase +
P3 P2 + ribosomes
Ø Ø + protease
c Ø
P1 Ø
Ø
Ø diluo n +
Ø
P3
Ø mRNA
Ø
P2
Ø x 3 genes
3.1.2 Mass Action The next step in building our model is to write an expression
Equations describing each reaction in our system according to the law of
mass action, which states that the rate of a reaction is proportional
to the concentration of the reactants. For example, for the reaction
x + y ! z, the rate of production of z is calculated as ddt½z ¼ ½x ½y k1,
where k1 is a constant known as the mass action rate constant that
indicates the rate per reactant (or proportionality) of the reaction.
Intuitively, this means that if you have twice as many molecules, the
reaction rate will be twice as high because collisions between mole-
cules are twice as likely to happen. The mass action equation is then
k1
written as x þ y ! z.
96 Giselle McCallum and Laurent Potvin-Trottier
We can thus write the mass equations for all the chemical
reactions included in our model. For the repressilator, the equa-
tions are:
f ðP i1 Þ
; ! m i
βm
mi ! ;
λP
mi ! P i þ mi
βP
Pi ! ;
for each repressor (i ¼ 1, 2, 3, and where P0 ¼ P3 by definition). λP
is the rate of translation of repressor (Pi) per mRNA per unit time,
and βm and βP are the rate of elimination of mRNA and protein,
respectively (determined by dilution due to cell growth and active
degradation). The rate for the transcription of mi is the value of the
function f(Pi 1) describing repression of the promoter by the
previous protein (Pi 1) in the network. Here, we decide to use
the following function called a Hill function:
λm K h
f ðP i1 Þ ¼
K h þ P hi1
The Hill equation is classically used to describe cooperative
binding of ligands to a receptor and is useful in describing many
biological processes, as it describes nonlinear switching of a system
between 0 and 1 (a fully “off” and fully “on” state). In our model,
the Hill function is used to approximate the (possibly partial)
cooperative binding of the repressor proteins to their promoters
(see Note 2). Here, h is the Hill coefficient representing this coop-
erative binding. The parameter K in the Hill function is the thresh-
old at which half of a population of a repressor in the cell is bound
to its site and accounts for the affinity that a repressor has for its
binding site in a promoter. λm is the maximal transcription rate
when there is no repression of the gene encoding the mRNA.
Using the mass action equations, we can now write the ordi-
nary differential equations (ODEs) that describe the dynamics of
our system and find a deterministic solution of the system (see
Subheading 3.2) assuming that the numbers of molecules are very
high. This may not be an accurate approximation in all situations,
but can provide an intuition about the system’s behavior. In order
to consider the effects of the finite number of molecules, we can
also simulate the reactions stochastically (see Subheading 3.3).
Table 1
Estimated parameter values for the mRNA-protein repressilator model
Est.
Parameter Description Units value Source
1
λmi Max transcription rate mRNA min 4.1 [49]
mRNA τp1(see 150
footnote a)
K Threshold of repression (½ molecules proteins 7 [5]
are bound to promoters)
h Hill coefficient of cooperativity Unitless 2 [50]
1
βm mRNA elimination rate (combination min 0.1 [51]b
degradation and dilution) τp1(see 3.6
footnote a)
λPi Translation rate proteins mRNA1 min1 1.8 [21, 51]b
proteins mRNA1τp1(see 65
footnote a)
βP Combination of dilution due min1 0.027
to cell growth and active τp1(see footnote a) 1
degradation of protein
a
For parameter scans discussed in Subheading 3.3.8, we have set all values for rate constant parameters to units of protein
lifetime τp1 (see Note 1) by scaling rate parameter values assuming a cell division rate (and protein half-life) of 25 min,
with no degradation of our proteins by proteases (making τP ¼ 36 min)
b
Values were found on BioNumbers [18]
biology numbers from many cell types and species (from peer-
reviewed sources) and Ron Milo and Rob Philips’ book Cell Biology
by the Numbers [19]. For example, in our model (assuming that our
network is being expressed in Escherichia coli), several studies listed
on BioNumbers have shown translation rates of ~ 8 aa/mRNA/
s [20, 21], which we can adapt to the units required for our model
(see Note 3). Table 1 contains a complete list of parameters in the
example repressilator model, and an order of magnitude estimation
of these parameters that serve as a starting point for our simula-
tions. It is easy to vary these parameters by orders of magnitude in
silico, but it is important to remember what they physically repre-
sent: rates of chemical reactions in a cell. As such, they must remain
physically realistic. For example, binding constants cannot be so
high such that molecules would need to diffuse faster than they
would in water. Note that although testing a range of parameters
computationally is easier than experimentally, the dimensionality
(and therefore computational load) expands as the number of
parameters increases (if you want to test a range of n values for
x parameters, you must run nx simulations). This provides further
motivation to keep the model as simple as possible.
98 Giselle McCallum and Laurent Potvin-Trottier
3.2 Deterministic Let’s start by assuming that our system is evolving in a macroscopic
Solution test tube in which all of our reactants are present in high numbers,
and the system is homogenously mixed. Under this assumption, we
3.2.1 Writing Ordinary
can write a set of deterministic ordinary differential equations
Differential Equations
(ODEs) that describe the dynamics of our system. Here, we write
(ODEs)
one equation for each species in the system, describing its rate of
change (i.e., its production rate minus its overall depletion rate).
For example, for a molecule x produced at a constant rate and
eliminated at a constant rate per molecule, dx dt ¼ λ β x ðt Þ, where λ
and β are the production and elimination rates constants, respec-
tively. It is often useful to know the concentration of your compo-
nents at equilibrium. To get this value, we can simply set dx dt ¼ 0
(at steady state, concentration does not change) and solve the
equation for x. Here, the steady-state value of x is (intuitively)
determined by its rate of production divided by its degradation
rate: x ss ¼ βλ.
We can use the same strategy to build a set of ODEs describing
the repressilator. In the spirit of simplicity, let’s assume for now that
the parameters are roughly equal for each mRNA and repressor. We
can therefore use the same parameter values for all equations in our
symmetrical system:
dmi λ Kh
¼ hm m i βm
dt K þ P hi1
dP i
¼ m i λP P i β p
dt
where i is the gene index as defined above. In total, our model will
consist of six ODEs with two terms each, describing the rate of
change of three mRNAs and three repressors over time.
3.2.2 Solving the System A system of ODEs describing nonlinear biochemical networks
of ODEs cannot usually be solved analytically. However, for a given set of
parameters and initial value for the components, it can be solved
relatively easily using a numerical solver (for well-behaved
systems), which are built into most programming languages (e.g.,
Matlab (see code at https://github.com/potvinlab/MiMB_
circuitmodeling.git), Python). For example, using the ode23 func-
tion built into MatLab, we can solve our system of equations with
the parameters in Table 1 over a specified time. In Fig. 2a, we can
see that for our estimated parameter values and chosen set of initial
conditions, our system exhibits sustained oscillations. For more
detailed information on ODE models of biological networks, please
refer to the chapter in this book titled Modelling frameworks:
Ordinary Differential Equations.
Using Models to (Re-)Design Synthetic Circuits 99
a b
1500
Deterministic 3000
Stochastic
P1
Copy Number
P2
1000 2000
P3
500 1000
0 0
0 20 40 0 40
c Time (τp)
10
0
28 29 30
Fig. 2 Time traces for the proteins of the repressilator. (a) Deterministic
numerical solution to the system of ODEs, solved with the parameter set
shown in shows oscillations in Table 1. This set of parameters leads to sustained
oscillations. (b) Time traces for three proteins, simulated stochastically with the
same parameter set as in a. The stochastic system still produces sustained
oscillations, with some noise in period and amplitude of peaks. (c) Zooming in,
we can see copy number changing in discrete steps, with one protein being
produced or degraded in time steps of various lengths
3.2.3 Parameter Space Often, we are interested in understanding the behavior of the
Analysis and Bifurcation system over a range of parameters. For example, you might be
Diagrams interested in choosing components (and indirectly parameters) for
your circuit that will lead to a specific behavior. For a deterministic
system, it can be possible to analytically determine the parameter
boundary that will give oscillations (or other behaviors like damped
oscillations or stability of equilibria) using a method called linear
stability and bifurcation analysis. While the detailed process of
linear stability analysis is outside the scope of this chapter (see
Note 4), this approach has previously been used to analyze the
parameter space of the repressilator model and find the boundary of
the parameter space, which can give rise to oscillations
[1, 22]. Sometimes, combinations of parameters (such as the ratio
between them), rather than individual parameters themselves,
determine a system’s behavior. Here, two combinations of para-
meters, α and β (where α ¼ βλmβλPK and β ¼ ββP ), determine whether
m P m
the system oscillates and are used in linear stability analysis to
determine the boundary of the oscillation space. A plot of this
boundary, called the bifurcation diagram (Fig. 3), shows that
there are many sets of parameters for which the system can oscillate.
From this diagram, we can infer that increasing cooperativity/
100 Giselle McCallum and Laurent Potvin-Trottier
103
h = 1.35
h = 1.5
h=2
102
h=3
101
β=βp/βm 100
10-1
10-2 0
10 101 102 103 104
λ mλ P
α=
βp βmK
Fig. 3 Bifurcation diagram. Plot showing the boundary of the αβ parameter
space that gives rise to oscillations. Thick lines indicate the boundary at various
h values, as determined by linear stability analysis (and the parameter combina-
tions that lead to oscillations contained to the right each line). Increasing
cooperativity (h), increasing α and having β ¼ 1, all increase the parameter
space that support oscillations
3.3 Stochastic So far, we have been operating under the assumption that mole-
Simulations cules in our system are present in high numbers and therefore
behave according to deterministic dynamics. In cells, this assump-
tion is not always correct: many molecules such as mRNAs and
proteins are present in low copy numbers [23–28]. Individual
chemical reactions will happen by chance when molecules collide
with each other, such that numbers of molecules will fluctuate over
time (or across cells in a population), making their respective cellu-
lar processes (like gene expression) stochastic in nature. Levels of
molecules can also fluctuate even if they are present in higher
numbers, as this noise can be transmitted from one molecule to
the next (for example, if proteins are translated from a noisy
Using Models to (Re-)Design Synthetic Circuits 101
3.3.1 Stochastic Notation Similar to the deterministic system, we can write a set of equations
describing the rates for each possible reaction in the system, using
the following notation (see Note 5):
λ
x ! xþn
where n is the change in x value resulting from a reaction and λ is
the average rate of the reaction. In our example of simple birth and
death of a molecule, n ¼ 1 and 1, respectively, but can be 2 in
other cases, such as production in a burst or oligomerization of
molecules. In all cases, n should be an integer, as molecules can only
exist in integer numbers. For the repressilator, the reaction equa-
tions in stochastic notation are as follows:
102 Giselle McCallum and Laurent Potvin-Trottier
a b
t t
t t
t t
t t
frequency
xss
xss
c time x
β·x t0 λ
P(x=2,t1) P(x=4,t1)
x
0 1 2 3 4 5
x → x-1 t0→ next reaction?
x → x+1
x
time
Fig. 4 Stochastic simulation of a birth and death process. (a) Single time traces
(thin gray lines) simulated using the Gillespie algorithm for a species x, which is
produced at rate λ and degraded at rate β x. Although each trace is different,
their statistical properties eventually converge to the correct probability distribu-
tion (colored lines) and its moments (e.g., mean and standard deviation). (b)
Once steady state is reached, the probability distribution does not change. (c)
Random walk on a lattice. Starting at a given x value, the system can move to a
value of x + 1 or x 1 with probabilities that depend on the production and
degradation rate, respectively
λm K h
K h þP i1 h
mi ! mi þ 1
βm mi
mi ! mi 1
λP mi
Pi ! Pi þ 1
βP P i
Pi ! Pi 1
We will use the reaction rate expressions from these equations
in our Gillespie algorithm.
3.3.2 Simulating Instead of analytically solving the CME, we will simulate one reali-
a Time-Trace: The Gillespie zation of the stochastic process. The idea behind the Gillespie
Algorithm algorithm is quite simple: we initialize the system to an (arbitrary)
initial value (number of molecules at time t), and then let chemical
reactions happen randomly. To be exact, we need to pick these
reactions from their proper probability distribution, describing
both when the reaction is going to happen and which one will
happen. For example, consider our molecule x from the previous
Using Models to (Re-)Design Synthetic Circuits 103
section (produced with rate λ and degraded with rate β x), and
imagine our cell has 3 molecules (x ¼ 3) at a particular time point
(t0). The next chemical reaction will either be the production or
degradation of a molecule, either leading to x ¼ 4 or x ¼ 2. The
time until this next reaction happens is also stochastic and depends
on the current state of the system (Fig. 4c).
3.3.3 Gillespie Algorithm: After assigning an (arbitrary) initial value to all molecular species at
Time to Next Reaction the time zero of our simulation (e.g., x ¼ 3 in Fig. 4c), we will first
calculate the time to the next reaction, given the current state of our
system. We can imagine the system sitting in one state, simply
waiting for the next reaction to occur. We know that the probability
of that chemical reaction happening per time unit is constant over
time, regardless of how long we waited. As an analogy, imagine the
waiting time for rolling black while playing roulette. It does not
matter how long you wait or how many times you have already
rolled red, the probability of falling on black on a given roll is always
the same. Another example is radioactive decay, a stochastic process
where the probability of one nucleus decaying is constant over
time. This property is called memorylessness, because the stochastic
process does not have a “memory” of how long it waited in a state.
The only continuous probability distribution with this property is
the exponential distribution, here with T as the time to the first
reaction, λ is the average rate of the reaction, and τ is a given time
interval (see Note 6):
P ðT > τÞ ¼ e λτ
P ðT τÞ ¼ 1 P ðT > τÞ ¼ 1 e λτ ¼ F ðt Þ
This is the cumulative distribution function (CDF) of the time
to the first reaction: if we take the derivative of this function, we get
the exponential probability density function, which gives the prob-
ability density that the time to the first reaction is around τ:
dF ðτÞ
pðT ¼ τÞ ¼ ¼ λ e ðλτÞ
dτ
In our algorithm, we therefore want to sample from this expo-
nential distribution. Because most programming languages include
a function to generate random numbers uniformly distributed
between 0 and 1, we use the CDF to map the distribution of
interest to a uniform distribution (see Note 7). This process is
referred to as inversing the distribution. In our example, λ is the
total rate of all reactions in the system: because the reactions are
independent, the rate of any one reaction happening is the sum of
PN
the rates of each reaction in our system λtot ¼ λi , where N is
i¼1
the number of reactions in the system. Because we know the
current number of molecules in the system, we can calculate these
104 Giselle McCallum and Laurent Potvin-Trottier
N
0 λtot = ∑ λi
λ1 λ2 λ3 λN
i=1
0 r2 1
Fig. 5 Choosing a reaction. Rates of all reactions are calculated given the current
system state and normalized by the sum of all possible reactions (λtot), such that
their cumulative sum is 1. Rates are aligned, and a number r2 uniformly
distributed between 0 and 1 is generated, whose value will determine which
of these reactions will occur
3.3.4 Gillespie: Choosing We now know that a reaction happened at time t0 + τ, but we still
a Reaction don’t know which reaction occurred. Using the rate of each reac-
tion λi (again using the current state of the system), we first nor-
malize these rates by λtot, such that if we line them up, their
cumulative sum is 1, thus building a cumulative distribution func-
tion (see Note 9). We now generate a second random number
between 0 and 1 (r2), which will fall somewhere on this line of
normalized reaction rates, determining the reaction that will occur
(Fig. 5). Reactions with higher rates take up more of the space in
the vector, and are therefore chosen with higher probability.
3.3.5 Gillespie: Updating Once we know which of the N reactions will occur and the time it
the State of the System will take, we must update both the time of our simulation and
quantities of all the species in the system. To update the time,
simply add the randomly sampled τ value to the current time. To
update the quantity of reactants, we add or subtract the appropriate
value to each species involved in the randomly picked reaction (see
Note 10). For example, if we chose the transcription of m1 as our
next reaction, we would update the system by adding +1 to the
current value of m1.
Using Models to (Re-)Design Synthetic Circuits 105
3.3.6 Gillespie: Iterating The steps of the algorithm above are iterated a chosen number of
the Algorithm times (n chemical reactions), with time and quantity of reactants
being updated at each iteration. The number of iterations should be
long enough to properly characterize the resulting time trace. For
example, if you are interested in the statistical properties of your
system around steady state, you should run your simulation far
enough past the time that steady state is reached that you have
sufficient points to sample to calculate these statistics. Note that
different species in your model may evolve at different time scales
and it might take many reactions until you can sufficiently sample
your slow species (this may be computationally challenging). After
running the simulation for n iterations at the parameters listed in
Table 1, our system shows regular oscillations, but with some noise
in the period and amplitude of the oscillations (Fig. 2b). Zooming
in, we can see the discrete production and depletion steps of our
proteins, and the different sized τ intervals in time (Fig. 2c).
The steps of the Gillespie algorithm can be summarized as
follows:
1. Initialize the system at t0 to a chosen set of reactant quantities
2. Calculate all reaction rates (λi(x1, x2. . .)) and their sum
P
N
λtot ¼ λi , using the current state of the system (quantity
i¼1
of each reactant, x1, x2. . ., at t0)
3. Use λtot and a randomly generated r1 (0,1) to calculate time τ
to next reaction using inverse sampling of the exponential
distribution:
ln ðr 1 Þ
τ¼
λtot
4. Normalize all reaction rates (λi) by λtot and align them. Ran-
domly generate r2 between (0,1), whose value determines
which reaction happens at time t0 + τ
5. Update system according to chosen reaction, adding or sub-
tracting the appropriate amount to or from the quantity of each
species involved in the chosen reaction
6. Repeat steps 2–5 n times, updating the state of the system at
each iteration
3.3.7 Characterization After simulating the system for many steps, we can then character-
of Results ize its properties. For example, using a long time trace, we can
calculate the probability distribution of the number of molecules at
steady state (P(X ¼ x), Fig. 4b), or moments of the distribution,
such as the mean number of molecules or the fluctuations around
the mean (i.e., variance). The specific measure by which a gene
circuit is characterized will depend on its desired behavior. For
oscillators, the autocorrelation of the protein copy number is a
106 Giselle McCallum and Laurent Potvin-Trottier
3.3.8 Parameter Scan As with the deterministic solution to our model, we should assess
how the behavior of the system changes as a function of its para-
meters. Here, it is useful to set one parameter equal to 1, varying
the other parameter values in relation to it to minimize the number
of parameters to range and assess the behavior of your system when
simulated using this range of parameters. Typically, we set βP to
1, switching the time units of the simulation to protein lifetimes
(τp) and scaling the value of other rate values accordingly (Table 1,
see Note 1). Quantifying the autocorrelation for a range of para-
meters shows that the stochastic system oscillates over a broader
range of parameters than the deterministic system (Fig. 6a, b),
“smoothing” out the bifurcation transition (Fig. 6b). While the
Correlaon aer
a Determinisc Stochasc b one period
300 400 100
Copy Number
0.7
Hill = 2
0.5
150 200 10-1
β
0.3
10-2 0.1
0 0
0 20 40 0 20 40 0
10 10 1 2
10
α
100 200 100 0.7
Copy Number
Hill = 1.5
0.5
50 100 10-1
β
0.3
0.1
0 0 10-2
0 20 40 0 20 40 0
10 10 1
10 2
Time (τP) α
Fig. 6 Scanning the parameter space of the stochastic system. (a) Comparison of deterministic solution and
stochastic simulation of the system with different Hill coefficients. With the chosen parameters, the system
still shows sustained oscillations with low cooperativity, whereas the deterministic solution shows damped
oscillations. (b) Heatmap of the autocorrelation of the time traces of the proteins after one period. Here,
α ¼ βλβP λm
and β ¼ ββP . The thick black line indicates the deterministic bifurcation boundary, and the pink dot
P m K m
corresponds to the parameter values used in simulations in a. As in the deterministic bifurcation calculation,
increasing cooperativity increases the size of the oscillation space. However, in the stochastic regime, it is
possible to maintain oscillations outside the predicted bifurcation boundary, with both high and low coopera-
tivity constants
Using Models to (Re-)Design Synthetic Circuits 107
3.4 Using Models Now that we know how to simulate our synthetic circuits, we can
to Redesign incorporate data from the first design to help us understand and
the Circuit: An improve their behavior. This process is obviously very specific to
Example particular circuits, so here we will use our previous experience
redesigning the repressilator as an example. Some recommenda-
tions are general, such as reducing propagation of stochastic noise
(as this can transmitted between molecules), and we will emphasize
these. It might also be necessary to iterate the design-build-test
loop multiple times, making small changes to the circuit, then
quantifying its properties and analyzing the results. Initially, the
repressilator was designed using the bifurcation diagram in Fig. 3.
The guidelines were therefore to have strong promoters
(to increase α) and high cooperativity, while ensuring that the
proteins’ half-lives were similar to the mRNAs’ (β ¼ 1). Therefore,
repressors that multimerize and bind strongly were chosen, and
they were targeted for fast degradation to reduce their half-lives.
The assembled circuit did indeed oscillate, but its performance
appeared much lower than natural oscillators or other subsequently
published synthetic oscillators [32–39].
For the redesign of a circuit, it is crucial that the experimental
data accurately represent the circuit’s behavior. Therefore, for
single-cell dynamic properties such as oscillations, we evaluated
the performance of the circuit using a microfluidic device nick-
named the mother machine [6–8], which enables us to track
thousands of single cells under constant growth conditions for
hundreds of cell divisions. Comparing these data to the original
experiments performed on agar pad (where growth conditions
change rapidly as cells start to compete for nutrients) revealed
that the oscillatory properties appeared much improved, suggesting
that the circuit is sensitive to changes in growth conditions. This
illustrates how separating variability from the environment and
intrinsic noise of a circuit can aid in its redesign, as we can then
change or eliminate components that are highly sensitive to envi-
ronmental noise (e.g., growth conditions). We also observed high
amplitude noise between the peaks of the oscillations (Fig. 7), and
we thus decided to investigate fluorescent read-out for the oscilla-
tions as a potential source of noise. The original design included
one plasmid for the repressor and another, noisy plasmid carrying
the fluorescent read-out to track oscillations. Therefore, this ampli-
tude noise could be simply an artifact of our measurements.
Indeed, transferring the reporter to the repressilator plasmid
108 Giselle McCallum and Laurent Potvin-Trottier
PR lacI-ssrA
λcI-ssrA lacI gfp-asv
Redesigning a PLtetO1 PLtetO1
PLlacO1
synthetic circuit: λcI tetR
the repressilator pSC101 ori colE1 ori
tetR-ssrA
original repressilator cicuit
1. Precise characterization
4
GFP concentration
1cm
3
• analyze single cells “Mother Machine”
under constant 2
growth conditions 1
0
1.5μm 0 10 20 30
time (τP)
2. Identify and eliminate sources of noise
YFP concentration
6
a. amplitude noise PLtetO1 λcI-ssrA
• integrate reporter mVenus PR 4
lacI-ssrA
to remove noise PLtetO1 2
from reporter
plasmid PLlacO1
0
pSC101 ori tetR-ssrA 0 10 20 30
time (τP)
b. noisy decay model decay to guide redesign: repression curve TetR sponge
decay decay decay
transcription
tion to increase K
peak protein h
number colE1 ori
time [repressor]
PLtetO1
YFP concentration
λcI 6
• add titration PLtetO1
mVenus PR
sponge to 4
lacI
increase repression P
LtetO1 + 2
threshold and PLlacO1
cooperativity colE1 ori 0
pSC101 ori tetR 0 10 20 30 40
time (τP)
Fig. 7 Redesigning the repressilator. Outline of steps taken to redesign the repressilator circuit to achieve high
robustness and precision. In step 1, we characterized the circuit in single cells at constant growth rates, which
improved oscillations compared to the original experimental setup. In step 2, we identified and eliminated
intrinsic sources of noise in the circuit. These included variable copy numbers of reporter plasmids (a), low
peak amplitude due to degradation of repressors and apparent low K values of repressors (b). Integrating the
fluorescent reporter onto the repressilator plasmid, removing degradation, and adding a titration sponge
improved precision of the circuit, leading to the most precise performance of a synthetic oscillator to date
Using Models to (Re-)Design Synthetic Circuits 109
greatly reduced peak amplitude noise (Fig. 7). In doing so, we also
made a serendipitous discovery: the fluorescent reporter originally
targeted for degradation interfered with degradation of the repres-
sors, adding noise to the oscillations. This was an example of the
unknowns in biology and emphasizes the need for both experiment
and modeling.
We also observed that the shape of the oscillations was strongly
non-sinusoidal (Fig. 7), which mathematical modeling of our sys-
tem told us was characteristic of very low repression thresholds (K)
(as expected for these strong repressors). In such a regime, the
promoters operate in a switch-like fashion—they are either
completely on or off—and the period can be decomposed in three
sub phases where each repressor decays from its peak value down to
its repression threshold while its production is completely off
(Fig. 7). After P1 decays below its threshold, production of P2 is
derepressed, which will immediately inhibit production of P3 and
initiate its decay phase. The length of the period is thus determined
by the sum of the three decay times, and we can analyze each decay
independently. While this analysis is specific to this circuit, such
pseudo-steady-state analysis, or time scale separation, is a general
technique that can be useful in analyzing many types of circuits. A
detailed analysis of the decay phase showed that two factors were
necessary for a precise timing: (1) high peak amplitude, averaging
the timing of the decay over many steps, and (2) relatively high
repression thresholds, as the elimination time of the last few mole-
cules (to fall below a low K value) is very noisy (Fig. 2c), which in
turns causes large variation in period. These recommendations were
implemented by (1) removing protein degradation, thus letting
proteins accumulate to higher numbers (and also possibly remov-
ing a source of noise) and (2) adding decoy binding sites for the
repressors (called a “titration sponge”). These decoy sites (present
in much higher copy numbers than the actual sites) soak up free
repressors, effectively increasing the repression threshold (and
increasing effective cooperativity at the same time)(Fig. 7). This
linear molecular titration [40–44] is a very versatile tool that
enables the tuning of repression curves that would experimentally
difficult to change otherwise and has been used in a variety of
applications, from timers in natural biological systems [7] to con-
trollers for perfect adaptation [45]. After implementing these
changes, the oscillations of the repressilator were extremely precise,
taking more than 13 periods before accumulating half a period of
phase drift.
As demonstrated, mathematical modeling and careful experi-
mental characterization are both critical components of designing
and redesigning synthetic genetic circuits. Models can provide
valuable insights into required parameter values and possible sys-
tem behaviors, and should guide the initial engineering of a circuit.
It is important to carefully characterize this initial circuit
110 Giselle McCallum and Laurent Potvin-Trottier
4 Notes
P0
P(t) = P0·e-βP t
Protein (P)
P0 ln(2)
2 t1 /2= = τP ln(2)
βP
e-1
0
0 t1 /2 τP
Time (t)
Fig. 8 Exponential protein decay. For proteins being eliminated at a constant rate
(with no production), the population will decay exponentially. The half-life (t1/2) is
the time at which the population has reached half of its initial value. The lifetime
(τP) of the protein is the average time that a protein will exist in the system. βP is
the elimination rate constant
h=1
h = 1.5
h=3
h=6
0
0 K
[repressors]
Fig. 9 Hill function. The Hill is used in our model to describe binding of repressors
to their promoters. Transcription rate is calculated as a function of the number of
unbound promoters (available for expression). As the cooperativity coefficient
h increases, the transition between a gene being fully unbound (expressed with
rate λm) and fully bound (repressed) becomes sharper
15%
P(T≤t)
0.5
15%
0
0 100
4.2% 27.8%
time (t)
Fig. 10. Intuition for the inverse transform sampling. Equal probabilities (15%) on
the uniform distribution are mapped to different probabilities on the CDF, where
the higher slopes (corresponding to the PDF) corresponds to higher probabilities
1
e λtot τ ¼
1 r1
remember that if y ¼ ex, x ¼ ln ( y). Therefore:
1
λtot τ ¼ ln
1 r1
Also remember that ln(x y) ¼ y · ln (x):
ln ð1 r 1 Þ
τ¼
λtot
9. Gillespie in practice: rate vector function. When setting up the
Gillespie algorithm, there are a few tricks that make things
more efficient and cleaner. For example, after writing the entire
set of equations, it is helpful to assemble the rates definitions
for each equation into a vector, and build this vector into a
function that accepts current reactant values as an input (called
the rate vector function, or rvf). This allows us to easily calcu-
late the individual rates of all reactions for a given system state
and to quickly sum the rates to calculate λtot. In the case of the
repressilator, the rate vector function is defined as:
h
λm K h λ Kh
rvf ðm, P Þ ¼ , βm m1 , λP m1 , βP P 1 , mh , β m m 2 , λP m 2 ,
K þ P3
h h
K þ P h1
i
λ Kh
βP P 2 , mh , β m m 3 , λP m 3 , β P P 3
K þ P h2
Using Models to (Re-)Design Synthetic Circuits 115
Table 2
Stoichiometry matrix for the repressilator model
m1 P1 m2 P2 m3 P3
m1 ! m1 + 1 1 0 0 0 0 0
m1 ! m 1 1 1 0 0 0 0 0
P1 ! P1 + 1 0 1 0 0 0 0
P1 ! P1 1 0 1 0 0 0 0
m2 ! m2 + 1 0 0 1 0 0 0
m2 ! m 2 1 0 0 1 0 0 0
P2 ! P2 + 1 0 0 0 1 0 0
P2 ! P2 1 0 0 0 1 0 0
m3 ! m3 + 1 0 0 0 0 1 0
m3 ! m 3 1 0 0 0 0 1 0
P3 ! P3 + 1 0 0 0 0 0 1
P3 ! P3 1 0 0 0 0 0 1
116 Giselle McCallum and Laurent Potvin-Trottier
Table 3
Example Stoichiometry matrix for a system with coupled reactions
Production Dimerization
λy λY y 2
y ! y þ1 ðy, Y Þ ! ðy 2, Y þ 1Þ
y 1 2
Y 0 1
References
1. Elowitz MB, Leibier S (2000) A synthetic 20:1099–1103. https://doi.org/10.1016/j.
oscillatory network of transcriptional regula- cub.2010.04.045
tors. Nature 403:335–338. https://doi.org/ 7. Norman TM, Lord ND, Paulsson J, Losick R
10.1038/35002125 (2013) Memory and modularity in cell-fate
2. Gardner TS, Cantor CR, Collins JJ (2000) decision making. Nature 503:481–486.
Construction of a genetic toggle. Nature https://doi.org/10.1038/nature12804
403:339–342. https://doi.org/10.1038/ 8. Potvin-Trottier L, Luro S, Paulsson J (2018)
35002131 Microfluidics and single-cell microscopy to
3. Vilar JMG, Kueh HY, Barkai N, Leibler S study stochastic processes in bacteria. Curr
(2002) Mechanisms of noise-resistance in Opin Microbiol 43:186–192. https://doi.
genetic oscillators. Proc Natl Acad Sci U S A org/10.1016/j.mib.2017.12.004
99:5988–5992. https://doi.org/10.1073/ 9. Alon U (2007) An introduction to systems
pnas.092133899 biology : design principles of biological cir-
4. McKane AJ, Newman TJ (2005) Predator-prey cuits. Chapman & Hall/CRC, Boca Raton, FL
cycles from resonant amplification of demo- 10. Phillips R, Kondev J, Theriot J et al (2013)
graphic stochasticity. Phys Rev Lett Physical biology of the cell, 2nd edn. Garland
94:218102. https://doi.org/10.1103/Phy Science, New York, NY
sRevLett.94.218102 11. Munsky B, Hlavacek WS, Tsimring LS (2018)
5. Potvin-Trottier L, Lord ND, Vinnicombe G, Quantitative biology : theory, computational
Paulsson J (2016) Synchronous long-term methods, and models. MIT Press, Cambridge,
oscillations in a synthetic gene circuit. Nature MA
538:514–517. https://doi.org/10.1038/ 12. Ingalls B (2013) Mathematical modelling in
nature19841 systems biology: an introduction. MIT Press,
6. Wang P, Robert L, Pelletier J et al (2010) Cambridge, MA
Robust growth of Escherichia coli. Curr Biol
Using Models to (Re-)Design Synthetic Circuits 117
13. Bialek WS (2012) Biophysics: searching for 27. Raj A, Van Oudenaarden A (2009) Single-
principles. Princeton University Press, Prince- molecule approaches to stochastic gene expres-
ton, NJ sion. Annu Rev Biophys 38:255–270. https://
14. Wikipedia (2019) Gillespie algorithm. https:// doi.org/10.1146/annurev.biophys.37.
en.wikipedia.org/wiki/Gillespie_algorithm 032807.125928
15. Kernst OK (2015) Gillespie’s stochastic simu- 28. Paulsson J (2004) Summing up the noise in
lation algorithm for chemical reactions. In: gene networks. Nature 427:415–418.
Wolfram Alpha Demonstr. https:// https://doi.org/10.1038/nature02257
demonstrations.wolfram.com/Gillespies 29. McQuarrie DA (1967) Stochastic approach to
StochasticSimulationAlgorithmForChemical chemical kinetics. J Appl Probab 4:413–478.
Reactions/ https://doi.org/10.2307/3212214
16. Hilfinger A, Norman TM, Paulsson J (2016) 30. van Kampen NG (2007) Stochastic processes in
Exploiting natural fluctuations to identify physics and chemistry, 3rd edn. Elsevier,
kinetic mechanisms in sparsely characterized Amsterdam
systems. Cell Syst 2:251–259. https://doi. 31. Gillespie DT (1977) Exact stochastic simula-
org/10.1016/j.cels.2016.04.002 tion of coupled chemical reactions. J Phys
17. Hilfinger A, Norman TM, Vinnicombe G, Chem 81:2340–2361. https://doi.org/10.
Paulsson J (2016) Constraints on fluctuations 1021/j100540a008
in sparsely characterized biological systems. 32. Mihalcescu I, Hsing W, Leibler S (2004) Resil-
Phys Rev Lett 116:058101. https://doi.org/ ient circadian oscillator revealed in individual
10.1103/PhysRevLett.116.058101 cyanobacteria. Nature 430:81–85. https://doi.
18. Milo R, Jorgensen P, Moran U et al (2010) org/10.1038/nature02533
BioNumbers--the database of key numbers in 33. Teng S-W, Mukherji S, Moffitt JR et al (2013)
molecular and cell biology. Nucleic Acids Res Robust circadian oscillations in growing cyano-
38:D750–D753. https://doi.org/10.1093/ bacteria require transcriptional feedback. Sci-
nar/gkp889 ence 340:737–740. https://doi.org/10.
19. Milo R, Phillips R (2016) Cell biology by the 1126/science.1230996
numbers. Garland Science, New York, NY 34. Chabot JR, Pedraza JM, Luitel P, van Oude-
20. Guet CC, Bruneaux L, Min TL et al (2008) naarden A (2007) Stochastic gene expression
Minimally invasive determination of mRNA out-of-steady-state in the cyanobacterial circa-
concentration in single living bacteria. Nucleic dian clock. Nature 450:1249–1252. https://
Acids Res 36:e73. https://doi.org/10.1093/ doi.org/10.1038/nature06395
nar/gkn329 35. Stricker J, Cookson S, Bennett MR et al (2008)
21. Siwiak M, Zielenkiewicz P (2013) Transimula- A fast, robust and tunable synthetic gene oscil-
tion - protein biosynthesis web service. PLoS lator. Nature 456:516–519. https://doi.org/
One 8:e73943. https://doi.org/10.1371/ 10.1038/nature07389
journal.pone.0073943 36. Tigges M, Dénervaud N, Greber D et al
22. Elowitz MB (1999) Transport, assembly, and (2010) A synthetic low-frequency mammalian
dynamics in systems of interacting proteins. oscillator. Nucleic Acids Res 38:2702–2711.
PhD Thesis. Princeton University, Princeton, https://doi.org/10.1093/nar/gkq121
NJ 37. Danino T, Mondragón-Palomino O,
23. El Samad H, Khammash M, Petzold L, Gille- Tsimring L, Hasty J (2010) A synchronized
spie D (2005) Stochastic modelling of gene quorum of genetic clocks. Nature
regulatory networks. Int J Robust Nonlinear 463:326–330. https://doi.org/10.1038/
Control 15:691–711. https://doi.org/10. nature08753
1002/rnc.1018 38. Mondragón-Palomino O, Danino T, Selim-
24. Paulsson J (2005) Models of stochastic gene khanov J et al (2011) Entrainment of a popula-
expression. Phys Life Rev 2:157–175. https:// tion of synthetic genetic oscillators. Science
doi.org/10.1016/j.plrev.2005.03.003 333:1315–1319. https://doi.org/10.1126/
25. Elowitz MB, Levine AJ, Siggia ED, Swain PS science.1205369
(2002) Stochastic gene expression in a single 39. Prindle A, Selimkhanov J, Li H et al (2014)
cell. Science 297:1183–1186. https://doi. Rapid and tunable post-translational coupling
org/10.1126/science.1070919 of genetic circuits. Nature 508
26. Ozbudak EM, Thattai M, Kurtser I et al (7496):387–391. https://doi.org/10.1038/
(2002) Regulation of noise in the expression nature13238
of a single gene. Nat Genet 31:69–73. https:// 40. Buchler NE, Louis M (2008) Molecular titra-
doi.org/10.1038/ng869 tion and ultrasensitivity in regulatory networks.
118 Giselle McCallum and Laurent Potvin-Trottier
J Mol Biol 384:1106–1119. https://doi.org/ 46. Hill A (1910) The possible effects of the aggre-
10.1016/j.jmb.2008.09.079 gation of the molecules of haemoglobin on its
41. Buchler NE, Cross FR (2009) Protein seques- oxygen dissociation curve. J Physiol 40:4–7
tration generates a flexible ultrasensitive 47. Strogatz S (2015) Nonlinear dynamics and
response in a genetic network. Mol Syst Biol chaos: with applications to physics, biology,
5:272. https://doi.org/10.1038/msb. chemistry, and engineering, 2nd edn. Westview
2009.30 Press, Boulder, CO
42. Genot AJ, Fujii T, Rondelez Y (2012) Com- 48. Epstein IR, Irving R, Pojman JA, John A
puting with competition in biochemical net- (1998) An introduction to nonlinear chemical
works. Phys Rev Lett 109:1–5. https://doi. dynamics: oscillations, waves, patterns, and
org/10.1103/PhysRevLett.109.208102 chaos. Oxford University Press, New York, NY
43. Lee T-H, Maheshri N (2012) A regulatory role 49. Weiße AY, Oyarzún DA, Danos V et al (2015)
for repeated decoy transcription factor binding Mechanistic links between cellular trade-offs,
sites in target gene expression. Mol Syst Biol gene expression, and growth. Proc Natl Acad
8:576. https://doi.org/10.1038/msb.2012.7 Sci U S A 112:E1038–E1047. https://doi.
44. Brewster RC, Weinert FM, Garcia HG et al org/10.1073/pnas.1416533112
(2014) The transcription factor titration effect 50. Niederholtmeyer H, Sun ZZ, Hori Y et al
dictates level of gene expression. Cell (2015) Rapid cell-free forward engineering of
156:1312–1323. https://doi.org/10.1016/j. novel genetic ring oscillators. elife 4:1–18.
cell.2014.02.022 https://doi.org/10.7554/elife.09771
45. Lillacci G, Aoki SK, Gupta A et al (2019) A 51. Taniguchi Y, Choi PJ, Li G-W et al (2010)
universal rationally-designed biomolecular Quantifying E. coli proteome and transcrip-
integral feedback controller for robust perfect tome with single-molecule sensitivity in single
adaptation. Nature 570:533–537. https://doi. cells. Science 329:533–538. https://doi.org/
org/10.1038/s41586-019-1321-1 10.1126/science.1188308
Chapter 4
Abstract
SYNBADm is a Matlab toolbox for the automated design of biocircuits using a model-based optimization
approach. It enables the design of biocircuits with pre-defined functions starting from libraries of biological
parts. SYNBADm makes use of mixed integer global optimization and allows both single and multi-
objective design problems. Here we describe a basic protocol for the design of synthetic gene regulatory
circuits. We illustrate step-by-step how to solve two different problems: (1) the (single objective) design of a
synthetic oscillator and (2) the (multi-objective) design of a circuit with switch-like behavior upon
induction, with a good compromise between performance and protein production cost.
Key words Automated design, Biological parts, Global optimization, Mixed Integer Nonlinear
Programming, Multi-objective optimization, Trade-offs, Synthetic biology
1 Introduction
Filippo Menolascina (ed.), Synthetic Gene Circuits: Methods and Protocols, Methods in Molecular Biology, vol. 2229,
https://doi.org/10.1007/978-1-0716-1032-9_4, © Springer Science+Business Media, LLC, part of Springer Nature 2021
119
120 Irene Otero-Muras and Julio R. Banga
2 Methods
5
.
P3 . 15 P3
15
CDS
5 5
Fig. 1 Biological components: Promoters, Ribosome Binding Site (RBS), Protein Coding Sequences (CDS), and
Terminator (Ter); Devices and Systems
2.3 Optimization The class of mixed integer dynamic optimization problems consid-
Solvers ered above is NP-hard. Although these problems can be solved with
purely stochastic methods (such as simulated annealing, genetic
algorithms, etc.), that would be extremely costly computationally,
since these methods require a very large number of evaluations of
the cost function (and therefore many simulations of the explored
biocircuits). To avoid this, SYNBADm includes four global optimi-
zation solvers which are based on metaheuristics, combining sto-
chastic global search with efficient local search methods. The main
advantage is that we keep the global character of the search (escap-
ing from local solutions) while increasing efficiency dramatically
thanks to the deterministic local solvers. Currently, SYNBADm
offers the following metaheuristics:
l eSS (enhanced Scatter Search by Egea et al. [7]): for Mixed
Integer Nonlinear Programming problems, it handles con-
straints and incorporates the local solver MISQP (Mixed Integer
Sequential Quadratic Programming by Exler et al. [8]).
l MITS (Mixed Integer Tabu Search by Exler et al. [9]): for Mixed
Integer Nonlinear Programming problems, incorporates the
local solver MISQP.
l ACOmi (Ant Colony Optimization for mixed integer domain by
Schlueter et al. [10]): for Mixed Integer Nonlinear Program-
ming problems, incorporates local solver MISQP.
l VNS (Variable Neighborhood search by Mladenovic et al. [11]):
for Integer Nonlinear Programming problems, it does not han-
dle constraints. Use this solver for unconstrained single objec-
tive problems with integer (or binary) variables only.
It should be noted that, due to their stochastic and heuristic
nature, these methods cannot guarantee global optimality. How-
ever, it should also be recalled that deterministic global optimiza-
tion methods, which could in principle offer such guarantees, are in
practice too computationally costly to be applied to problems of
realistic size. In contrast, the metaheuristics considered in
Biocircuit Design with SYNBADm 123
4.1 Definition of the The goal is to find an endogenous oscillator (i.e. the system oscil-
Problem lates without the need of an external inducer) starting from a library
of available components. We consider mass action kinetics, because
we are interested in tracking the concentrations of proteins and
mRNAs (see Note 1).
We have the following components available: two different
promoters Pλ and Ptet that we denote, respectively, by P1, P2 and
four protein coding sequences (cI, tetR, lacI, luxI). In addition, we
consider an extra promoter P3, repressed by lacI with tunable
promoter strength. To generate the corresponding SYNBADm
library we will use as a template the built-in mass action library
and modify it accordingly.
4.2 Preparing the 1. In the folder MA_Library (within USR_Libraries), we open the
Library of Components script MA_input_library that we are going to use as a tem-
plate. Before doing any modification, we save the script with a
different name, MA_input_library_EX1.
Biocircuit Design with SYNBADm 125
1 library . n a m e o f f u n c t i o n= ’MAex1 ’ ;
2 library . p r o m o t e r s={ ’ P1 ’ , ’ P2 ’ , ’ P3 ’ } ; % l i s t o f p r o m o t e r s
3 library . t r a n s c r i p t s ={ ’ c I ’ , ’ tetR ’ , ’ l a c I ’ , ’ l u x I ’ } ; % l i s t o f p r o t e i n c o d i n g r e g i o n s
4 library . p r o m t f={ ’ c I ’ , ’ tetR ’ , ’ l a c I ’ } ; % t r a n s c r i p t a f f e c t i n g each promoter
5 library . i n d u c e r s ={}; % l i s t o f i n d u c e r s
6 library . i n d t r ={}; % t r a n s c r i p t a f f e c t e d by each i n d u c e r
4.3 Defining the SYNBADm has a number of built-in objective functions included in
Objective Function the folder USR_ObjFuns. The function OF_Oscil is the objective
function especially suited to design oscillators. Therefore, we do
not need to define in this case an ad-hoc objective function but
making use directly of the built-in function OF_Oscil. We only
need to adapt the list of species to the library that we have defined
for our problem. In order to do this, we open OF_Oscil and
substitute the default list of species by the one we are currently
using, i.e.: trnsc ¼ {cI,tetR,lacI,luxI}; The objective
126 Irene Otero-Muras and Julio R. Banga
1 f u n c t i o n k=MAex1 parameters
2
3 NA = 6 . 0 2 2 1 4 1 5 e23 ; % Avogadro
4 V = 1 e −14; % C e l l volume
5 NAV = NA∗V/1 e9 ; % For c o n c e n t r a t i o n i n nM
6
7 k f p t 1=NAV;
8 k f p t 2=NAV;
9 k f p t 3=NAV;
10 kb pt 1 =0.5;
11 kb pt 2 =0.5;
12 kb pt 3 =0.5;
13 kdeg pt 1 =0.075;
14 kdeg pt 2 =0.075;
15 kdeg pt 3 =0.075;
16 ktransc 1 =0.00005;
17 ktransc 2 =0.00005;
18 ktransc 3 =0.00005;
19 kleak 1 =0.12;
20 kleak 2 =0.09;
21 kleak 3 =0.01;
22 k t r a n s l 1 =0.1;
23 k t r a n s l 2 =0.1;
24 k t r a n s l 3 =0.1;
25 k t r a n s l 4 =0.1;
26 kdeg m 1 = 0 . 0 0 1 ;
27 kdeg m 2 = 0 . 0 0 1 ;
28 kdeg m 3 = 0 . 0 0 1 ;
29 kdeg m 4 = 0 . 0 0 1 ;
30 kdeg 1 =0.001;
31 kdeg 2 =0.001;
32 kdeg 3 =0.001;
33 kdeg 4 =0.001;
4.4 Solving the In the library USR_inputs, we create the input file (a pre-existing
Single Objective input file can be used as a template), and we save it as Oscilla-
Optimization Problem tor_MAex1.m (see Fig. 4). In this file we indicate:
1 %================================
2 % MIXED INTEGER MODEL FRAMEWORK
3 %================================
4 i n p u t s . model . l i b t y p e = ’ MA Library ’ ; %Choose ’ MA Library ’ | ’ HL Library ’
5 i n p u t s . model . ode name = ’ M A e x 1 o d e f i l e c ’ ;
6 i n p u t s . model . n i n t e g e r v a r = 0 ;
7 i n p u t s . model . n r e a l v a r = 1 ;
8 i n p u t s . model . n b i n a r y v a r = 1 2 ;
9 i n p u t s . model . d e f p a r a m f u n c t i o n= ’ MAex1 parameters 1 ’ ;
10 i n p u t s . model . d e f s t a t e f u n c t i o n= ’ M A e x 1 d e f a u l t s t a t e s ’ ;
11 i n p u t s . model . t r a n s c p r o m o t f u n c t i o n = ’ M A e x 1 t r a n s c r i p t s a n d p r o m o t e r s ’ ;
12 i n p u t s . model . u v a l u e s = [ ] ;
13 %============================
14 % DESIGN PROBLEM OPTIONS
15 %============================
16 i n p u t s . d e s i g n . o b j e c t i v e = ’ OF Oscil ’ ;
17 inputs . design . idx = {3};
18 inputs . design . par x = [ ] ;
19 inputs . design . var L = z e r o s ( 1 , 1 3 ) ;
20 i n p u t s . d e s i g n . var U = o n e s ( 1 , 1 3 ) ;
21 inputs . design . var 0 = zeros (1 ,13) ;
22 i n p u t s . d e s i g n . D max = 3 ; % o n l y a p p l i e s i n MITS , ESS , ACO
23 i n p u t s . d e s i g n . D min = 3 ; % o n l y a p p l i e s i n MITS , ESS , ACO
24 %====================================
25 % SIMULATE OPTIONS
26 %=====================================
27 inputs . simulate . v a r c i r c u i t = [ ] ;
28 inputs . simulate . tspan = 0 : 1 0 : 4 0 0 0 0 ;
29 i n p u t s . s i m u l a t e . o b j e c t i v e = { ’ OF Oscil ’ } ;
30 %==================================
31 % MINLP SOLVER OPTIONS
32 %==================================
33 i n p u t s . o p t s o l . o p t s o l v e r = ’ ESS ’ ; % Choose MINLP s o l v e r ’ ESS ’ | ’ MITS ’ | ’ ACO’ | ’ VNS’
34 i n p u t s . o p t s o l . maxtime = 1 0 0 ;
35 i n p u t s . o p t s o l . maxeval = [ ] ;
36 %e s s o p t i o n s
37 i n p u t s . o p t s o l . e s s . l o c a l . s o l v e r = ’ misqp ’ ;
38 %==================================
39 % IVP SOLVER OPTIONS
40 %==================================
41 i n p u t s . i v p s o l . r t o l = 1 . 0D−7; % [ ] IVP s o l v e r i n t e g r a t i o n t o l e r a n c e s
42 i n p u t s . i v p s o l . a t o l = 1 . 0D−7;
5. The options for the integration, mainly the tolerances for the
initial value problem (IVP) solver.
Once the input file is completed, we call (from the main direc-
tory) the function to solve the single optimization design problem:
>>SYNBAD_Design_SO(Oscillator_MAex1). After the
computation time (selected in the design input file, in this case
100 s), the optimal solution found is stored in the file RESULTS_-
DESIGN.mat. Note that the design problem might have more than
one optimal solution, and due to the fact that we use global
optimization solvers, the solution obtained by SYNBADm might
be different in every call to SYNBAD_Design_SO. Here we find the
following solution:
results.xbest ¼ [0.013209502186800 0 0 1 1 0 0 0 1 0 0
0 0] which corresponds to the circuit in Fig. 5. The value of the
objective function for the optimal circuit is results.fbest¼
-0.739787373366868. We recommend to save the mat file con-
taining the results with a different name (RESULTS_DE-
SIGN_EX1_T1) in the USR_Results folder, to avoid overwriting
the results in further calls to the single objective design function.
128 Irene Otero-Muras and Julio R. Banga
t2 (tetR) P2
LacI
t3 (lacI)
P3
t4(luxI) kf_pt_3 = 0.0132
cI
P1 P2 P3
0.2 0.2
P1
P2
0.1 0.1
0 0
0.2 0.2
P3lacI P1cI
P2tetR P3
0.1 0.1
0 0
0.2 0.2
0.1 0.1
0 0
200 100
tetR
cI
100 50
0 0
1000 2
cIm
lacI
500 1
0 0
tetRm
2 20
lacIm
1 10
0 0
0 0.5 1 1.5 2 2.5 3 3.5 4 0 0.5 1 1.5 2 2.5 3 3.5 4
t x10 4 t x10 4
Fig. 6 Dynamics of all the species involved in the oscillator obtained by SYNBADm
4.5 Simulating the If we want to simulate the dynamics of a circuit (the optimal circuit
Dynamics of a Circuit found by SYNBADm or any other combination) we only have to fill
the simulation options in the input file Oscillator_MAex1.m,
including the vector describing the circuit inputs.simulate.
var_circuit. Importantly, if we choose other circuit than the
solution found, the vector needs to preserve the same structure in
terms of number and class (binary or real) of the entries. The
simulation is carried out by running:
>>SYNBAD_Simulate(Oscillator_MAex1). Using the
solution found, the dynamics of all the species involved are auto-
matically depicted, as it is shown in Fig. 6.
Biocircuit Design with SYNBADm 129
5.1 Definition of the The second example consists on finding circuits that behave as
Problem switches upon stimulation by different inducers (starting by a
library containing different promoters, protein coding sequences,
and inducers). By switch-like performance, we understand as in
[14] that the steady state level of LacI is high upon aTc and low
upon IPTG induction, whereas the steady state level of tetR is low
upon aTc and high upon IPTG induction.
In order to ensure an optimal use of the cell resources, we take
into account the protein production cost as a second optimization
objective. We consider in this case Hill kinetics (we are interested
only in the dynamics of the proteins involved, see Note 2).
We have the following components available: five different
promoters Plac1, Plac2, Pλ, Ptet, ParaC that we denote, respectively,
by P1 ... P5 and four protein coding sequences (tetR, lacI, cI,
araC). To generate the corresponding SYNBADm library we will
use as a template the built-in Hill kinetics library and modify it
accordingly. The first two promoters are both repressed by lacI but
with different affinities.
5.2 Preparing the 1. In the folder HL_Library (within USR_Libraries), we open the
Library of Components script HL_input_library that we are going to use as a tem-
plate. Before doing any modification, we save the script with a
different name, HL_input_library_EX2.
2. Now, in the file HL_input_library_EX2 we fill in the fields of
the structure library, as indicated in Fig. 7, where name_-
of_function contains a short name to identify the library
files, promoters is a row cell array of strings containing the
names of the promoters, transcripts is a row cell array of
strings containing the names of the protein coding regions,
prom_tf is the list of transcripts repressing each promoter,
prom_nhill contains the Hill coefficients for each repressor-
promoter pair, inducers is a row cell array of strings contain-
ing the names of the inducers and ind_tr is a row cell array of
strings containing the names of the transcript being bound by
each inducer.
3. In order to generate the library files we call the SYNBADm Hill
kinetics library function:
1 library . n a m e o f f u n c t i o n= ’ HLex2 ’ ;
2 library . p r o m o t e r s={ ’ P l a c 1 ’ , ’ P l a c 2 ’ , ’ Plambda ’ , ’ P t e t ’ , ’ ParaC ’ } ; % l i s t o f p r o m o t e r s
3 library . t r a n s c r i p t s ={ ’ tetR ’ , ’ l a c I ’ , ’ c I ’ , ’ araC ’ } ; % l i s t o f p r o t e i n c o n d i n g r e g i o n s
4 library . p r o m t f={ ’ l a c I ’ , ’ l a c I ’ , ’ c I ’ , ’ tetR ’ , ’ araC ’ } ; % t r a n s c r i p t a f f e c t i n g each promoter
5 library . p r o m n h i l l = [ 4 , 4 , 2 , 2 , 2 ] ; % H i l l c o e f f i c i e n t s o f R e p r e s s o r −Promoter p a i r s
6 library . i n d u c e r s ={ ’IPTG ’ , ’ aTc ’ } ; % l i s t o f i n d u c e r
7 library . i n d t r ={ ’ l a c I ’ , ’ tetR ’ } ; % t r a n s c r i p t a f f e c t e d by each i n d u c e r
1 f u n c t i o n k=H L e x 2 p a r a m e t e r s 1
2
3 K Plac1 =10;
4 K Plac2 = 0 . 0 1 ;
5 K Plambda = 0 . 3 3 ;
6 K Ptet = 0 . 0 1 4 ;
7 K ParaC = 2 . 5 ;
8 alpha tetR =1.215;
9 a l p h a l a c I =1.215;
10 alpha cI =2.92;
11 alpha araC =1.215;
12 kdeg tetR =0.0346;
13 k d e g l a c I =0.0346;
14 kdeg cI =0.0693;
15 kdeg araC = 0 . 0 1 1 5 ;
16 kf lacIIPTG =0.05;
17 kf tetRaTc =0.05;
18 kb lacIIPTG = 0 . 1 ;
19 kb tetRaTc = 0 . 1 ;
20 kdeg lacIIPTG = 0 . 0 6 9 3 ;
21 kdeg tetRaTc = 0 . 0 6 9 3 ;
>>SYNBAD_Makelibrary_HL_C(HL_input_library_
EX2) .
5.3 Defining the The first objective function must encode the desired switch-like
Objective Functions behavior: namely, the steady state level of LacI must be high upon
aTc and low upon IPTG induction, whereas the steady state level of
tetR must be low upon aTc and high upon IPTG induction, as it has
been defined in [14]. To ensure that we achieve the desired func-
tionality at a minimal consumption of cell resources we consider as a
second objective a proxy of the protein production cost as defined
elsewhere [15]. Both objective functions can be defined by modify-
ing accordingly the templates denoted, respectively, as OF_Switch
and OF_Cost available in SYNBADm, in the folder USR_ObjFuns.
We save the corresponding functions as OF1_Switch and
OF2_Cost in USR_ObjFuns.
Biocircuit Design with SYNBADm 131
O2U sol1
Objective 2
sol2
O2L
5.4 Solving the SYNBADm solves bi-objective optimization problems using the
Multi-Objective epsilon-constraint strategy [16]. First, we choose our objective
Optimization Problem 1 and objective 2 and solve, for each of them, a single objective
optimization problem. In this way we obtain the extremes of the
Pareto front denoted in Fig. 9 by sol1 and sol2. In this example, we
choose the circuit performance as our first objective and the protein
production cost as the second objective.
In order to solve the first single objective optimization prob-
lem, we create the corresponding input file Switch_HLex2_-
OJB1.m as indicated in Fig. 10 in the library USR_inputs. Once
this file is created, we execute:
>>SYNBAD_Design_SO( Switch_HLex2_OBJ1)
obtaining as a result the first extreme of the Pareto front (sol1
in Fig. 9), we rename the mat file as sol1.mat for storage pur-
poses. We proceed in the same manner to solve the second single
objective optimization problem (in this case we create the input file
Switch_HLex2_OJB2.m as indicated in Fig. 10 in the library
USR_inputs, but just modifying in this case the objective
inputs.design.objective ¼ {OF2_Cost}. Once this file is
created, we execute:
>>SYNBAD_Design_SO(Switch_HLex2_OBJ2) to obtain
the second extreme of the Pareto front (sol2 in Fig. 9), that we
store as sol2.mat. Now, we are in the position to solve the
bi-objective optimization problem. In the library USR_inputs, we
create input file Switch_HLex2.m (see Fig. 11). In this file we
indicate the two objectives to optimize inputs.modesign.
objective1 and inputs.modesign.objective2. We also
indicate the coordinates of the two extreme points of the Pareto
front obtained as solutions of the single optimization problems,
respectively, inputs_modesign_min_objective_1 and
inputs_modesign_min_objective_2. Finally we need to
132 Irene Otero-Muras and Julio R. Banga
Fig. 10 SYNBADm design input file for example 2, single objective search for the extreme of the Pareto:
Switch_HLex2_OBJ1.m
6 Notes
IPTG Promoters
P5 1 2 3 4 5
Repressors
LacI1 1
2 IPTG Promoters
2500 P1 3 P2 1 2 3 4 5
P1
Repressors
4 LacI1 1
tetR
2
P1 3
aTc tetR 4
2000
Objective 2 (Cost)
aTc
IPTG
P1
1500 LacI 1
P2
P1
tetR Promoters
P3 1 2 3 4 5
Repressors
1000 1
aTc 2
−1 −0.95 −0.9 −0.85 −0.8 −0.75
3
Objective 1 (Performance) 4
kb ktb
P 1 + T 1 FGGGGGGBG P 1T 1GGGGGGA P 1 + mT 2
GGGGG
ku
ð7Þ
kr kdm kd
mT 2GGGGGGA mT 2 + T 2 mT 2GGGGGGGGA ∅ T2 GGGGGGA ∅
where kbi, kui and kdi are the constants of binding, unbinding,
and degradation of the inducer complex, respectively.
2. Reactions Associated with the Biological Devices in a SYNBADm
Library of Hill Type. The kinetic formalism is adapted from [14]
and further extended. Within this framework, the device
P1 T2, where P1 is a promoter negatively regulated by a pro-
tein T1, has associated with the reactions:
rt ð8Þ
P 1GGGGGA P 1 + T 2
kd
T 2GGGGGGA ∅. ð9Þ
kbi1 kdi
I + T 1 FGGGGGGGB
GGGGGGG IT 1GGGA ∅
kui1
where kbi, kui, and kdi are the constants of binding, unbinding,
and degradation of the inducer complex, respectively.
Funding: This research was funded by the Spanish Ministry of Sci-
ence, Innovation and Universities, project SYNBIOCONTROL (ref.
DPI2017-82896-C2-2-R).
References
1. Marchisio MA, Stelling J (2009) Computa- 9. Exler O, Antelo LT, Egea JA, Alonso AA,
tional design tools for synthetic biology. Curr Banga JR (2008) A Tabu search-based algo-
Opin Biotechnol 20(4):479–485 rithm for mixed-integer nonlinear problems
2. Rodrigo G, Landrain TE, Shen S, Jaramillo A and its application to integrated process and
(2013) A new frontier in synthetic biology: control system design. Comput Chem Eng 32
automated design of small RNA devices in bac- (8):1877–1891
teria. Trends Genet 29(9):529–536 10. Schlueter M, Egea JA, Banga JR (2009)
3. Nielsen AAK, Der BS, Shin J, Vaidyanathan P, Extended ant colony optimization for
Paralanov V, Strychalski EA, Ross D, non-convex mixed integer nonlinear program-
Densmore D, Voigt CA (2016) Genetic circuit ming. Comput Oper Res 36(7):2217–2229
design automation. Science 352(6281): 11. Hansen P, Mladenovic N, Moreno-Perez JA
aac7341 (2010) Variable neighbourhood search: meth-
4. Otero-Muras I, Henriques D, Banga JR ods and applications. Ann Oper Res 175
(2016) Synbadm: a tool for optimization- (1):367–407
based automated design of synthetic gene cir- 12. Pedersen M, Phillips A (2009) Towards pro-
cuits. Bioinformatics 32(21):3360–3362 gramming languages for genetic engineering of
5. Xiang Y, Dalchau N, Wang B (2018) Scaling up living cells. J R Soc Interface 6:S437–S450
genetic circuit design for cellular computing: 13. Otero-Muras I, Banga JR (2016) Design prin-
advances and prospects. Nat Comput 17 ciples of biological oscillators through optimi-
(4):833–853 zation: Forward and reverse analysis. PLoS
6. Otero-Muras I, Banga JR (2017) Automated ONE 11(12):e0166867
design framework for synthetic biology exploit- 14. Dasika MS, Maranas CD (2008) Optcircuit: an
ing Pareto optimality. ACS Synth Biol 6 optimization based method for computational
(7):1180–1193 design of genetic circuits. BMC Syst Biol 2:24
7. Egea JA, Marti R, Banga JR (2010) An evolu- 15. Szekely P, Sheftel H, Mayo A, Alon U (2013)
tionary method for complex-process optimiza- Evolutionary tradeoffs between economy and
tion. Comput Oper Res 37:315–324 effectiveness in biological homeostasis systems.
8. Exler O, Schittkowski K (2007) A trust region PLoS Comput Biol 9(8):e1003163
SQP algorithm for mixed-integer nonlinear 16. Otero-Muras I, Banga JR (2014) Multicriteria
programming. Optim Lett 1(3):269–280 global optimization for biocircuit design. BMC
Syst Biol 8:113
Chapter 5
Abstract
Laboratory automation is a key enabling technology for genetic engineering that can lead to higher
throughput, more efficient and accurate experiments, better data management and analysis, decrease in
the DBT (Design, Build, and Test) cycle turnaround, increase of reproducibility, and savings in lab
resources. Choosing the correct framework among so many options available in terms of software,
hardware, and skills needed to operate them is crucial for the success of any automation project. This
chapter explores the multiple aspects to be considered for the solid development of a biofoundry project
including available software and hardware tools, resources, strategies, partnerships, and collaborations in
the field needed to speed up the translation of research results to solve important society problems.
Key words Laboratory automation, Synthetic biology, Hardware, Software, Throughput, Machine
learning, Liquid handling, Metabolic engineering, Standardization, Reproducibility
1 Introduction
Filippo Menolascina (ed.), Synthetic Gene Circuits: Methods and Protocols, Methods in Molecular Biology, vol. 2229,
https://doi.org/10.1007/978-1-0716-1032-9_5, © Springer Science+Business Media, LLC, part of Springer Nature 2021
137
138 Marilene Pavan
communication and training are also key to get the team on board.
One should be prepared to explain the benefits in adopting the new
procedures and technologies, have a clear roadmap in hand, and
have the team fully trained and supported during operations.
2.1 Design Software tools are available today to help on the experimental
design and data management and analysis. They are fundamental
to help to guide on the best strategy for both the abstract level
(choosing the best genetic candidates to be tested) and for the
practical level (the DNA assembly strategy to be executed). Auto-
mated, well-informed designs help to increase the number of
designs that can be generated, the speed these designs can be
generated, and it helps to narrow down the design space prioritiz-
ing the best candidates to be built and tested, saving lab resources
[20–24].
2.2 Build Worklist-based instructions for liquid handlers are the most com-
mon approach in automated facilities. Usually a .csv file is gener-
ated, containing the instructions from where to aspirate liquids
(DNA parts, reagents, media, etc.) to where to dispense them.
Ideally, the build instructions should be linked to the design and
sample inventory pipelines. As an example for the advantage of
having the build process automated in the lab, DNA assembly can
140 Marilene Pavan
2.3 Test The test tools and equipment should link the experiments being
performed in the lab, with the respective constructs created by the
design tool and assembled by the build pipeline. They also collect
and store the data being generated while testing the synthetic
genetic circuits. Fragment analysis, sequencing, transcriptomics,
fermentation, proteomics, and metabolomics can be usually auto-
mated, though analytical techniques remain one of the most diffi-
cult steps to automate to date.
2.4 Learn Machine learning (ML) [25, 26] is being used to interpret data
being generated by the test cycle. The vast amount of data being
generated by automated experiments suits better computer algo-
rithms than human minds. These algorithms can help to under-
stand the behavior and predictability of genetic circuits, narrowing
down the space of genetic circuits to be constructed, speeding up
and saving money in the whole process. In a study published in
2019, Opgenorth and colleagues [13] reported on the implemen-
tation of two DBT cycles to optimize 1-dodecanol production from
glucose using 60 engineered Escherichia coli MG1655 strains. They
used the data produced in the first DBT cycle to train several
machine-learning algorithms and to suggest protein profiles for
the second DBT cycle that would increase production. These stra-
tegies resulted in a 21% increase in dodecanol titer in Cycle 2.
Finally, together with the data presented above, the 10 ques-
tions below should be asked when considering the need for auto-
mation: (1) Does the lab need more reproducible and accurate
results? (2) Would the laboratory research line benefit from increas-
ing the throughput of experiments (more data generation leading
to faster answers)? (3) Does the lab staff have the time and resources
to train the lab students and employees, continuously, on automa-
tion? (4) Does the lab want and have the time to work with the
vendors to co-develop and adapt the protocols and processes to an
automated set up? (5) Not having the students and employees
performing some repetitive tasks is important for the lab operations
(benefiting from walk-away time and avoiding injuries caused by
repetitive tasks)? (6) How the costs associated with automation
(purchase, maintenance, consumables, training, co-developments)
will impact the lab? (7) Does the lab need better documentation on
the protocols being performed and results being generated? (8) Are
the lab protocols and experiments standard and frequent enough so
they can benefit from an automated procedure? (9) Would it be
beneficial to achieve financial stability in lab operations? (10) Does
the lab have the required space for an automated system?
Setting Up an Automated Biomanufacturing Lab 141
3 Strategy
4 Business Plan
Fig. 1 Edinburgh Genome Foundry. The foundry system has a complete, fully integrated automated system for
the design, build and test of hundreds of genetic variants per day
Fig. 2 Enclosed biofoundry at Lanzatech Inc. allows to modify and test anaerobic microorganisms
4.3 Partnerships Though more and more companies are offering low cost, more
affordable, and accessible solutions in lab automation, partnerships
with existent public or private biofoundries [10, 12] might provide
a good solution for first-time users or to those already using auto-
mation in their lab but in need of different protocols and
146 Marilene Pavan
4.4 Education Sometimes, the gains of having an automated lab set up are not as
obvious as scaling up and speeding up experiments, but equally
important. Through research being developed in an automated
setup, a new generation of scientists is being trained at the interface
of systems biology, synthetic biology, molecular biology, bioinfor-
matics, strain engineering, and metabolic engineering, along with
hardware and software engineering. These skill sets are critical in
basic and applied R&D but rarely acquired in traditional research
programs [1].
Setting Up an Automated Biomanufacturing Lab 147
4.5 System Oftentimes, the use of grant funds to pay for system maintenance is
Maintenance not permitted. In this case, check with the finance department in
and Personnel the company or university for alternatives. Usually, the equipment
comes with a one-year warranty contract that can be negotiated to
be extended. Is important to involve the department responsible
for purchases and partnerships in those negotiations. Preventive
maintenance and service contracts, after the warranty expires,
should also be negotiated. Finally, consider learning the basics of
the system maintenance, so you do not have to rely heavily in
maintenance services. Be also very clear in the business plan on
how you are planning to implement daily procedures for cleaning
and maintenance.
Personnel are often the most expensive part of an automation
project due to the fact that it requires highly specialized scientists in
biotechnology and software-related fields. Usually a scientist work-
ing in automation projects learns a programming language, works
well in a diverse team, is a good problem-solving, build solid, long-
term relationships with other laboratories, vendors, and partners,
can understand and adapt scientific protocols to automated sys-
tems, helps with grant writing, and has great discipline to maintain
the system and document the research being developed there.
5.3 DNA Assembly Standardization and modularity bring the potential to make engi-
and Strain neering biology more predictable, while enabling miniaturization
Development and automation of DNA assembly methods. The first attempt, in
synthetic biology, to introduce some degree of standardization was
the implementation of the BioBrick standard, adopted by the
iGEM competition [61]. Today, a number of modular, highly
efficient, automation friendly tools, and methodologies are avail-
able for DNA construction, being the most used method known as
Gibson Assembly [62]. This one-step, isothermal, scarless, in vitro
recombination approach utilizes exonuclease activity, DNA poly-
merase activity, and DNA ligase activity to amplify and ligate DNA
fragments with appropriate overlaps. Another widely adopted tool,
the Golden Gate assembly [63]—and its variables as Modular
Cloning (MoClo) [64] and BASIC [65] among others [66–68]—
150 Marilene Pavan
5.4 Open Science The more automation and throughput are introduced in the lab
routine, the more biological material exchange and collaborations
might be required and needed. Biological material exchange is
wanted to save time and money in resynthesizing, retesting already
existing, well-characterized DNA parts and strains. Material Trans-
fer Agreements (MTAs) underlie the legal requirements within
researchers to define the terms and conditions for sharing biological
materials, ensuring and respecting the rights of the creators, and
promoting safe practices and responsible research [70]. However,
the process of getting an agreement can be very bureaucratic and
time-consuming. Fortunately, there are initiatives as the OpenMTA
(https://www.openplant.org/openmta/), which relaxes restric-
tions on the redistribution and commercial use of biomaterials,
while supporting the practical realities of technology transfer by
being flexible enough to accommodate the needs of different
groups worldwide [70]. It is highly desirable the widespread adop-
tion of this system, in order to accelerate and simplify the MTA
process.
Also, community labs such as BioBlaze (https://www.bioblaze.
org/), BioCurious (http://biocurious.org/), and GenSpace
(https://www.genspace.org/); material exchange initiatives such
as the OpenMTA and the Free Genes project (https://biobricks.
org/freegenes/); outreach initiatives such as the Community Bio-
technology Initiative (CBI) [71] and the IGEM competition; and
low-cost robots like the OpenTrons OT2 (https://opentrons.
com/) are facilitating, promoting, and democratizing the entry
access to automation.
6 Conclusion
Acknowledgments
References
1. Si T, Zhao H (2016) A brief overview of syn- chemical diversity. Nat Rev Microbiol 14
thetic biology research programs and roadmap (3):135–149
studies in the United States. Synth Syst Bio- 4. Yeow JA, Ng PK, Tan KS et al (2014) Effects of
technol 1(4):258–264 stress, repetition, fatigue and work environ-
2. Khalil AS, Collins JJ (2010) Synthetic biology: ment on human error in manufacturing indus-
applications come of age. Nat Rev Genet 11 tries. J Appl Sci 14(24):3464–3471. https://
(5):367–379 doi.org/10.3923/jas.2014.3464.3471
3. Smanski MJ, Zhou H, Claesen J et al (2016)
Synthetic biology to access and expand nature’s
Setting Up an Automated Biomanufacturing Lab 153
5. Chao R, Mishra S, Si T, Zhao H (2017) Engi- 20. Densmore DM, Bhatia S (2014) Bio-design
neering biological systems using automated automation: software + biology + robots.
biofoundries. Metab Eng 42:98–108 Trends Biotechnol 32:111–113
6. Studies L (2015) Industrialization of biology: a 21. Hillson NJ, Rosengarten RD, Keasling JD
roadmap to accelerate the advanced (2012) J5 DNA assembly design automation
manufacturing of chemicals software. ACS Synth Biol 1(1):14–21. https://
7. Nielsen J, Keasling JD (2016) Engineering cel- doi.org/10.1021/sb2000116
lular metabolism. Cell 164(6):1185–1197 22. Morrell WC, Birkel GW, Forrer M et al (2017)
8. Karim AS, Dudley QM, Jewett MC (2016) The experiment data depot: a web-based soft-
Cell-free synthetic systems for metabolic engi- ware tool for biological experimental data stor-
neering and biosynthetic pathway prototyping. age, sharing, and visualization. ACS Synth Biol
In: Wittmann C, Liao JC (eds) Industrial bio- 6(12):2248–2259. https://doi.org/10.1021/
technology. Wiley, Weinheim acssynbio.7b00204
9. Groth P, Cox J (2017) Indicators for the use of 23. Nielsen AAK, Der BS, Shin J et al (2016)
robotic labs in basic biomedical research: a lit- Genetic circuit design automation. Science
erature analysis. PeerJ 5:e3997. https://doi. 352(6281):aac7341. https://doi.org/10.
org/10.7717/peerj.3997 1126/science.aac7341
10. Hillson N, Caddick M, Cai Y et al (2019) 24. Appleton E, Densmore D, Madsen C, Roehner
Building a global alliance of biofoundries. Nat N (2017) Needs and opportunities in
Commun 10:2040 bio-design automation: four areas for focus.
11. Carbonell P, Jervis AJ, Robinson CJ et al Curr Opin Chem Biol 40:111–118
(2018) An automated design-build-test-learn 25. Costello Z, Martin HG (2018) A machine
pipeline for enhanced microbial production of learning approach to predict metabolic path-
fine chemicals. Commun Biol 1:66. https:// way dynamics from time-series multiomics
doi.org/10.1038/s42003-018-0076-9 data. NPJ Syst Biol Appl 4:19. https://doi.
12. Hayden EC (2014) The automated lab. Nature org/10.1038/s41540-018-0054-3
516(7529):131–132 26. Jervis AJ, Carbonell P, Vinaixa M et al (2019)
13. Opgenorth P, Costello Z, Okada T et al (2019) Machine learning of designed translational
Lessons from two design-build-test-learn control allows predictive pathway optimization
cycles of Dodecanol production in Escherichia in Escherichia coli. ACS Synth Biol 8
coli aided by machine learning. ACS Synth Biol (1):127–136. https://doi.org/10.1021/
8(6):1337–1351. https://doi.org/10.1021/ acssynbio.8b00398
acssynbio.9b00020 27. Hale AN (1999) 5 Building realistic automated
14. Olsen K (2012) The first 110 years of labora- production lines for genetic analysis. In: Craig
tory automation: technologies, applications, AG, Hoheisel JD (eds) Methods in microbiol-
and the creative scientist. J Lab Autom 17 ogy. Academic Press, San Diego
(6):469–480. https://doi.org/10.1177/ 28. O’Sullivan B (2019) Points to consider when
2211068212455631 planning for lab automation projects. In: High-
15. Chapman T (2003) Lab automation and Res Bio. https://highresbio.com/blog/
robotics: automation on the move. Nature points-to-consider-when-planning-for-lab-
421(6923):661, 663, 665-6. https://doi. automation-projects/
org/10.1038/421661a 29. Opentrons (2019) Guide to choosing a lab
16. Lundberg K (2012) Increase user adoption automation platform. In: Opentrons. https://
rates and realize a higher rate of return on insights.opentrons.com/the-automated-
your LIMS investment. GenomeWeb 1–8 pipetting-revolution-is-here
17. Phillips P, Lithgow GJ, Driscoll M (2017) A 30. Butler JM (2012) New technologies and auto-
long journey to reproducible results. Nature mation. In: Advanced topics in forensic DNA
548:387–388 typing. Elsevier Academic Press, San Diego
18. Teytelman L (2018) No more excuses for 31. Ham TS, Dmytriv Z, Plahar H et al (2012)
non-reproducible methods. Nature 560 Design, implementation and practice of JBEI-
(7719):411 ICE: an open source biological part registry
platform and tools. Nucleic Acids Res 40(18):
19. Freedman LP, Cockburn IM, Simcoe TS e141. https://doi.org/10.1093/nar/gks531
(2015) The economics of reproducibility in
preclinical research. PLoS Biol 13(6): 32. Oberortner E, Cheng JF, Hillson NJ, Deutsch
e1002165. https://doi.org/10.1371/journal. S (2017) Streamlining the design-to-build
pbio.1002165 transition with build-optimization software
154 Marilene Pavan
tools. ACS Synth Biol 6(3):485–496. https:// platform for cell-free synthetic biology. ACS
doi.org/10.1021/acssynbio.6b00200 Synth Biol 5(4):344–355. https://doi.org/
33. Cohen L (2019) Writing your business plan. 10.1021/acssynbio.5b00296
Nat Biotechnol 20(Suppl):BE33–BE35. 47. Gregorio NE, Levine MZ, Oza JP (2019) A
https://doi.org/10.1038/nbt0602supp- user’s guide to cell-free protein synthesis.
BE33 Methods Protoc 2:24. https://doi.org/10.
34. Clark DP, Pazdernik NJ (2016) Synthetic biol- 3390/mps2010024
ogy: report to congress 2013. Biotechnology 48. Kay JE, Jewett MC (2015) Lysate of engi-
419–445. https://doi.org/10.1016/B978-0- neered Escherichia coli supports high-level
12-385015-7.00013-2 conversion of glucose to 2,3-butanediol.
35. Chao R, Liang J, Tasan I et al (2017) Fully Metab Eng 32:133–142. https://doi.org/10.
automated one-step synthesis of single- 1016/j.ymben.2015.09.015
transcript TALEN pairs using a biological 49. Rustad M, Eastlund A, Marshall R et al (2017)
foundry. ACS Synth Biol 6:678–685. https:// Synthesis of infectious bacteriophages in an
doi.org/10.1021/acssynbio.6b00293 E. coli-based cell-free expression system. J Vis
36. NSF Broader impacts review criterion. https:// Exp (126):56144. https://doi.org/10.3791/
www.nsf.gov/pubs/2007/nsf07046/ 56144
nsf07046.jsp 50. Dudley QM, Nash CJ, Jewett MC (2019) Cell-
37. Segal M (2019) An operating system for the free biosynthesis of limonene using enzyme-
biology lab. Nature 573(7775):S112–S113 enriched Escherichia coli lysates. Synth Biol 4
38. Carbonell P, Radivojevic T, Garcı́a Martı́n H (1):ysz003. https://doi.org/10.1093/
(2019) Opportunities at the intersection of synbio/ysz003
synthetic biology, machine learning, and auto- 51. Schoborg JA, Clark LG, Choudhury A et al
mation. ACS Synth Biol 8(7):1474–1477. (2016) Yeast knockout library allows for effi-
https://doi.org/10.1021/acssynbio.8b00540 cient testing of genomic mutations for cell-free
39. Lima AN, Philot EA, Trossini GHG et al protein synthesis. Synth Syst Biotechnol 1:2–6.
(2016) Use of machine learning approaches https://doi.org/10.1016/j.synbio.2016.02.
for novel drug discovery. Expert Opin Drug 004
Discov 11(3):225–239 52. Karim AS, Dudley QM, Juminaga A, et al
40. Kan A (2017) Machine learning applications in (2019) In vitro prototyping and rapid optimi-
cell image analysis. Immunol Cell Biol 95 zation of biosynthetic enzymes for cellular
(6):525–530 design. bioRxiv. https://doi.org/10.1101/
685768
41. Ghosh A, Ando D, Gin J et al (2016) 13C
metabolic flux analysis for systematic metabolic 53. Moore SJ, MacDonald JT, Wienecke S et al
engineering of S. cerevisiae for overproduction (2018) Rapid acquisition and model-based
of fatty acids. Front Bioeng Biotechnol 4:76. analysis of cell-free transcription–translation
https://doi.org/10.3389/fbioe.2016.00076 reactions from nonmodel bacteria. Proc Natl
Acad Sci U S A 115(19):E4340–E4349.
42. Lawson CE, Harcombe WR, Hatzenpichler R https://doi.org/10.1073/pnas.1715806115
et al (2019) Common principles and best prac-
tices for engineering microbiomes. Nat Rev 54. Huang A, Nguyen PQ, Stark JC et al (2018)
Microbiol 17(12):725–741. https://doi.org/ Biobits™ explorer: a modular synthetic biol-
10.1038/s41579-019-0255-9 ogy education kit. Sci Adv 4(8):eaat5105.
https://doi.org/10.1126/sciadv.aat5105
43. Kwon YC, Jewett MC (2015) High-
throughput preparation methods of crude 55. Jewett MC, Forster AC (2010) Update on
extract for robust cell-free protein synthesis. designing and building minimal cells. Curr
Sci Rep 5:8663. https://doi.org/10.1038/ Opin Biotechnol 21(5):697–703
srep08663 56. Caschera F, Noireaux V (2016) Compartmen-
44. Karim AS, Jewett MC (2018) Cell-free syn- talization of an all-E. coli cell-free expression
thetic biology for pathway prototyping. Meth- system for the construction of a minimal cell.
ods Enzymol 608:31–57 Artif Life 22(2):185–195
45. Perez JG, Stark JC, Jewett MC (2016) Cell- 57. Gulati S, Rouilly V, Niu X et al (2009) Oppor-
free synthetic biology: engineering beyond the tunities for microfluidic technologies in syn-
cell. Cold Spring Harb Perspect Biol 8(12): thetic biology. J R Soc Interface 6:S493–S506
a023853. https://doi.org/10.1101/ 58. Gach PC, Iwai K, Kim PW et al (2017) Droplet
cshperspect.a023853 microfluidics for synthetic biology. Lab Chip
46. Garamella J, Marshall R, Rustad M, Noireaux 17:3388–3400
V (2016) The all E. coli TX-TL toolbox 2.0: a
Setting Up an Automated Biomanufacturing Lab 155
59. Gach PC, Shih SCC, Sustarich J et al (2016) A reusable genetic modules. PLoS One 6(7):
droplet microfluidic platform for automating e21622. https://doi.org/10.1371/journal.
genetic engineering. ACS Synth Biol 5 pone.0021622
(5):426–433. https://doi.org/10.1021/ 69. Casini A, Storch M, Baldwin GS, Ellis T (2015)
acssynbio.6b00011 Bricks and blueprints: methods and standards
60. Lashkaripour A, Rodriguez C, Ortiz L, Dens- for DNA assembly. Nat Rev Mol Cell Biol 16
more D (2019) Performance tuning of micro- (9):568–576
fluidic flow-focusing droplet generators. Lab 70. Kahl L, Molloy J, Patron N et al (2018) Open-
Chip 19(6):1041–1053. https://doi.org/10. ing options for material transfer. Nat Biotech-
1039/C8LC01253A nol 36(10):923–927
61. Shetty RP, Endy D, Knight TF (2008) Engi- 71. Kong DS, Thorsen TA, Babb J et al (2017)
neering BioBrick vectors from BioBrick parts. J Open-source, community-driven microfluidics
Biol Eng 2:5. https://doi.org/10.1186/ with metafluidics. Nat Biotechnol 35
1754-1611-2-5 (6):523–529
62. Gibson DG, Young L, Chuang RY et al (2009) 72. Walsh DI, Pavan M, Ortiz L et al (2019) Stan-
Enzymatic assembly of DNA molecules up to dardizing automated DNA assembly: best prac-
several hundred kilobases. Nat Methods 6 tices, metrics, and protocols using robots.
(5):343–345. https://doi.org/10.1038/ SLAS Technol 24(3):282–290. https://doi.
nmeth.1318 org/10.1177/2472630318825335
63. Engler C, Kandzia R, Marillonnet S (2008) A 73. Ortiz L, Pavan M, McCarthy L, et al (2017)
one pot, one step, precision cloning method Automated robotic liquid handling assembly of
with high throughput capability. PLoS One 3 modular DNA devices. J Vis Exp (130):54703.
(11):e3647. https://doi.org/10.1371/jour https://doi.org/10.3791/54703
nal.pone.0003647 74. Jessop-Fabre MM, Sonnenschein N (2019)
64. Weber E, Engler C, Gruetzner R et al (2011) A Improving reproducibility in synthetic biology.
modular cloning system for standardized Front Bioeng Biotechnol 7:18. https://doi.
assembly of multigene constructs. PLoS One org/10.3389/fbioe.2019.00018
6(2):e16765. https://doi.org/10.1371/jour 75. Beal J, Haddock-Angelli T, Baldwin G et al
nal.pone.0016765 (2018) Quantification of bacterial fluorescence
65. Storch M, Casini A, Mackrow B et al (2015) using independent calibrants. PLoS One 13
BASIC: a new biopart assembly standard for (6):e0199432. https://doi.org/10.1371/jour
idempotent cloning provides accurate, single- nal.pone.0199432
tier DNA assembly for synthetic biology. ACS 76. Madsen C, McLaughlin JA, Misirl G et al
Synth Biol 4(7):781–787. https://doi.org/10. (2016) The SBOL stack: a platform for storing,
1021/sb500356d publishing, and sharing synthetic biology
66. Lai HE, Moore S, Polizzi K, Freemont P designs. ACS Synth Biol 5(6):487–497.
(2018) EcoFlex: a multifunctional moclo kit https://doi.org/10.1021/acssynbio.5b00210
for E. coli synthetic biology. Methods Mol 77. McLaughlin JA, Myers CJ, Zundel Z et al
Biol 1772:429–444 (2018) SynBioHub: a standards-enabled
67. Iverson SV, Haddock TL, Beal J, Densmore design repository for synthetic biology. ACS
DM (2016) CIDAR MoClo: improved Synth Biol 7(2):682–688. https://doi.org/
MoClo assembly standard and new E. coli 10.1021/acssynbio.7b00403
part library enable rapid combinatorial design 78. Bioeconomy G (2019) A research roadmap for
for synthetic and traditional biology. ACS the next-generation bioeconomy
Synth Biol 5(1):99–103. https://doi.org/10.
1021/acssynbio.5b00124 79. Clarke LJ, Kitney RI (2016) Synthetic biology
in the UK – an outline of plans and progress.
68. Sarrion-Perdigones A, Falconi EE, Zandalinas Synth Syst Biotechnol 1(4):243–257
SI et al (2011) GoldenBraid: an iterative clon-
ing system for standardized assembly of
Chapter 6
Abstract
Type-2S restriction enzymes allow the routine assembly of large batches of synthetic constructs from
individual genetic parts. However, design flaws in the part sequence can cause assembly failures, incurring
troubleshooting costs and project delays. As a result, the careful design and checking of the assembly plan is
often a bottleneck of large assembly projects, and may require computational support. This chapter
demonstrates the use of two free and open-source web applications accelerating this task by automating
genetic part design and simulating type-2S cloning to detect potential assembly issues.
1 Introduction
Filippo Menolascina (ed.), Synthetic Gene Circuits: Methods and Protocols, Methods in Molecular Biology, vol. 2229,
https://doi.org/10.1007/978-1-0716-1032-9_6, © Springer Science+Business Media, LLC, part of Springer Nature 2021
157
158 Valentin Zulkower
2.1 Manual Here, we detail the different steps involved in the standardization of
Standardization of a a Green Fluorescent Protein (GFP) sequence for use at position
Genetic Part (Outline) “p9” of the EMMA assembly standard, which will enable to express
other proteins (placed at position “p7”) with a downstream GFP
fusion, by the intermediary of a peptide chain in position p8.
1. Obtain a GFP-encoding nucleotides sequence, e.g., from the
NCBI website (https://www.ncbi.nlm.nih.gov/nuccore/
L29345.1).
2. Open the sequence in the editor of your choice. Sequence files
in text or FASTA format can be open in any text editor, while
files in Genbank format require specialized software such as
Benchling (https://www.benchling.com/) or Snapgene
(https://www.snapgene.com).
3. Add the two-nucleotide sequence “CA” at the beginning of the
sequence in order to make the sequence compatible with posi-
tion p9 of the EMMA standard. This dimer will complete
position p9’s left overhang GCGT to form the assembly scar
GCGTCA, encoding a short Alanine–Serine peptide chain.
Omitting the addition of “CA” will result in an out-of-frame
GFP sequence and biologically dysfunctional protein.
4. Add GCGT and TGCT on the left and right sides of the
sequence, respectively. These will be the sequences of the
Computer-Aided Design and Pre-validation of Large Batches of DNA Assemblies 159
Fig. 1 Input and output of the web-based part standardization application. (a) User-created spreadsheet
defining the assembly standard to follow. (b) Screenshot of the web application showing the web form in its
entirety. (c) Sample files from the part standardization report: PDF summary of the report (front) and Fasta file
listing all standardized sequences for ordering from a DNA synthesis company
2.3 Protecting some 1. Before adding the part’s corresponding Genbank file to the zip
Part Regions against archive, open the Genbank file in a sequence editor, for
Modifications instance, the free software Snapgene Viewer (snapgene.com)
or Benchling (benchling.com).
2. Locate a sequence region which should be protected against
modifications and add an annotation at this location (the
method for which may vary from one sequence editor to
another). The Genbank type of the annotation should be “mis-
c_feature,” and the label of the annotation should be either
“@keep” (to forbid any mutation in the region) or “@cds”
(to allow codon juggling only, i.e., mutations that do not
change the translated protein sequence). Note that many
more design constraints are available, as listed in the documen-
tation of the underlying sequence optimizer DNA Chisel
(https://edinburgh-genome-foundry.github.io/DnaChisel).
3. Save the resulting Genbank record to a file (e.g., “p9_GFP.
gb”) and add the file to the zip archive.
2.4 Using the Web 1. With the web browser of your choice (we recommend a mod-
Application ern version of Google Chrome or Firefox), connect to the
application at the following address: https://cuba.
genomefoundry.org/domesticate_part_batches.
2. The application consists in a one-page form shown in Fig. 1b.
In the rest of this protocol, letters in parenthesis (a), (b), etc.
refer to annotations in this figure.
3. Enter the name of the assembly standard used in (a). This
information is mostly optional and only used for reference in
the report produced by the application.
4. Drag and drop the Standard Definition File in the upload box
(b).
5. Drag and drop the Sequences Files in the upload box (c).
6. When using Genbank records, if the name of the different parts
is provided in the file name (e.g., “p9_GFP.gb”) rather than in
the Genbank’s metadata, set the selection menu in (d) to the
“Use file names as parts IDs” option.
7. Tick the checkbox (e) to allow sequence edits. If the box is left
unticked and some of the provided sequences cannot be stan-
dardized without sequence modifications, the standardization
of these parts will fail, and the failures will be signaled in the
resulting report with an indication for troubleshooting. If the
box is ticked, make sure that sensitive elements have been
protected as explained in the previous section.
8. Click on the “Domesticate” button (f) to start the standardiza-
tion of the parts. This will take a few seconds to a few minutes
depending on the number of parts to process (a progress bar
will be displayed).
162 Valentin Zulkower
3.1 Preparing 1. Gather the sequences of all parts involved in the assembly. The
the Necessary Data sequences could be spread across different Fasta and Genbank
Files files, but for practicality, we recommend a single Fasta file or a
zip file containing the sequences as separated files in the Gen-
bank format, each file named after the part sequence it
provides.
2. Specify the assembly plan by creating a spreadsheet on the
model of Fig. 2a. Save the spreadsheet in Excel format (.xls or
.xlsx) or CSV format (.csv). The resulting file will be referred to
as the Assembly Plan Spreadsheet in the rest of this section.
3. The web application offers the possibility to omit connector
parts in the Assembly Plan Spreadsheet, and instead have the
necessary connectors for each construct automatically selected.
This requires to gather the sequences of all connector parts
available, as a single Fasta file or a zip file containing the
sequences as separated files in the Genbank format, each file
named after the connector part it provides. These file(s) will be
referred to as Connector Sequences in this section.
3.2 Using the Web 1. Connect to the following application with the web browser of
Application your choice: https://cuba.genomefoundry.org/simulate_gg_
assemblies.
2. The application consists in a one-page form shown in Fig. 2b.
In the rest of this protocol, letters in parenthesis (a), (b), etc.
refer to annotations in this figure.
3. Select the enzyme to be used for the assembly (a). Options are
BsaI, BsmBI (this option is also suitable for other type-2S
BsmBI isoschizomers such as Esp3I), BbsI, or the default
option “Autoselect,” which will attempt to guess the intended
enzyme based on the presence of recognition sites in the part
sequences of each assembly.
4. Drag and drop all sequence files in the upload box (b).
5. Tick the checkbox “Provide a list of assemblies” and drag the
Assembly Plan Spreadsheet in the appearing box (c). Note that
this step can be skipped if the assembly plan consists in a single
assembly.
164 Valentin Zulkower
Fig. 2 Input and output of the web-based cloning simulation application. (a) Screenshot of the web application
showing the web form in its entirety. (b) User-created spreadsheet specifying the assembly plan. Each line
starts with the name of the construct to be assembled, followed by the list of parts in each assembly. (c)
Organization of the Cloning Simulation Report file. (d) Schema of the Genbank record of “Construct 1” as
predicted by the application from the assembly plan of panel B. (e) Part connection schema for “Construct 2.”
The circularity of the schema indicates that the parts will indeed assemble properly into a circular plasmid
References
1. Kosuri S, Church GM (2014) Large-scale de system for standardized assembly of multigene
novo DNA synthesis: technologies and applica- constructs. PLoS One 6(2):e16765. https://
tions. Nat Methods 11(5):499–507. https:// doi.org/10.1371/journal.pone.0016765
doi.org/10.1038/nmeth.2918 7. Guo Y, Dong J, Zhou T, Auxillos J, Li T,
2. Chao R, Mishra S, Si T, Zhao H (2017) Engi- Zhang W et al (2015) YeastFab: the design
neering biological systems using automated and construction of standard biological parts
biofoundries. Metab Eng 42:98–108. https:// for metabolic engineering in Saccharomyces
doi.org/10.1016/j.ymben.2017.06.003 cerevisiae. Nucleic Acids Res 43(13):e88.
3. Engler C, Kandzia R, Marillonnet S (2008) A https://doi.org/10.1093/nar/gkv464
one pot, one step, precision cloning method 8. Martella A, Matjusaitis M, Auxillos J, Pollard
with high throughput capability. PLoS One 3 SM, Cai Y (2017) EMMA: an extensible mam-
(11):e3647. https://doi.org/10.1371/jour malian modular assembly toolkit for the rapid
nal.pone.0003647 design and production of diverse expression
4. Tsuge K, Sato Y, Kobayashi Y, Gondo M, vectors. ACS Synth Biol 6(7):1380–1392.
Hasebe M, Togashi T et al (2015) Method of https://doi.org/10.1021/acssynbio.7b00016
preparing an equimolar DNA mixture for 9. Richardson SM, Wheelan SJ, Yarrington RM,
one-step DNA assembly of over 50 fragments. Boeke JD (2006) GeneDesign: rapid, auto-
Sci Rep 5:10655. https://doi.org/10.1038/ mated design of multikilobase synthetic genes.
srep10655 Genome Res 16:550–556. https://doi.org/
5. Lin D, O’Callaghan CA (2018) MetClo: 10.1101/gr.4431306
Methylase-assisted hierarchical DNA assembly 10. Pereira F, Azevedo F, Carvalho Â, Ribeiro GF,
using a single type IIS restriction enzyme. Budde MW, Johansson B (2015) Pydna: a sim-
Nucleic Acids Res 46:e113. https://doi.org/ ulation and documentation tool for DNA
10.1093/nar/gky596 assembly strategies using python. BMC Bioin-
6. Weber E, Engler C, Gruetzner R, Werner S, formatics 16(1):142. https://doi.org/10.
Marillonnet S (2011) A modular cloning 1186/s12859-015-0544-x
Chapter 7
Abstract
Restriction digest analysis and Sanger sequencing are among the most commonly used techniques to check
the sequence of synthetic DNA constructs. However, both require careful preparation to select restriction
enzymes or DNA primers adapted to the expected constructs sequences. In projects involving
manufacturing of large batches of synthetic constructs, the task can be tedious and error-prone. This
chapter demonstrates the use of two free and open-source web applications providing fast and automated
selection of enzymes and sequencing primers for DNA construct verification.
Key words Computer-aided manufacturing, DNA assembly, DNA verification, Sanger sequencing,
Restriction digest analysis, Synthetic Biology
1 Introduction
Filippo Menolascina (ed.), Synthetic Gene Circuits: Methods and Protocols, Methods in Molecular Biology, vol. 2229,
https://doi.org/10.1007/978-1-0716-1032-9_7, © Springer Science+Business Media, LLC, part of Springer Nature 2021
167
168 Valentin Zulkower
would allow to prepare a single digestion mix for the project, at the
same time greatly simplifying the protocol and saving reagents. If
need be, the objective could be relaxed to finding a pair of digests
such that any construct in the batch can be digested using one of
the two options. This section describes a software solution auto-
mating the search for such optimal enzymes.
2.1 Using the Web 1. With the web browser of your choice (we recommend Google
Application Chrome or Firefox), connect to the application at the following
address: https://cuba.genomefoundry.org/domesticate_part_
batches. The application consists in a simple one-page form
shown in Fig. 1a. In the rest of this section, letters in parenthe-
sis (a), (b), etc. refer to annotations in this figure.
2. Make sure that the selection box in (a) indicates “Good pat-
terns for all constructs.”
3. Choose the ideal range for the number of bands in a band
profile (b). Less than three bands are generally considered too
generic to be a good screening, and more than 8 bands gener-
ally result in crowded band patterns.
4. Drag and drop the sequences of all constructs, in Genbank or
Fasta format, in the upload box (c).
5. Tick the checkbox in (d) to indicate that the sequences are
circular plasmids.
6. Indicate which ladder will be used among the options proposed
in (e), or the ladder with the closest range if it does not appear
in the options.
7. Enter all available enzymes as a comma-separated list in the text
box (f). Note that a few pre-set enzyme lists are available in the
selection on top of the text box.
8. Choose the maximum number of enzymes in a given digest, as
well as the maximum number of digests accepted to verify the
batch. For instance, when asking for 3–6 bands via a single
digest with 2 enzymes suitable for a batch of 10 constructs, the
application will return a digestion plan as shown in Fig. 1b,
relying on a mix of enzymes AseI and EcoRI. When asking for
4–6 bands and 2 possible digests for the same constructs, the
application will return an assembly plan consisting of AseI
+EcoRI, completed this time by a SphI+XhoI digest
(as shown in Fig. 1c). Notice how this second digest provides
4-band patterns for constructs C5, C6, and C8, for which
digest AseI+EcoRI only produced three bands. As a conse-
quence, each construct in the batch can be verified using either
AseI+EcoRI or SphI+XhoI.
9. Optionally, tick the boxes in (h) for the application to return
detailed plots showing, for each digest, which regions of a
construct correspond to the different bands in the band profile
(Fig. 1d).
170 Valentin Zulkower
Fig. 1 Form and output of the web-based enzyme selection application. (a) Screenshot of the web application
showing the web form in its entirety. (b) Example plot returned by the application for a batch of 10 constructs.
The selected enzymes are indicated on the left. (c) Plot returned by the application in complement to the one in
panel B when the user requires two different digests. (d) Example of construct map returned by the application
(here construct C1, with construct features blurred as not relevant for this chapter). Yellow features indicate
the construct regions corresponding to the different bands (labeled a, b, c, d) of the digestion pattern
3.1 Preparing 1. Prepare a zip archive containing the expected sequence of all
the Necessary Data constructs in the batch as separate Genbank files (referred to as
Files the Constructs Sequences Archive in the rest of this section).
The name of each Genbank file should reflect the construct’s
name. If only part of these sequences should be covered, refer
to the instructions below.
2. Optionally, prepare a Fasta file gathering the sequences of all
primers already available to you. This file will be referred to as
the Primers Sequences File in the rest of this section).
3.2 Indicating The full Sanger sequencing of a 10-kb plasmid requires typically
Regions to Cover 20 reads to ensure a 2 coverage (where each nucleotide is read
and Primer-Free twice). Consequently, sequencing a hundred plasmids will require
Regions two thousand reads, and possibly hundreds of different primers,
making it costly and logistically challenging. To reduce the com-
plexity of the sequencing plan, one may want to restrict sequencing
to some regions of interest. For instance, when assembling several
genetic parts into a receptor vector, the sequencing of the receptor
region may be deemed unnecessary. One may also decide to only
sequence regions at the junctions between successive genetic parts,
as these locations may be more prone to assembly artifacts. More-
over, one may want to avoid using a primer annealing at these
junctions, as the primer may not be able to anneal in case of artifacts
at this location, leading to no read at all. The following steps show
how to specify regions to cover and prevent primers at certain
172 Valentin Zulkower
Fig. 2 Input and output of the web-based primer selection application. (a) Schematic representation of an
assembly’s Genbank record, with part junctions annotated to indicate that these regions in particular should
be covered by sequencing, and should not be an annealing location for primers. (b) Screenshot of the web
application showing the web form in its entirety. (c) Example output schema showing the sequencing plan for
2 constructs. Short red triangles indicate primer annealing locations, blue features indicate Sanger reads from
newly designed primers, and purple features indicate Sanger reads using available primers
3.3 Using the Web 1. With the web browser of your choice, connect to the applica-
Application tion at the following address: https://cuba.genomefoundry.
org/select_primers.
2. The application consists in a simple one-page form shown in
Fig. 2b. In the rest of this section, letters in parenthesis (a), (b),
etc. refer to annotations in this figure.
3. Make sure the validation type is set to “Sanger sequencing” (a).
4. In selection box (b) indicate whether the primers should pro-
duce reads on the 30 –50 strand, 50 –30 strand, or both,
corresponding to a 2 coverage where each nucleotide is read
once from each direction.
5. Drag the Constructs Sequences Archive in the upload box (c).
6. Tick the box (d) to indicate that the constructs to validate are
circular.
7. Optionally, drag the Primers Sequences File in the upload box (e).
8. Specify the expected read size (f) and the target annealing
temperature of the primers (g). The default values provided
are typical, but these parameters may slightly vary depending
on protocol details, and must be checked with the sequencing
laboratory.
9. Specify the number of digits used in the name formatting for
new primers (h). For instance, a value of 3 will result in primer
names of the form P001, P002, etc. Name collisions with
existing primers specified in the Primers Sequences File will
be automatically avoided.
10. Click on “Select primers” (i) to launch the primer selection,
which may take a few minutes (progress bars will be displayed).
References
1. Casini A, Storch M, Baldwin GS, Ellis T (2015) 4. Sanger F, Coulson AR (1975) A rapid method
Bricks and blueprints: methods and standards for determining sequences in DNA by primed
for DNA assembly. Nat Rev Mol Cell Biol 16 synthesis with DNA polymerase. J Mol Biol 94
(9):568–576. https://doi.org/10.1038/ (3):441–448. https://doi.org/10.1016/0022-
nrm4014 2836(75)90213-2
2. Potapov V, Ong JL, Kucera RB, Langhorst BW, 5. Dharmadi Y, Patel K, Shapland E, Hollis D,
Bilotti K, Pryor JM et al (2018) Comprehensive Slaby T, Klinkner N et al (2014) High-
profiling of four base overhang ligation fidelity throughput, cost-effective verification of struc-
by T4 DNA ligase and application to DNA tural DNA assembly. Nucleic Acids Res 42(4):
assembly. ACS Synth Biol 7(11):2665–2674. e22. https://doi.org/10.1093/nar/gkt1088
https://doi.org/10.1021/acssynbio.8b00333 6. Hancock JM, Zvelebil MJ, Hancock JM (2004).
3. Shapland EB, Holmes V, Reeves CD, Sorokin E, PRIMER3. In: Dictionary of bioinformatics and
Durot M, Platt D et al (2015) Low-cost, high- computational biology. https://doi.org/10.
throughput sequencing of DNA assemblies 1002/9780471650126.dob0560.pub2
using a highly multiplexed Nextera process.
ACS Synth Biol 4(7):860–866. https://doi.
org/10.1021/sb500362n
Chapter 8
Abstract
Synthetic genetic circuits are composed of many parts that must interact and function together to produce a
desired pattern of gene expression. A challenge when assembling circuits is that genetic parts often behave
differently within a circuit, potentially impacting the desired functionality. Existing debugging methods
based on fluorescent reporter proteins allow for only a few internal states to be monitored simultaneously,
making diagnosis of the root cause impossible for large systems. Here, we present a tool called the Genetic
Analyzer which uses RNA sequencing data to simultaneously characterize all transcriptional parts (e.g.,
promoters and terminators) and devices (e.g., sensors and logic gates) in complex genetic circuits. This
provides a complete picture of the inner workings of a genetic circuit enabling faults to be easily identified
and fixed. We construct a complete workflow to coordinate the execution of the various data processing and
analysis steps and explain the options available when adapting these for the characterization of new systems.
Key words Genetic circuits, Genetic parts, Characterization, Biometrology, RNA-seq, Synthetic
biology
1 Introduction
Filippo Menolascina (ed.), Synthetic Gene Circuits: Methods and Protocols, Methods in Molecular Biology, vol. 2229,
https://doi.org/10.1007/978-1-0716-1032-9_8, © Springer Science+Business Media, LLC, part of Springer Nature 2021
175
176 Deepti Vipin et al.
START
data/bed/S.bed
data/fasta/S.fasta
data/fastq/S.fastq
data/gff/S.gff
data/settings.txt
00_setup.sh
Generate normalized results/S/
Create temporary and
result directories correcting edge effects
01_map_reads.sh 06_de_analysis.sh
CHARACTERIZATION
Map RNA-seq reads using tmp/S/ Evaluate differential
results/
BWA and Samtools S.bam gene expression
S.de.analysis.txt
between sets of samples
DATA PRE-PROCESSING
03_fragment_distributions.sh
Generate fragment length results/S/
distribution of mapped genes S.fragment.distribution.txt Fit response functions results/
for genetic devices
04_read_analysis.sh results/
count.matrix.txt 09_clean_up.sh
Calculate FPKMs per gene
mapped.reads.matrix.txt
and TMM between sample
gene.lengths.matrix.txt
normalization factors
norm.factors.matrix.txt
fpkm.normed.matrix.txt
END
Fig. 1 Overview of the workflow. Major analyses shown in boxes with dependencies and flows between
analyses shown by arrows. Dashed arrows denote input and output files to each step with “S” denoting a
prefix that would be replaced with a specific sample name
Promoter Terminator
Genetic
Design
J Te
Transcription
TSS Postion (bp) TTS
Fig. 2 Method for characterizing promoter and terminator parts. For both types of part, a small region of the
transcription profile before and after each part is used to estimate changes in RNA polymerase (RNAP) flux
[14, 19]. For promoters, a sharp increase in RNAP flux occurs at the transcription start site (TSS) and the
absolute change in RNAP flux from before to after δJ captures the promoter strength. For terminators, a drop in
RNAP flux occurs at the transcription termination site (TTS) as RNAP physically unbind from the DNA. As this is
a stochastic process, the fractional drop in RNAP flux across the part is related to the termination efficiency Te
(i.e., percentage of RNAP that terminate). Genetic design shown with Synthetic Biology Open Language Visual
(SBOL Visual) symbols [34] and produced using DNAplotlib [35, 36]
178 Deepti Vipin et al.
a Sensor b NOT-gate
Joff Joff
Jin
2
Jout
1 1
Jon
J
3 4
2
+ J in
+ Inducer
3
Fig. 3 Method for quantifying response function of genetic devices. (a) Sensors are characterized by the
activity δJ of an output promoter Pout in the presence (+) and absence () of an inducer molecule, or another
environmental factor [19]. (b) Genetic gates, such as a NOT-gate, are characterized by the relationship
between the total RNAP flux acting as input to the gate Jin and the activity δJout of the output promoter Pout
[19]. Steady-state measurements across a range of input combinations of a circuit can then be used to fit a
response function (e.g., Hill equation) for the device [12]. The response functions of each device are shown on
the right of each panel, and the transcription profiles and the RNAP flux measurements used to calculate these
are shown to the left
2 Materials
2.1 Software The Genetic Analyzer requires that the following software tools and
Dependencies packages are installed and accessible from a command prompt. In
most cases, newer versions of the software should be compatible.
However, if issues are encountered, we recommend using the pre-
cise versions listed below.
(a) Python version 2.7.9 [27]—we recommend using a packaged
Python distribution such as Anaconda (www.continuum.io)
or Enthought (www.enthought.com).
(b) R version 3.2.1 [28].
(c) edgeR version 3.8.6 [29].
(d) BWA version 0.7.4 [30].
(e) SAMtools version 1.4 [31].
(f) HTSeq version 0.9.1 [32].
(g) Git version 2.21.0.
2.2 Installation 1. The Genetic Analyzer forms a part of a number of tools for
of the Genetic Analyzer analyzing sequencing data. A copy of the latest Genetic Ana-
lyzer can be downloaded by running the following command:
2.3 Sequencing Data The Genetic Analyzer assumes that sequencing data will be
provided in a standardized form to allow for automated processing
[19]. In particular, it requires that paired-end RNA-seq data with
FASTQ files is provided for read 1 and read 2 of each fragment. We
recommend preparing strand-specific RNA-seq sequencing
libraries [26] to allow for the multiplexing of multiple samples
during a single run, and sequencing these libraries using an Illu-
mina sequencer (e.g., HiSeq 2500). It is essential that a sufficient
number of reads are generated per sample to allow for accurate
quantification of genetic parts and devices. Although the precise
number is dependent on the size of the host genome and synthetic
genetic constructs present, for Escherichia coli cells, we find that
approximately four million reads per sample is sufficient for accu-
rate measurements from large genetic circuits [19].
3 Methods
3.1 Initial Workflow 1. The first step is the creation of a new workflow to store all
Setup sequencing data, metadata about the host system and synthetic
genetic circuits being studied, and the generated results. To
create a new workflow, it is advised that a copy of the “circuit
example” directory is made and renamed as appropriate.
Because the workflow relies on the specific location of certain
files, keeping the same directory structure within a workflow is
essential. Once a new workflow directory has been created, a
number of key files must be edited and added within the “data”
directory. We recommend editing and renaming the examples
provided to ensure that correct file formats are maintained.
180 Deepti Vipin et al.
Table 1
Custom workflow feature types and parameters for GFF files
3.2 Data 1. Once a complete workflow is setup, the raw RNA sequencing
Preprocessing data for each sample need to be mapped to the reference
sequences. This is performed by the “01_map_reads.sh” script
which calls the “map_reads.py” script to coordinate the SAM-
tools [31] and BWA [30] software for each sample. This script
should be edited to include entries for each sample present in
the “data/setting.txt” file.
2. The mapping of sequencing reads is then performed by run-
ning the command:
sh 01_map_reads.sh
3. This creates BAM files [31] for each sample in the “tmp”
directory.
4. The next step is to generate read counts for each gene feature in
the GFF file of the system being studied. This is used when
calculating differential gene expression in Subheading 3.4. This
process is performed by the “02_count_reads.sh” script which
calls the “count_reads.py” script for each sample. This script
should be edited to include entries for each sample present in
the “data/setting.txt” file.
5. Read counts for each gene are then calculated by running the
command:
sh 02_count_reads.sh
9. The script will create output files in the “results” directory for
each sample containing the fragment length distributions.
Genetic Analyzer Tool 183
12. The script will create three output files in the “results” direc-
tory: “norm.factors.matrix.txt” containing TMM between
sample normalization factors, “mapped.reads.matrix.txt” con-
taining mapped read counts, “count.matrix.txt” containing
read counts for each gene, “gene.lengths.matrix.txt” contain-
ing the length of each gene, and “fpkm.normed.matrix.txt”
containing normalized FPKM expression values for each gene.
3.3 Generating 1. Once the RNA-seq data have been preprocessed, the next step
Transcription Profiles is to generate transcription profiles for specified regions of the
host genome, as well as any synthetic genetic constructs that
might be contained on plasmids. This process is performed by
the “05_transcription_profiles.sh” script which should be edi-
ted such that calls to the “transcription_profile.py” script are
made for each sample. Chromosomes for which profiles should
be created are specified with the “-chroms” option.
2. Transcription profiles are then created by running the
command:
sh 05_transcription_profiles.sh
3.4 Analyzing 1. Synthetic circuits can impart a significant burden on a host cell
Differential Gene which is often manifested by changes in gene expression. Dif-
Expression ferential gene expression analysis allows for shifts in expression
to Understand the Host to be quantified in a robust manner, correcting for potential
Response between-sample variations due to differences in sequencing
184 Deepti Vipin et al.
3. The script will create output files in the “results” directory for
each analysis performed. These will be named in the format
“PREFIX.de.analysis.txt” where PREFIX is replaced by the
user provided “-output_prefix” in the “06_de_analysis.sh”
script.
3. The script will create two output files in the “results” directory:
“promoter.profile.perf.txt” containing estimates of promoter
strengths and “terminator.profile.perf.txt” containing termina-
tor efficiencies calculated from the transcription profiles (see
Fig. 2).
devices, it is essential that the samples taken, span the full range
of possible inputs the system may be exposed to. This ensures
that inputs vary over their full range and improve the fitting of a
response function. In this workflow, we allow for genetic
devices that have activating and repressing Hill-like response
functions. The fitting of the response function to experimental
data is performed by the “08_promoter_fitting.sh” script. This
calls the “promoter_fitting.py” script for each set of samples
corresponding to a particular condition. The script will need to
be updated for the samples to be processed. If for example, you
have assayed a circuit in two separate types of growth media,
then the samples for one media should be fitted separately to
the other. This will, therefore, require two calls to the “pro-
moter_fitting.py” script with the appropriate samples given as
arguments to the “-samples” option.
2. Genetic device characterization is performed by running the
command:
sh 08_promoter_fitting.sh
3.7 Removing 1. Once the complete workflow has been run and all required
Temporary Files analysis performed, a clean-up step can be used to remove all
and Logs temporary files and logs. This will ensure that any generated
results remain untouched but will not allow for intermediate
steps to be rerun out of order (some of the temporary files are
necessary for many of the analyses).
2. The clean-up step is performed by the “09_clean_up.sh” script.
Before running, this file must be updated to include entries to
delete all contents from the “tmp” and “logs” directories
(including any sub-directories). Once edited, the script can be
executed using:
sh 09_clean_up.sh
4 Notes
Acknowledgments
References
1. Greco FV, Tarnowski MJ, Gorochowski TE 13. Woodruff LBA et al (2016) Registry in a tube:
(2019) Living computers powered by bio- multiplexed pools of retrievable parts for
chemistry. Biochemist 41:14–18 genetic design space exploration. Nucleic
2. Brophy JAN, Voigt CA (2014) Principles of Acids Res 45(3):1553–1565
genetic circuit design. Nat Methods 11:508 14. Canton B, Labno A, Endy D (2008) Refine-
3. Kosuri S et al (2013) Composability of regu- ment and standardization of synthetic
latory sequences controlling transcription and biological parts and devices. Nat Biotechnol
translation in Escherichia coli. Proc Natl Acad 26:787
Sci U S A 110(34):14024 15. Kelly JR et al (2009) Measuring the activity of
4. Mutalik VK et al (2013) Precise and reliable BioBrick promoters using an in vivo reference
gene expression via standard transcription and standard. J Biol Eng 3(1):4
translation initiation elements. Nat Methods 16. Kleeman B et al (2018) A guide to choosing
10:354 fluorescent protein combinations for flow cyto-
5. Schmidl SR et al (2019) Rewiring bacterial metric analysis based on spectral overlap. Cyto-
two-component systems by modular metry A 93(5):556–562
DNA-binding domain swapping. Nat Chem 17. Goodwin S, McPherson JD, McCombie WR
Biol 15(7):690–698 (2016) Coming of age: ten years of next-
6. Scott SR, Hasty J (2016) Quorum sensing generation sequencing technologies. Nat Rev
communication modules for microbial consor- Genet 17:333
tia. ACS Synth Biol 5(9):969–977 18. Stark R, Grzelak M, Hadfield J (2019) RNA
7. Gorochowski TE, Avcilar-Kucukgoze I, sequencing: the teenage years. Nat Rev Genet
Bovenberg RAL, Roubos JA, Ignatova Z 20(11):631–656
(2016) A minimal model of ribosome alloca- 19. Gorochowski TE et al (2017) Genetic circuit
tion dynamics captures trade-offs in expression characterization and debugging using
between endogenous and synthetic genes. ACS RNA-seq. Mol Syst Biol 13(11):952
Synth Biol 5(7):710–720 20. Ingolia NT (2014) Ribosome profiling: new
8. Gyorgy A et al (2015) Isocost lines describe the views of translation, from single codons to
cellular economy of genetic circuits. Biophys J genome scale. Nat Rev Genet 15:205
109(3):639–646 21. Gorochowski TE, Chelysheva I, Eriksen M,
9. Qian Y, Huang H-H, Jiménez JI, Del Vecchio Nair P, Pedersen S, Ignatova Z (2019) Abso-
D (2017) Resource competition shapes the lute quantification of translational regulation
response of genetic circuits. ACS Synth Biol 6 and burden using combined sequencing
(7):1263–1272 approaches. Mol Syst Biol 15(5):e8719
10. Cardinale S, Arkin AP (2012) Contextualizing 22. Park PJ (2009) ChIP–seq: advantages and chal-
context for synthetic biology – identifying lenges of a maturing technology. Nat Rev
causes of failure of synthetic biological systems. Genet 10(10):669–680
Biotechnol J 7(7):856–866 23. Del Campo C, Bartholom€aus A, Fedyunin I,
11. Nielsen AAK et al (2016) Genetic circuit Ignatova Z (2015) Secondary structure across
design automation. Science 352(6281): the bacterial transcriptome reveals versatile
aac7341 roles in mRNA regulation and function. PLoS
12. Stanton BC, Nielsen AAK, Tamsir A, Clancy K, Genet 11(10):e1005613
Peterson T, Voigt CA (2014) Genomic mining 24. Strobel EJ, Yu AM, Lucks JB (2018) High-
of prokaryotic repressors for orthogonal logic throughput determination of RNA structures.
gates. Nat Chem Biol 10(2):99–105 Nat Rev Genet 19(10):615–634
Genetic Analyzer Tool 187
25. Conway T et al (2014) Unprecedented high- 31. Li H et al (2009) The sequence alignment/
resolution view of bacterial operon architecture map format and SAMtools. Bioinformatics 25
revealed by RNA sequencing. MBio 5(4): (16):2078–2079
e01442–e01414 32. Anders S, Pyl PT, Huber W (2014) HTSeq—a
26. Shishkin AA et al (2015) Simultaneous genera- Python framework to work with high-
tion of many RNA-seq libraries in a single reac- throughput sequencing data. Bioinformatics
tion. Nat Methods 12:323 31(2):166–169
27. Sanner MF (1999) Python: a programming 33. Quinlan AR, Hall IM (2010) BEDTools: a
language for software integration and develop- flexible suite of utilities for comparing genomic
ment. J Mol Graph Model 17(1):57–61 features. Bioinformatics 26(6):841–842
28. R. C. Team (2013) R: a language and environ- 34. Beal J et al (2019) Communicating structure
ment for statistical computing. R Foundation and function in synthetic biology diagrams.
for Statistical Computing, Vienna ACS Synth Biol 8(8):1818–1825
29. Robinson MD, McCarthy DJ, Smyth GK 35. Der BS et al (2017) DNAplotlib: programma-
(2009) edgeR: a bioconductor package for dif- ble visualization of genetic designs and asso-
ferential expression analysis of digital gene ciated data. ACS Synth Biol 6(7):1115–1119
expression data. Bioinformatics 26 36. Bartoli V, Dixon DOR, Gorochowski TE
(1):139–140 (2018) Automated visualization of genetic
30. Li H, Durbin R (2009) Fast and accurate short designs using DNAplotlib. In: Braman JC
read alignment with Burrows–Wheeler trans- (ed) Synthetic biology: methods and protocols.
form. Bioinformatics 25(14):1754–1760 Springer New York, New York, NY, pp
399–409
Chapter 9
Abstract
Cell-free synthetic biology offers an approach to building and testing gene circuits in a simplified environ-
ment free from the complexity of a living cell. Recent advances in microfluidic devices allowed cell-free
reactions to run under nonequilibrium, steady-state conditions enabling the implementation of dynamic
gene regulatory circuits in vitro. In this chapter, we present a detailed protocol to fabricate a microfluidic
chemostat device which enables such an operation, detailing essential steps in photolithography, soft
lithography, and hardware setup.
1 Introduction
Filippo Menolascina (ed.), Synthetic Gene Circuits: Methods and Protocols, Methods in Molecular Biology, vol. 2229,
https://doi.org/10.1007/978-1-0716-1032-9_9, © Springer Science+Business Media, LLC, part of Springer Nature 2021
189
190 Nadanai Laohakunakorn et al.
gene networks. Cell-free systems are thus well suited for rational,
bottom-up engineering of biomolecular systems [3, 4]. Further-
more, the functionality of cell-free systems can be expanded by
inclusion of additional components [5], and provide a system for
quantitative analysis including mRNA and protein concentrations
[6, 7]. A second key benefit is that their ease of preparation and
scalability also accelerate design-build-test cycles, resulting in their
adoption as an efficient rapid prototyping platform. Both lysate
[8, 9] and recombinant [10] cell-free reaction systems can now be
readily generated using standard laboratory equipment at reason-
ably low costs.
Microfluidics have allowed these benefits of cell-free synthetic
biology to be more fully realized [11]. By increasing the through-
put, lowering reagent consumption and providing control and
quantitative monitoring of thousands of reactions in parallel, they
have enabled precise characterization of cell-free gene circuits both
in integrated chips, [12, 23] as well as in encapsulated droplets
[13, 14].
Batch cell-free reactions typically run to chemical equilibrium
as substrates are exhausted, reaction products accumulate, and
enzymatic machinery degrades. To maintain a more life-like non-
equilibrium steady state, large-scale continuous exchange or con-
tinuous flow reactors have been used to feed the reaction with small
molecules and wash away products through ultrafiltration mem-
branes [15]. At the microfluidic level, microchemostat devices have
been developed which replenish not only substrates but also the
enzymatic machinery, while at the same time diluting away reaction
products [16, 17]. These microchemostats enable long-term
steady-state reactions, and also allow for the investigation of bio-
logically relevant dynamical behaviors such as oscillations [16, 17]
and pattern formation [18].
In this chapter, we describe the entire process of designing,
fabricating, and operating a microfluidic chemostat device. The
chip we chose as an example is a revised and simplified version of
the microchemostat presented in Niederholtmeyer et al. 2013 [16],
and is shown in Fig. 1.
The operation of the device first involves selecting an input
solution using the multiplexer unit, which is directed to one of
eight separate reactor rings. Each reactor contains four output
ports, located at specific positions around the ring. Opening these
ports exchanges a fixed fraction of the reactor volume, with the
exact fraction depending on the position of the port. The place-
ment of these ports allows the reactor to be loaded with a reaction
of fixed composition, and importantly also allows a dilution step to
occur which preserves this composition. In between dilution steps,
Steady-State Cell-Free Gene Expression 191
outlet
2 mm
individual microreactor
Fig. 1 (a) A two-layer microchemostat design consists of a thin control layer sandwiched between a glass slide
and a thicker flow layer. (b) Applying pressure to channels in the control layer pushes up valves which close off
channels in the flow layer. (c) The chip contains eight individual chemostat reactors. Four control lines serve
as dual-function valves and peristaltic pump. Actuating these lines sequentially mixes the liquid inside the
reactors
the reaction is mixed using a peristaltic pump. Full details are given
in Subheading 3.5.
We describe the photolithographic steps required to print the
chip design on a chrome mask, and subsequently transfer it onto
silicon wafers. Once fabricated, these silicon molds can be used for
multiple rounds of soft lithography where they are used to cast
polydimethysiloxane (PDMS) devices. Finally, the hardware
required for operating the chip is described, and a standard experi-
ment outlined. Related protocols are available in the literature
[19, 20].
192 Nadanai Laohakunakorn et al.
2 Materials
3 Methods
3.1 Design of 1. Design the device (see Note 2) on AutoCAD 2019 or other
Microfluidic Devices software with similar functionality. A specific example is shown
in Fig. 1, and other designs are available on our webpage (see
Note 3). Export the final design as a .dxf file.
2. Using CleWin, convert the designs to a machine-compatible .
cif file ready for photomask fabrication.
3. During curing, the PDMS layers will differentially shrink, with
the thicker flow layer shrinking more than the thinner control
layer, which remains attached to the rigid mold. Thus, it is
crucial to enlarge the entire flow layer design by 1.5%. This
can be done in CleWin during the conversion.
3.2 Photolithography 1. Expose chrome masks with the VPG200 laser writer, using a
for Mask and Wafer 20 mm write lens (see Note 4) and 48% intensity. Make sure
Fabrication that the polarity and mirroring of the mask are correct (see
Note 5).
3.2.1 Mask Fabrication
2. Next, process the exposed masks using the HMR900 mask
processor. This involves the following automated steps:
3. First purge the machine with deionized (DI) water.
4. Then develop for 100 s with a diluted developer mixture
(AZ 351B:DI water in the ratio 1:3.75) and rinse with DI
water.
5. Etch through the chrome layer for 60 s using the Cr01 etchant,
and rinse.
6. Finally, strip the photoresist using the AZ 400 K developer for
35 s, followed by a final rinse and drying with CO2. The
completed masks should be completely dry before use.
3.2.2 Flow Mold 1. Prime a clean Si wafer with hexamethyldisilazane (HMDS) (see
Fabrication Note 6) for 10 s in vacuum, using the VB20 hotplate.
2. Transfer the wafer onto the Optispin SB20 spin coater and
dispense a few ml of positive-resist AZ9260 onto the center
of the wafer, taking care to avoid bubbles (see Notes 7 and 8).
3. Spin coat at 920 rpm for 100 s, followed by 60 s relaxation at
0 rpm. This deposits a 14-μm layer of photoresist on the surface
of the wafer.
Steady-State Cell-Free Gene Expression 195
3.2.3 Control Mold 1. Clean the Si wafer with 2.45 GHz O2 plasma in the Tepla
Fabrication 300 Plasma Stripper, using 500 W for 7 min and 400 ml/min
of O2.
2. Transfer the wafer onto the LSM250 spin coater and dispense a
few ml of negative resist GM1070-SU8 onto the center of the
wafer, taking care to avoid bubbles.
3. Spin coat a 40-μm layer of photoresist onto the wafer using the
following program: 5 s/0–500 rpm, 5 s/500 rpm, 21 s/
500–1933 rpm, 40s/1933 rpm, 1 s/1933–2933 rpm, 1 s/
2933–1933 rpm, 5 s/1933 rpm, and 26 s/1933–0 rpm.
4. When the spin coating has finished, immediately transfer the
wafer to the hotplate and carry out an initial relaxation fol-
lowed by a softbake using the following program (see Note 14):
30 min at 30 C, then 3000 s ramp 30 C to 130 C, 300 s at
130 C, and then 3000 s ramp 130 C to 30 C.
5. Load the appropriate chrome mask onto the MJB4 mask
aligner and expose for 1 cycle at 16 s, using the Hg-i line
(365 nm) at 20 mW/cm2. Use the following parameters:
expose type ¼ soft, alignment gap ¼ 30, WEC type ¼ cont,
N2 purge ¼ NO, and WEC-offset ¼ OFF.
196 Nadanai Laohakunakorn et al.
3.3 Soft Lithography 1. Before first use, place wafers inside a sealed box with few drops
for Device Fabrication (0.5 mL) of trimethylchlorosilane and incubate for at least
12 h. Repeat the silanization before each use for 10 min.
3.3.1 Silanization of
Wafers
3.3.2 Casting and Curing 1. In two plastic cups, weigh out and add PDMS elastomer and
of PDMS Devices curing agent in a ratio 5:1 (50 g: 10 g) for the flow layer and
20:1 (20 g: 1 g) for the control layer.
2. Defoam the mixture using the ARE-250 centrifugal mixer, by
mixing at 2000 rpm for 1 min followed by defoaming at
2200 rpm for 2 min. These values correspond to machine
settings specific for the ARE250, which is not a standard cen-
trifuge but a ’planetary’ mixer, i.e. the samples spin on a plat-
form which itself revolves around a central axis.
3. Clean both flow and control wafers using pressurised N2.
4. Put the flow layer wafer on aluminium foil inside a glass petri
dish. Make sure the foil covers the dish and contains the PDMS
fully. Pour all of the 5:1 PDMS mixture on top of the wafer and
place the dish inside a vacuum desiccator for 40 min to degas
the mixture.
5. Put the control layer wafer in the SCS G3P-8 spin coater, and
carefully pour a few ml of the 20:1 PDMS onto the center of
the wafer. To coat the wafer, run the following program: Step
0, rpm ¼ 0, disp ¼ 2, ramp ¼ 0.0, dwell ¼ 0; Step
1, rpm ¼ 1420, disp ¼ none, ramp ¼ 20.0, dwell ¼ 35; Step
Steady-State Cell-Free Gene Expression 197
To relay board + PC
Control branch
Solenoid
PE tubing valve
PE tubing Luer stub Water-filled control line Connector pin
OD 6 mm male luer 23 ga ID 0.35 mm
Electric
Compressed
manifold
air supply To chip
Buffers Connector pin
ID 0.35 mm
Regulator
Manual Luer stub TX-TL reagents
manifold 23 ga
Flow branch
Fig. 2 Pneumatic connections for the setup. The compressed air supply is split into two independently
regulated branches. Pressure in the control branch is switched using electric valves while the flow branch is
controlled manually. Buffers and other input solutions are stored in Tygon tubing, while cell-free (TX-TL)
reagents are stored in FEP–PEEK tubing
3.4 Hardware Setup Air pressure is supplied to the setup using polyethylene (PE) tubing
connected directly to the laboratory compressed air supply. A sche-
matic of the setup’s pneumatic connections is shown in Fig. 2.
3.4.1 Regulation of 1. Connect one branch of the input air supply to a regulator, and
Control Layer Pressure direct the regulated output supply to the aluminium electric
manifold.
2. The electric manifold directs air pressure to the chip’s control
lines. Attach Tygon tubing (ID 0.0200 ) to the manifold using
appropriate adaptors as shown in Fig. 2. The tubing contains a
23 ga luer stub on one end (used for filling and connecting to
the manifold) and a stainless steel connector pin on the other
(used for connecting to the chip).
3. Plug the electric manifold into the relay board, which links via
USB to a PC running control software written in LabVIEW. An
example of the code and full documentation can be found
online (see Note 22).
3.4.2 Regulation of Flow 1. Connect the other branch of the input air supply to a regulator,
Layer Pressure and connect the regulated supply to the manual luer manifold.
2. Adjust the pressure as required (typically ~0.3 bar).
3.5 Device Operation 1. Lower the control manifold pressure to around ~10 psi.
3.5.1 Filling Control 2. Using the PC software, close all the control line valves.
Lines 3. Fill each Tygon line with deionised water (see Note 23)
through the connecting pin, using a syringe attached to a
luer stub.
Steady-State Cell-Free Gene Expression 199
3.5.2 Filling Flow Lines 1. Make sure the appropriate manual manifold valve is closed.
2. Basic reagents such as buffers and chemicals are held in ID
0.0200 Tygon tubing. First, assemble the tubing which consists
of a length of Tygon, a 23 ga luer stub on one end and a
connector pin on the other.
3. Attach a syringe to the luer stub and carefully draw up the
required reagent into the tubing. Make sure there are no
bubbles.
4. Attach the connector pin to the appropriate flow inlet, before
removing the syringe and attaching the luer stub to the manual
manifold.
5. Make sure valves are in the appropriate configuration on the
chip before opening the flow manifold valve, and allowing the
reagent to fill into the device. Typically, a pressure of ~0.3 bar is
ideal for the flow lines.
6. For the cell-free extract, follow the previous steps, but instead
draw up the solution into the FEP coil through the PEEK
tubing. Attach the PEEK tubing directly into the chip.
7. An important requirement for long-term steady-state reactions
is that the cell-free extract is separated from energy and DNA
solutions. If required, cooling elements can be supplemented
to further prevent degradation of the solutions [16, 20].
Solution A
a Loading b Dilution Solution B
1 2 3 4
c d 103 e f
200
Experimentally-determined load %
Experimentally-determined load %
ring number 70 70
15 Dilution %
ring 1 4 y = 1.01x - 0.26
YFP fluorescence [RFU]
60 60
ring 2 12 R2 = 0.9996
150 12 ring 3 50 20 50
ring 4 60
9 ring 5 40 40
100 ring 6
ring 7 30 30 Chip
6 ring 8
number
50 20 20 1
3 2
10 10 3
4
0 0 0 0
0 50 100 150 200 250 0 2 4 6 8 Ring 1 0 10 20 30 40 50 60 70
Ring 2
Ring 3
Ring 4
Ring 5
Ring 6
Ring 7
Ring 8
Time [s] Cycle Theoretical load %
Fig. 3 Basic operations and characterization of the chip. (a) Initial loading is achieved by flowing an input
solution (solution A, green) first through one side of the reactor, then the other. (b) Dilution takes place by
flushing an input solution (solution B, yellow) through different outlets. The dilution fraction is controlled by the
geometric positioning of the outlets and is fixed for a given design. (c) After loading 20% of a reactor with YFP,
actuating the peristaltic pump at 20 Hz mixes the solution in ~100 s. (d) This shows the fluorescence from all
eight reactor rings, initially loaded with 20% YFP, and repeatedly diluted with buffer. (e) Experimentally
determined dilution fraction for each of the eight reactors. (f) Experimentally determined load fraction vs
theoretical load fraction for four different chips
4 Notes
a Initial loading c
103 Solution A 103 Solution B 103 Solution C
Load A Load B Load C 30 20 20
ring number
40% 40% 20% 25 ring 1
15 15 ring 2
DNA-cy5 [RFU]
20 ring 3
CFP [RFU]
YFP [RFU]
ring 4
15 10 10
ring 5
10 ring 6
5 5 ring 7
5 ring 8
0 0 0
0 2 4 6 8 10 0 2 4 6 8 10 0 2 4 6 8 10
Time [hours] Time [hours] Time [hours]
b Dilution step d 8
103 Tracer for Solution A
60
103 Steady-state expression
Load A Load B Load C ring number
8% 8% 4% ring 1
ring 2
6
ring 3
mCherry [RFU]
deGFP [RFU]
40 ring 4
ring 5
4 ring 6
ring 7
20 ring 8
2
0 0
0 2 4 6 8 10 0 2 4 6 8 10
Time [hours] Time [hours]
Fig. 4 Typical experimental operation of the chip. (a) The chip is initially loaded with three solutions A–C
(green, yellow, and blue) in the ratio 40%, 40%, and 20%, and (b) subsequently diluted with the same
solutions in the ratio 8%, 8%, and 4%. (c) Carrying out this process using aqueous solutions of three different
fluorescent tracers demonstrates that steady-state concentrations are maintained over many hours. (d)
Steady-state cell-free expression can be achieved by adding as the three solutions cell-free lysate (solution
A), energy solution (solution B), and DNA template (solution C). The lysate is labeled with an mCherry tracer to
assess its concentration (left), while the reaction produces deGFP, which reaches a steady-state concentration
when production and dilution rates are equal (right). Here, a dilution step was carried out every 15 min
Acknowledgments
References
1. Purnick P, Weiss R (2009) The second wave of 12. Niederholtmeyer H et al (2015) Rapid cell-free
synthetic biology: from modules to systems. forward engineering of novel genetic ring oscil-
Nat Rev Mol Cell Biol 10:410–422 lators. elife 4:1–18
2. Garenne D, Noireaux V (2019) Cell-free tran- 13. Hori Y et al (2017) Cell-free extract based
scription-translation: engineering biology from optimization of biomolecular circuits with
the nanometer to the millimetre scale. Curr droplet microfluidics. Lab Chip 17:3037–3042
Opin Biotechnol 58:19–27 14. Chang J-C et al (2018) Microfluidic device for
3. Takahashi MK et al (2015) Characterizing and real-time formulation of reagents and their
prototyping genetic networks with cell-free subsequent encapsulation into double emul-
transcription-translation reactions. Methods sions. Sci Rep 8:8143
86:60–72 15. Spirin A et al (1988) A continuous cell-free
4. Perez JG et al (2016) Cell-free synthetic biol- translation system capable of producing poly-
ogy: engineering beyond the cell. Cold Spring peptides in high yield. Science 242:1162–1164
Harb Perspect Biol 8:a023853 16. Niederholtmeyer H et al (2013) Implementa-
5. de Maddalena LL et al (2016) GreA and GreB tion of cell-free biological networks at steady
enhance expression of Escherichia coli RNA state. Proc Natl Acad Sci 110:15985–15990
polymerase promoters in a reconstituted 17. Karzbrun E et al (2014) Programmable
transcription-translation system. ACS Synth on-chip DNA compartments as artificial cells.
Biol 5:929–935 Science 6198:829–832
6. Niederholtmeyer H, Xu L, Maerkl SJ (2013) 18. Tayar A et al (2017) Synchrony and pattern
Real-time mRNA measurement during an formation of coupled genetic oscillators on a
in vitro transcription and translation using chip of artificial cells. Proc Natl Acad Sci
binary probes. ACS Synth Biol 2:411–417 114:11609–11614
7. Wick S et al (2019) PERSIA for direct fluores- 19. Rockel S, Geertz M, Maerkl SJ (2012)
cence measurements of transcription, transla- MITOMI: a microfluidic platform for in vitro
tion, and enzyme activity in cell-free systems. characterization of transcription factor-DNA
ACS Synth Biol 8:1010–1025 interaction. Methods Mol Biol 786:97–114
8. Kwon Y-C, Jewett MC (2015) High- 20. van der Linden A J et al (2019) A multilayer
throughput preparation methods of crude microfluidic platform for the conduction of
extract for robust cell-free protein synthesis. prolonged cell-free gene expression. J Vis Exp
Sci Rep 5:8663 152:e59655
9. Sun ZZ et al (2013) Protocols for implement- 21. Ferry MS, Razinkov IA, Hasty J (2012) Micro-
ing an Escherichia coli based TX-TL cell-free fluidics for synthetic biology: from design to
expression system for synthetic biology. J Vis execution. Methods Enzymol 497:295–372
Exp 79:1–15 22. Chau K et al (2011) Dependence of the quality
10. Lavickova B, Maerkl SJ (2019) A simple, of adhesion between poly(dimethylsiloxane)
robust, and low-cost method to produce the and glass surfaces on the composition of the
PURE cell-free system. ACS Synth Biol oxidizing plasma. Microfluid Nanofluid
8:455–462 10:907–917
11. Dubuc E et al (2019) Cell-free microcompart- 23. Swank Z, Laohakunakorn N, Maerkl SJ (2019)
mentalised transcription-translation for the Cell-free gene-regulatory network engineering
prototyping of synthetic communication net- with synthetic transcription factors. Proc Natl
works. Curr Opin Biotechnol 58:72–80 Acad Sci U S A 116:5892–5901
Chapter 10
Abstract
Applications of control engineering to mammalian cell biology have been recently implemented for precise
regulation of gene expression. In this chapter, we report the main experimental and computational
methodologies to implement automatic feedback control of gene expression in mammalian cells using a
microfluidics/microscopy platform.
Key words Feedback control, Mammalian cell, Microfluidics, Cell segmentation, PDMS , Control
algorithms
1 Introduction
Mahmoud Khazim, Elisa Pedone, and Lorena Postiglione contributed equally to this work.Diego di Bernardo
and Lucia Marucci contributed equally to this work.
Filippo Menolascina (ed.), Synthetic Gene Circuits: Methods and Protocols, Methods in Molecular Biology, vol. 2229,
https://doi.org/10.1007/978-1-0716-1032-9_10, © Springer Science+Business Media, LLC, part of Springer Nature 2021
205
206 Mahmoud Khazim et al.
2 Materials
2.1 Chip Fabrication 1. Master silicon wafer (Silicon Valley Microelectronics, USA).
2. Chlorotrimethylsilane (TCSM).
3. Aluminum foil.
4. Vacuum degassing chamber (Bel-Art).
5. Oven.
6. Sonicator (Camlab).
7. Acetone, methanol, isopropyl alcohol, and distilled water.
8. Pressurized nitrogen.
9. Polydimethylsiloxane (PDMS) Sylgard 184 Elastomer base
(Dow Corning).
10. 0.75-mm biopsy punch (World Precision Instruments;
504,529).
11. Cover glasses (Hirschmann 24 60 mm T 0.13–0.17 mm).
12. O2 plasma asher (DienerZepto).
3 Methods
3.1 Fabrication For brevity, the protocol reported here assumes that a master mold
of PDMS Replica is available; therefore, only the steps to produce device replica are
Molding included. For master mold fabrication, please refer to the original
publication [4].
3.1.2 PDMS Microfluidic PDMS base is mixed with curing agent and placed on a silicon
Device Preparation master mold wafer. The mixture is degassed and then cured. The
cured PDMS is peeled off and autoclaved, and then, ports are
punched using a biopsy punch.
l Prepare PDMS by mixing Sylgard 184 Elastomer base and cur-
ing agent in a 10:1 ratio. Mix the base and curing agent well
using a lab spatula. The amount of PDMS is usually worked out
to a tailored dimension. For a 4-in. wafer and petri dish, use 50 g
of PDMS/curing agent in a 10:1 ratio (45 g of PDMS base and
5 g of curing agent).
208 Mahmoud Khazim et al.
Fig. 1 Steps and equipment for the fabrication of microfluidic devices by PDMS replica molding. (a)
Silanization and surface treatment of the master by placing in a degassing chamber with TCSM. (b) The
master is placed in a glass petri dish covered with aluminum foil. (c) PDMS and curing agent are mixed,
poured over the master, and degassed. (d) After curing, the PDMS replica are cut out and ports are made by
use of a reusable biopsy punch. (e) The punched devices are sonicated in isopropyl alcohol followed by
sonication in H2O. (f) The cleaned replica is bonded to glass coverslips using a plasma asher
l Place master mold into a glass petri dish, with similar area
dimensions, covered with aluminum foil (Fig. 1b).
l Pour PDMS mix onto master mold and degas in a vacuum
degassing chamber for 30 min or until all bubbles have been
removed (Fig. 1c).
l Place master mold with PDMS in an oven to cure for 1 h at
80 C.
l Gently peel off cured PDMS from the mold end and release the
PDMS from the master mold.
l Autoclave cured PDMS for 30 min at 121 C in an autoclavable
paper bag to ensure long-term viability of cells in the device.
l Using a 0.75-mm biopsy punch, punch the ports to create
fluidic ports for access of cells and media (Fig. 1d) (see Note 1).
Mammalian Cell Control 209
3.1.3 Cleaning Punched PDMS devices are sonicated to dislodge PDMS shavings
and Bonding of PDMS from the ports. Coverslips are cleaned and dried. Finally, the PDMS
Chips to Glass Coverslips devices and coverslips are placed in a plasma asher and bonded by
bringing the surfaces to contact and optionally baking overnight to
increase the bond between the device and coverslip.
l Place punched PDMS devices in isopropyl alcohol and sonicate
for 10 min (Fig. 1e).
l Sonicate in distilled water for 10 min (Fig. 1e).
l Air-dry using pressurized nitrogen.
l For each molded, punched PDMS device, clean a thin
24 60 mm cover glass in acetone, methanol, isopropyl alcohol,
and distilled water and then dry with pressurized nitrogen.
l Expose the PDMS devices, with layers facing up, and cover
glasses to oxygen plasma in an O2 plasma asher, at 50–70%
power for 2 min (Fig. 1f).
l Bring the PDMS device into contact with the cover glass with
layers facing down to form a strong irreversible bond between
surfaces.
l Use a microscope to check for any faults.
l Optional: bake the bonded devices at 90 C overnight.
3.2 Chip Loading Prior to trapping the cells in the microfluidic device chambers via
on-chip vacuum, the device needs to be prewet such that device
3.2.1 Pins Preparation
channels are filled with fluid while the culture chambers remain
and Wetting of the Device
filled with air. Also, pins need to be prepared. To release the pins
from the syringe adaptors (Fig. 2a), incubate the pins with isopro-
pyl alcohol for 24–48 h (Fig. 2b). The pins can be used to connect
fluidic lines to the microfluidic chip (Fig. 3) for all future
experiments.
Fig. 2 Release of metallic pins from adaptors. Metallic pins before (a) and after
(b) 24/48-h isopropyl alchohol (isopropanol) incubation
210 Mahmoud Khazim et al.
Fig. 3 Microfluidic device wetting, cell loading, and preculture. (a) Microfluidic device, which has been bonded
to a glass coverslip. Media is flushed through the microfluidic channels starting from port 5 (b) followed by
filling through port 2 (c). (d) The wetted device is fastened onto a lab microscope, and a vacuum pump is
attached to ports 3 and 4. Cells are pushed through port 1 and loaded via the vacuum into the cell chambers.
(e) The loaded microfluidic device for preculture has ports 2, 5, and 6 plugged using a pin with a short length of
PTFE tubing tied at the end to stop fluidic flow through the ports (*). A 10-mL syringe with media is attached to
port 1 for overnight perfusion (blue arrow) and media flows out of port 5 (red arrow). (f) A 10-mL syringe with
media is fastened onto a makeshift rig and attached to port 1
3.2.2 Shear-Free Cell Once the chip has been wetted, the chip is attached to a vacuum
Loading Via on-Chip pump and cells are loaded into the chip. Cells are visually moni-
Vacuum tored via laboratory microscope while being vacuum-loaded. Cells
that remain in the main device channels are flushed out using fresh
media, while cells in the culture chambers are shielded and resistant
to convective flow.
l Connect the on-chip vacuum to ports 3 and 4 and fasten the
chip to a laboratory microscope to enable monitoring of cell
loading (Fig. 3d).
l Wash mammalian cells (previously kept in complementary media
and maintained in a tissue culture incubator at 37 C and 5%
CO2) with PBS; detach cells from culture dish and place into a
centrifuge tube.
l Centrifuge cells to form a pellet and resuspend in complete
media, at a density of 2 106 cells per 100 μL of media. If
cells are too concentrated, dilute with extra media.
l Aspirate the cell suspension into a fresh 2.5-mL syringe via
needle attached tubing and metal pin, and wet connect to port
1 of the device.
l Gently apply pressure to syringe with cell suspension until cells
are visible in the main perfusion channel upon inspection via a
tissue culture microscope.
l Once the presence of cells in the main channel is confirmed, stop
the flow by releasing syringe pressure until cells are apparent at
the entrance of the chip chambers.
l To begin cell loading, turn the vacuum on and visually monitor
cells entering the chambers; mechanical finger tapping of the
tube near the port can enhance cell loading as it avoids cells
getting stuck on the walls of the device.
l Once loaded, turn vacuum off and disconnect the vacuum ports.
l Use a new syringe and tubing with fresh media to flush out
untrapped cells from the ports by wet connecting to port
1 and applying a gentle pressure through the main channel,
out of the remaining ports of the device. Care needs to be
taken so that the fluid flow through the device is not too strong,
as this can cause cells properly trapped into the chambers to be
washed out.
3.2.3 Preculture of Cells The device with cells in the culture chambers can now be precul-
in the Microfluidic Device tured in the incubator, overnight and up to 48 h, to allow cells to
attach and proliferate inside the device prior to undertaking control
experiments on the microscope.
212 Mahmoud Khazim et al.
l Plug ports 2, 6, and 7 by using 90 bent metal pins with a small
amount of tubing which can be tied at the end to stop the flow
(Fig. 3e).
l Fill a 10-mL syringe with up to 5 mL of culture media onto a
makeshift rig and attach by needle, tubing and pin to port
1 (Fig. 3f). The hydrostatic pressure difference between the
syringe fluid and the opening in port 5 allows media to flow
through the device, establishing a slow perfusion flow.
l Place rig with culture media and microfluidic device into an
incubator and culture overnight.
3.3 Microfluidics/ l Measure and cut three sections of PTFE #24 AWG tubing for
Microscopy-Based collecting the waste media and and two sections of tubing for
Time-lapse time-controlled delivery of culture media to cells. The length of
the tubing is around 120 cm for the output (waste) and 200 cm
3.3.1 Tubes for the input (delivered) media.
l Connect the Y-junction to a short tube of about 20 cm and two
of the 120-cm waste output tubes (Fig. 4).
l Connect the short section of tubing (Fig. 4a) to a 23-gauge
needle and the longer sections of tubing to metal pins
(Fig. 4b, c).
l Similarly, connect one side of the remaining tubes (one output
and two inputs) to a 23-gauge needle on one end and metal pins
on the other end.
l Attach each fluidic line to a 50-mL syringe and slowly fill them
with 12 mL of culture media. Note that this amount of media is
enough for experiments <72 h; for longer experiments, more
media is required.
Fig. 4 Junction and tube connections. Side a is connected to the 50-mL waste
syringe placed at 23 cm from the stage, whereas b and c are connected to two
120-cm output tubes. Two metallic pins are used to attach the tubes to the chip
and collect the waste media from ports 1 and 2
Mammalian Cell Control 213
3.3.2 Chip Positioning Check the chip for the presence of obstructions that might impair
correct media flow through the microfluidic channels. If needed,
the chip can be flushed using a short section of #24 PTFE tubing
connected to a 10-mL syringe with fresh media.
Handle the following steps very carefully to avoid damaging the
chip and support, see Fig. 6.
l Connect each fluidic line to the chip and corresponding port,
starting from the waste ports. The tubes from the syringe placed
at 23 cm from the stage go to ports 1 and 2, while the one at
46 cm from the stage connects to port 5. Finally, connect the
syringes for media delivery to ports 6 and 7.
l Center the microfluidic chip, loaded with cells, on the micro-
scope stage and secure with adhesive tape.
l Check the CO2 valve is open and fasten the atmospheric cham-
ber correctly over the chip.
214 Mahmoud Khazim et al.
3.3.3 Actuation System The actuation system consists of two motor-controlled syringes
(containing media) mounted on linear actuators connected to
ports 6 and 7. Custom scripts in MATLAB need to be written to
implement online cell segmentation and control algorithms (see
Subheadings 3.4 and 3.5); the latter have to been coupled to the
software for automatic syringe movement and actuation.
l Turn on the software for syringes calibration.
l Calibrate the actuation system using the dedicated software and
move the syringes to the desired positions. The syringe with
media to be delivered to the cells should be at the highest
position.
3.3.4 Microscope Specs The settings in this section refer to use [2, 3] of a Leica DMi8
inverted microscope equipped with the digital camera AndoriXON
897 ultra back-illuminated EMCCD (512 512 16 μm pixels,
16 bit, 56 fps at full frame), and an environmental control chamber
(PeCon) for temperature and CO2 control. Equivalent micro-
scopes can be used, as far as the following is present:
l Digital camera for image acquisition.
l Environmental control chamber (PeCon) for long-term temper-
ature control and CO2 enrichment.
l Adaptive Focus Control (AFC) option to ensure that the focus is
maintained during the entire duration of the experiment.
l 20–40 objective.
3.3.5 Time-lapse l Turn on the microscope following this order: stage, fluorescent
Settings lamp, and microscope.
l Launch the microscope software and setup the time-lapse by:
Mammalian Cell Control 215
3.4 Computational For mammalian cell segmentation, the property of cells exhibiting a
Algorithms white halo in phase contrast images can be exploited [1–3]. The
main steps for cell segmentation and fluorescence quantification are
3.4.1 Cell Segmentation
as follows:
l Defining a threshold to generate a first binary image selecting
only pixels belonging to cell edges.
l Obtaining a second binary image (mask) in which the cell area is
overestimated by using dilation and filling operators.
l Subtracting from the mask obtained at point (2) the mask
obtained at point (1) in order to derive a binary image that
selects the portion of the original image covered by cells.
l Applying the mask, obtained at point (3), to the fluorescent field
image. In order to calculate the average fluorescence intensity of
pixels belonging to cells, the value of mask pixels obtained is
divided by the area of the mask.
l Subtracting the background signal (measured in a cell-free por-
tion of the chamber) from the value of cell fluorescence signal.
Other segmentation algorithms might be used, given different
cell morphologies and/or microscope used.
3.5.3 Model Predictive Model predictive control (MPC) is a well-established technique for
Control (MPC) controlling multivariable systems subject to constraints. Applica-
tions of MPC to regulate gene expression and signaling pathway
activity in mammalian cells are reported in [2].
l Given a desired control reference, MPC aims at finding the
optimal control input to minimize the difference between the
target value and the measured value, by means of a dynamical
model of the system being controlled and a cost function.
l To speed up computation, a discretized version of the dynamical
models describing the biological system is used, assuming that
the input is piece-wise constant during the sampling period
T (zero-order hold method):
x kþ1 ¼ Ax k þ Buk
y k ¼ Cx k
where, for example, in the case of a three-state system with
0 1
x1ðkT Þ
B C
1 input, x k ¼ @ x2ðkT Þ A are the system states, uk ¼ u(kT) is
x3ðkT Þ
the control input, and yk ¼ () is the system output with being a
natural number (∈[1,2,. . .]).
l Starting from the experimental data, at each sampling time , the
MPC controller uses the discrete model to predict the dynamic
behavior of the system to be controlled over a defined prediction
horizon and to determine the input such that an open-loop
objective function is minimized [7]. An example of cost function
to be minimized is the squared control error (SSE), defined as
follows:
X
kþN
SSEk ¼ ðN þ 1 þ k i Þε2i
i¼kþ1
4 Notes
Acknowledgments
References
1. Fracassi C, Postiglione L, Fiore G, di Bernardo of gene expression and signaling pathway activity
D (2016) Automatic control of gene expression in mammalian cells by automated microfluidics
in mammalian cells. ACS Synth Biol 5 feedback control. ACS Synth Biol 7
(4):296–302. https://doi.org/10.1021/ (11):2558–2565. https://doi.org/10.1021/
acssynbio.5b00141 acssynbio.8b00235
2. Postiglione L, Napolitano S, Pedone E, Rocca 3. Pedone E, Postiglione L, Aulicino F, Rocca DL,
DL, Aulicino F, Santorelli M, Tumaini B, Montes-Olivas S, Khazim M, di Bernardo D, Pia
Marucci L, di Bernardo D (2018) Regulation Cosma M, Marucci L (2019) A tunable dual-
Mammalian Cell Control 219
input system for on-demand dynamic gene 7. Morari M, Lee JH (1999) Model predictive con-
expression regulation. Nat Commun 10 trol: past, present and future. Comput Chem
(1):4481. https://doi.org/10.1038/s41467- Eng 23(4):667–682. https://doi.org/10.
019-12329-9 1016/S0098-1354(98)00301-9
4. Kolnik M, Tsimring LS, Hasty J (2012) 8. Fiore G, Menolascina F, di Bernardo M, di Ber-
Vacuum-assisted cell loading enables shear-free nardo D (2013) An experimental approach to
mammalian microfluidic culture. Lab Chip 12 identify dynamical models of transcriptional reg-
(22):4732–4737. https://doi.org/10.1039/ ulation in living cells. Chaos 23(2):025106.
c2lc40569e https://doi.org/10.1063/1.4808247
5. Astrom KJ, Murray RM (2010) Feedback sys- 9. Menolascina F, Fiore G, Orabona E, De
tems: an introduction for scientists and engi- Stefano L, Ferry M, Hasty J, di Bernardo M, di
neers. Princeton University Press Bernardo D (2014) In-vivo real-time control of
6. Utnik V, Lee, H (2006) Chattering problem in protein expression from endogenous and syn-
sliding mode control systems. Paper presented at thetic gene networks. PLoS Comput Biol 10
the international workshop on variable structure (5):e1003625–e1003625. https://doi.org/10.
systems, Alghero, Sardinia, Italy 1371/journal.pcbi.1003625
Chapter 11
Abstract
Dynamic modeling in systems and synthetic biology is still quite a challenge—the complex nature of the
interactions results in nonlinear models, which include unknown parameters (or functions). Ideally, time-
series data support the estimation of model unknowns through data fitting. Goodness-of-fit measures
would lead to the best model among a set of candidates. However, even when state-of-the-art measuring
techniques allow for an unprecedented amount of data, not all data suit dynamic modeling.
Model-based optimal experimental design (OED) is intended to improve model predictive capabilities.
OED can be used to define the set of experiments that would (a) identify the best model or (b) improve the
identifiability of unknown parameters. In this chapter, we present a detailed practical procedure to compute
optimal experiments using the AMIGO2 toolbox.
Key words Biological systems, Dynamic models, Optimal experimental design, Practical identifiability
1 Introduction
Filippo Menolascina (ed.), Synthetic Gene Circuits: Methods and Protocols, Methods in Molecular Biology, vol. 2229,
https://doi.org/10.1007/978-1-0716-1032-9_11, © Springer Science+Business Media, LLC, part of Springer Nature 2021
221
222 Eva Balsa-Canto et al.
with each other. Next steps (3 and 4) describe the kinetics and
strengths of biomolecular interactions within the system. If we
consider the cell as a well-stirred reactor, we can explain the behav-
ior of the network using a set of ordinary differential equations
which determine concentration changes as prescribed by kinetic
laws. The model would read as follows:
dx
¼ f ðx, u, θ, t Þ; xðt 0 Þ ¼ x0 ð1Þ
dt
where x, u, and θ regard the vectors of state variables, inputs, and
model parameters, respectively.
The dynamics of the system (1) depends on the initial condi-
tions (x0) and the parameter values. Parameter estimation offers the
means to reconcile models with data [1]. The underlying idea is to
solve a nonlinear optimization problem to compute unknown and
nonmeasurable kinetic constants to maximize the likelihood of the
data ey.
The experimental data consist of a matrix of values
corresponding to individual measurements obtained under the
conditions specified by an experimental scheme ε. We encode the
experimental data and the model predictions in the following
vectors:
h i h i
y¼ e
e y 1 , ey 2 , . . . , ey d , . . . , ey nd y ¼ y 1 , y 2 , . . . , y d , . . . , y nd
ð2Þ
where ey represents the experimental data and y ¼ g(x, u, θ, t) the
corresponding model predictions; d represents a specific experi-
mental condition defined by subindexes ε-for the experiment-,
o-for the observables in the experiment ε-, and s – for the sampling
times in the experiment ε. nd regards the total number of such
conditions, that is, the number of data. Accordingly, the operators
to be defined in the sequel can be easily condensed as follows:
nεo
!!
Xnd X
nε X no,ε
Xs
Fig. 1 Concept of the experimental scheme. It includes the number of experiments and/or replicates,
stimulation conditions, measured states, experiment duration, and sampling times
Fig. 2 Iterative solution of the parameter estimation and optimal experimental design problems. The iterative
solution requires a NLP solver to generate candidate solutions at each iteration k; and an IVP solver to solve
the model equations, plus the parametric sensitivities in the case of OED, to evaluate the cost function and
constraints
2 Materials
2.1 Toolbox AMIGO2 toolbox and the corresponding documentation are avail-
Download and License able at:
https://sites.google.com/site/amigo2toolbox/.
The toolbox is provided as a zip file with a password; it is free of
charge for academic purposes under the creative commons license.
For further details on license conditions, please visit http://
creativecommons.org/licenses/by-nc-nd/3.0.
2.2 Toolbox AMIGO2 has been implemented in MATLAB and tested in several
Requirements MATLAB versions. However, it may interface to C code for model
and Installation Guide simulation and parameter estimation.
For full capabilities, the user will require the following addi-
tional software:
l Cytoscape is needed for network visualization.
l MATLAB optimization toolbox is required to use the local
optimizers fmincon (SQP method for constrained problems,
suitable for dynamic optimization) or lsqnonlin (a least-squares
local NLP solver, suited for parameter estimation).
l MATLAB symbolic manipulation toolbox is used to evaluate
exact Jacobians and for network visualization.
l C compiler (e.g., gcc) is required to use AMIGO2-enhanced
modes with C.
The most computationally demanding step in all tasks in
AMIGO2 is the solution of the system dynamics, that is, the set
of ordinary differential equations (ODE). In this regard, the tool
offers the possibility of automatically generating C code, and this
will be automatically mexed to CVODES (Sundials) as included in
AMIGO2.
The toolbox does not require installation. Once unzipped,
open a MATLAB session and move to the AMIGO2 path. The
code is initialized by typing AMIGO_Startup. The Startup auto-
matically adds AMIGO2 to the path and generates mex options
files. From that moment on, users can access the Help from the
MATLAB help Supplemental Software section.
2.3 Code Structure AMIGO2 is organized in four main modules: the preprocessor, the
numerical kernel, the postprocessor, and the module of main tasks.
Figure 4 presents the code structure once unzipped.
l Help folder keeps all toolbox-related documentation.
l Examples folder keeps several implemented examples that the
user may consider as templates to address new problems.
228 Eva Balsa-Canto et al.
Fig. 4 Code structure. The code is organized in user-oriented folders (Examples, Inputs, Help, and Results),
code folders (Preprocessor, Postprocessors, Add-ons, Release-info), and tasks (Startup, Prep, SModel, SObs,
SData, LRank, GRank, ContourP, RIdent, PE, REG_PE, PE-PostAnalysis, OED, IOC, and DO)
3 Methods
3.2 Optimal Currently, AMIGO2 does not offer a specific task to solve the
Experimental Design problem of OED for model selection. Still, it is possible to use
for Model Selection AMIGO_DO for that purpose. Remark that the use of DO implies
in AMIGO2 that it is possible to measure regularly over time.
AMIGO_DO requires the definition of inputs.model, inputs.
DOsol, inputs.IVPsol, inputs.NLPsol, and inputs.plotd.
3.2.1 Definition The first step in the protocol corresponds to the definition of the
of the Objective Functional objective functional that will characterize the differences between
the models. Several possibilities exist. Here, we include a couple of
examples:
1. The integral of the squared differences of the fluorescent
protein:
2t¼tf 31=2
ð
2
J OED,MS ¼4 CitAU,A CitAU,B dt 5 ð24Þ
t¼0
Equation 24:
'dJ_OED_MS=(CitAUA-CitAUB)^2'
Equation 25:
'dtfinal=1',…
'dJ_OED_MS=(1/tfinal)*(CitAUA-CitAUB)^2'
3.2.2 Definition The definition of the dynamic optimization problem requires the
of the Optimization following elements: the initial conditions for model simulation; the
Problem tentative experiment duration; the type of optimization problem
(minimization or maximization); the definition of the objective;
and control vectors parameterization (type of input interpolation,
number of discretization elements, initial guess and bounds for the
inputs, and bounds for the experiment duration).
Inputs are shown in the sequel. For illustrative purposes, we
will assume that the experiment may last between 4 and 24 h, and
the input profile corresponds to a step-wise profile with five ele-
ments of fixed duration (i.e., the experiment is split into four
segments of equal duration). The input parameterization can be
easily modified in the inputs structure to consider steps of varying
duration, pulse-wise profiles, or linear-wise profiles. It should be
noted that the use of steps or linear-wise profiles with elements of
varying duration increases the multimodality of the optimization
problem. In general, solving a case with ten constant duration steps
is simpler than solving a case with five steps whose duration is also
Optimal Experimental Design for Systems and Synthetic Biology Using AMIGO2 233
% CVP DETAILS
inputs.DOsol.u_interp='stepf'; % Stimuli definition:
% 'sustained' |'step'|'stepsf'
|'linear'
inputs.DOsol.n_steps=5;
inputs.DOsol.u_guess=500*ones(1,inputs.DOsol.n_steps); % Guess for the
input
inputs.DOsol.u_min=zeros(1,inputs.DOsol.n_steps);
inputs.DOsol.u_max=1000*ones(1,inputs.DOsol.n_steps); % Min/max for the
input
inputs.DOsol.t_con=linspace(0,inputs.DOsol.tf_guess,inputs.DOsol.n_steps+1);
% Input swithching times: Initial and final time
3.2.3 Definition The user selects the initial value problem solver plus the optimizer.
of the Numerical Methods AMIGO_DO allows for successive input refinements, and there-
fore, the user may activate that possibility.
% SIMULATION
%
inputs.ivpsol.ivpsolver='cvodes'; % IVP solver: 'cvodes'(default,
C)|
% 'ode15s'
(default,MATLAB,sbml)|'ode113'|
% 'ode45'
inputs.ivpsol.rtol=1.0D-7; % [] IVP solver integration
tolerances
inputs.ivpsol.atol=1.0D-7;
% OPTIMIZATION
%
inputs.nlpsol.nlpsolver='local_fmincon'; % [] NLP solver:
% LOCAL: 'local_fmincon'|'local_n2fb'|'local_dn2fb'|'local_dhc'|
% 'local_ipopt'|'local_solnp'|'local_nomad'|
% MULTISTART:'multi_fmincon'|'multi_n2fb'|'multi_dn2fb'|'multi_dhc'|
% 'multi_ipopt'|'multi_solnp'|'multi_nomad'|
234 Eva Balsa-Canto et al.
% GLOBAL: 'de'|'sres'
% HYBRID: 'hyb_de_fmincon'|'hyb_de_n2fb'|'hyb_de_dn2fb'|'hyb_de_dhc'|
% 'hyp_de_ipopt'|'hyb_de_solnp'|'hyb_de_nomad'|
%
'hyb_sres_fmincon'|'hyb_sres_n2fb'|'hyb_sres_dn2fb'|'hyb_sres_dhc'|
% 'hyp_sres_ipopt'|'hyb_sres_solnp'|'hyb_sres_nomad'
% METAHEURISTICS:
% 'ess' or 'eSS' (default)
% Note that the corresponding defaults are in files:
% OPT_solvers\DE\de_options.m; OPT_solvers\SRES\sres_options.m;
% OPT_solvers\eSS_**\ess_options.m
%
inputs.nlpsol.reopt='off'; % Reoptimization
inputs.nlpsol.reopt_local_solver='fmincon'; % Optimiser for
reoptimization
inputs.nlpsol.n_reOpts=2; % Number of
reoptimizations
3.2.4 Running the Code The first step is to preprocess the model to generate necessary
scripts: C code for model simulation and the objective function.
After preprocessing, the AMIGO_DO can be run.
Fig. 5 Optimal experimental design for model selection. (a) The optimal IPTG profile corresponds to a pulse-
wise profile, starting from the absence of stimulation. Response time of model B to IPTG pulses is shorter than
the corresponding response time for model A. (b) The optimal IPTG profile starts from no stimulation for more
than half the experiment, and after that, the IPTG value increases to a final value of 152. The experiment lasts
the minimum allowed of 8 h. In both cases, models respond differently, thus being distinguishable
3.3 Optimal AMIGO_OED offers the possibility of solving the optimal experi-
Experimental Design mental design problem for parameter estimation. The problem is
for Parameter formulated as a dynamic optimization problem in which the objec-
Estimation in AMIGO2 tive is to find the experimental scheme that minimizes a specific
functional of the Fisher information matrix subject to a given set of
constraints. The model is defined as in Subheading 3.1.
3.3.1 Definition The toolbox predefines various OED problems. The most widely
of the Objective Functional used are as follows:
l D-optimum design corresponds to the maximization of the
determinant of the Fisher information matrix. This design is
236 Eva Balsa-Canto et al.
3.3.2 Definition The user needs to define what is being designed: initial conditions,
of the Optimization stimuli condition, observation function, experiment duration, and
Problem number and location of sampling times.
Inputs are shown in the sequel. In this particular example, we
will assume that we design a single 24-h experiment, and the input
profile corresponds to a step-wise profile with five elements of fixed
duration.
Optimal Experimental Design for Systems and Synthetic Biology Using AMIGO2 237
%===========================================
% DEFINITION OF EXPERIMENT 1: TO BE DESIGNED
%===========================================
inputs.exps.exp_type{1}='od';
inputs.exps.n_obs{1}=1; % Number of observables
inputs.exps.obs_names{1}=char('CitAU'); % Name of observables
inputs.exps.obs{1}=char('CitAU=CitAUB'); % Observation function
3.3.3 Definition The evaluation of the Fisher information matrix requires the solu-
of the Numerical Methods tion of the model parametric sensitivities. The toolbox implements
several possibilities including CVODES (for C code), a modifica-
tion of ode15s for sensitivity computation (for MATLAB models),
and a couple of finite differences schemes which may be used for C,
MATLAB, or blackbox models.
% SIMULATION
inputs.ivpsol.ivpsolver='cvodes'; % IVP solver:
'cvodes'(default,C)|
%'ode15s' (default, MATLAB,
sbml)|
%'ode113'|'ode45'
inputs.ivpsol.senssolver='cvodes'; % Sensitivities solver:'cvodes'
% (default,C)|
'sensmat'(matlab)|
% Finite differences:
'fdsens2'|'fdsens5'
inputs.ivpsol.rtol=1.0D-7; % Solver integration tolerances
inputs.ivpsol.atol=1.0D-7;
238 Eva Balsa-Canto et al.
Fig. 6 Optimal input profiles for parameter estimation. (a) Presents the optimal input to achieve maximum
information, that is, to maximize the determinant of the Fisher information matrix (Dopt). (b) Presents the
optimal input to achieve minimum correlation, that is, to maximize the minimum eigenvalue of the Fisher
information matrix (Eopt). The profiles are completely different, while for Dopt, the optimum corresponds to a
pulse-wise profile starting from full stimulation; for Eopt, it seems more convenient to use steps of different
magnitudes
3.3.4 Running the Code The first step is to preprocess the model to generate necessary
scripts: C code for model simulation and the objective function.
After preprocessing, the AMIGO_OED can be run.
Acknowledgments
References
1. Jaqaman K, Danuser G (2006) Linking data to deterministic method. Ind Eng Chem Res 44
models: data regression. Nat Rev Mol Cell Biol (5):1514–1523
7(11):813–819 14. Rodriguez-Fernandez M, Mendes P, Banga JR
2. Balsa-Canto E, Alonso AA, Banga JR (2008) (2006) A hybrid approach for efficient and
Computational procedures for optimal experi- robust parameter estimation in biochemical
mental design in biological systems. IET Syst pathways. Biosystems 83(2–3):248–265
Biol 2(4):163–172 15. Villaverde AF, Fröhlich F, Weindl D,
3. Kreutz C, Timmer J (2009) Systems biology: Hasenauer J, Banga JR (2019) Benchmarking
experimental design. FEBS J 276(4):923–942 optimization methods for parameter estima-
4. Walter E, Pronzato L (1997) Identification of tion in large kinetic models. Bioinformatics 35
parametric models from experimental data. (5):830–838
Springer 16. Egea JA, Balsa-Canto E, Garcı́a M-SG, Banga
5. Quarteroni A, Sacco R, Saleri F (2000) JR (2009) Dynamic optimization of nonlinear
Numerical mathematics. Springer-Verlag, processes with an enhanced scatter search
New York method. Ind Eng Chem Res 48(9):4388–4401
6. Fletcher R (1987) Practical methods of optimi- 17. Egea JA, Martı́ R, Banga JR (2010) An evolu-
zation. Wiley, Chichester tionary method for complex-process optimiza-
7. Seber GAF, Wild CJ (1989) Nonlinear regres- tion. Comp Oper Res 37(2):315–324
sion. Wiley series in probability and mathemat- 18. Balsa-Canto E, Henriques D, Gabor A, Banga
ical statistics. Wiley, New York JR (2016) AMIGO2, a toolbox for dynamic
8. Schittkowski K (2002) Numerical data fitting modeling, optimization and control in systems
in dynamical systems. Kluwer, Dordrecht biology. Bioinformatics 32(21):3357–3359
9. Fröhlich F, Kaltenbacher B, Theis FJ, Hase- 19. Vassiliadis VS, Sargent RWH, Pantelides CC
nauer J (2017) Scalable parameter estimation (1994) Solution of a class of multi-stage
for genome-scale biochemical reaction net- dynamic optimization problems: 1, problems
works. PLoS Comp Biol 13(1):e1005331 without path constraints, 2, problems with
path constraints. Ind Eng Chem Res 33
10. Balsa-Canto E, Banga JR, Alonso AA, Vassilia- (2111–2122):2123–2133
dis VS (2002) Restricted second order infor-
mation for the solution of optimal control 20. Gnugge R, Dharmarajan L, Lang M, Stelling J
problems using control vector parameteriza- (2016) An orthogonal permease–inducer–re-
tion. J Proc Cont 12(2):243–255 pressor feedback loop shows bistability. ACS
Synth Biol 5:1098–1107
11. Lin Y, Stadtherr MA (2006) Deterministic
global optimization for parameter estimation 21. Bandiera L, Hou Z, Kothamachu V, Balsa-
of dynamic systems. Ind Eng Chem Res Canto E, Swain P, Menolascina F (2018)
45:8438–8448 On-line optimal input design increases the effi-
ciency and accuracy of the modelling of an
12. Polisetty P, Voit E, Gatzke E (2006) Identifica- inducible synthetic promoter. Processes 6
tion of metabolic system parameters using (9):148
global optimization methods. Theor Biol Med
Model 3:4 22. Storn R, Price K (1997) Differential evolution –
a simple and efficient heuristic for global opti-
13. Balsa-Canto E, Vassiliadis VS, Banga JR (2005) mization over continuous spaces. J Glob
Dynamic optimization of single- and multi- Optim 11:341–359
stage systems using a hybrid stochastic-
Chapter 12
Abstract
Synthetic biology has so far made limited use of mathematical models, mostly because their inference has
been traditionally perceived as expensive and/or difficult. We have recently demonstrated how in silico
simulations and in vitro/vivo experiments can be integrated to develop a cyber-physical platform that
automates model calibration and leads to saving 60–80% of the effort. In this book chapter, we illustrate the
protocol used to attain such results. By providing a comprehensive list of steps and pointing the reader to
the code we use to operate our platform, we aim at providing synthetic biologists with an additional tool to
accelerate the pace at which the field progresses toward applications.
Key words Synthetic biology, Mathematical modeling, System identification, Optimal experimental
design, Microfluidics
1 Introduction
Filippo Menolascina (ed.), Synthetic Gene Circuits: Methods and Protocols, Methods in Molecular Biology, vol. 2229,
https://doi.org/10.1007/978-1-0716-1032-9_12, © Springer Science+Business Media, LLC, part of Springer Nature 2021
241
242 Lucia Bandiera et al.
Fig. 1 Cyber-physical platform and test case used to illustrate its implementation. (a) In the cyber-physical
platform, the computer where the OED algorithm is implemented quantifies gene expression and uses
Parameter Estimation/OED to stimulate the cells with inputs that maximize the amount of information
extracted per experiment. Such input is translated in a stimulus for the cells in the microfluidic device
using a Hydrostatic Pressure Modulation System (HPMS, see [7]). A microscope is used to observe cells and
close the OED loop. (b) An inducible promoter in engineered yeast cells, presented in [2], is considered in the
following. (c) Ordinary differential equation model used to mathematically formalize the behavior of the
inducible promoter
Fig. 2 On-line vs off-line OED. In off-line OED (a) the input (red signal) is optimized before the beginning of the
experiment and then applied during the experiment while the output (green) is recorded. The experiment stops
at τH ¼ τS when the data are gathered for a potential new iteration. At this point, the off-line and on-line
modes differ: in on-line OED (b) at τS < τH a new parameter estimation routine is run on the input/output data
acquired up until then (0 < t < τS). The resulting model ℳ( p1) is used to design a new optimal input u ∗ 2 that
maximizes the information content of subexperiment 2 when it is administered to the cells
2 Materials
3 Methods
3.1 Structural 1. In Matlab, create a .mat file with information on your model:
Identifiability list the symbolic variables (syms) and specify the model states
(x), the output variables (h), the unknown parameters (p), the
dynamic equations (f), the vector of initial conditions (ics), the
known initial conditions (known_ics), and the inputs (u) (see
Note 1).
2. Open the file options.m and specify as modelname the string
given to the generated .mat file. If not already existent, create a
directory called results where the Structural Identifiability
results will be saved. If the complexity of the model requires
decomposition for the analysis to be run, additionally specify
the directory path for MEIGO software.
3. Select the desired identifiability options for the computation of
the generalized observability-identifiability matrix (see Note 2).
With the inducible promoter model example, the rank of the
matrix has been computed symbolically (value set to 0), the
states have not been replaced with known initial conditions
(value set to 0), identifiability of initial conditions and input
observability have been checked (value set to 1), finding of
identifiable combinations, checks for unidentifiability and
model decomposition have not been selected (value set to 0),
and maximum time allowed for computing 1 Lie derivative has
been set to 1000 s. To resolve structural identifiability issues,
the basal transcriptional rate α has been fixed (see Note 3). This
parameter is therefore defined in the vector of previously iden-
tified parameters (prev_ident_pars) since its value has been
fixed.
4. Start the Structural Identifiability analysis (see Note 4) by
running the script called STRIKE_GOLDD.m.
246 Lucia Bandiera et al.
Fig. 3 Results of global sensitivity analysis on the inducible promoter model considering a multiexperiment
scheme composed by three random dynamic inputs. (a) Importance factors computed by AMIGO2 to quantify
global sensitivity considering a random initial guess for the parameter vector. Note that, while the output
sensitivity values depend on parameter estimates, the ranking of the kinetic rates remains conserved across
multiple runs. (b) Box plots, overlaid with swarmplots, of the importance factor δmsqr
p , computed for 30 random
initial guesses of the parameter vector, for each parameter of the model ( p). Decreasing values of the
importance factor (from left to right) relate to a smaller sensitivity of the model output to the parameter
3.3 Practical 1. Follow steps 1 and 2 from Subheading 3.2 to create the Matlab
Identifiability structure inputs.exps, which will additionally be populated
with the experimental data. To define the type of data used,
specify real or pseudo as a string in .data_type. Then, introduce
the data (.exp_data) and their associated error (.error_data).
2. Follow steps 3 and 4 from Subheading 3.2.
248 Lucia Bandiera et al.
Fig. 4 Results of the practical identifiability analysis. Example of joint plot of the
most (Kr) and least (kf) identifiable parameters in the inducible promoter model,
as selected from a comparison of the coefficient of variation of their estimates.
The marginal distributions were computed from the 95% confidence interval on
parameter estimates inferred on 600 in silico realizations of the experimental
data from a user-specified initial guess. The bivariate plot conveys information
about the correlation between the parameters (e.g., weak correlation in this
example)
until the puncher will break through the PDMS. Use the
puncher plunger to get rid of the PDMS core, lift the PDMS
layer, and carefully pull out the puncher from the hole while
rotating the puncher in a counterclockwise direction. Follow-
ing the steps above, punch all ports in all devices. Cover the
PDMS with magic tape again and, using a blade razor, isolate
single microfluidic chips following the grids.
7. Cleaning device ports. Insert a 25G needle in a short length of
tubing and connect the needle adapter to a 5-mL disposable
syringe filled with double distilled water. Insert the free extrem-
ity of the tubing in a port and apply pressure. Water should flow
through the port, removing PDMS debris. Repeat the outlined
procedure on all ports on both sides of the devices.
8. Bonding chips to coverslips. Warm up the plasma cleaner. After
15 min, run two cycles of vacuum (30 s), plasma (45 s), and
pressure release. The plasma cleaner is ready for use when a
bright pink plasma is visible in the chamber. Remove dust from
the device by covering it with magic tape and insert it in the
plasma cleaner, feature-side up. Using kimwipes, gently wipe
both sides of a high precision coverslip until it is completely free
of dust and insert it in the chamber of the plasma cleaner. Turn
on the vacuum (30 s) and apply the plasma (45 s). Turn off the
vacuum and gently release the pressure. Remove the device and
the coverslip from the plasma cleaner and quickly bond them
by letting the device fall on the coverslip from a 45 angle.
Application of a downward pressure, which could cause the
features to collapse, has to be avoided. Transfer the bonded
chip to a 60 C oven for 15 min. Repeat the above steps for
each chip. Place the devices in a petri dish and store them at
room temperature.
3.4.2 Overnight Culture 1. On the day before the experiment, under the fume hood, pick
an isolated, average size colony from an SC plate supplemented
with the appropriate sugar (e.g., 2% w/V glucose) and inocu-
late it in a 20-mL test tube containing 5 mL of SC media
supplemented with sugar (e.g., 2% w/V glucose) and the high-
est concentration of the chemical inducer to be used (e.g.,
1000 μM IPTG).
2. Grow the cell culture overnight in a shaking incubator at 30 C,
230 rpm.
3. On the day of the experiment, measure the Optical Density
(OD600) of the cell culture. Dilute the cell culture to an
OD600 ~ 0.1 in fresh media having the same composition of
the one used in the overnight culture.
4. Grow in a shaking incubator at 30 C, 230 rpm for 2–3 h or
until the cell culture reaches the middle exponential phase
(OD600 ∈ [0.3, 0.5]). In the meanwhile, proceed with the
following steps.
A Cyber-Physical Platform for Model Calibration 251
3.4.4 Wetting 1. Secure the microfluidic device to the lid of a petri dish, acting as
the Microfluidic Chip a chip holder, on one side of the cover slip using paper tape.
Examine the quality of the device features at 10 magnification
to verify correct punching of the ports and absence of debris
obstructing the channels.
2. Apply pressure to the 5-mL syringe containing media until the
short length of tubing is free from air bubbles and droplets
appear at its free end.
3. Insert the free end of the tubing in port 5 and apply a gentle
pressure to enable media flow while preventing the chip from
being lifted off the cover slip. When a media droplet appears at
port 4, detach the tubing from port 5 by applying a counter
pressure and connect it to port 4 (see Note 10). Repeat the
above procedure for ports 3, 1, and 2.
4. Under the microscope, verify the absence of air bubbles in the
chip. If air bubbles are present, repeat the procedure above.
5. Using kimwipes, remove the excess of media on top of the
device.
252 Lucia Bandiera et al.
Table 1
Content of the 50 mL syringes for the microfluidic experiment
Syringe
identifier Syringe content
1 SC media complemented with the appropriate sugar, inducer and fluorescent dye for a
total volume of 10 mL (e.g., 8.73 mL SC media, 1 mL 20% glucose, 100 μL IPTG
0.1 M, and 170 μL Sulforhodamine B 1 mM)
2 SC media complemented with the appropriate sugar for a total volume of 10 mL
3 10 mL of SC media
4 10 mL of SC media
5 5 mL of SC media
6 5 mL of cell culture
3.4.5 Connecting 1. Remove the device from the petri dish and secure it to the
Syringes to the Chip sample holder using electrical tape. Clean the lower side of the
cover slip using 70% EtOH and kimwipes.
2. At 10 magnification, re-examine the chip for absence of air
bubbles and debris in the ports, the channels, and the chamber.
Using the wetting syringe, cover the ports of the chip with
media.
3. Check for the absence of air bubbles in the lines and, operating
at the height of the stage, connect syringe 5 to its port. Proceed
connecting syringe 4, 1, 2, and 3.
4. Verify that no air bubbles were introduced in the microfluidic
device during the procedure and secure the tubing to the stage
using paper tape.
3.4.6 Calibration 1. Select the microscope channels to be used (DIC and the chan-
of the Microfluidic Device nel for the fluorescent dye used to track the inducing media,
e.g., sulforhodamine) and set the field of view at the DAW
junction of the microfluidic device (Fig. 5).
2. Specify the minimum (hmin) and maximum (hMax) heights of
the actuators, which generate an approximate mixing ratio of
0% (i.e., absence of fluorescent signal in the channel feeding the
chamber) and 100% (i.e., fluorescent signal detected across the
entire width of the main channel). From these, the average
height (hmean, mixing ratio of approximately 50%) can be
retrieved.
3. To enable an accurate, a posteriori estimate of the 0% and 100%
mixing ratios, heights that correspond to pressures that will
slightly overshoot the central channel of the DAW junction
should be considered. To this aim, we specify a range for the
A Cyber-Physical Platform for Model Calibration 253
3.4.7 Loading the Cells 1. Using a spectrophotometer, measure the OD600 of the cell
culture to verify whether it reached middle exponential phase.
2. Prepare the cell syringe, having ID 6 (see Subheading 3.4.3,
step 4) and attach it at a height above 23 cm from the micro-
scope stage.
3. Disconnect syringe 5 and connect syringe 6 to port 5. Move
syringe 4 to an upper position while keeping it below the cell
syringe.
4. Monitoring in live DIC, at a 60 magnification, verify cell flow.
Flickering the tubing of syringe 6 might help to perturb the
flow. To prevent premature clogging of the device, the initial
number of cells in the trap should be low, ideally below 10.
5. When satisfied with the number of cells in the trap, adjust the
syringes to the running position: gently lower syringe 4 to
23 cm above the stage, bring the cell syringe to the same
height, disconnect the cells reservoir from port 5, and plug in
syringe 5.
6. Verify the absence of air bubbles in the ports, channels, and the
chamber.
3.4.8 Microscope Setup 1. Using a 70% EtOH-wet kimwipe, clean the 40 objective. Add
oil and set the focus.
2. With the help of the stage controller to navigate the device,
mark the position of the chamber and DAW junction.
3. Select the DIC and fluorescence channels (e.g., sulforhoda-
mine, citrine) to be acquired during the experiment. For each
of them, specify the exposure time.
4. Specify the sampling frequency and the number of acquired
images. These two fields determine the duration of the
experiment.
A Cyber-Physical Platform for Model Calibration 255
3.5.1 Fine-Tuning 1. Open manually annotated images in ImageJ (see Note 17).
of the Weights 2. Open the U-Net Job manager (Plugins! U-Net ! Job Man-
of the Convolutional Neural ager) and select Fine-tuning.
Network
3. Use pretrained weights, available in the U-Net example
2d_cell_net_v0.caffemodel.h5, as a starting point for transfer
learning (see Note 18).
4. Subdivide the set of annotated images in training (67%) and
test set (33%). Both sets should contain representative samples
of the images to segment.
5. Specify the number of evaluations of the loss function used to
optimize the network weights. While 1.5 105 iterations are
normally enough, the number should be significantly increased
when the network is trained from scratch. In addition, set the
learning rate (1 105) and the validation interval (150).
6. Specify the file name and path where the resultant weights will
be saved (see Note 19).
7. Untick the selection “labels are classes” (see Note 20).
8. Press OK to start the fine-tuning.
9. Statistics plots are generated in real time during training.
Among these, the Loss function and the Intersection Over
Union plots are the most informative (Fig. 6).
10. Once newly optimized weights are available, qualitatively check
the performances of the network on samples in the validation
set and manually annotated images not included in the tuning.
3.5.2 Image 1. To isolate from the images the section corresponding to the cell
Segmentation trap, cut all DIC (and fluorescence) images using a rectangular
region of interest located at the same coordinate.
2. In ImageJ, open the DIC image to be segmented.
3. Open the U-Net Job manager (Plugins! U-Net ! Job Man-
ager) and select Segmentation.
256 Lucia Bandiera et al.
Fig. 6 Statistics plots generated by U-Net [5] during fine-tuning. (a) Plot of the intersection over union metric
as a function of the number of iterations in fine-tuning. The metric, computed as the ratio between the
overlapping cell objects predicted from convolutional neural network and identified in the ground-truth (i.e.,
manually annotated images) and the cell area encompassed by both, quantifies the accuracy of cell detection.
A value above 0.5 suggests a good prediction. (b) Cross-entropy loss, computed on the training (gray line) and
validation (blue line) sets, as refinement of the network weights occurs. In this example, convergence to
optimal weights is achieved after a limited number of iterations
3.5.3 Cell-Tracking 1. To identify single cells from the population in the binary image,
and Extraction open each image as a matrix in Matlab and label connected
of Fluorescence components by applying the bwlabel function. Hence, save the
Time-Series matrix as an image in TIFF format.
2. Open Lineage Mapper (Plugins! Tracking ! Lineage
Mapper).
3. Specify the path and file names for the images to be tracked as
well as the identifier of the directory and files in which the
results, i.e., masks with the tracking indexes, will be stored.
4. Populate the fields corresponding to the tracking parameters,
following the instructions provided by the plugin developers
[12] (see Note 22).
5. Press the tab “track.”
A Cyber-Physical Platform for Model Calibration 257
Fig. 7 Visual representation of the outcome of a microfluidic experiment in which the response of cells to a
random stepwise input (blue line) is measured in fluorescence microscopy. DIC images, together with the
associated binary mask, acquired at 0 and 24 h are shown (top panels). The mean fluorescence across the cell
population (black line) and its standard deviation (gray shaded area) are computed from single cell time series.
Representative single cell data are shown as yellow, pink, and purple lines. Note that the bottom panel reports
in silico data
3.6 Parameter 1. Follow steps 1 and 2 from Subheading 3.2 and step 1 from
Estimation Subheading 3.3 to create the inputs.exps Matlab structure.
Populate it with the experimental data (Fig. 8a, b) obtained
from Subheadings 3.4 and 3.5.
258 Lucia Bandiera et al.
Fig. 8 Comparison of pseudo-experiments in which random (a, orange line) or optimally designed (b, cyan line)
inputs are used to gather data for parameter estimation. While aimed at exemplifying the outcome of OED and
PE in the cyber-physical platform, pseudo-data were here obtained by sampling the model output, in response
to the shown input, and adding 5% Gaussian noise. The green line represents the calibrated model response
to the data. (c) Distributions of the estimate of parameter γf, inferred with the two input profiles when
assuming a uniform prior, are compared to the true parameter value. The higher informative content of the
optimally designed input is reflected in the location (i.e., centered on the true value) and width of the
distribution
3.7 Optimal 1. Follow step 1 in Subheading 3.2 to create the inputs Matlab
Experimental Design structure that contains the ODEs.
for Model Calibration 2. Create the inputs.exps Matlab structure as described in step
2 from Subheading 3.2. To specify the properties of the
A Cyber-Physical Platform for Model Calibration 259
4 Notes
References
1. Bandiera L, Hou Z, Kothamachu V, Balsa- 3. Villaverde AF, Barreiro A, Papachristodoulou
Canto E, Swain P, Menolascina F (2018) A (2016) Structural identifiability of dynamic
On-line optimal input design increases the effi- systems biology models. PLoS Comput Biol 12
ciency and accuracy of the modelling of an (10):1–22
inducible synthetic promoter. Processes 6 4. Balsa-Canto E, Henriques D, Gábor A, Banga
(9):148 JR (2016) AMIGO2, a toolbox for dynamic
2. Gnügge R, Dharmarajan L, Lang M, Stelling J modeling, optimization and control in systems
(2016) An orthogonal Permease-inducer- biology. Bioinformatics 32(21):3357–3359
repressor feedback loop shows bistability. ACS
Synth Biol 5(10):1–29
A Cyber-Physical Platform for Model Calibration 265
5. Falk T et al (2019) U-net: deep learning for cell segmentation of budding yeast. Bioinformatics
counting, detection, and morphometry. Nat 34(1):88–96
Methods 16(1):67–70 12. “Lineage Mapper User Guide.” [Online].
6. Chalfoun J, Majurski M, Dima A, Halter M, https://github.com/USNISTGOV/Lineage-
Bhadriraju K, Brady M (2016) Lineage map- Mapper/wiki/User-Guide
per: a versatile cell and particle tracker. Sci Rep 13. Egea JA, Henriques D, Cokelaer T, Villaverde
6:1–9 AF, Julio R (2014) MEIGOR: a software suite
7. Ferry MS, Razinkov IA, Hasty J (2011) Micro- based on metaheuristics for global optimiza-
fluidics for synthetic biology, vol 497, 1st edn. tion in systems biology and bioinformatics.
Elsevier Inc., San Diego Continuous and mixed-integer problems:
8. Versari C et al (2017) Long-term tracking of enhanced scatter search, pp. 1–33
budding yeast cells in brightfield microscopy: 14. Ligon TS, Fröhlich F, Chiş OT, Banga JR,
CellStar and the Evaluation Platform. J R Soc Balsa-Canto E, Hasenauer J (2018) GenSSI
Interface 14:20160705 2.0: multi-experiment structural identifiability
9. Dimopoulos S, Mayer CE, Rudolf F, Stelling J analysis of SBML models. Bioinformatics 34
(2014) Accurate cell segmentation in micros- (8):1421–1423
copy images using membrane patterns. Bioin- 15. Balsa-canto E, Alonso AA, Banga JR (2010) An
formatics 30(18):2644–2651 iterative identification procedure for dynamic
10. Bredies K, Wolinski H (2011) An active- modeling of biochemical networks. BMC Syst
contour based algorithm for the automated Biol 4:11
segmentation of dense yeast populations on 16. “AMIGO2 Documentation.” [Online].
transmission microscopy images. Comput Vis https://sites.google.com/site/
Sci 14(7):341–352 amigo2toolbox/doc
11. Bakker E, Swain PS, Crane MM (2018) Mor-
phologically constrained and data informed cell
Chapter 13
Abstract
Heterologous gene expression draws resources from host cells. These resources include vital components to
sustain growth and replication, and the resulting cellular burden is a widely recognized bottleneck in the
design of robust circuits. In this tutorial we discuss the use of computational models that integrate gene
circuits and the physiology of host cells. Through various use cases, we illustrate the power of host–circuit
models to predict the impact of design parameters on both burden and circuit functionality. Our approach
relies on a new generation of computational models for microbial growth that can flexibly accommodate
resource bottlenecks encountered in gene circuit design. Adoption of this modeling paradigm can facilitate
fast and robust design cycles in synthetic biology.
Key words Cellular burden, Growth models, Whole-cell modeling, Gene circuit design, Synthetic
biology, Resource allocation
1 Introduction
Filippo Menolascina (ed.), Synthetic Gene Circuits: Methods and Protocols, Methods in Molecular Biology, vol. 2229,
https://doi.org/10.1007/978-1-0716-1032-9_13, © Springer Science+Business Media, LLC, part of Springer Nature 2021
267
268 Evangelos-Marios Nikolados et al.
promoters RBS
host-circuit model
growth resource
defects usage
design circuit
dbl time
space function
expression translation
parameter 2
protein
time
ribosomes
parameter 1
2.1 Bacterial Bacterial growth has been an active topic of study for many decades.
Growth Laws The celebrated work of Nobel laureate Jacques Monod provided a
key quantitative description for growth [43], based on the observa-
tion that bacteria in batch cultures exhibit several phases of growth:
l Lag phase: cells do not immediately start to grow after nutrient
induction, as they first must adapt to the new environment;
RNA and proteins are produced as the cell prepares for division.
l Exponential phase: cells duplicate at a constant rate, so that
their number grows exponentially as N(t) ¼ N02 t/τ with τ being
the average doubling time. Equivalently, the number of cells can
be expressed as N(t) ¼ N0eλt, where λ ¼ log 2=τ is the
growth rate.
Host-Circuit Modelling 271
ribosomes
proteome
pr
ro
o
transcription
translation
n
energy enzymes
metabolism
nutrients
Fig. 2 Mechanistic model for bacterial growth. The model predicts growth rate from the allocation of two
cellular resources (energy and ribosomes) among the various processes that fuel growth and replication [35]
Host-Circuit Modelling 273
Table 1
Chemical reactions in the mechanistic growth model [35]
where the sum over x is over all types of protein in the cell. Overall,
energy is created by metabolizing si and lost through translation
and dilution by growth. The positive term in Eq. 8 determines
energy yield per molecule of internalized nutrient from Eq. 4.
The parameter ns describes the nutrient efficiency of the growth
medium.
In rapidly growing E. coli, it is known that transcription has a
minor role in energy consumption [52]. We therefore model tran-
scription as an energy-dependent process, but with a negligible
impact on the overall energy pool. If wx,max denotes the maximal
transcription rate, the effective transcription rate has the form
a
w x ¼ w x;max , ð9Þ
θx þ a
for all proteins except house-keeping ones, i.e. x ∈{r, t, m}. We
assume that the transcription of housekeeping mRNAs is subject to
negative autoregulation so as to keep constant expression levels in
various growth conditions:
a 1
wq ¼ w q;max :
θq þ a 1 þ ðpq =K q Þhq
|fflfflffl{zfflfflffl} |fflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflffl} ð10Þ
energy dependent negative
translation autoregulation
In Eqs. 9 and 10, the parameter θ x denotes a transcriptional thresh-
old, while Kq and hq are regulatory parameters. The differential
equations for the number of mRNAs (mx) are therefore:
m_ x ¼ wx ðλ þ d m Þmx þ vx kb pr m x þ ku c x , ð11Þ
where x ∈{r, t, m, q}. In Eq. 11, mRNAs are produced through
transcription with rate wx, while mRNAs are lost through dilution λ
and degradation with rate dm. At the same time, mRNAs bind and
unbind with ribosomes, so that the ribosome–mRNA complexes
(cx) follow
c_x ¼ λc x vx þ kb pr m x ku c x , ð12Þ
where kb and ku are the rate constants of binding and unbinding.
Translation contributes with a positive term to Eq. 11 and a nega-
tive term to Eq. 12. The differential equations for protein abun-
dance are therefore:
p_x ¼ v x λpx , x∈ft, m, qg: ð13Þ
We note that Eq. 13 applies to all proteins except free ribosomes.
The equation for free ribosomes pr includes an additional term:
P
p_ r ¼ vr λpr þ ðvx kb pr mx þ ku c x Þ: ð14Þ
x∈fr, t, m, qg
276 Evangelos-Marios Nikolados et al.
Table 2
Model parameters for an Escherichia coli host, taken from [35]
P
p_ r ¼ v r λpr þ ðvx kb pr m x þ ku c x Þ
X
x
γðaÞ X X
λ ¼
M
ð cx þ c ci Þ:
x i
|fflffl{zfflffl} ð21Þ
ribosomal
complexes
a b x102
wild-type 50
metabolic 100
ribosomal
growth rate
% of WT
expression (# of molecules)
50
translation
rates 30
100 104
house-keeping
free
ribosomes
house-keeping
10 uptake enzyme
metabolic enzyme
ribosomes heterologous protein
0
100 101 102 103 104
bound
gene induction (mRNAs/min)
Fig. 3 Simulation of an inducible gene. (a) Steady state translation rates and ribosomal abundance predicted
for the wild-type Escherichia coli model, parameterized as in Table 2. (b) Predicted steady state expression of
a heterologous gene for increasing induction strength. The pie charts indicate translation rates and ribosomal
abundance as in the left panel. The inset shows the predicted growth rate, relative to the wild-type. The
induction strength was modeled with the parameter wmax,rep in Eq. 24. The binding rate constant was set equal
to the dissociation rate constant, so that kb,rep ¼ 1 102 min1molecules1, ku,rep ¼ 1 102 min1.
Transcript and protein half-lives were set to two and four minutes, respectively [5], so that
d m,rep = ln 2 / 2 min −1 and d p,rep = ln 2 / 4 min −1
a b c
input 1 input 1
gene 1 gene 1
input output output output
gene 1 gene 2 gene 3 gene 3 gene 4
gene 2 gene 2
NOT NOT
input 2 input 2
AND AND
NAND
Fig. 4 Logic gates based on transcriptional regulators. (a) The NOT gate contains two genes connected in
cascade. Repression of gene 2 inverts the input signal. (b) The AND gate contains three genes, in which two
transcriptional activators jointly trigger the expression of a third output gene. (c) The NAND gate contains four
genes and is the composition of an AND and a NOT gate. Circuit connectivities are based on the implementa-
tion by Wang et al. [61]
4.1 Host-Aware The NOT gate contains two genes in cascade, where gene 1 codes
NOT Gate for a transcriptional repressor that inhibits the expression of gene 2;
the circuit diagram is shown in Fig. 4a. We first model the NOT
gate in isolation using Eq. 25. We choose the regulatory functions
Ri as
1
R1 ¼ 1, R2 ¼ c h :
p1 ð26Þ
1þ
K c1
The choice of R2 models the inhibition of gene 2, and differ-
ent inhibitory strengths and cooperativity effects can be described
by suitable choices of the threshold K c1 and Hill coefficient h. We fix
K c1 ¼ 250 molecules and h1 ¼ 2.
As shown in Fig. 5a, the isolated models correctly predict the
expected circuit function, with stronger induction of the input gene
1 gradually suppressing the expression of the output proteins (pc2 ),
with strong induction resulting in minimal output yield. In other
words, the gate has high output only when the input signal is low, in
effect acting as an inverter of the input signal.
To simulate the host-aware NOT gate, we follow the procedure
outlined in Subsection 3.1. The host-aware simulations shown in
Fig. 5b suggest that the function of the NOT gate remains largely
unaffected by host–circuit interactions. For intermediate input
levels, simulations predict an increase in growth rate of up to
50% with respect to a basal case. Such apparent growth benefit
is a consequence of the circuit architecture (Fig. 4a): an increase in
the input causes a stronger repression of gene 2 and thus relieves
the burden on the host. But since the expression of the repressor
coded by gene 1 also burdens the host, for high inputs the expres-
sion of gene 1 counteracts the growth advantages gained by repres-
sion of gene 2, resulting in an overall drop in growth rate.
a b
isolated model host-aware model
x102 x102
25 50 150
growth rate (% of basal)
output (# molecs.)
output (# molecs.)
NOT
100
15
input output 25 basal
0 1
50
1 0
5
0 0 0
100 101 102 103 104 100 101 102 103 104 100 101 102 103 104
input (mRNAs/min) input (mRNAs/min) input (mRNAs/min)
Fig. 5 Host-aware simulation of a NOT gate. (a) Gate output predicted by a model isolated from the cellular
host. Inset shows the Boolean truth table for the NOT gate. (b) Output and growth rate predictions from host-
aware model of the NOT gate. Growth rate is normalized to a basal case
284 Evangelos-Marios Nikolados et al.
4.2 Host-Aware The AND gate comprises two genes that co-activate a third output
AND Gate gene (Fig. 4b). As built in the original implementation [61], the
promoter for gene 3 is activated only when both the co-dependent
enhancer-binding proteins, encoded by genes 1 and 2, are present
in a heteromeric complex. Consequently, the regulatory functions
for the AND gate are:
c h1 c h 2
p1 p2
K c1 K c2
R1 ¼ 1, R2 ¼ 1, R3 ¼ c h1 c h 2 , ð27Þ
p1 p2
1þ 1þ
K c1 K c2
with K c1 ¼ 200 molecules and h1 ¼ 2.381 for the activation by
gene 1, and K c2 ¼ 3000 molecules and h2 ¼ 1.835 for the activa-
tion by gene 2; these values are similar to the parameter values
estimated in Wang et al. [61].
Simulations of the isolated model (Fig. 6a) show that, as
expected, the gate has a high output only when the input signals
are high. This agrees with the expected truth table of the AND,
shown in the inset of Fig. 6a. In contrast, simulations of the host-
aware model, shown Fig. 6b, suggest a strong impact of host–
circuit interactions. The host-aware model predicts a bell-shaped
response surface, where the output reaches a maximal value for an
intermediate level of the inputs, beyond which the output drops
monotonically. Such loss-of-function coincides with a drop in
growth rate observed for increased levels of either input, as seen
in the right panel of Fig. 6b, and thus suggests a link between
growth defects and poor circuit function.
a isolated model
b host-aware model
104 104 104
output (# molecules x104)
input 1 (mRNAs/min)
input 1 (mRNAs/min)
AND
102 102 102
Fig. 6 Host-aware simulation of an AND gate. (a) Output predicted by a model isolated from the cellular host.
Inset shows the Boolean truth table for the AND gate. (b) Output and growth rate predictions from host-aware
model of the AND gate across the input space. Growth rate is normalized to the basal case in lower left corner
of the heatmap
Host-Circuit Modelling 285
4.3 Host-Aware The NAND gate is the negation of an AND gate, and thus pro-
NAND Gate duces a low output only when both inputs are high. As shown in
Fig. 4c, the gate has four genes connected as the composition of an
AND and NOT gate. As with the previous two cases, we simulate
the isolated model using Eq. 25. The regulatory functions for the
NAND gate are:
R1 ¼ 1,
R2 ¼ 1,
h c h 2
pc1 1 p2
c
K1 K c2
R3 ¼ c h 1 c h2 ,
p1 p2 ð28Þ
1þ 1þ
K c1 K c2
1
R4 ¼ c h 3 ,
p3
1þ
K c3
with parameter values for R3 equal to those for R3 of the AND gate
in Eq. 27, and parameter values for R4 equal to those of R2 for the
NOT gate in Eq. 26.
As shown in Fig. 7, simulations reveal substantially different
predictions between the isolated and host-aware models of the
NAND gate. The host-aware model predicts a complex relation
between inputs and output that differs from the ideal response
predicted by the isolated model. Host-aware simulations produce
the correct response across a range of the input space (Fig. 7b), but
display significant distortions possibly caused by the loss-of-func-
tion of the AND component shown in Fig. 6b. The impact of host–
circuit interactions can also be observed in the predicted growth
rate, which suggests a growth advantage for intermediate levels of
the inputs. This is a result of the architecture of the NOT gate, akin
to what we observed in Fig. 5b.
NAND 45 150
input 1 (mRNAs/min)
input 1 (mRNAs/min)
input 1 (mRNAs/min)
2.5
103 output
103 103
input 1 input 2 output
0 0 1
0 1 1
102 1 0 1
102 102
1 1 0
0 0 basal 0
100 100 100 0
100 101 102 103 104 100 101 102 103 104 10 101 102 103 104
input 2 (mRNAs/min) input 2 (mRNAs/min) input 2 (mRNAs/min)
Fig. 7 Host-aware simulation of a NAND gate. (a) Output predicted by a model isolated from the cellular host.
Inset shows the Boolean truth table for the NAND gate. (b) Output and growth rate predictions from host-aware
model of the AND gate across the input space. Growth rate is normalized to the basal case in lower left corner
of the heatmap
286 Evangelos-Marios Nikolados et al.
4.4 Impact of Design In this final section, we conduct a series of simulations that mimic
Parameters on Circuit experiments commonly used in circuit design. These aim to explore
Function the impact of design parameters and growth media on circuit
function.
4.4.1 Ribosomal Binding A number of studies have shown that RBS strength is a key mod-
Sites (RBS) ulator of cellular burden [21, 29–31]. Here we examine the impact
of RBS strengths on the AND and NAND gates from the previous
section. Using the notation in our model, see e.g. Eq. 16, we define
the RBS strength as:
kcb;i
RBSi ¼ c , ð29Þ
ku;i
where kcb;i is the mRNA-ribosome binding rate constant (in units of
min1molecules1), and kcu;i is their dissociation rate constant
(in units of min1).
We simulated the AND and NAND gates with variable RBS
strengths and gene induction strengths. As shown in Fig. 8a (left),
the AND gate retains its function for increasing RBS strength. We
observe that for the same induction, designs with stronger RBS
lead to increased circuit yield. At the same time, the simulations
predict (Fig. 8a, left) a larger bell-shaped response surface, suggest-
ing, that by increasing RBS, we expect a slightly larger design space
where the output can reach a larger maximal value for the same
range of inputs. In all cases, however, after the output reaches a
maximal value, we find a monotonic drop in circuit yield. The loss-
of-function coincides with a drop in growth rate observed in all
designs (Fig. 8a, right), which becomes more pronounced with
stronger RBS.
As shown in Fig. 8b, the impact of RBS is more notable for the
NAND gate. For designs with stronger RBS (insets Fig. 8b, left),
but weak induction, the gate displays a behavior akin to that of the
basal case. For intermediate induction, increasing RBS strength has
more detrimental effects on the circuit’s function. Specifically, the
NOT component fails to fully repress the AND component, thus
distorting the region where the circuit is functional. However,
further increase in RBS greatly impairs the system leading to near
total loss-of-function across the entire response surface (insets
Fig. 8b, left). Likewise, for stronger RBS and intermediate levels
of the input, we observe loss of the growth advantage gained by the
NOT gate component (Fig. 8b, right).
4.4.2 Nutrient Quality Bacterial growth is known to depend critically on the quality of the
growth media. As a final illustration of our approach, we used the
host-aware models to explore the impact of media on the function
of the transcriptional logic gates. We model the quality of the media
Host-Circuit Modelling 287
AND
input 1 (mRNAs/min)
103
input 1 (mRNAs/min)
103
AND
input 1
input 1
102 input 2 102 input 2
RBS X50 RBS X50
101 101
input 1
input 1 basal
0
10 0 input 2 10 input 2
100 101 102 103 104 100 101 102 103 104
input 2 (mRNAs/min) input 2 (mRNAs/min)
NAND
103 103
input 1 (mRNAs/min)
input 1
input 1
input 2
102 102 input 2
RBS X50 RBS X50
101 101
input 1
input 1
basal
100 input 2
100 input 2
100 101 102 103 104 100 101 102 103 104
input 2 (mRNAs/min) input 2 (mRNAs/min)
Fig. 8 Impact of ribosomal binding site (RBS) strength. (a) Output and growth rate predictions for the AND gate
in Fig. 4b and three RBS strengths. (b) Output and growth rate predictions for the NAND gate in Fig. 4c. RBS
strengths were computed from Eq. 29 by simultaneously increasing the binding rate constant
k cb;i ∈f102 , 101:5 , 101:155 g and decreasing the dissociation rate constant k cu;i ∈f102 , 102:5 , 102:855 g
in a pairwise manner for i ¼ 3 (AND gate) and i ¼ 4 (NAND gate). Gene induction strengths were varied in the
range 100 w cmax;i 104 mRNAs/min for i ¼ 1, 2 in both gates, and fixed w cmax;3 ¼ 375 mRNAs/min for the
AND gate, and w cmax;3 ¼ 375 mRNAs/min and w cmax;4 ¼ 250 mRNAs/min for NAND gate
input 1 (mRNAs/min)
input 1 (mRNAs/min)
NAND
103 103
AND
input 1
input 1
input 2 input 2
102 n s = 0.2
102 n s = 0.2
101 101
input 1
input 1
input 2 input 2
100 100
100 10 1 2
10 103
10 4 100 10 1
10 2
10 3
10 4
Fig. 9 Impact of growth media on circuit function. (a) Simulations of the AND gate in Fig. 4b in various growth
media. (b) Simulations of the NAND gate in Fig. 4c in various growth media. In both cases the nutrient quality
parameter was set to n s ∈f0:2, 0:6, 1:0g; all other model parameters are identical to the simulations in Figs. 6
and 7b
5 Discussion
References
1. Andrianantoandro E, Basu S, Karig DK, Weiss 8. Tabor JJ, Salis HM, Simpson ZB, Chevalier
R (2006) Synthetic biology: new engineering AA, Levskaya A, Marcotte EM, Voigt CA,
rules for an emerging discipline. Mol Syst Biol Ellington AD (2009) A synthetic genetic edge
2(1):2006.0028 detection program. Cell 137(7):1272–1281
2. Canton B, Labno A, Endy D (2008) Refine- 9. Mannan AA, Liu D, Zhang F, Oyarzún DA
ment and standardization of synthetic (2017) Fundamental design principles for
biological parts and devices. Nat Biotechnol transcription-factor-based metabolite biosen-
26(7):787 sors. ACS Synth. Biol. 6:1851–1859
3. Ninfa AJ, Selinsky S, Perry N, Atkins S, Song 10. Oyarzún DA, Stan G-BV (2013) Synthetic
QX, Mayo A, Arps D, Woolf P, Atkinson MR gene circuits for metabolic control: design
(2007) Using two-component systems and trade-offs and constraints.. J R Soc Interf
other bacterial regulatory factors for the fabri- 10:20120671
cation of synthetic genetic devices. Methods 11. Nielsen AA, Der BS, Shin J, Vaidyanathan P,
Enzymol 422:488–512 Paralanov V, Strychalski EA, Ross D,
4. Teo JJ, Woo SS, Sarpeshkar R (2015) Synthetic Densmore D, Voigt CA (2016) Genetic circuit
biology: a unifying view and review using ana- design automation. Science 352(6281):
log circuits. IEEE Trans Biomed Circ Syst 9 aac7341
(4):453–474 12. Chaves M, Oyarzún DA (2019) Dynamics of
5. Elowitz MB, Leibler S (2000) A synthetic complex feedback architectures in metabolic
oscillatory network of transcriptional regula- pathways. Automatica 99:323–332
tors. Nature 403(6767):335 13. Carbonell P, Radivojevic T, Garcı́a Martı́n H
6. Hasty J, McMillen D, Collins JJ (2002) Engi- (2019) Opportunities at the intersection of
neered gene circuits. Nature 420(6912):224 synthetic biology, machine learning, and auto-
7. Gardner TS, Cantor CR, Collins JJ (2000) mation. ACS Synth Biol 8:1474–1477
Construction of a genetic toggle switch in 14. Hughes RA, Ellington AD (2017) Synthetic
Escherichia coli. Nature 403(6767):339 DNA synthesis and assembly: putting the
290 Evangelos-Marios Nikolados et al.
synthetic in synthetic biology. Cold Spring Stan G-B, Ellis T (2018) Burden-driven feed-
Harbor Perspect Biol 9:a023812 back control of gene expression. Nat Methods
15. Rondelez Y (2012) Competition for catalytic 15(5):387
resources alters biological network dynamics. 29. Gyorgy A, Jiménez JI, Yazbek J, Huang H-H,
Phys Rev Lett 108(1):018102 Chung H, Weiss R, Del Vecchio D (2015)
16. Cardinale S, Arkin AP (2012) Contextualizing Isocost lines describe the cellular economy of
context for synthetic biology–identifying genetic circuits. Biophys J 109(3):639–646
causes of failure of synthetic biological systems. 30. Carbonell-Ballestero M, Garcia-Ramallo E,
Biotechnol J 7(7):856–866 Montañez R, Rodriguez-Caso C, Macı́a J
17. Gyorgy A, Del Vecchio D (2014) Limitations (2015) Dealing with the genetic load in bacte-
and trade-offs in gene expression due to com- rial synthetic biology circuits: convergences
petition for shared cellular resources. In: 2014 with the ohm’s law. Nucleic Acids Res 44
IEEE 53rd Annual Conference on Decision (1):496–507
and Control (CDC), pp. 5431–5436. IEEE, 31. Gorochowski TE, Avcilar-Kucukgoze I,
New York (2014) Bovenberg RA, Roubos JA, Ignatova Z
18. Mather WH, Hasty J, Tsimring LS, Williams RJ (2016) A minimal model of ribosome alloca-
(2013) Translational cross talk in gene net- tion dynamics captures trade-offs in expression
works. Biophys J 104(11), 2564–2572 between endogenous and synthetic genes. ACS
19. Scott M, Gunderson CW, Mateescu EM, Synth Biol 5(7):710–720
Zhang Z, Hwa T (2010) Interdependence of 32. Karr JR, Sanghvi JC, Macklin DN, Gutschow
cell growth and gene expression: origins and MV, Jacobs JM, Bolival Jr B, Assad-Garcia N,
consequences. Science 330(6007):1099–1102 Glass JI, Covert MW (2012) A whole-cell
20. Tan C, Marguet P, You L (2009) Emergent computational model predicts phenotype
bistability by a growth-modulating positive from genotype. Cell 150(2):389–401
feedback circuit. Nat Chem Biol 5(11):842 33. Purcell O, Jain B, Karr JR, Covert MW, Lu TK
21. Ceroni F, Algar R, Stan G-B, Ellis T (2015) (2013) Towards a whole-cell modeling
Quantifying cellular capacity identifies gene approach for synthetic biology. Chaos 23
expression designs with reduced burden. Nat (2):025112
Methods 12(5):415 34. Klumpp S, Zhang Z, Hwa T (2009) Growth
22. An W, Chin JW (2009) Synthesis of orthogonal rate-dependent global effects on gene expres-
transcription-translation networks. Proc Natl sion in bacteria. Cell 139:1366–1375
Acad Sci 35. Weiße AY, Oyarzún DA, Danos V, Swain PS
23. Segall-Shapiro TH, Meyer AJ, Ellington AD, (2015) Mechanistic links between cellular
Sontag ED, Voigt CA (2014) A resource allo- trade-offs, gene expression, and growth. Proc
cator for transcription based on a highly frag- Natl Acad Sci 112(9):E1038–E1047
mented T7 RNA polymerase. Mol Syst Biol 10 36. Liao C, Blanchard AE, Lu T (2017) An inte-
(7):742 grative circuit–host modelling framework for
24. Pasini M, Fernández-Castané A, Jaramillo A, predicting synthetic gene network behaviours.
de Mas C, Caminal G, Ferrer P (2016) Using Nat. Microbiol. 2(12):1658
promoter libraries to reduce metabolic burden 37. Thomas P, Terradot G, Danos V, Weiße AY
due to plasmid-encoded proteins in recombi- (2018) Sources, propagation and conse-
nant Escherichia coli. New Biotechnol 33 quences of stochasticity in cellular growth.
(1):78–90 Nat Commun 9(1):1–11
25. Shopera T, He L, Oyetunde T, Tang YJ, Moon 38. Nikolados E-M, Weiße AY, Ceroni F, Oyarzún
TS (2017) Decoupling resource-coupled gene DA (2019) Growth defects and loss-of-func-
expression in living cells. ACS Synth Biol 6 tion in synthetic gene circuits. ACS Synth Biol
(8):1596–1604 8(6):1231–1240
26. Darlington APS, Kim J, Jiménez JI, Bates DG 39. O’Brien EJ, Lerman JA, Chang RL, Hyduke
(2018) Dynamic allocation of orthogonal ribo- DR, Palsson B (2013) Genome-scale models of
somes facilitates uncoupling of co-expressed metabolism and gene expression extend and
genes. Nat Commun 9:695 refine growth phenotype prediction. Mol Syst
27. Rugbjerg P, Sarup-Lytzen K, Nagy M, Som- Biol 9:693
mer MOA (2018) Synthetic addiction extends 40. Carrera J, Covert MW (2015) Why build
the productive life time of engineered Escher- whole-cell models? Trends Cell Biol 25
ichia coli populations. Proc Natl Acad Sci 115 (12):719–722
(10):2347–2352 41. Karr JR, Takahashi K, Funahashi A (2015) The
28. Ceroni F, Boo A, Furini S, Gorochowski TE, principles of whole-cell modeling. Curr Opin
Borkowski O, Ladak YN, Awan AR, Gilbert C, Microbiol 27:18–24
Host-Circuit Modelling 291
42. O’Brien EJ, Monk JM, Palsson BO (2015) ribosomes: expression from reporter genes
Using genome-scale models to predict does not always reflect functional mRNA levels.
biological capabilities Cell 161(5):971–987 J Mol Biol 231(3):678–688
43. Monod J (1949) The growth of bacterial cul- 55. Dong H, Nilsson L, Kurland CG (1995) Gra-
tures. Ann Rev Microbiol 3(1):371–394 tuitous overexpression of genes in Escherichia
44. Schaechter M, Maaløe O, Kjeldgaard NO coli leads to growth inhibition and ribosome
(1958) Dependency on medium and tempera- destruction. J Bacteriol 177(6):1497–1504
ture of cell size and chemical composition dur- 56. Lim WA (2010) Designing customized cell
ing balanced growth of Salmonella signalling circuits. Nat Rev Mol Cell Biol 11
typhimurium. Microbiology 19(3):592–606 (6):393
45. Neidhardt FC, Magasanik B (1960) Studies on 57. Khalil AS, Collins JJ (2010) Synthetic biology:
the role of ribonucleic acid in the growth of applications come of age. Nat Rev Genet 11
bacteria. Biochim Biophys Acta 42:99–116 (5):367
46. Dennis PP, Ehrenberg M, Bremer H (2004) 58. Joshi N, Wang X, Montgomery L, Elfick A,
Control of rRNA synthesis in Escherichia coli: French C (2009) Novel approaches to biosen-
a systems biology approach. Microbiol Mol sors for detection of arsenic in drinking water.
Biol Rev 68(4):639–668 Desalination 248(1–3):517–523
47. Maaløe O (1979) Regulation of the protein- 59. Paitan Y, Biran I, Shechter N, Biran D,
synthesizing machinery—ribosomes, tRNA, Rishpon J, Ron EZ (2004) Monitoring aro-
factors, and so on. In: Biological Regulation matic hydrocarbons by whole cell electrochem-
and Development, pp. 487–542. Springer, ical biosensors. Anal Biochem 335(2):175–183
New York (1979) 60. Saeidi N, Wong CK, Lo T-M, Nguyen HX,
48. Bremer H, Dennis PP, et al (1996) Modulation Ling H, Leong SSJ, Poh CL, Chang MW
of chemical composition and other parameters (2011) Engineering microbes to sense and
of the cell by growth rate. EcoSal Cell Mol Biol eradicate Pseudomonas aeruginosa, a human
2(2):1553–1569 pathogen. Mol Syst Biol 7(1):521
49. Maitra A, Dill KA (2015) Bacterial growth laws 61. Wang B, Kitney RI, Joly N, Buck M (2011)
reflect the evolutionary importance of energy Engineering modular and orthogonal genetic
efficiency. Proc Natl Acad Sci 112(2):406–411 logic gates for robust digital-like synthetic biol-
50. Bosdriesz E, Molenaar D, Teusink B, Brugge- ogy. Nat Commun 2:508
man FJ (2015) How fast-growing bacteria 62. Hartline CJ, Mannan AA, Liu D, Zhang F,
robustly tune their ribosome concentration to Oyarzún DA (2020) Metabolite sequestration
approximate growth-rate maximization. FEBS enables rapid recovery from fatty acid depletion
J 282(10):2029–2044 in Escherichia coli. mBio 11:e03112–e03119
51. Molenaar D, Van Berlo R, De Ridder D, Teu- 63. Cambray G, Guimaraes JC, Arkin AP (2018)
sink B (2009) Shifts in growth strategies reflect Evaluation of 244,000 synthetic sequences
tradeoffs in cellular economics. Mol Syst Biol 5 reveals design principles to optimize translation
(1):323 in Escherichia coli. Nat Biotechnol 36
52. Russell JB, Cook GM (1995) Energetics of (10):1005
bacterial growth: balance of anabolic and cata- 64. Borkowski O, Bricio C, Murgiano M,
bolic reactions. Microbiol Mol Biol Rev 59 Rothschild-Mancinelli B, Stan GB, Ellis T
(1):48–62 (2018) Cell-free prediction of protein expres-
53. McGinness KE, Baker TA, Sauer RT (2006) sion costs for growing cells. Nat Commun 9
Engineering controllable protein degradation. (1):1457
Mol Cell 22(5):701–707 65. Liu D, Mannan AA, Han Y, Oyarzún DA,
54. Vind J, Sørensen MA, Rasmussen MD, Peder- Zhang F (2018) Dynamic metabolic control:
sen S (1993) Synthesis of proteins in Escher- towards precision engineering of metabolism. J
ichia coli is limited by the concentration of free Ind Microbiol Biotechnol 45:535–543
Chapter 14
Abstract
One of the fundamental properties of engineered large-scale complex systems is modularity. In synthetic
biology, genetic parts exhibit context-dependent behavior. Here, we describe and quantify a major source of
such behavior: retroactivity. In particular, we provide a step-by-step guide for characterizing retroactivity to
restore the modular description of genetic modules. Additionally, we also discuss how retroactivity can be
leveraged to quantify and maximize robustness to perturbations due to interconnection of genetic modules.
Key words Retroactivity, Gene transcription networks, Modularity, Synthetic biology, Context-
dependence, Model order reduction, Loading
1 Introduction
Filippo Menolascina (ed.), Synthetic Gene Circuits: Methods and Protocols, Methods in Molecular Biology, vol. 2229,
https://doi.org/10.1007/978-1-0716-1032-9_14, © Springer Science+Business Media, LLC, part of Springer Nature 2021
293
294 Andras Gyorgy
Fig. 1 Experimental demonstration of retroactivity, adapted from [43]. Upon addition of DOX, rtTa binds to the
promoter pTET, expressing SKN7m, which then triggers GFP production in the output module
2 Materials
3 Methods
ð6Þ
3.2 Step 2: Internal Here we derive the reduced order model of the isolated dynamics of
Retroactivity a module when the module has no inputs.
1. The binary matrix Vi has as many columns as the number of
TFs in the module, and as many rows as the number of parents
of xi, such that its ( j, k) element is 1 if the jth parent of xi is xk,
otherwise the entry is zero. That is, an entry in the following
matrix
x1 x2 ...
2 3
pi,1
6 7
Vi ¼ 6
6
7
7 pi,2
4 5
⋮
3.3 Step 3: External Here, we extend the reduced order model in 13 to the case in which
Retroactivity the module has external TFs as inputs.
300 Andras Gyorgy
3.4 Step 4: Scaling Next, consider the interconnection of the module together with its
and Mixing context.
Retroactivity
1. The binary matrix U has as many rows as the number of inputs
of the module, and as many columns as the number of TFs in
the context, such that its ( j, k) element is 1 if the jth input of the
module is the kth internal TF of the context (u j ¼ xk ), other-
wise the entry is zero. That is, an entry in the following matrix
x1 x2 ...
2 3
u1
6 7
U ¼ 6
6
7
7 u2
4 5
⋮
3.5 Step 5: Error Due Here, we provide three distinct ways to quantify the measure of
to Retroactivity disturbance on the module dynamics due to retroactivity from its
context when parameter values are known (see Note 5). For sim-
¼ 0. Let
plicity, we focus on the case when M
x_ ¼ f ðx, u, uÞ
_ ð25Þ
denote the dynamics of the module in isolation from 17. Once the
module is connected to its context, its dynamics change according
to
1
x_ ¼ ½I þ ðI þ RÞ1 S f ðx, u, u_ Þ ð26Þ
from 24. Let x ðt Þ and x~ðt Þ denote the solution of 25 and 26,
respectively, with identical initial conditions.
1. Introduce
1
μðx, uÞ: ¼ jj½I þ ðI þ RÞ1 S I jj2 : ð27Þ
2. If they exist, define l^, f^ , and μ^ such that (i) f ðx, u, u_ Þ have
Lipschitz constant l^ , (ii) jj f ðx, u, u_ Þjj2 f^ , and (iii)
μðx, uÞ μ^.
3. Let σ min ðI þ RÞ denote the smallest singular value of (I + R),
stands for the greatest singular value of S
and similarly, σ max ðSÞ
and define
σ max ðSÞ
μ^ ¼ max
x, x σ min ðI þ RÞ σ max ðSÞ
< σ min ðI þ RÞ.
provided that σ max ðSÞ
4. It the system 25 is contracting [40] with rate λ > 0 and metric
transformation Θðx, t Þ , then denote by κ ðx, t Þ the condition
number of Θðx, t Þ, and let κ^ 0 such that κ^ κðx, t Þ.
Milestone 4: The change in dynamics of a module due to retro-
ctivity from its context is bounded according to
_ f~ðx, u, uÞk
kf ðx, u, uÞ _ 2
μðx, uÞ: ð28Þ
_ 2
kf ðx, u, uÞk
Similarly, the difference between trajectories of 25 and 26 is
bounded as
μ^ f^ h lt^ i
jjx ðt Þ x~ðt Þjj2 e 1 ,
l^
and also by
μ^ f^κ^
jjx ðt Þ x~ðt Þjj2 :
λ
304 Andras Gyorgy
3.6 Illustrating To illustrate both the steps detailed above and the effect of inter-
the Effects modular connections on the dynamics of interconnected modules,
of Intermodular we consider first a natural recurring network motif, then a com-
Connections monly used synthetic genetic module.
Example 1: Single-Input Motif: The single-input motif in Fig. 2a
is a recurrent motif in gene transcription networks [31, 57]. Here,
we show that the dynamic performance (speed) of the module and
its robustness to interconnection with its context are not indepen-
dent, and that this trade-off can be analyzed by focusing on the
interplay between the internal retroactivity R of the module and the
scaling retroactivity S of the context. Let x_ 1 ¼ f ðx 1 Þ denote the
isolated dynamics of the module from 7. Furthermore, we have
1 Þ ¼ Pl R
i ¼ 1 for i ¼ 1, 2, . . ., l and U ¼ 1, so that Sðx
D i¼1 l ðx 1 Þ
by 20, where R i ðx 1 Þ is the retroactivity of TF xi in the context.
According to 24, the dynamics of the module upon interconnec-
tion modify to
1 þ Rðx 1 Þ
x_ 1 ¼ 1 Þf ðx 1 Þ ¼ ½1 μðx 1 Þ f ðx 1 Þ,
1 þ Rðx 1 Þ þ Sðx |{z}
|{z}
effect of the context
effect of the context
Fig. 2 (a) Single input motif. (b) The response time increases with the load. (c)
High internal retroactivity counteracts the effect of loading
Fig. 3 (a) AR-clock. (b) AR-clock with load. (c) Neglecting retroactivity, the isolated AR-clock displays
sustained oscillations. (d) When internal retroactivity is accounted for, oscillations are quenched. (e) Oscilla-
tions can be restored by loading the repressor, thus increasing the scaling retroactivity
A Step-by-Step Guide for Retroactivity in Gene Networks 307
4 Notes
Table 1
Retroactivity Ri of a node for the most common binding types
Binding type Ri
Single parent ηi n2 y n1
2 ky
y
1þky
2 3
Independent ηi n2 y n1
6 2 k 0 7
6 1þ y y 7
6 ky 7
6 7
6 ηi 2 m1 7
6 m z 7
4 0 2 k 5
z
1 þ kz
z
2 3
Competitive n2 y n1 kz þ z m ny n mz m1
6 k 7
6 kz ky kz 7
ηi 2 6
y
7
y zn
1þky þ kz
4 ny n1
mz m 2 m1 k þ y n 5
m z y
ky kz kz ky
2 3
Cooperative n2 y n1 kz þ z m ny n mz m1
6 k kz ky kz 7
6 7
ηi 2 6
y
4 ny n7
y n
1þky þzkz
n1
mz m n 2
y m z m1 ky þ y 5
ky kz ky kz ky
308 Andras Gyorgy
Table 2
Hill function Hi for the most common binding types
Binding type Hi
y
Single parent π i,0 þπ i,1 ky
ηi yn
1þ ky
y m yn m
Independent π i,0 þπ i,1 ky þπ i,2 zkz þπ i,3 ky zkz
ηi yn m yn m
1þ ky þzkz þ ky zkz
y m
Competitive π i,0 þπ i,1 ky þπ i,2 zkz
ηi yn m
1þ ky þzkz
y yn m
Cooperative π i,0 þπ i,1 ky þπ i,3 ky zkz
ηi yn yn m
1þ ky þ ky zkz
References
29. Jayanthi S, Nilgiriwala KS, Del Vecchio D acquisition and model-based analysis of cell-
(2013) Retroactivity controls the temporal free transcription–translation reactions from
dynamics of gene transcription. ACS Synth nonmodel bacteria. Proc Natl Acad Sci
Biol 2(8):431–441 https://doi.org/10.1073/pnas.1715806115.
30. Jiang P, Ventura AC, Sontag ED, Merajver SD, http://www.pnas.org/content/early/2018/
Ninfa AJ, Del Vecchio D (2011) Load-induced 04/16/1715806115.full.pdf
modulation of signal transduction networks. 45. Mou S, Del Vecchio D (2015) How retroactiv-
Sci Signal 4(194):ra67 ity impacts the robustness of genetic networks.
31. Kalir S, McClure J, Pabbaraju K, Southward C, In: 2015 54th IEEE Conference on Decision
Ronen M, Leibler S, Surette MG, Alon U and Control (CDC), pp 1551–1556. https://
(2001) Ordering genes in a flagella pathway doi.org/10.1109/CDC.2015.7402431
by analysis of expression kinetics from living 46. Nagaraj VH, Greene JM, Sengupta AM, Son-
bacteria. Science 292(5524):2080–2083 tag ED (2017) Translation inhibition and
32. Khalil HK (2002) Nonlinear systems. Prentice resource balance in the TX-TL cell-free gene
Hall, Upper Saddle River expression system. Synt Biol 2(1):1–7. https://
33. Kim Y, Paroush Z, Nairz K, Hafen E, doi.org/10.1093/synbio/ysx005
Jiménez G, Shvartsman SY (2011) Substrate- 47. Neupert J, Karcher D, Bock R (2008) Design
dependent control of MAPK phosphorylation of simple synthetic RNA thermometers for
in vivo. Mol Syst Biol 7:467 temperature-controlled gene expression in
34. Kirschner MW, Gerhart JC (2006) The plausi- Escherichia coli. Nucleic Acids Res 36(19):e124
bility of life: Resolving Darwin’s dilemma. Yale 48. Perez-Martin J, Espinosa M (1994) Correla-
University Press, New Haven tion between DNA bending and transcriptional
35. Kittleson JT, Cheung S, Anderson JC (2011) activation at a plasmid promoter. J Mol Biol
Rapid optimization of gene dosage in Escheri- 241(1):7–17
chia coli using dial strains. J Biol Eng 5:10 49. Prescott TP, Gyorgy A (2015) Isocost lines
36. Klipp E, Liebermeister W, Wierling C, describe the cellular economy of genetic cir-
Kowald A, Lehrach H, Herwig R (2009) Sys- cuits. In: Proceedings of the IEEE Conference
tems biology: a textbook. Wiley, Hoboken on Decision and Control
37. Kyung KH, Sauro HM (2010) Fan-out in gene 50. Purcell O, di Bernardo M, Grierson CS, Savery
regulatory networks. J Biol Eng 4:16 NJ (2011) A multi-functional synthetic gene
network: a frequency multiplier, oscillator and
38. Lauffenburger DA (2000) Cell signaling path- switch. PLOS One 6(2):1–12. https://doi.
ways as control modules: complexity for sim- org/10.1371/journal.pone.0016140
plicity? Proc Natl Acad Sci 97(10):5031–5033
51. Purnick PEM, Weiss R (2009) The second
39. Lee JW, Gyorgy A, Cameron DE, et al. (2016) wave of synthetic biology: from modules to
Creating single-copy genetic circuits. Mol Cell systems. Nat Rev Mol Cell Biol 10(6):410–422
63(2):329–336. https://doi.org/10.1016/j.
molcel.2016.06.00 52. Qian Y, Huang HH, Jiménez JI, Del Vecchio
D (2017) Resource competition shapes the
40. Lohmiller W, Slotine JJE (1998) On contrac- response of genetic circuits. ACS Synth Biol 6
tion analysis for non-linear systems. Automa- (7):1263–1272. https://doi.org/10.1021/
tica 34(6):683–696 acssynbio.6b00361
41. Lyons SM, Xu W, Medford J, Prasad A (2014) 53. Ravasz E, Somera AL, Mongru DA, Oltvai ZN,
Loads bias genetic and signaling switches in Barabasi AL (2002) Hierarchical organization
synthetic and natural systems. PLoS Comput of modularity in metabolic networks. Science
Biol 10(3):e1003533 297(5586):1551–1555
42. Milo R, Shen-Orr SS, Kashtan N, Chlovskii 54. Saez-Rodriguez J, Kremling A, Gilles ED
DB, Alon U (2002) Network motifs: simple (2005) Dissecting the puzzle of life: modular-
building blocks of complex networks. Science ization of signal transduction networks. Com-
298(5594):824–827 put Chem Eng 29(3):619–629
43. Mishra D, Rivera PM, Lin A, Vecchio DD, 55. Saez-Rodriguez J, Gayer S, Ginkel M, Gilles
Weiss R (2014) A load driver device for engi- ED (2008) Automatic decomposition of
neering modularity in biological networks. Nat kinetic models of signaling networks minimiz-
Biotechnol 32(12):1268–1275 ing the retroactivity among modules. Bioinfor-
44. Moore SJ, MacDonald JT, Wienecke S, matics 24(16):213–219
Ishwarbhai A, Tsipa A, Aw R, Kylilis N, Bell 56. Scott M, Gunderson C, Mateescu E, Zhang Z,
DJ, McClymont DW, Jensen K, Polizzi KM, Hwa T (2010) Interdependence of cell growth
Biedendieck R, Freemont PS (2018) Rapid
A Step-by-Step Guide for Retroactivity in Gene Networks 311
and gene expression: origins and conse- robust and tunable synthetic gene oscillator.
quences. Science 330:1099–1102 Nature 456(7221):516–519
57. Shen-Orr SS, Milo R, Mangan S, Alon U 63. Tamsir A, Tabor JJ, Voigt CA (2011) Robust
(2002) Network motifs in the transcriptional multicellular computing using genetically
regulation network of Escherichia coli. Nat encoded nor gates and chemical ‘wires’. Nature
Genet 31(1):64–68 469(7329):212–215
58. Siegal-Gaskins D, Tuza ZA, Kim J, Noireaux V, 64. Tan C, Marguet P, You L (2009) Emergent
Murray RM (2014) Gene circuit performance bistability by a growth-modulating positive
characterization and resource usage in a cell- feedback circuit. Nat Chem Biol 5
free “Breadboard”. ACS Synth Biol (11):842–848
3:416–425. https://doi.org/10.1021/ 65. Weiße AY, Oyarzún DA, Danos V, Swain PS
sb400203p (2015) Mechanistic links between cellular
59. Slusarczyk AL, Lin A, Weiss R (2012) Founda- trade-offs, gene expression, and growth. Proc
tions for the design and implementation of Natl Acad Sci 112(9):E1038–E1047. https://
synthetic genetic circuits. Nat Rev Genet 13 doi.org/10.1073/pnas.1416533112
(6):406–420 66. Yates EA, Philipp B, Buckley C, Atkinson S,
60. Smanski MJ, Bhatia S, Zhao D, Park Y, Wood- Chhabra SR, Sockett RE, Goldner M,
ruff L BA, Giannoukos G, Ciulla D, Busby M, Dessaux Y, Camara M, Smith H, Williams P
Calderon J, Nicol R, Gordon DB, (2002) N-acylhomoserine lactones undergo
Densmore D, Voigt CA (2014) Functional lactonolysis in a pH-, temperature-, and acyl
optimization of gene clusters by combinatorial chain length-dependent manner during
design and assembly. Nat Biotechnol 32 growth of Yersinia pseudotuberculosis and Pseu-
(12):1241–1249 domonas aeruginosa. Infect Immun 70
61. Sridharan GV, Hassoun S, Lee K (2011) Iden- (10):5635–5646
tification of biochemical network modules 67. Yoon J, Blumer A, Lee K (2006) An algorithm
based on shortest retroactive distances. PLoS for modularity analysis of directed and
Comput Biol 7(11):e1002262 weighted biological networks based on edge-
62. Stricker J, Cookson S, Bennett MR, Mather betweenness centrality. Bioinformatics 22
WH, Tsimring LS, Hasty J (2008) A fast, (24):3106–3108
Chapter 15
Abstract
RNA-seq enables the analysis of gene expression profiles across different conditions and organisms. Gene
expression burden slows down growth, which results in poor predictability of gene constructs and product
yields. Here, we describe how we applied RNA-seq to study the transcriptional profiles of Escherichia coli
when burden is elicited during heterologous gene expression. We then present how we selected early
responsive promoters from our RNA-seq results to design sensors for gene expression burden. Finally, we
describe how we used one of these sensors to develop a burden-driven feedback regulator to improve
cellular fitness in engineered E. coli.
Key words Synthetic construct, Gene expression burden, RNA-seq, Sensor, Feedback
Filippo Menolascina (ed.), Synthetic Gene Circuits: Methods and Protocols, Methods in Molecular Biology, vol. 2229,
https://doi.org/10.1007/978-1-0716-1032-9_15, © Springer Science+Business Media, LLC, part of Springer Nature 2021
313
314 Alice Boo and Francesca Ceroni
1 Introduction
2 Materials
2.1 Strains We used bacterial strains MG1655 (K-12 F- λ- rph-1) and DH10B
(K-12 F- λ- araD139 Δ(araA-leu)7697 Δ(lac)X74 galE15 galK16
galU hsdR2 relA rpsL150(StrR) spoT1 deoR ϕ80dlacZΔM15
endA1 nupG recA1 e14- mcrA Δ(mrr hsdRMS mcrBC)), acquired
from the National BioResource Project Japan. Users should select
the strain of their own interest and apply the materials and methods
for the strain(s) of their choice.
2.2 Molecular 1. Plasmid DNA isolation, extraction of DNA from agarose gels,
Cloning and PCR purification were done using Qiagen kits.
2. All our PCR reactions were carried out using the NEB Phusion
High Fidelity Polymerase and oligonucleotide primers synthe-
sized by IDT.
3. The burden-responsive promoters were synthesized as gBlocks
from IDT and inserted into the destination vector below
through restriction cloning using SfiI and PacI. All enzymes
were ordered from NEB.
4. The structure of the plasmids used to test the burden early
responsive promoters on a plasmid is as described in Fig. 1. The
gene reporting the activity of the burden-responsive promoter,
here sfGFP, can be easily swapped using restriction cloning
using the PacI and BsaI restriction sites.
5. The structure of the burden-driven feedback plasmid used to
regulate heterologous protein production is as described in
Fig. 2. The actuator, here the sgRNA, can be swapped using
restriction cloning using the PacI and AscI restriction sites. The
target site on the sgRNA can also be swapped using inverse
PCR of the feedback plasmid with insertion-encoding 50 phos-
phorylated primers, followed by DpnI digestion and religation
before transformation into the strain of interest. This allows for
replacement of the target domain, which in our case was
designed to hit the araBAD promoter.
Table 1
Protocol to make 400 mL of M9 0.4% fructose supplemented with
Casamino acids
3 Methods
3.1 Identify How Here, we describe the workflow we followed to prepare our E. coli
the Host Responds strains for measuring the impact of expressing a synthetic construct
to Burden Using on the host. We also describe how the samples were prepared for
RNA-Seq studying the impact of our burden-inducing construct on the host
transcriptome via RNA-seq. The workflow is represented in Fig. 3.
Fig. 3 Workflow to measure the burden induced by the expression of a heterologous construct and extract the
RNA for RNA-seq
318 Alice Boo and Francesca Ceroni
3.1.2 Time-Course Assay 1. Grow overnight cultures of E. coli cells transformed with the
construct and control plasmids at 37 C overnight with aera-
tion in a shaking incubator in 5 mL of M9 medium (see Mate-
rials Subheading 3.3).
2. In the morning, dilute 60 μL of each sample into 3 mL of fresh
M9 media supplemented with the appropriate antibiotics and
grow them at 37 C with shaking for another hour
(outgrowth).
3. Then, transfer 200 μL of each sample into a 96-well plate (we
used clear transparent 96-well Costar plates) at approximately
0.1 OD600.
4. Place the samples in a microplate reader (we used a Biotek
Synergy HT plate-reader) and incubate them at 37 C with
orbital at Medium Shaking. For 1 h. Take measurements of
VioB-mCherry (excitation, 590 nm; emission, 645 nm) and
OD600 every 15 min. (if using GFP then use excitation,
485 nm; emission, 528 nm).
5. Sixty minutes into the incubation, briefly remove the plate to
add the inducer to the wells (our final concentrations of inducers
were: l-arabinose, 0.2%; l-rhamnose, 2%). Set this time point as
your “time 0.”
Burden Sensors for Synthetic Biology 319
6. If you are doing a burden assay: grow the cells in the reader for
4.5 h, taking measurements of VioB-mCherry (excitation,
590 nm; emission, 645 nm) and OD600 every 15 min.
7. If you are performing RNA-seq analysis: remove the samples
from the wells at 15 and 60 min after induction for processing:
(a) Take 170 μL from each of four wells per time point and
dispense it in a fresh tube to which you would have added
1.360 mL of RNA protection buffer.
(b) Leave the samples for 5 min at room temperature and then
centrifuge them at 4 C at maximum speed.
(c) Discard the supernatant and freeze the pellets at 20 C.
(d) Repeat the experiment for the three replicates on three
different days (our three replicates were repeated indepen-
dently on three different days for a total of 90 samples used to
produce the final data set (7 constructs 2 strains
3 replicates 2 time points ¼ 84 samples; plus control strain
DH10B-GFP cells 3 replicates 2 time points)).
3.1.3 RNA-Seq Sample The library preparation uses a custom protocol adapted from pre-
Preparation vious Nextera kit methods [7].
1. Extract the RNA from your samples taken in the
Section Burden Assay and RNA-seq Time Course. Use the
Qiagen RNeasy mini kit (Qiagen 74104).
2. Remove possible traces of genomic DNA contamination by
treating 2 μg of each sample for a second time with DNase I
(Qiagen 79254).
3. Assess the total RNA quality and integrity with an Agilent 2100
Bioanalyzer and Agilent RNA 6000 Nano Kit (5067-1511).
The average RNA integrity number should be superior to 9.
4. Enrich the mRNA with the MicrobExpress rRNA removal kit
(Thermo Scientific AM1905).
5. Assess successful rRNA depletion on the Bioanalyzer.
6. Carry the retrotranscription starting from 50 ng of total
enriched mRNA with the Tetro cDNA synthesis kit (Bioline
BIO-65043) and 6 μL of random hexamers (Bioline
BIO-38028) per reaction.
7. For the second cDNA synthesis, add 5 μL of NEB Next
second-strand synthesis buffer (NEB B6117S) to the first-
strand synthesis mix, 3 μL of dNTPs (NEB N0446S), 2 μL of
RNase H (NEB M0297L), 2 μL of polymerase I (Thermo
Scientific 18010025), and 18 μL of water per reaction.
8. Incubate the samples at 16 C for 2.5 h.
9. Purify the cDNA with the MiniElute PCR purification kit
(Qiagen 28,004) and elute in 10 μL of DEPC-treated free
water.
320 Alice Boo and Francesca Ceroni
3.1.4 RNA-Seq Library We performed the library sequencing at the Imperial College
Sequencing London Genomic Facility. We used two lanes from the HiSeq
2500 sequencer for paired-end sequencing with read length of
100 bp.
3.1.5 Sequencing Quality 1. Trim and assess the quality of your raw reads for all sequenced
Control and Alignments samples using Trim Galore v0.4.1 with default settings. Look
for potential batch effects by pooling your technical replicates.
2. Obtain the genomic sequences of your organism, for example,
using Ensembl Genomes. (In our case, we created a FASTA
format sequence file corresponding to our DH10B-GFP and
MG1655-GFP strain by merging the composite of strain, plasmid,
and integrated GFP for each sample to use as a reference for read
alignment.)
3. Align the trimmed reads using the BWA mem algorithm
v0.7.12-r1039 with the default settings.
4. Create a sorted BAM file for each sample using SAMtools
v1.3.1 on the alignments obtained at the previous step.
5. Check that your biological replicates do not exhibit any batch
effects before you generate the raw counts with Bioconductor
Rsubread package v1.12.6.
6. Discard all reads identified as unremoved rRNA, and in the one
case where reads could align to either the plasmid or the strain
genome, assign the raw reads appropriately to match those of
flanking sequence.
7. Check the biological replicates to identify any outlier sample.
8. Generate the normalized FPKM counts with the Bioconductor
edgeR package version 3.4.2, accounting for gene length and
library size (by TMM normalization), which will be used for
downstream analysis.
Burden Sensors for Synthetic Biology 321
3.1.7 Analyze To calculate the burden imposed by the constructs, refer to Ceroni
the Plate-Reader Data et al. [11]:
to Evaluate Burden
ln ðODðt 3 ÞÞ ln ðODðt 1 ÞÞ
Growth rateðt 2 Þ ¼
t3 t1
Total GFPðt 3 Þ Total GFPðt 1 Þ
GDP Capacityðt 2 Þ ¼
ODðt 2 Þ ðt 3 t 1 Þ
Total RFPðt 3 Þ Total RFPðt 1 Þ
RFP Production Rate per Cellðt 2 Þ ¼
ODðt 2 Þ ðt 3 t 1 Þ
where t1 ¼ time 15 min after induction, t2 ¼ time after induc-
tion, and t3 ¼ time + 15 min after induction.
Mean rates and their standard errors are calculated from three
biological. To account for the background red fluorescence of M9,
we added 400 to all RFP output rates per cell as we measured that
red fluorescence decreases at a rate of approximately 400 RFP h1
as it is consumed by cells during growth.
3.2 Select the Best The next step is to identify which promoters are upregulated in the
Burden-Responsive presence of burden. We identified early responsive promoters using
Promoter to Build RNA-seq, isolated and cloned them upstream of a fluorescent
a Burden Biosensor reporter so to characterize their response to burden when out of
their genomic context on a plasmid. This workflow is presented in
Fig. 5. This allowed us to select our burden sensor: the promoter
exhibiting the best fold activation when it is triggered by burden.
322 Alice Boo and Francesca Ceroni
Fig. 5 Workflow to identify promoters that are upregulated by burden from RNA-seq results and test them out
of their genomic context in order to select the best candidate to use as a burden biosensor
3.2.1 Interpret Here, we describe how to interpret the RNA-seq results to identify
the RNA-Seq Results promoters with an early response to burden. We used DESeq2 for
to Identify Promoter our differential expression analyses [12].
Upregulated by Burden
1. Compare gene expression between cells transformed with syn-
thetic constructs and the analogous cells transformed with the
corresponding empty plasmid (We excluded the reads mapping
to ribosomal genes or to the synthetic constructs).
2. Annotate the differentially expressed genes with data extracted
from the EcoCyc database [13] using custom Python code.
3. Using a volcano plot can help visualizing which genes were
upregulated or downregulated in the cells experiencing the
imposed burden compared to the control cells (Fig. 5). We
specifically looked at the differential gene expression at 15 min,
and 1 h after induction.
3.2.3 Select the Best Analyze the plate-reader data and select the sensor plasmid that
Promoter to Use as Burden exhibits the best ON/OFF properties.
Biosensor
1. Analyze the plate-reader data according to sect. 3.1.7. Plot bar
graphs at 1 h post-induction with burden of the GFP produc-
tion rate per cell.
Burden Sensors for Synthetic Biology 323
Fig. 6 Workflow to build a burden-driven feedback for gene expression based on a burden biosensor
uncovered with RNA-seq
3.3 Build Once we identified our burden sensor, we used it to drive the
the Burden-Driven expression of an actuator able to regulate gene expression in
Feedback Loop response to burden. Our workflow for building a burden-driven
feedback loop is represented in Fig. 6. In the presence of burden,
the actuator should be triggered to decrease heterologous gene
expression, thus decreasing the burden imposed on the cell, and
restore some of its cellular capacity.
To measure cellular burden, we used the capacity monitor from
Ceroni et al. [1]. This can assess the burden of genetic constructs by
calculating the changes in GFP productions from a “monitor cas-
sette” constitutively expressing GFP from the bacterial genome. A
detailed protocol of how to integrate the capacity monitor into a
strain of interest can be found in Note 3. GFP capacity, or the GFP
production rate per cell, should be maintained above a specific
threshold, which means that burden would be contained to an
upper bound.
3.3.1 Build the Feedback Build the feedback plasmid (Fig. 2) by restriction cloning: the
Plasmid promoter can be inserted using the previously synthesized gBlocks
carrying the SfiI and PacI restriction sites. The actuator can also be
324 Alice Boo and Francesca Ceroni
synthesized with PacI and AscI restriction sites for insertion into
the feedback plasmid via restriction cloning. In our case, the sgRNA
was placed under the regulation of the htpG1 promoter to promote
fast dynamics of our system and such that the levels of sgRNA in the
cell will be directly related to the host cell capacity. dCas9 is consti-
tutively expressed and binds to sgRNA present in the cell to inhibit
the production of VioB-mCherry, which slows down cell growth
when its expression is triggered (Fig. 7).
1. Transform the burden plasmid and the feedback plasmid into a
strain containing the sfGFP capacity monitor integrated into
the genome. Also transform an open-loop version of the feed-
back: the sgRNA should not target anything in the cell.
2. Carry a time-course assay in the plate-reader: take measure-
ments of VioB-mCherry (excitation, 590 nm; emission,
645 nm), sfGFP (excitation, 485 nm; emission, 528 nm), and
OD600 every 15 min.
3. Sixty minutes into the incubation, briefly remove the plate to
add the inducer to the wells (0.2% arabinose).
4. Grow the cells for 6 h.
5. Analyze the data by plotting the GFP capacity and the VioB-
mCherry production rate at 1 h post-induction.
Repression of the VioB-mCherry production is tunable by
controlling the intracellular concentration of dCas9 available to
form an inhibiting complex together with the guide RNA.
dCas9 expression sets the steady-state repression levels of the
heterologous VioB-mCherry protein, but its production rate
should be carefully chosen such that it does not itself impose a
large burden on the host cell. The capacity monitor can assess
the burden of genetic constructs by calculating the changes in
GFP productions from a “monitor cassette” constitutively
expressing GFP from the bacterial genome (see Note 3).
6. Create a library of feedback constructs with promoters of vari-
ous strengths driving dCas9 expression to check if increasing
dCas9 levels strengthen repression of the feedback. Randomly
Burden Sensors for Synthetic Biology 325
Fig. 8 Tune the feedback gain by changing the expression level of dCas9
(promoter/RBS) or by varying the affinity of the sgRNA with its target promoter
(bp mutation)
4 Notes
1. M9 Medium Recipe
(a) M9 Minimum salts (5) stock solution: dissolve 56.4 g
of M9 Minimum Salts into 1 L of distilled H2O. Stir to
suspend and sterilize by autoclaving. Store at room
temperature.
(b) Thiamine hydrochloride stock solution: dissolve 10 mg
of thiamine hydrochloride into 1 mL of water. Agitate to
suspend. Filter-sterilize. Cover the sterile container with
aluminum foil to protect it from the light. Store at room
temperature. (DH10B cannot produce thiamine
hydrochloride.)
(c) Fructose stock solution: dissolve 10 g of fructose into
100 mL of distilled H2O. Filter-sterilize. Store at 4 C.
(We used fructose as the main carbon source to avoid the
strong catabolite repression of AraBAD and RhaBAD pro-
moters known to occur in glucose media.)
(d) Casamino acids stock solution: dissolve 10 g of Casa-
mino Acids into 100 mL of distilled H2O. Stir to suspend
and sterilize by autoclaving. Store at room temperature.
(We tried various Casamino acids brands and found that
Casamino acids from MP Biomedicals gave us consistent
growth for our DH10B and MG1655 cells.)
(e) 1 M Magnesium sulfate (MgSO4) stock solution: dis-
solve 246 g of MgSO4l7H2O into 1 L of distilled
H2O. Sterilize by autoclaving. Store at room temperature.
328 Alice Boo and Francesca Ceroni
References
1. Ceroni F, Algar R, Stan G-B, Ellis T (2015) genetic designs and associated data. ACS
Quantifying cellular capacity identifies gene Synth Biol 6:1115–1119. https://doi.org/10.
expression designs with reduced burden. Nat 1021/acssynbio.6b00252
Methods 12:415–418. https://doi.org/10. 10. Myers CJ, Beal J, Gorochowski TE et al (2017)
1038/nmeth.3339 A standard-enabled workflow for synthetic
2. Borkowski O, Ceroni F, Stan GB, Ellis T biology. Biochem Soc Trans 45:793–803.
(2016) Overloaded and stressed: whole-cell https://doi.org/10.1042/BST20160347
considerations for bacterial synthetic biology. 11. Ceroni F, Boo A, Furini S et al (2018) Burden-
Curr Opin Microbiol 33:123–130. https:// driven feedback control of gene expression.
doi.org/10.1016/j.mib.2016.07.009 Nat Methods 15:387–393. https://doi.org/
3. Ellis T (2018) Predicting how evolution will 10.1038/nmeth.4635
beat us. Microb Biotechnol 12(1):41–43. 12. Love MI, Huber W, Anders S (2014) Moder-
https://doi.org/10.1111/1751-7915.13327 ated estimation of fold change and dispersion
4. Martin VJJ, Pitera DJ, Withers ST et al (2003) for RNA-seq data with DESeq2. Genome Biol
Engineering a mevalonate pathway in Escher- 15:1–21. https://doi.org/10.1186/s13059-
ichia coli for production of terpenoids. Nat 014-0550-8
Biotechnol 21:796–802. https://doi.org/10. 13. Keseler IM, Mackie A, Santos-Zavaleta A et al
1038/nbt833 (2017) The EcoCyc database: reflecting new
5. Gyorgy A, Jiménez JI, Yazbek J et al (2015) knowledge about Escherichia coli K-12.
Isocost lines describe the cellular economy of Nucleic Acids Res 45:D543–D550. https://
genetic circuits. Biophys J 109:639–646. doi.org/10.1093/nar/gkw1003
https://doi.org/10.1016/j.bpj.2015.06.034 14. Farasat I, Salis HM (2016) A biophysical model
6. Shachrai I, Zaslaver A, Alon U, Dekel E (2010) of CRISPR/Cas9 activity for rational design of
Cost of unneeded proteins in E. coli is reduced genome editing and gene regulation. PLoS
after several generations in exponential growth. Comput Biol 12:1–33. https://doi.org/10.
Mol Cell 38:758–767. https://doi.org/10. 1371/journal.pcbi.1004724
1016/j.molcel.2010.04.015 15. Petrova OE, Garcia-Alcalde F, Zampaloni C,
7. Gertz J, Varley KE, Davis NS et al (2012) Sauer K (2017) Comparative evaluation of
Transposase mediated construction of rRNA depletion procedures for the improved
RNA-seq libraries. Genome Res 22:134–141. analysis of bacterial biofilm and mixed patho-
https://doi.org/10.1101/gr.127373.111. gen culture transcriptomes. Sci Rep 7:1–15.
134 https://doi.org/10.1038/srep41114
8. Gorochowski TE, Espah Borujeni A, Park Y 16. Haldimann A, Wanner BL (2001) Conditional-
et al (2017) Genetic circuit characterization replication, integration, excision, and retrieval
and debugging using RNA-seq. Mol Syst Biol plasmid-host systems for gene structure-
13:952. https://doi.org/10.15252/msb. function studies of bacteria. J Bacteriol
20167461 183:6384–6393. https://doi.org/10.1128/
9. Der BS, Glassey E, Bartley BA et al (2017) JB.183.21.6384
DNAplotlib: programmable visualization of
330 Alice Boo and Francesca Ceroni
17. Algar RJR (2013) Understanding, characteris- Rhys James Richmond Algar, MA (Oxon),
ing and modelling the interactions between MRes Submission for the degree of PhD.
synthetic genetic circuits and their host chassis Imperial College London
Chapter 16
Abstract
Synthetic biology has been advancing cellular and molecular biology studies through the design of synthetic
circuits capable to examine diverse endogenously or exogenously driven regulatory pathways. While early
genetic devices were engineered to be insulated from intracellular crosstalk, more recently the need of
achieving dynamic control of cellular behavior has led to the development of smart interfaces that connect
signal information (sensor) to desired output activation (actuator). Sensor-actuator circuits can respond to
diverse inputs, including small molecules, exogenous and endogenous mRNA, noncoding RNA (i.e.,
miRNA), and proteins to regulate downstream events, transcriptionally, posttranscriptionally, and transla-
tionally. These devices require attentive engineering to either create complex chimeric proteins or modify
protein structures to be amenable to the specific circuits’ architecture and/or purpose.
In this chapter, we describe how to implement two different protein-based devices in mammalian cells:
(1) a modular platform that sense and respond to disease-associated proteins and (2) a protein-based system
that allows simultaneous regulation of RNA translation and protein activity, via RNA-protein and newly
engineered protein–protein interactions.
Key words Mammalian synthetic biology, Protein sensor-actuator, Synthetic smart interfaces, Pro-
tein–protein regulation, Protein–RNA regulation, RNA-binding protein
1 Introduction
1.1 Synthetic Programmable and model aided synthetic circuits hold the poten-
Devices that Sense tial to improve our understanding of the rules that govern
Intracellular Protein biological processes [1–4] and to create new tools for biomedical
and Regulate purposes [22]. Genetic biosensors with medical applications focus
Cellular Fate on cell function rewiring by triggering a therapeutic output via
transcriptional or translational regulation [5–9]. Most of synthetic
biosensors have been designed to respond to extracellular stimuli
either by building input-specific devices or by creating a generaliz-
able framework to adapt to different cues [7, 10, 23].
Here, we describe the first modular platform that can be repur-
posed to sense and respond to several intracellular proteins that
function as disease’ biomarkers. This synthetic platform couples
Filippo Menolascina (ed.), Synthetic Gene Circuits: Methods and Protocols, Methods in Molecular Biology, vol. 2229,
https://doi.org/10.1007/978-1-0716-1032-9_16, © Springer Science+Business Media, LLC, part of Springer Nature 2021
331
332 Giuliano Bonfá et al.
Membrane
Tag
mKate
sm
pla
y to scFv162
C
scFv35
TCS TEVp
Gal4-VP16
us
ucle
N Actuator
1.2 A Protein-Based RNA-encoded genetic circuits have the potential to limit immuno-
Strategy to Regulate genicity and mutagenicity issues of DNA-based system and exhibit
RNA Translation faster dynamics. They have thus become an appealing strategy for
and Protein Activity synthetic systems’ regulation with a variety of applications mostly in
the biomedical fields [11, 12]. However, the ability to achieve fine
control over gene expression at posttranscriptional and transla-
tional level is limited by the poor toolbox of regulatory devices
available: ribozymes, aptamers, riboswitches can modulate the
translation of the associated output but cannot be interconnected
to create modular and scalable circuits [13, 14].
Recently, RNA-binding proteins (RBPs) have been demon-
strated for the engineering of RNA-encoded networks, enhancing
the regulatory features of RNA expression [15]. We envisioned to
create a multilayered system that adds further regulatory elements
via protein–RNA and protein–protein interactions using a protein
engineering strategy.
Proteases can recognize specific aminoacidic sequences leading
to proteolytic events. In theory, these protease-responsive
sequences could be transferred to other proteins modifying their
structure but with no impairment of their function. This is poten-
tially possible due to the availability of protein crystal structures for
a large number of proteins, as well as of multiple software available
to study and infer protein structure and sequence [16, 17], after in
silico modification or via homology analysis for native protein
sequences. Thus, this system is potentially highly modular, and
more levels of regulation can be multiplexed. This study use
TEVp and others from the same family, but this framework could
be extended to endogenous proteins that are activated following
specific cellular state. Here we report on how to use protein engi-
neering to create regulatory cascades that connect proteases to
RBPs L7Ae and Ms2-cNOT7, to tune output expression at post-
transcriptional and translational level. L7Ae binds kink-turn
motives in the 50 UTR of the target mRNA and blocks output
translation [15]. We modified its structure to be TEV protease
dependent. Ms2-cNOT7 is a fusion protein that binds Ms2 binding
motifs in the 30 UTR and chops the poly(A) of the target mRNA,
334 Giuliano Bonfá et al.
2 Material
2.1.3 Flow Cytometry 1. LSR Fortessa flow cytometer, equipped with 405, 488, and
Staining, Acquisition, 561 nm lasers and LSR-II system (BD Biosciences).
and Analyses 2. SpheroTech RCP-30-5A beads (SpheroTech).
3. To determine surface expression of HLA-I molecules, use
AlexaFluor® 647 mouse anti-human HLA A, B, C antibody
clone W6/32 (Biolegend® #311414).
4. For apoptosis assays, stain post-transfected and PBS washed
cells with Pacific-Blue conjugated Annexin V (LifeTechnolo-
gies) before flow cytometry analysis.
5. FACSDiva8 software.
3 Methods
3.1.2 Transfection of HEK 1. Carry out protein sensor transfections in 24-well plate format
293FT Cells transfecting HEK 293FT with Attractene.
and Fluorescence Imaging 2. Prepare a mix of 300 ng of total DNA in DMEM base medium
without supplements to a final volume of 60 μL.
338 Giuliano Bonfá et al.
3.1.3 Electroporation 1. Electroporate TZM-bl and Jurkat cells with Neon Transfection
of TZM-bl and Jurkat Cells System using 10 μL Neon Tip.
2. For TZM-bl cells, prepare a total of 2 μg of DNA in a
1.5 mL tube.
3. Harvest 2 105 cells by trypsinization and centrifuge in PBS at
150 g for 5 min at room temperature.
4. Remove the supernatant with a pipette, suspend the cells in
buffer R, and then add the cells to the DNA tube mixing gently.
5. Pick the DNA and cell mix with the appropriate Neon Tip and
transfer to the electroporator. Apply a pulse (pulse voltage:
1005 v, pulse width: 35 ms, pulse number: 2) and transfer all
the cells to the well.
6. For Jurkat cells, prepare a total of 4 μg of DNA in a
1.5 mL tube.
7. Harvest 3 105 cells and centrifuge in PBS at 150 g for
5 min at room temperature.
8. Remove the supernatant with a pipette, suspend the cells in
buffer R, and then add the cells to the DNA tube mixing gently.
9. Pick the DNA and cell mix with the appropriate Neon Tip and
transfer to the electroporator. Apply a pulse (pulse voltage:
1325 v, pulse width: 10 ms, pulse number: 3) and transfer all
the cells to the well.
10. Infect the TZM-bl and Jurkat cells with HIV strains around
6–12 h post-transfections, allowing for recovery after
electroporation.
Engineering Protein-Based Parts in Mammalian Cells 339
3.1.4 Flow Cytometry 1. Acquire the cells with LSR Fortessa flow cytometer, equipped
and Data Analysis with 405, 488, and 561 nm lasers.
2. Collect 30,000–100,000 events per sample and acquire fluo-
rescence data with the following cytometer settings: 488 nm
laser and 530/30 nm bandpass filter for EYFP/EGFP, 561 nm
laser and 610/20 nm filter for mKate, and 405 nm laser,
450/50 filter for EBFP.
3. Convert flow cytometry data from arbitrary units to compen-
sated molecules of equivalent fluorescein (MEFL) using the
TASBE characterization method [19, 20]. The TASBE method
uses a strong constitutively expressed fluorophore, which
serves as both a transfection marker and an indicator of relative
circuit copy count.
4. An affine compensation matrix is computed from single posi-
tive and blank controls.
5. FITC measurements are calibrated to MEFL using SpheroTech
RCP-30-5-A beads.
6. Mappings from other channels to equivalent FITC are com-
puted from co-transfection of constitutively expressed EBFP,
EYFP, and mKate, each controlled by the hEF1a promoter on
its own otherwise identical plasmid.
7. MEFL data are segmented by constitutive fluorescent protein
expression into logarithmic bins at 10 bins/decade and because
the data are log-normally distributed, geometric mean, and
variance computed for those data points in each bin.
8. Observe the constitutive fluorescence distributions. Select the
threshold based on each data set, below which data are
excluded as being too close to the non-transfected population
(e.g., 1 107 MEFL for NS3 and NEF HEK, 3 107 MEFL
for HTT and TAT, 2 105 MEFL for TZM-bl, and 105 for
Jurkat data sets).
9. Removed high outliers by excluding all bins without at least
100 data points. Both population and per-bin geometric statis-
tics are computed over this filtered set of data.
10. Include at least three biological replicates for all experiments
and indicate error bars using standard deviation. Variance for
all groups should be generally similar: any differences should be
reflected in the displayed standard deviation.
3.1.6 HIV Production 1. Produce HIV-1 strains by transfecting HEK-293 T cells with
and Infection the corresponding infectious molecular clones (NIH AIDS
reagents program) and JetPRIME® reagent.
2. After 40 h, concentrate virus preparations by ultracentrifuga-
tion for 1 h, 64,074 g, 4 C on 20% sucrose to avoid viral
particle-free proteins.
3. Titrate viral stocks by HIV-1 p24 ELISA.
4. For infection of TZM-bl cells, use a viral inoculum of 500 ng of
p24 for each strain.
5. Forty hours after infection, harvest, fix, and permeabilize the
cells with cytofix/cytoperm solution for 15 min at room
temperature.
6. Determine the percentage of infected cells by intracellular
staining of viral protein p24 with a FITC-conjugated antibody
(KC57-FITC, dilution 1:50) and flow cytometry.
3.1.8 RNA Extraction, 1. Perform RNA extraction with RNeasy Mini Kit. Wash the cells
cDNA Synthesis, and qPCR in PBS and add buffer RTL directly into the wells.
2. Incubate for 2 min at room temperature and collected with a
sterile scraper. Proceed the RNA extraction according to man-
ufacturer’s instructions.
3. Elute RNA in 30 μL of RNAse free water to maximize the yield.
Engineering Protein-Based Parts in Mammalian Cells 341
a Ins1
Ins3
Ins2
b
State 1 State 2
TEV
L7Ae_TCS L7Ae_TCS
EGFP EGFP
k-turns k-turns
Fig. 2 (a) Crystal structure of L7Ae bound to RNA target with the possible
insertion sites highlighted. (b) Graphical representation of the RNA-encoded
circuit regulated by a TEV-responsive L7Ae. State1: In absence of TEVp,
L7Ae_TCS represses EGFP translation. State2: When TEVp is present, it
cleaves L7Ae rendering it nonfunctional and EGFP levels increases
a b c
Stage 3 TUMV TVMV TEV
Stage 2
TVMV_TUCS TEV_TVCS TUMV_TCS
Stage 1
TVCS TCS TUCS
EGFP EGFP EGFP
Stage 0 AAA AAA AAA
Ms2 binding Ms2 binding Ms2 binding
motives motives motives
Fig. 3 Graphical representation of protease-based cascades. In all cascade variants, at stage 0, EGFP is
expressed and at stage 1 is downregulated by Ms2-cNOT7. (a) At stage 2, Ms2-TVCS-cNOT7 activity is
disrupted by TVMV-TUCS and EGFP expression is restored; at stage 3, EGFP expression is knocked down again
as TVMV-TUCS is repressed by TUMV. (b) At stage 2, Ms2-TCS-cNOT7 activity is disrupted by TEV-TVCS and
EGFP expression is restored; at stage 3, EGFP expression is knocked down again as TEV-TVCS is repressed by
TVMV. (c) At stage 2, Ms2-TUCS-cNOT7 activity is impaired by TUMV-TCS and EGFP expression increases; at
stage 3, TUMV-TCS is repressed by TEV, and EGFP is downregulated
4 Notes
References
1. Ausl€ander D, Eggerschwiler B, Kemmer C, Engineering modular intracellular protein
Geering B, Ausl€ander S, Fussenegger M sensor-actuator devices. Nat Commun 9:1881
(2014) A designer cell-based histamine-specific 7. Scheller L, Strittmatter T, Fuchs D, Bojar D,
human allergy profiler. Nat Commun 5:4408 Fussenegger M (2018) Generalized extracellu-
2. di Bernardo D, Marucci L, Menolascina F, Sici- lar molecule sensor platform for programming
liano V (2012) Predicting synthetic gene net- cellular behavior. Nat Chem Biol 14:723–729
works. Methods Mol Biol 813:57–81 8. Courbet A, Endy D, Renard E, Molina F, Bon-
3. Tigges M, Marquez-Lago TT, Stelling J, Fus- net J (2015) Detection of pathological biomar-
senegger M (2009) A tunable synthetic mam- kers in human clinical samples via amplifying
malian oscillator. Nature 457:309–312 genetic switches and logic gates. Sci Transl
4. Siciliano V, Garzilli I, Fracassi C, Criscuolo S, Med 7:289ra83
Ventre S, di Bernardo D (2013) MiRNAs con- 9. Sedlmayer F, Fussenegger M (2017) Synthetic
fer phenotypic robustness to gene networks by biology: a probiotic probe for inflammation.
suppressing biological noise. Nat Commun Nat Biomed Eng 1:0097
4:2364 10. Schwarz KA, Daringer NM, Dolberg TB, Leo-
5. Kipniss NH, Dingal PCDP, Abbott TR, Gao Y, nard JN (2016) Rewiring human cellular inpu-
Wang H, Dominguez AA, Labanieh L, Qi LS t–output using modular extracellular sensors.
(2017) Engineering cell sensing and responses Nat Chem Biol 13:202
using a GPCR-coupled CRISPR-Cas system. 11. McNamara MA, Nair SK, Holl EK (2015)
Nat Commun 8:2212 RNA-based vaccines in cancer immunotherapy.
6. Siciliano V, DiAndreth B, Monel B, Beal J, J Immunol Res 2015:794528
Huh J, Clayton KL, Wroblewska L,
McKeon A, Walker BD, Weiss R (2018)
346 Giuliano Bonfá et al.
12. Sahin U, Karikó K, Türeci Ö (2014) mRNA- 18. Engler C, Marillonnet S (2014) Golden Gate
based therapeutics — developing a new class of cloning. Methods Mol Biol 1116:119–131
drugs. Nat Rev Drug Discov 13:759–780 19. Beal J, Wagner TE, Kitada T, Azizgolshani O,
13. Cella F, Wroblewska L, Weiss R, Siciliano V Parker JM, Densmore D, Weiss R (2015)
(2018) Engineering protein-protein devices Model-driven engineering of gene expression
for multilayered regulation of mRNA transla- from RNA replicons. ACS Synth Biol 4:48–56
tion using orthogonal proteases in mammalian 20. Beal J, Weiss R, Yaman F, Davidsohn N, Adler
cells. Nat Commun 9:1–9 A (2012) A method for fast, high-precision
14. Culler SJ, Hoff KG, Smolke CD (2010) Repro- characterization of synthetic biology devices.
gramming cellular behavior with RNA control- MIT CSAIL Tech Report 2012-008
lers responsive to endogenous proteins. 21. Moore T, Zhang Y, Fenley MO, Li H (2004)
Science 330:1251–1255 Molecular basis of box C/D RNA-protein
15. Wroblewska L, Kitada T, Endo K, Siciliano V, interactions; cocrystal structure of archaeal
Stillo B, Saito H, Weiss R (2015) Mammalian L7Ae and a box C/D RNA. Structure
synthetic circuits with RNA binding proteins 12:807–818
for RNA-only delivery. Nat Biotechnol 22. Caliendo F, Dukhinova M, Siciliano V (2019)
33:839–841 Engineered Cell-Based Therapeutics: Synthetic
16. PyMOL | pymol.org. https://pymol.org/2/. Biology Meets Immunology. Front. Bioeng.
Accessed 30 Oct 2019 Biotechnol. 7:43
17. Waterhouse A, Bertoni M, Bienert S, Studer G, 23. Cella F, Siciliano V (2019) Protein-based parts
Tauriello G, Gumienny R, Heer FT, de Beer and devices that respond to intracellular and
TAP, Rempfer C, Bordoli L, Lepore R, extracellular signals in mammalian cells. Curr.
Schwede T (2018) SWISS-MODEL: homol- Opin. Chem. Biol. 52:47–53
ogy modelling of protein structures and com-
plexes. Nucleic Acids Res 46:W296–W303
INDEX
Filippo Menolascina (ed.), Synthetic Gene Circuits: Methods and Protocols, Methods in Molecular Biology, vol. 2229,
https://doi.org/10.1007/978-1-0716-1032-9, © Springer Science+Business Media, LLC, part of Springer Nature 2021
347
SYNTHETIC GENE CIRCUITS : METHODS AND PROTOCOLS
348 Index
DNA assembly repressilator .............................................................. 91
automated selection RNA-seq (see RNA sequencing (RNA-seq))
enzyme...................................................... 168–170 simulation of an inducible gene ................... 278–282
primer ....................................................... 170–174 stochastic simulations............................................... 42
batch part standardization ............................ 158–162 whole-cell model .................................................... 269
circular plasmids ..................................................... 167 See also Synthetic circuits
EGF................................................................ 157–158 Gene expression burden
NGS ........................................................................ 168 burden-driven feedback loop
and strain development................................. 149–150 cellular burden.......................................... 325–327
synthetic biology projects ...................................... 157 library of promoters ......................................... 325
type-2S assembly ........................................... 162–165 plasmid...................................................... 323–325
verification .............................................................. 168 burden-responsive promoter
DNA verification .......................................................... 168 biosensor................................................... 322–323
Dynamic models........... 67, 74, 119, 120, 123, 223, 226 genomic context............................................... 322
RNA-seq results ............................................... 322
E cell engineering ...................................................... 314
Edinburgh Genome Foundry (EGF)................ 157–158, host responds ................................................ 317–321
medium................................................................... 316
168, 341–343
molecular cloning.......................................... 315–316
F RNA-seq library preparation ........................ 316–317
strains ...................................................................... 315
Feedback Gene network
burden-based biomolecular ................................... 314 retroactivity (see Retroactivity)
burden-driven................................................ 323–327 synthetic.................................................................. 110
control Gene regulatory networks ................................. 3, 14, 93,
algorithms................................................. 215–218 101, 120, 308
laws...................................................................... 32 Genetic parts
negative loop ........................................................ 5, 34 descriptions............................................................. 151
positive................................................. 4, 9–11, 17–19 DNA constructs ..................................................... 157
stability of oscillations ................................................ 2 receptor vector ....................................................... 171
three-gene negative.................................................... 4 RNA-seq (see RNA sequencing (RNA-seq))
Feedback control standardization .............................................. 158–159
controller Gene transcription networks ............... 32, 294, 299, 304
PI............................................................... 216–217 Gillespie algorithm
relay........................................................... 215–216 choosing a reaction ................................................ 104
implementation ........................................................ 34 iterating................................................................... 105
law ............................................................................. 32 Markov process....................................................... 101
MPC............................................................... 217–218 rate vector function................................................ 114
Focal point...................................................................... 15 resampling time trace data..................................... 116
Funding SSA...................................................................... 42, 77
fee-for-service model ............................................. 145 stochastic algorithm ................................................. 29
government ................................................... 143–145 stoichiometry matrix ..................................... 115, 116
project partnerships................................................ 145 system update ......................................................... 104
time to next reaction..................................... 103–104
G time-trace simulating .................................... 102–103
Gene circuits Global optimization ............................................ 120, 226
cell-free ................................................................... 190 Growth models ......................................... 269, 274, 276,
CLE approach .......................................................... 43 280, 282, 288
construction ........................................................... 337
design.................................................... 123, 267, 270 H
heterologous genes ....................................... 277–278 Hardware ................. 138, 141, 142, 146, 149, 191, 193
living cells ............................................................... 175 control layer pressure regulation ........................... 198
modeling................................................................... 92 flow layer pressure .................................................. 198
QS/Fb circuit..................................................... 43, 44 Hill function ................................ 96, 111, 112, 299, 308
SYNTHETIC GENE CIRCUITS : METHODS AND PROTOCOLS
Index 349
Host–circuit models M
bacterial growth ............................................ 269–276
gene circuits Machine learning
heterologous genes .................................. 277–278 algorithms............................................................... 148
inducible gene .......................................... 278–282 automation need .................................................... 140
model complexity................................................... 269 gene circuit design ................................................. 267
transcriptional logic gates ............................. 281–288 scientific experiments design ................................. 148
T7 RNA polymerase .............................................. 268 test cycle ................................................................. 140
trial-and-error approach ........................................ 148
I Mammalian cell
culture and transfection ................................ 334–335
Intracellular protein-sensor electroporation .............................................. 334–335
acquisition .............................................................. 335 microfluidics/microscopy (see Microfluidics)
analyses ................................................................... 335 segmentation .......................................................... 215
apoptosis assays ...................................................... 340 tissue culture........................................................... 211
cDNA synthesis ..................................... 335, 340–341 Mammalian synthetic biology
data analysis ............................................................ 339 intracellular protein-sensor devices ............. 331–335,
DNA cloning ................................................. 334, 337 337–341
electroporation ...................................... 334–335, 338 protein-based
flow cytometry .............................................. 335, 339 devices.............................................. 336, 341–344
fluorescence imaging..................................... 337–338 strategy...................................................... 333–334
of HEK 293FT cells ...................................... 337–338 synthetic devices ....................................... 331–333
HIV production and infection .............................. 340 Mathematical modeling ......................... 2, 109, 110, 267
HLA-I surface expression ............................. 339–340 Metabolic engineering ................................................. 146
mammalian cells culture ............................... 334–335 Metrology ............................................................ 150–152
plasmid construction..................................... 334, 337 Microfluidics
qPCR ..................................................... 335, 340–341 and cell-free systems.............................. 148–149, 190
RNA extraction ..................................... 335, 340–341 chip fabrication....................................................... 206
chip loading ............................................................ 206
L
pins preparation and wetting................... 209–210
Laboratory automation preculture of cells ..................................... 211–212
automation field shear-free cell loading ...................................... 211
cell-free systems........................................ 148–149 computational algorithms...................................... 215
DNA assembly.......................................... 149–150 connectors .............................................................. 193
metrology ................................................. 150–151 device fabrication ................................................... 244
microfluidics ............................................. 148–149 experiments
ML .................................................................... 148 calibration ................................................. 252–254
open science ..................................................... 150 cells loading ...................................................... 254
standardization ......................................... 150–151 connecting syringes to the chip....................... 252
strain development................................... 149–150 fabrication................................................. 249–250
build ............................................................... 139–140 microfluidic chip wetting................................. 251
business plan microscope setup...................................... 254–255
education .................................................. 146–147 overnight culture.............................................. 250
funding ..................................................... 143–145 setup.......................................................... 244–245
partnerships .............................................. 145–146 syringe preparation........................................... 251
system maintenance and personnel ................. 147 feedback control algorithms
design...................................................................... 139 MPC.......................................................... 217–218
ML .......................................................................... 140 PI controller ............................................. 216–217
strategy........................................................... 141–144 relay controller ......................................... 215–216
synthetic biology .................................................... 138 hardware ................................................................. 193
test........................................................................... 140 PDMS ..................................................................... 205
Liquid handling................................................... 149, 152 time-lapse................................................................ 207
SYNTHETIC GENE CIRCUITS : METHODS AND PROTOCOLS
350 Index
actuation system ............................................... 214 dynamics ................................................................. 222
chip positioning....................................... 213, 214 inducible promoter modeling ...................... 229–231
microscope specs .............................................. 214 local methods ......................................................... 225
settings ...................................................... 214–215 model
tubes.......................................................... 212–213 building............................................................. 221
and turbidostats........................................................ 44 calibration ................................................. 258–259
Mixed integer nonlinear programming ...................... 122 selection .................................................... 231–235
Model calibration parameter estimation ................... 224, 225, 235–238
computational tools ............................................... 244 probability density function................................... 223
image processing stochastic global optimization algorithms ............ 226
cell-tracking and extraction ..................... 256–257 toolbox
fine-tuning ........................................................ 255 download and license....................................... 227
segmentation ............................................ 255–256 requirements and installation guide................ 227
microfluidic Ordinary differential equations (ODEs)
device fabrication ............................................. 244 cyber-physical platform .......................................... 242
experimental setup ................................... 244–245 Matlab solvers......................................................... 123
optimal experimental design......................... 258–259 nonlinear deterministic .......................................... 120
parameter estimation .................................... 257–258 solving................................................................. 98–99
practical identifiability ................................... 247–248 writing....................................................................... 98
sensitivity analysis .......................................... 246–247
structural identifiability .......................................... 245 P
test case ................................................................... 242
Parameter space analysis ....................... 22, 99–100, 106,
Model order reduction .......................... 46, 48, 295–296 121, 260, 261
Model predictive control (MPC) ....................... 217–218 PDMS, see Polydimethysiloxane (PDMS)
Modularity ........................ 119, 149, 150, 267, 293, 332
Photolithography
Moieties .......................................................................... 75 consumables ........................................................... 192
MPC, see Model predictive control (MPC) control mold fabrication............................... 195–196
Multi-objective optimization.............................. 131–133
flow mold fabrication.................................... 194–195
machines ................................................................. 192
N
mask fabrication ..................................................... 194
Network control............................................................... 4 PI controller, see Proportional-Integral (PI) controller
Boolean models .................................................. 31–32 Piecewise-linear differential equation (PLDE) models
strategies ............................................................. 30–31 cyclic orbit ................................................................ 11
synthetic circuits................................................. 32–33 discontinuities .......................................................... 35
Network dynamics IRMA circuit ............................................................ 19
analysis oscillator with positive feedback ....................... 17–19
attractors and their stability ......................... 20–21 toggle switch ...................................................... 15–17
formal verification of network Polydimethysiloxane (PDMS)
properties ................................................ 23–26 bonding .................................................................. 197
modular analysis ........................................... 26–28 casting and curing ......................................... 196–197
state transition graphs ..................... 21–23, 28–30 degassing................................................................. 249
control elastomer................................................................. 193
Boolean models ............................................ 31–32 mixing ..................................................................... 249
strategies ....................................................... 30–31 replica molding
synthetic circuits........................................... 32–33 cleaning and bonding ...................................... 209
Next generation sequencing (NGS) .................. 168, 176 microfluidic device preparation ............... 207–208
silanization........................................................ 207
O Practical identifiability 223, 242, 244, 247–248, 261, 264
ODEs, see Ordinary differential equations (ODEs) Proportional-integral (PI) controller .................... 31, 32,
216–217
Optimal experimental design
AMIGO2 ....................................................... 226, 229 Protein-based devices
candidate models.................................................... 223 cell culture .............................................................. 336
code structure................................................ 227–229 flow cytometry ....................................................... 336
PCR......................................................................... 336
SYNTHETIC GENE CIRCUITS : METHODS AND PROTOCOLS
Index 351
plasmid cloning ..................................... 336, 341–342 quality control and alignments........................ 320
protein–protein devices testing .................... 342–344 sample preparation ................................... 319–320
protein structural analysis ............................. 341–342 time-course assay...................................... 318–319
in silico protein engineering .................................. 336 transcription profiles ........................................ 321
transient transfection cell imaging ........................ 336 transformation .......................................... 317–318
Protein-protein regulation......................... 333, 342–344 library preparation
Protein-RNA regulation .............................................. 333 consumables ............................................. 316–317
Protein sensor-actuator...................... 332–334, 337–341 equipment......................................................... 317
materials
Q genetic analyzer installation..................... 178–179
QSSA, see Quasi steady-state approximation (QSSA) sequencing data ................................................ 179
software dependencies ..................................... 178
Qualitative modeling
Boolean models .................................................... 6–13 methods
DNA synthesis............................................................ 1 data preprocessing.................................... 182–183
differential gene expression ..................... 183–184
dynamic properties............................................. 34–35
gene expression dynamics .......................................... 2 initial workflow setup............................... 179–181
network dynamics promoters and terminators .............................. 184
response function ..................................... 184–185
analysis .......................................................... 20–30
control .......................................................... 30–33 temporary files and logs removing .................. 185
PLDE models ................................................ 3, 15–19 transcription profiles ........................................ 183
reviews ...................................................................... 34 in vivo assay ............................................................ 314
synthetic regulatory circuits................................... 4–6
S
Quasi steady-state approximation (QSSA) ................... 75
Sanger sequencing
R necessary data files.................................................. 171
Relay controller ................................................... 215–216 output ............................................................ 173–174
primer-free regions........................................ 171–173
Reproducibility.......................... 137–139, 142, 149–151
Resource allocation ...................................... 33, 269, 272 web application ...................................................... 173
Restriction digest analysis Sensor
DNA assembly verification .................................... 168 gene expression burden ......................................... 314
genetic logic gates .................................................. 184
web application ............................................. 169–170
Retroactivity intracellular protein (see Intracellular protein-sensor)
biochemical reactions............................................. 295 small-molecule........................................................ 176
Soft lithography
contraction theory ................................................. 296
error ............................................................... 303–304 clean/semiclean room ........................................... 249
external .......................................................... 299–301 consumables .................................................. 192–193
device fabrication
intermodular connections............................. 304–307
internal........................................................... 298–299 bonding of PDMS ........................................... 197
mathematical model of modules .................. 296–297 casting and curing, PDMS....................... 196–197
silanization........................................................ 196
model order reduction.................................. 295–296
modularity .............................................................. 293 machines ................................................................. 192
perturbations .......................................................... 294 Software ........................................................................ 193
scaling and mixing......................................... 301–302 automatic syringe movement ................................ 214
code implementation ......................................... 46–49
time-scale separation ..................................... 295–296
RNA-binding protein .................................................. 333 dependencies .......................................................... 178
RNA sequencing (RNA-seq) DSGRN .................................................................... 22
FACSDiva8............................................................. 335
burden-driven feedback ......................................... 323
characterize genetic parts....................................... 176 FlowJo .................................................................... 340
computational tool................................................. 176 and hardware components..................................... 138
MEIGO .................................................................. 245
host responds
library sequencing ............................................ 320 Snapgene Viewer .................................................... 160
plate-reader data............................................... 321 spectrum companies............................................... 146
promoter characterization ............................... 321 SynBioHub ............................................................. 151
tools ........................................................................ 139
SYNTHETIC GENE CIRCUITS : METHODS AND PROTOCOLS
352 Index
Software (cont.) modeling framework..................................... 120–121
web-based ............................................................... 162 optimization
SSA, see Stochastic simulation algorithm (SSA) problem design......................................... 121–122
Standardization .................................. 138, 149–152, 157 solvers ....................................................... 122–123
genetic part.................................................... 158–159 oscillator design
necessary data files......................................... 159, 160 library of components.............................. 124–125
output ............................................................ 161–162 objective function..................................... 125–126
part regions.................................................... 160–161 problem definition ........................................... 124
web application ...................................................... 161 simulating the dynamics, circuit...................... 128
State transition graphs ............................ 3, 8–11, 13, 16, single objective optimization problem ... 126–128
18, 20–23, 28–30 Pareto front of solutions........................................ 134
Steady-state gene expression practical examples................................................... 123
batch cell-free reactions ......................................... 190 switch-like circuit design
device operation definition .......................................................... 129
cell-free expression ................................... 199–201 library of components.............................. 129–130
filling control lines ................................... 198–199 multi-objective optimization problem .... 131–133
flow lines filling ................................................ 199 objective functions ........................................... 130
experimental reagents ............................................ 194 SynBioHub ................................................................... 151
hardware setup ....................................................... 198 Synthetic biology
host cell................................................................... 189 application .................................................................. 2
microfluidic.................................................... 193, 194 batch cell-free reactions ......................................... 190
microscope hardware ............................................. 193 biocircuits ............................................................... 119
photolithography cell-free systems...................................................... 189
consumables ..................................................... 192 cyber-physical platform .......................................... 242
control mold fabrication .......................... 195–196 DBT ........................................................................ 137
flow mold fabrication............................... 194–195 design-build-test-learn cycle........................... 92, 267
machines ........................................................... 192 federal investments................................................. 143
mask fabrication ............................................... 194 genetic design......................................................... 177
soft lithography homeostasis .............................................................. 32
consumables ............................................. 192–193 laboratory automation (see Laboratory automation)
device fabrication ..................................... 196–198 microfluidics ........................................................... 190
machines ........................................................... 192 OED .............................................................. 241, 242
software................................................................... 193 on-line vs. off-line .................................................. 243
Stochastic modeling ....................................................... 92 photolithographic steps ......................................... 191
CME ......................................................................... 42 promoters and regulators ...................................... 189
continuous deterministic approach ......................... 42 sequencing methods .............................................. 176
gene expression noise............................................... 41 stochastic perturbations ........................................... 22
materials.............................................................. 43–52 toggle switch .............................................................. 4
methods .............................................................. 52–71 two-layer microchemostat design ................ 190, 191
spatial ........................................................................ 42 Synthetic circuits
Stochastic simulation algorithm (SSA) ................. 42, 43, characterizing promoter and terminator ..... 176, 177
78, 79, 83 contextual effects.................................................... 175
Stochastic simulations control ................................................................ 32–33
characterization of results ............................. 105–106 dynamic properties..................................................... 3
circuit performance .................................................. 44 genetic parts and devices ....................................... 176
dynamic model ......................................................... 67 Hill function .................................................. 111, 112
gene circuits.............................................................. 42 interactions ............................................................... 21
Gillespie algorithm ........................................ 102–105 inverse transform sampling.................................... 113
parameter scan............................................... 106–107 materials
stochastic notation ........................................ 101–102 built-in/custom-coded functions ..................... 93
time-trace....................................................... 102–103 computing long-term statistics.................... 49–50
SYNBADm model in proper form .................................. 44–49
initialization ................................................... 123–124 noise .............................................................. 49–50
installation ..................................................... 123–124 software......................................................... 50–52
SYNTHETIC GENE CIRCUITS : METHODS AND PROTOCOLS
Index 353
memorylessness property....................................... 113 T
methods
abstracting the circuit .................................. 94–95 Throughput ............................... 137–143, 149–152, 167
compilation......................................................... 67 Trade-offs ........................................ 2, 83, 120, 122, 304
deterministic solution ................................ 98–100 Transcriptional logic gates
mass action equations .................................. 95–96 circuit function
models to redesign ................................... 107–110 nutrient quality......................................... 286–287
OpenFPM client program ........................... 52–67 RBS ................................................................... 286
parameter estimation ................................... 96–97 host-aware gate
simulation ..................................................... 67–71 AND ................................................................. 284
stochastic simulations............................... 100–107 NAND .............................................................. 285
models....................................................................... 91 NOT ................................................................. 283
novel gene circuits.................................................... 92 Type-2S assembly pre-validation
parameter values..................................................... 110 necessary data files.................................................. 163
redesign..................................................................... 92 output ..................................................................... 165
response function .......................................... 176, 178 restriction sites........................................................ 157
structure and behavior ............................................... 2 web application ............................................. 163–165
workflow ........................................................ 176, 177
W
See also Qualitative modeling
Synthetic construct ............................ 314, 317–318, 322 Whole-cell modeling........................................... 269, 270
Synthetic gene circuits, see Synthetic circuits