0% found this document useful (0 votes)
32 views10 pages

BayesianOptimizationfor Designof MultiscaleBiologicalCircuits

This document summarizes a research article about using Bayesian optimization to efficiently design biological circuits that operate across multiple scales (e.g., gene expression, signaling pathways, metabolism). The method allows joint optimization of circuit architecture and parameters to solve a highly complex optimization problem. It was shown to efficiently handle large multiscale problems and enable parameter sweeps to assess circuit robustness prior to experimental implementation.

Uploaded by

Aileen Turner
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
32 views10 pages

BayesianOptimizationfor Designof MultiscaleBiologicalCircuits

This document summarizes a research article about using Bayesian optimization to efficiently design biological circuits that operate across multiple scales (e.g., gene expression, signaling pathways, metabolism). The method allows joint optimization of circuit architecture and parameters to solve a highly complex optimization problem. It was shown to efficiently handle large multiscale problems and enable parameter sweeps to assess circuit robustness prior to experimental implementation.

Uploaded by

Aileen Turner
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

pubs.acs.

org/synthbio Research Article

Bayesian Optimization for Design of Multiscale Biological Circuits


Published as part of the ACS Synthetic Biology virtual special issue “AI for Synthetic Biology”.
Charlotte Merzbacher, Oisin Mac Aodha, and Diego A. Oyarzún*
Cite This: ACS Synth. Biol. 2023, 12, 2073−2082 Read Online

ACCESS Metrics & More Article Recommendations *


sı Supporting Information
See https://pubs.acs.org/sharingguidelines for options on how to legitimately share published articles.

ABSTRACT: Recent advances in synthetic biology have enabled


the construction of molecular circuits that operate across multiple
scales of cellular organization, such as gene regulation, signaling
Downloaded via UNIV DE TALCA on November 23, 2023 at 19:14:17 (UTC).

pathways, and cellular metabolism. Computational optimization


can effectively aid the design process, but current methods are
generally unsuited for systems with multiple temporal or
concentration scales, as these are slow to simulate due to their
numerical stiffness. Here, we present a machine learning method
for the efficient optimization of biological circuits across scales.
The method relies on Bayesian optimization, a technique
commonly used to fine-tune deep neural networks, to learn the shape of a performance landscape and iteratively navigate the
design space toward an optimal circuit. This strategy allows the joint optimization of both circuit architecture and parameters, and
provides a feasible approach to solve a highly nonconvex optimization problem in a mixed-integer input space. We illustrate the
applicability of the method on several gene circuits for controlling biosynthetic pathways with strong nonlinearities, multiple
interacting scales, and using various performance objectives. The method efficiently handles large multiscale problems and enables
parametric sweeps to assess circuit robustness to perturbations, serving as an efficient in silico screening method prior to experimental
implementation.
KEYWORDS: Bayesian optimization, machine learning, dynamic pathway control, genetic circuit design, multiscale biological systems,
metabolic engineering

1. INTRODUCTION techniques have been employed to identify functional circuits,


The design of molecular circuits with prescribed functions is a including exhaustive search,4−6,13 computational optimiza-
core task in synthetic biology.1 These circuits can include tion,7,8 systems theoretic approaches,14−18 Bayesian de-
components that operate across various scales of cellular sign,19,20 and machine learning.9,21 While these methods differ
on their specific modeling strategies and assumptions, they all
organization, such as gene expression, signaling pathways,2 or
require computational simulations at many locations, typically
metabolic processes.3 Computational methods are widely
thousands to millions, in the design space. But since multiscale
employed to discover circuits with specific dynamics,4−6 and,
systems often cannot be simulated at such scale, the
in particular, optimization-based strategies can be employed to
computational costs limit the applicability of current
search over design space and single out circuits predicted to
optimization methods.
fulfill a desired function.7−10 However, circuit design requires
A notable example of this challenge appears in genetic
the specification of circuit architecture, i.e., the circuit “wiring
circuits for dynamic control of metabolic pathways.22−26 These
diagram”, as well as the strength of interactions among
systems are receiving substantial attention thanks to several
molecular components. Since circuit architectures are discrete
successful implementations that improved yields as compared
choices and molecular interactions depend on continuous
to classic techniques in metabolic engineering.27,28 The key
parameters such as binding rate constants, circuit design leads
principle is to put enzymatic genes under the control of
to mixed-integer optimization problems that can be notori-
metabolite-responsive mechanisms that couple heterologous
ously difficult to solve.11 Moreover, when circuits operate
across multiple scales, their computational models become
numerically stiff,12 resulting in extremely slow simulations that Received: February 23, 2023
make their mixed-integer optimization challenging or even Published: June 20, 2023
impossible to solve.
Previous work on computational circuit design has largely
focused on genetic circuits that operate in isolation from other
layers of the cellular machinery (Figure 1A). A range of
© 2023 The Authors. Published by
American Chemical Society https://doi.org/10.1021/acssynbio.3c00120
2073 ACS Synth. Biol. 2023, 12, 2073−2082
ACS Synthetic Biology pubs.acs.org/synthbio Research Article

Figure 1. Bayesian optimization for the design of circuit architectures and parameters. (A) Previous optimization methods have focused on genetic
circuits in isolation from other cellular processes. For multiscale circuits, optimization approaches become infeasible due to the difficulty of
simulating stiff dynamical systems in many locations of the design space; a common example of such multiscale systems are gene circuits that
control metabolic production.3 We propose the use of Bayesian optimization (BayesOpt) for efficient optimization of architectures and parameters
in multiscale circuits. (B) Schematic of a mixed-integer Bayesian optimization loop; the objective function is regarded a random variable to be
optimized over an input space comprised of continuous parameters and a set of discrete circuit architectures. At each iteration, the algorithm
computes the value of the objective function from the solution of an ordinary differential equation (ODE) model at a single location in the input
space. The algorithm learns the shape of the objective landscape using a nonparametric statistical model,37 which is employed to propose a new
location in the input space through an acquisition function designed to balance exploration and exploitation of the input space; more details in the
Methods. The algorithm iteratively learns the shape of the performance landscape until convergence to a global optimum. (C) Example metabolic
pathway under gene regulation. We consider three negative feedback architectures plus open loop control; the architectures are named based on the
net effect of the metabolite on gene expression. The intermediate X1 binds a transcription factor (TF) that controls the expression of pathway
enzymes, either as an activator or repressor. Vin is the constant influx to the engineered pathway from native metabolism. The TF dose-response
curve (at right) is described by three parameters, ki, θi, and ni, where i = 1, 2. The aim is to find designs with optimal architecture and dose-response
parameters (ki, θi); for simplicity the Hill coefficient was fixed to ni = 2. (D) Performance landscapes of the four feasible circuit architectures. We
exclude architectures with positive feedback loops as these are prone to multistability.47 The shape of the performance landscape defined in eq 3
shows substantial variation across the four architectures. This leads to a highly nonconvex mixed-integer optimization problem. Heatmaps show the
value of the objective J computed on a regular grid of the indicated parameters. (E) Comparison of BayesOpt against other strategies using the toy
model as a benchmark; lower objective function values are better. Shown are the results for random sampling (N = 1,000 samples), grid search (N
= 40,000), a genetic algorithm55 (N = 100 individuals, N = 1,000 generations), and a gradient-based optimizer to find optimal continuous
parameter values for each architecture.48

expression to the concentration of a pathway intermediate.3 transcription factors30 or riboregulators.31 The design of
This creates feedback loops between enzyme expression and control architectures is particularly important, because there
pathway intermediates that allow the control of pathway are many ways of building similar control loops,32 for example
activity in response to upstream changes in growth conditions by employing combinations of transcriptional activators and
or precursor availability. Such dual genetic-metabolic systems repressors,33,34 that may differ in their performance and cost of
are particularly challenging to simulate efficiently because implementation.
metabolites and enzymes vary in different time scales, from Here, we present a fast and scalable machine learning
milliseconds (enzyme kinetics) to minutes (enzyme expres- approach for optimization of multiscale circuit architectures
sion), and they also appear in vastly different concentrations; and parameters (Figure 1A). The method is based on Bayesian
in bacteria enzymes are typically expressed in nanomolar optimization coupled with differential equation models, and we
concentrations, while metabolites are found typically above the highlight its utility in various models of metabolic pathways
millimolar range.29 Moreover, the implementation of these under genetic feedback control.35 Using a toy example for a
systems is costly and requires substantial experimental fine- simple pathway, we first show that the method converges
tuning. As a result, a central task prior to implementation is the rapidly and outperforms other optimizers by a substantial
choice of a suitable feedback control loop between metabolites margin. We then consider real world models of metabolic
and enzymatic genes, and the strength of interactions between pathways in Escherichia coli for the production of several
metabolites and actuators of gene expression such as relevant precursors: glucaric acid,36 fatty acids,33 and p-
2074 https://doi.org/10.1021/acssynbio.3c00120
ACS Synth. Biol. 2023, 12, 2073−2082
ACS Synthetic Biology pubs.acs.org/synthbio Research Article

aminostyrene.34 We use these pathways to illustrate how the ds /dt = f (s , e) s


speed of our method enables screening optimal designs in
realistic design tasks that would otherwise be infeasible to de/dt = u(s , pc , pd ) e (2)
compute, including the impact of uncertain enzyme kinetic
parameters, the use of layered architectures that combine where s and e are vectors of metabolite and enzyme
metabolic and genetic control, and the optimization of a concentrations, respectively. The term f(s, e) describes the
complex model with 23 differential equations, 27 candidate biochemical reactions between pathway intermediates, while
control architectures, and 16 parameters to be optimized. The the parameter λ models the dilution effect by cell growth. The
method can help speed up the design of synthetic biological vector u(s, pc, pd) describes the enzyme expression rates
circuits and presents a novel approach to explore the design controlled by some pathway intermediates, and typically take
space ahead of implementation. the form of sigmoidal dose-response curves that lump together
processes such as metabolite-TF or metabolite-riboregulator
interactions.30 The continuous parameters pc model the dose-
2. RESULTS response curves of the feedback mechanisms, whereas the
2.1. Bayesian Optimization for Joint Optimization of discrete parameters pd specify the gene control architecture.
Circuit Architecture and Parameters. In general, a circuit The number of heterologous enzymes determines the number
design task can be stated as the following mixed-integer of genes in the control circuit. In pathways under dynamic
optimization problem: control as in eq 2, both sets of species change in different time
scales; metabolic reactions operate in the millisecond range or
min J(x , pc , pd ) faster,39 while enzyme expression changes in the scale of
pd , pc
minutes or longer. Moreover, metabolites and enzymes are also
subject to: present in different ranges of concentrations, from nM for
enzymes to mM and higher for metabolites.29 As a result,
dx/dt = h(x) simulation of the ODE in eq 2 is computationally expensive,
pc , pd particularly when this has to be done many times as part of an
(1) optimization-based search.
The performance objective J can be flexibly used to model
where J(x, pc, pd) is a performance objective to be optimized common design goals such as production flux, yield or titer, as
over a space of continuous parameters pc and a discrete set of well as cost-benefit tasks that balance production with the
circuit architectures pd. The ordinary differential equation deleterious impact of the pathway on the physiology of the
(ODE) in eq 1 describes the temporal dynamics of circuit host. To first establish a baseline for the performance of our
components and are typically built from mass balance relations method, we employed a simple toy pathway model that
comprised in the nonlinear function h(x). Common examples displays common features found in real metabolic pathway
of continuous parameters in applications are binding affinities (Figure 1C). The model includes a metabolic branch point
between DNA and regulatory proteins, or the strength of through a heterologous pathway with two enzymatic steps. As
protein-protein interactions. Conversely, circuit architecture a performance objective we considered the minimization of
would typically involve various combinations of positive and
negative feedback loops among molecular species. We have J= 1Jprod + 2Jcost (3)
stated the problem as minimization of J, but similar
formulations can be posed as a maximization problem. where Jprod was designed so that its minimization is equivalent
In this paper, we propose to solve the design problem in eq to maximization of the production flux, and Jcost penalizes total
1 with Bayesian optimization (BayesOpt), a class of algorithms amount of enzyme expressed during the culture. The
designed for problems with objective functions that are parameters α1 and α2 are positive weights used to control
expensive to compute. BayesOpt is a global optimization the balance between the costs and benefits of expressing the
technique that treats the objective function as a random heterologous pathway. Details on the objective function can be
variable with a prior distribution on it. The algorithm creates a found in the Supporting Information.
statistical model of the objective through subsequent We considered the four control architectures shown in
evaluations, which are employed to build a posterior Figure 1C, which include open loop control as well as three
distribution and determine the next set of inputs to evaluate different implementations of negative feedback control using a
(Figure 1B). A typical application of BayesOpt is in design of metabolite-responsive transcription factor. Negative feedback
experiments35 where the objective function requires measuring is widely employed in gene circuits as it has substantial benefits
data with costly and/or slow experimental work. In deep in terms of robustness and performance, and their properties
learning, BayesOpt is widely employed for model selection, as have been extensively studied in the literature.40−42 To
traditional grid search approaches require large compute illustrate the challenge of jointly optimizing circuit architecture
resources to train many architectures with combinations of and parameters, in Figure 1D we show a schematic of the
various layer sizes and other hyperparameters.37,38 design space. The four control architectures under consid-
For the circuit design task in eq 1, when the biological eration reside at different discrete points in the architecture
system has multiple scales the computation of objective J space. Within each architecture, we observe substantial
requires solving a stiff ODE at many locations of the mixed- variations in the shape of the performance landscape J as a
integer search space, which can rapidly become infeasible. To function of the dose-response parameters pc. There are cases
illustrate the utility of BayesOpt in a range of design problems, with convex landscapes with a clear optimum (e.g. dual
we focus on genetic control circuits for metabolic pathways control) and landscapes with flat basins where most
that synthesize high-value products. In this case, the ODE in eq optimization algorithms would struggle to find the optimum
1 contains two sets of equations: (e.g. downstream activation). When searching over the space
2075 https://doi.org/10.1021/acssynbio.3c00120
ACS Synth. Biol. 2023, 12, 2073−2082
ACS Synthetic Biology pubs.acs.org/synthbio Research Article

Figure 2. Robustness of optimal circuits to parameter uncertainty. (A) Schematic of a dynamic pathway for production of glucaric acid in
Escherichia coli.26 The pathway includes allosteric inhibition and export of an intermediate to the extracellular space. The core pathway components
myoinositol (MI) and glucaric acid (GA) are modeled explicitly, as are the enzymes Ino1 and MIOX. The enzyme SuhB is not rate-limiting and is
not modeled explicitly. Vin is the constant influx to the engineered pathway from native metabolism. As in Figure 1C, the architectures are named
based on the net effect of the metabolite on gene expression. (B) Sample run of the BayesOpt algorithm for 1,000 iterations of the loop in Figure
1B. Black line shows the descent on the value of the objective function. Dots show all samples colored by architecture; pie charts show the fraction
of architectures explored by the algorithm, and the fraction of samples taken from the majority architecture (dual control). The first quarter of the
run had the most exploration of architectures other than dual control, with 38.6% of samples coming from nonmajority architectures. This
percentage steadily decreased over the iterations but did not drop below 20%, illustrating the global nature of the optimization routine. (C) To
examine the robustness of the optimal solutions to parameter uncertainty, we computed optimal solutions for many perturbed parameters of the
allosteric activation of MIOX by its substrate myoinositol (MI). Strip plot shows the best objective function values achieved for background and
perturbed kinetic parameters (Vm,MIOX, aMIOX, ka,MIOX) in eq 4. Kinetic parameters were perturbed using Latin Hypercube sampling56 in the range
(−100%, +100%) of the nominal values (Supporting Information). We observed little difference between background and perturbed values; dashed
line denotes the mean value of the objective function. Only one of the N = 100 runs for perturbed parameters failed to converge the optimum. (D)
Optimal architectures across runs with background and perturbed parameter values. Both background and perturbed systems resulted in over 80%
of runs selecting dual control as the optimal architecture. (E) Average dose-response curves and distribution of optimal parameters for the dual
control architecture with perturbed allosteric parameters. The repressive and activatory loops have substantially different dose-response curves on
average. The distributions of the dose-response parameters (right) show important variations in their mean and dispersion. The parameter ki and θi
determine the maximal enzyme expression rate and regulatory threshold, respectively.

of architectures and parameters simultaneously, the problem struggle to find the global optimum, especially in highly
becomes a mixed-integer, nonconvex optimization that is nonconvex landscapes like the ones presented here. In
extremely challenging to solve with traditional approaches. contrast, BayesOpt does not converge by chasing minima
We implemented a BayesOpt routine to jointly compute the directly but rather by modeling the entire objective function
architecture (pd) and dose-response parameters (pc) that landscape, which results in rapid and reliable results. The
minimize the performance objective in eq 3. We benchmarked method can perform multiple “jumps” between distant
its performance against several other methods, including a locations in the discrete-continuous search space, where each
random search, an exhaustive grid search, a gradient based subsequent sample is selected to maximize the expected
method, and a genetic algorithm (Figure 1E). The algorithm improvement on the best sample found so far.
was able to compute optimal solutions rapidly (average 27 The speed of our approach enables the computation of large
seconds per run across 100 runs) and robustly (standard solution ensembles under model perturbations such as sweeps
deviation less than 2.5% of the mean optimal objective of key model parameters. In addition, our method can search
function value). BayesOpt runs significantly faster than the high-dimensional mixed-integer design spaces. We next
other methods, and provides a 30-fold improvement over a illustrate the versatility of the approach in a range of relevant
genetic algorithm. The accuracy of the optimum, quantified by real world pathways that require solving the optimization
the minimal value of the objective function, is on average problem for large samples of parameter values.
11.4% worse than the genetic algorithm, but this falls within 2.2. Robustness of Control Circuits to Uncertainty in
the variation of the latter across several runs. We also note that Enzyme Kinetic Parameters. A challenge in building
the traditional gradient-based optimizer proved unreliable and pathway models is the substantial uncertainty on the enzyme
failed to converge on 14.5% of runs. kinetic parameters; this is particularly critical for pathways that
A key advantage of Bayesian methods is that they are not include regulatory mechanisms such allostery or product
gradient-based, and therefore are not constrained to navigate inhibition, which are often poorly characterized. Databases
the space smoothly in the direction of steepest descent. such as BRENDA43 often have insufficient data on enzyme
Gradient-based methods can get trapped in local minima and kinetics for a particular host strain or substrate of interest.
2076 https://doi.org/10.1021/acssynbio.3c00120
ACS Synth. Biol. 2023, 12, 2073−2082
ACS Synthetic Biology pubs.acs.org/synthbio Research Article

Figure 3. Optimization of metabolite dynamics in a fatty acid synthesis pathway. (A) Pathway diagrams with various control architectures
implemented in Escherichia coli.33 The metabolic loop employs a metabolite-responsive transcription factor, whereas the gene loop includes only a
repressor expressed on the same promoter as the enzyme. (B) Representative run of BayesOpt with cost-benefit objective showing the best
objective function value (black line). All samples are colored by their architecture. Pie charts of each quarter of the run show continued exploration
of all architectures despite clear stratification in losses. (C) Optimal trade-off curve between overshoot and rise time. The objective weight α was
swept from α = 0.01 to α = 10,000 and BayesOpt was run for 100 iterations at each α value. The optimal parameter values were used to compute
the rise time and overshoot for visualization. The inset shows three sample trajectories illustrating how different optimal architectures navigate the
trade-off between overshoot and rise time.

Since pathway dynamics can strongly depend on enzyme allosteric activation constants. We solved the optimization
kinetics, the parametric uncertainty requires extensive sweeps problem for 1,000 combinations of these three parameters,
of kinetic parameters to determine the robustness of a specific which took under 16 hours on a Macbook Air with Apple M1
control architecture deemed to be optimal. processor and 8 GB of RAM running MacOS Monterey.
We focused on a pathway for synthesis of glucaric acid in Perturbing the kinetic parameters of the glucaric acid pathway
E. coli (Figure 2A), a key precursor for many downstream did not significantly affect the minimum objective function
products.36 The pathway branches from glucose-6-phosphate value achieved, indicating that the optimum is robust to
(g6p) in upper glycolysis and contains three enzymatic steps uncertainty in the kinetic parameters (Figure 2C). However,
(Ino1, SuhB, and MIOX). Doong and colleagues implemented the mean optimal objective function value was not significantly
a dynamic control circuit using the dual transcriptional higher among the perturbed samples. We found that the dual
regulator IpsA which responds to the intermediate myoinositol control architecture was chosen as optimal in more than 85%
(MI).26 The pathway enzyme MIOX is allosterically activated of samples (Figure 2D). We thus sought to examine the
by its own precursor, and one intermediate (MI) can be optimal dose-response parameters of this architecture in more
exported to the extracellular space. We employed a previously detail.
developed ODE model10 that was parametrized using a The maximal enzyme expression rates (k) and regulatory
combination of enzyme kinetic data and omics measurements, thresholds (θ) control the shape of the dose-response curves.
and considered the same four control architectures as in the As shown in Figure 2E, we found that the upstream repressive
previous example, including various alternative implementa- loop and downstream activatory loop had different optimal
tions of negative feedback control. dose-response curves, corresponding to different optimal
The results in Figure 2B show a typical run of the optimizer values of the continuous parameters. Optimal values of the
when using the cost-benefit objective in eq 3 (details in upstream repression threshold θ1 are low (mean value 0.64)
Supporting Information), together with the fraction of samples and compressed into a narrow range as compared to the larger
in which the algorithm explored each control architecture standard deviation of the downstream repression threshold θ2
across the successive iterations. The optimal architecture (dual (mean value 7.24). This is reflected on a larger variation in the
control in this case) was found quickly and the algorithm was shape of the dose response curve for the downstream loop.
able to further decrease the value of the objective function by Experimental fine-tuning of a dual control circuit might target
exploring the space of dose-response parameters of IpsA. We parameters with optimal values with a wide range, such as k1, as
observe that as the iterations progress, the algorithm shows a varying these parameters is less likely to impair circuit function.
remarkable ability to explore other architectures despite their Overall, these results show the robustness of the glucaric acid
larger objective function values, thus highlighting the global dual control system to kinetic parameter uncertainty and
nature of the algorithm. demonstrate the possibilities enabled by the speed of
To explore the impact of uncertain enzyme kinetics, we BayesOpt.
perturbed the parameters of the rate-limiting MIOX allosteric 2.3. Exploration of Alternative Objective Functions.
reaction: In the previous case studies we employed a cost-benefit
objective designed to account for the trade-off between
Vm,eff MI heterologous production and the cost of expressing pathway
VMIOX =
k m,MIOX + MI enzymes, as in eq 3. To demonstrate the flexibility of the
method with other objective functions, here we consider the
1 + aMIOX MI optimization of the temporal trajectories of pathway
given Vm,eff = Vm,MIOX
ka,MIOX + MI (4) metabolites.
We focused on the joint optimization of the rise time and
where Vm,MIOX is the maximum rate of reaction, km,MIOX is the overshoot in a model of a fatty acid production pathway
Michaelis-Menten constant, and ka,MIOX and aMIOX are considered previously in the literature.33 Fatty acids are an
2077 https://doi.org/10.1021/acssynbio.3c00120
ACS Synth. Biol. 2023, 12, 2073−2082
ACS Synthetic Biology pubs.acs.org/synthbio Research Article

essential energy source and cellular membrane component. In NGL giving a low overshoot and LNML a low rise time. The
addition, hydrocarbons derived from fatty acids have attracted optimal NML circuit has no overshoot but the slowest rise
attention as a potential biofuel source.24,44 Recent work time, while the LNML has a rapid rise time but overshoots the
engineering metabolic and genetic control loops showed that steady-state value by more than double. These opposing trade-
negative feedback control could speed up the rise to steady- offs demonstrate the importance of balancing multiple circuit
state conditions.33 The pathway built in literature expressed a design objectives.
thioesterase under transcriptional control, shown as the 2.4. Scalability to Large Pathway Models. Our previous
negative metabolic loop (NML) architecture in Figure 3A. In case studies have been limited to circuits with a single
addition to transcription-factor mediated negative feedback metabolite controlling gene expression and a relatively small
loops, this model also includes individually implemented direct number of control architectures. We now study a large model
genetic loops where a repressor is expressed on the same for the synthesis of p-aminostyrene (p-AS), an industrially
promoter as the enzyme. These two different scales of loops relevant vinyl aromatic monomer, in E. coli (Figure 4A)45 using
interface with different levels of cellular organization. We a cost-benefit objective similar to eq 3 tailored to the specific
explore several control architectures previously proposed in the pathway (details in Supporting Information). This model has
literature33 (Figure 3A). two possible metabolites that can regulate gene expression,
We first considered a similar objective function as in eq 3 so namely p-amminocinnamic acid (p-ACA) and p-amino-
as to compare convergence against the previous case studies. phenylalanine (p-AF), both of which can act as ligands for
We implemented a modified production flux objective which aptazyme-regulated expression device (aRED) transcription
takes the reciprocal of the product flux to convert the factors,46 and three genes to be controlled. The aRED
optimization to a minimization problem. The pathway cost Jcost transcription factors can also act as dual regulators (activators
is measured by summing the expression of all heterologous or repressors) on any of the three promoters involved in the
enzymes and varies across the different architectures. Details pathway. For simplicity, we limit the design space to control
on the objective function can be found in the Supporting architectures without positive feedback loops, as these are
Information. A representative optimization run for this prone to bistability.47 This results in 27 possible control
objective (Figure 3B) shows that the negative gene loop architectures and 16 continuous parameters to be optimized.
(NGL, green) and negative metabolic loop (NML, orange) The model also has a number of additional complexities. It
architectures perform, on average, better than the other two contains operon-based gene expression commonly found in
architectures. BayesOpt samples taken from the open-loop bacterial systems (genes papA, papB, and papC are expressed
architecture were, on average, 2 orders of magnitude worse on the papABC operon), it includes a detailed description of
than samples taken from NML and NGL architectures. Despite mRNA dynamics and protein folding, which results in a large
such hierarchy of loss values across the four architectures, the model with 23 differential equations, and it can also display
method effectively explores all architectures throughout the oscillatory dynamics.
optimization run. In addition to expression of heterologous enzymes, the
We next considered the optimization of percent overshoot accumulation of toxic intermediates is another major source of
and rise time presented in the literature.33 The percent genetic burden to host organisms. The p-AS model has several
overshoot objective, Jos, measures the maximal deviation of sources of toxicity present in the pathway. 34,45 The
product from its steady state concentration and is defined as intermediate p-ACA and the efflux pump used to remove p-
the percent difference between the maximum fatty acid ACA from cells are both cytotoxic, while another intermediate,
concentration and the steady state fatty acid concentration. p-AF, leaks from cells.34 The pathway enzyme L-Amino Acid
The rise time, Jrt, is a measure of how fast fatty acid production Oxidase (LAAO) depletes key aromatic amino acid metabo-
rises to steady state and is defined as the first time point where lites and creates toxic hydrogen peroxide as a byproduct. The
the fatty acid concentration reaches 50% of the steady state model incorporates these various types of toxicity in the form
value, normalized by the total integration time. We minimized of a toxicity factor τ. This toxicity factor is of the form
the sum of the overshoot and rise time with a scaling weight α:
ki
J = Jrt + Jos = Pefflux
(5) ki +
pACA
+ +
LAAO
ta tp tl (6)
Adjusting α balances the relative importance of the two
optimization criteria; details on the calculation of rise time and where tl, ta, and tp are chemical-specific toxicity factors.
overshoot are in the Supporting Information. Higher values of Enzyme-induced toxicity tl scales the key metabolite depletion
α correspond to optimal circuits with low rise times, while rate driven by the enzyme LAAO. Metabolite-induced toxicity
lower values of α prioritize circuits with low overshoot. Rise ta scales the impact of toxic intermediate p-ACA concentration.
time is a measure of circuit speed, while overshoot is a measure Finally, protein-induced toxicity tp reflects the toxicity caused
of circuit accuracy. We found that when α is varied across by efflux pump expression. The toxicity factor acts as a scaling
several orders of magnitude, the optimal circuits form a coefficient on the pathway synthesis, degradation, and folding
optimal trade-off curve (Figure 3C). Different architectures reaction rates.
occupy different parts of the optimal trade-off curve and Despite the complexity and size of the p-AS model, we
display markedly different dynamics. The NML optima observe that BayesOpt explores many of the 27 possible
occupies a single point in the loss space, indicating that architectures and converges to a low value of the objective
multiple continuous parameter values give the same loss function (Figure 4B); this was also achieved at a reasonable
function value for multiple values of α. The NML also has the computational cost (mean run time under 2 min). The best
lowest absolute loss function value of all the architectures architecture selected in the sample run was a double upstream
considered. The NGL and layered negative metabolic loop repression, single downstream activation loop controlled by p-
(LNML) architectures occupy larger ranges on the curve, with AF (Figure 4B, inset), but there is no clear best architecture
2078 https://doi.org/10.1021/acssynbio.3c00120
ACS Synth. Biol. 2023, 12, 2073−2082
ACS Synthetic Biology pubs.acs.org/synthbio Research Article

there are combinations of architectures and parameter values


that achieve a similar optimal loss. We also found that several
architectures can display oscillatory solutions, which we chose
to exclude from the search by applying a peak detection
algorithm48 and adding a large regularization term to the loss.
To investigate the robustness to chemical toxicity, we
perturbed the metabolite-induced toxicity ta and protein-
induced toxicity tp in eq 6. The optimal loss values were found
to be comparable between perturbed and background systems
(Figure 4C). Additionally, when projected onto a 2-dimen-
sional space using principal component analysis, the distribu-
tion of background parameter values was similar to the
distribution of perturbed solutions, indicating that the
perturbation did not significantly affect the optimal parameters
selected (Figure 4D).49 The p-AS pathway lies at the far end of
what is currently possible to build experimentally and thus
illustrates the broad applicability of BayesOpt to realistic
design tasks in metabolic engineering.

3. DISCUSSION
Progress in synthetic biology allows the construction of circuits
of increased complexity across various levels of biological
organization. However, large design spaces and multiple scales
can become substantial challenges for the design of functional
systems. In this paper, we presented the use of Bayesian
optimization for the design of biological circuits. The method
can rapidly find circuit architectures and parameters that
optimize a performance objective that captures the target
circuit functionality.
The method is particularly well suited for cases in which the
multiple scales prevent efficient simulation of ODE models.
Gene circuits designed to control metabolic pathways are an
excellent example of such multiscale systems, as they combine
fast metabolic time scales with the much slower dynamics of
gene expression. Moreover, the choice of regulators, control
Figure 4. Bayesian optimization in a complex pathway. (A) Schematic
of pathway for production of p-aminostyrene.34 Two intermediates points, and control architectures adds multiple degrees of
can act as ligands for metabolite-dependent riboregulators, and three freedom that are infeasible to explore experimentally.
promoter sites of control. The optimization problem has 16 Previously implemented metabolic control systems have been
continuous decision variables and 27 circuit architectures. The built primarily based on application-specific knowledge of
substrate S is converted by enzymes A, B, and C to X1, which is then pathway features. 27,50 We have shown that Bayesian
converted by E to X2. The toxic substrate X2 is then pumped out of optimization can aid the design of such systems prior to
the cell via an efflux pump to form the product P. Both X1 and X2 can implementation and serve as tools for in silico screening of
act on the transcription factors TF1 and TF2. Vin is the constant influx competing designs that may have similar performance but
to the engineered pathway from native metabolism. (B) Representa- entail different cost of wetlab implementation. We showed the
tive run of the BayesOpt algorithm; the method samples many
efficiency and scalability of the method in several real world
architectures before settling on the optimal one. Pie charts show
continued exploration of a large number of architectures. The winning case studies from metabolic engineering. In particular, the p-
architecture is shown in the inset. (C) The p-aminostyrene pathway aminostyrene pathway is more complex than systems typically
has several forms of substrate, protein, and enzyme toxicity expressed implemented in literature so far, which suggests that the
via a toxicity factor τ (see eq 6). To explore the effects of protein and method is applicable across a range of relevant design tasks.
metabolite toxicity, we perturbed the toxicity factor. Metabolite- We anticipate several novel applications of this work to other
induced toxicity was perturbed on the nominal range (1 × 10−3, 1 × problem areas where discovery or tuning of multiscale circuits
10−4) and protein-induced toxicity on the range (1 × 10−4, 1 × 101) has been previously infeasible. For instance, this method could
respectively. Both ranges were selected to match the ranges provided be employed to fit temporal circuit dynamics to data or discern
in the literature.34 Latin Hypercube sampling was used to generate N which of several discrete circuit mechanisms most closely
= 100 perturbed parameter values, and the optimal solutions were
matches observed behavior. As with other design strategies
compared to an equal number of background solutions using the
nominal parameter values. (D) Visualization of the optimal solutions; based on ODE models, a challenge of our approach is the
scatter plot of principal components of the optimal parameter values significant domain knowledge required to construct models for
for the model with perturbed toxicity parameters (N = 100). Contour a target pathway, both in terms of the enzyme kinetics and the
plots show the background distribution of parameter values. downstream metabolic processes that affect pathway activity.
Machine learning has already proved useful in a range of
metabolic engineering tasks51 and is gaining substantial interest
when the optimization is run many times. No architecture is in other areas of synthetic biology.52,53 In this paper we have
optimal for more than 15% of test runs, demonstrating that shown how such methods can also benefit dynamic pathway
2079 https://doi.org/10.1021/acssynbio.3c00120
ACS Synth. Biol. 2023, 12, 2073−2082
ACS Synthetic Biology pubs.acs.org/synthbio Research Article

Table 1. Summary of Pathway Models Studied in This Paper. The ODEs in the p-Aminostyrene Pathway Also Include mRNA
and Folding Dynamics
product parameters (pc) architectures (pd) metabolites enzymes ODEs
toy pathway 4 4 2 2 4
glucaric acid10,26 4 4 3 2 5
fatty acid33 2 4 1 2 3
p-aminostyrene34 16 27 7 6 23

engineering by using optimization as a means to navigate the assumed to be zero. Initial concentrations for native
design space prior to system prototyping. metabolites were determined by first solving a model without
the heterologous enzymes up to steady state. Simulation times
4. METHODS and initial conditions are detailed in the Supporting
4.1. Bayesian Optimization. We employed the Bayesian Information for each model.
optimization routine implemented in the Python HyperOpt 4.3. Loss Function. In all cases the loss function J in eq 3
package.37 Bayesian optimization is commonly employed for was instanced to each pathway. Generally, the loss is defined as
hyperparameter tuning in deep neural networks. We employed a linear combination of costs and benefits of pathway activity
Expected Improvement as an acquisition function and a tree- so as to balance opposing design goals commonly found in
structured Parzen estimator (TPE) as a nonparametric applications. Since both components of the loss function have
statistical model for the loss landscape. We performed a grid different magnitudes, for each model we first swept the weights
search over the TPE hyperparameter γ which controls the α1 and α2 across many model simulations, and chose values
balance between exploration and exploitation but found little
that led to similar values for both components; this prevents
impact on the algorithm performance; we thus used the default
value of γ = 15 (Supplementary Figure S1). the optimizer from biasing the search towards low loss values
Constraints on the continuous and discrete decision caused by the scaling effects alone. For the fatty acid model in
variables were incorporated directly into the HyperOpt Figure 3 we also optimized the circuit rise time and overshoot
search space. At each run of the Bayesian optimization routine, % defined in eq 5. Details on all objective functions can found
the initial guess for the continuous decision variables were in the Supporting Information.
sampled from uniform distributions, with upper and lower
bounds were taken from literature.10,34,44 Architectures were
chosen uniformly from the set of architectures without positive
feedback loops.
■ ASSOCIATED CONTENT
Data Availability Statement
4.2. Model Pathways. We considered four exemplar The Python code for this paper is available on Zenodo at
pathways modeled via ordinary differential equations (ODEs): https://doi.org/10.5281/zenodo.7926205.
the toy system in Figure 1C, the glucaric acid pathway in *
sı Supporting Information
Figure 2A, the fatty acid pathway in Figure 3A, and the p-
The Supporting Information is available free of charge at
aminostyrene pathway in Figure 4A. Table 1 contains a
summary of the four considered models. In all cases, pathway https://pubs.acs.org/doi/10.1021/acssynbio.3c00120.
models include ODEs for both metabolites and pathway Details of model construction for toy, glucaric acid, fatty
enzymes. In each case, we define the various control acid, and p-aminostyrene models, including parameter
architectures and incorporate them as discrete decision values and model equations; Additional supporting
variables in the optimization problem, i.e., pd in eq 1; the figures related to hyperparameter tuning (PDF)
continuous decision variables, i.e., pc in eq 1, appear in the
expression rates of the pathway enzymes. For the toy model
and the glucaric acid pathway, enzyme expression was
parametrized using a lumped Hill equation model to describe
■ AUTHOR INFORMATION
Corresponding Author
the interaction between a regulatory metabolite and a
transcription factor. For the fatty acid and p-aminostyrene Diego A. Oyarzún − School of Informatics, University of
pathways, expression rates were parametrized with bespoke Edinburgh, Edinburgh EH8 9AB, U.K.; The Alan Turing
nonlinear functions describing specific biochemical processes. Institute, London NW1 2DB, U.K.; School of Biological
The discrete control architectures were defined in two different Sciences, University of Edinburgh, Edinburgh EH9 3JH,
ways. For the toy, glucaric acid, and p-aminostyrene models, U.K.; orcid.org/0000-0002-0381-5278;
the architectures were defined using a binary matrix to encode Email: [email protected]
the mode of transcriptional control. For the fatty acid model
Authors
we instead defined each architecture as a categorical choice and
switched between model functions correspondingly. We note Charlotte Merzbacher − School of Informatics, University of
that the p-aminostyrene pathway also contains ODEs for Edinburgh, Edinburgh EH8 9AB, U.K.; orcid.org/0009-
mRNA abundance and folded/unfolded proteins. All models 0000-2853-1864
and their parameters are described in the Supporting Oisin Mac Aodha − School of Informatics, University of
Information. Edinburgh, Edinburgh EH8 9AB, U.K.; The Alan Turing
The ODE models were solved with scikit-odes, a Python Institute, London NW1 2DB, U.K.
wrapper for the SUNDIALS suite of solvers.54 In all cases, the Complete contact information is available at:
initial concentrations of heterologous pathway enzymes were https://pubs.acs.org/10.1021/acssynbio.3c00120
2080 https://doi.org/10.1021/acssynbio.3c00120
ACS Synth. Biol. 2023, 12, 2073−2082
ACS Synthetic Biology pubs.acs.org/synthbio Research Article

Author Contributions (20) Gonzalez, J.; Longworth, J.; James, D. C.; Lawrence, N. D.
CM built the optimization pipeline, ran simulations, and Bayesian optimization for synthetic gene design. arXiv, May 7, 2015.
produced figures. CM and DAO analyzed results. DAO and DOI: 10.48550/arXiv.1505.01627.
OMA supervised the research. (21) Shen, J.; Liu, F.; Tu, Y.; Tang, C. Finding gene network
topologies for given biological function with recurrent neural network.
Funding Nat. Commun. 2021, 12, 1−10.
CM and DAO were supported by the United Kingdom (22) Zhang, F.; Carothers, J. M.; Keasling, J. D. Design of a dynamic
Research and Innovation (grant EP/S02431X/1, UKRI Centre sensor-regulator system for production of chemicals and fuels derived
for Doctoral Training in Biomedical AI). from fatty acids. Nat. Biotechnol. 2012, 30, 354−359.
(23) Oyarzún, D. A.; Stan, G.-B. V. Synthetic gene circuits for
Notes
metabolic control: design trade-offs and constraints. J. R. Soc., Interface
The authors declare no competing financial interest. 2013, 10, 20120671.

■ REFERENCES
(1) Brophy, J. A. N.; Voigt, C. A. Principles of genetic circuit design.
(24) Xu, P.; Li, L.; Zhang, F.; Stephanopoulos, G.; Koffas, M.
Improving fatty acids production by engineering dynamic pathway
regulation and metabolic control. Proc. Natl. Acad. Sci. U. S. A. 2014,
Nat. Methods 2014, 11, 508−520. 111, 11299−11304.
(2) Shaw, W. M.; Yamauchi, H.; Mead, J.; Gowers, G. O. F.; Bell, D. (25) Dunlop, M. J.; Keasling, J. D.; Mukhopadhyay, A. A model for
J.; Ö ling, D.; Larsson, N.; Wigglesworth, M.; Ladds, G.; Ellis, T. improving microbial biofuel production using a synthetic feedback
Engineering a Model Cell for Rational Tuning of GPCR Signaling. loop. Systems and Synthetic Biology 2010, 4, 95−104.
Cell 2019, 177, 782−796. (26) Doong, S. J.; Gupta, A.; Prather, K. L. Layered dynamic
(3) Zhang, F.; Carothers, J. M.; Keasling, J. D. Design of a dynamic regulation for improving metabolic pathway productivity in
sensor-regulator system for production of chemicals and fuels derived Escherichia coli. Proc. Natl. Acad. Sci. U. S. A. 2018, 115, 2964−2969.
from fatty acids. Nat. Biotechnol. 2012, 30, 354−9. (27) Ni, C.; Dinh, C. V.; Prather, K. L. Dynamic Control of
(4) Ma, W.; Trusina, A.; El-Samad, H.; Lim, W. A.; Tang, C. Metabolism. Annu. Rev. Chem. Biomol. Eng. 2021, 12, 519.
Defining network topologies that can achieve biochemical adaptation. (28) Hartline, C. J.; Schmitz, A. C.; Han, Y.; Zhang, F. Dynamic
Cell 2009, 138, 760−773. control in metabolic engineering: Theories, tools, and applications.
(5) Li, Z.; Liu, S.; Yang, Q. Incoherent inputs enhance the Metabolic Engineering 2021, 63, 126−140.
robustness of biological oscillators. Cell Systems 2017, 5, 72−81. (29) Tonn, M. K.; Thomas, P.; Barahona, M.; Oyarzún, D. A.
(6) Qiao, L.; Zhao, W.; Tang, C.; Nie, Q.; Zhang, L. Network Stochastic modelling reveals mechanisms of metabolic heterogeneity.
topologies that can achieve dual function of adaptation and noise Commun. Biol. 2019, 2, 108.
attenuation. Cell Systems 2019, 9, 271−285. (30) Mannan, A. A.; Liu, D.; Zhang, F.; Oyarzún, D. A. Fundamental
(7) Dasika, M. S.; Maranas, C. D. OptCircuit: An optimization based Design Principles for Transcription-Factor-Based Metabolite Bio-
method for computational design of genetic circuits. BMC Syst. Biol. sensors. ACS Synth. Biol. 2017, 6, 1851−1859.
2008, 2, 1−19. (31) Zhou, L.-B.; Zeng, A.-P. Exploring Lysine Riboswitch for
(8) Otero-Muras, I.; Banga, J. R. Automated Design Framework for Metabolic Flux Control and Improvement of L-Lysine Synthesis in
Synthetic Biology Exploiting Pareto Optimality. ACS Synth. Biol. Corynebacterium glutamicum. ACS Synth. Biol. 2015, 4, 729−734.
2017, 6, 1180−1193. (32) Chaves, M.; Oyarzún, D. A. Dynamics of complex feedback
(9) Hiscock, T. W. Adapting machine-learning algorithms to design architectures in metabolic pathways. Automatica 2019, 99, 323−332.
gene circuits. BMC Bioinformatics 2019, 20, 1−13. (33) Liu, D.; Zhang, F. Metabolic feedback circuits provide rapid
(10) Verma, B. K.; Mannan, A. A.; Zhang, F.; Oyarzún, D. A. Trade- control of metabolite dynamics. ACS Synth. Biol. 2018, 7, 347−356.
offs in biosensor optimization for dynamic pathway engineering. ACS (34) Stevens, J. T.; Carothers, J. M. Designing RNA-based genetic
Synth. Biol. 2022, 11, 228−240. control systems for efficient production from engineered metabolic
(11) Banga, J. Optimization in computational systems biology. BMC pathways. ACS Synth. Biol. 2015, 4, 107−115.
Syst. Biol. 2008, 2, 47. (35) Frazier, P. I. A tutorial on Bayesian optimization. arXiv, July 8,
(12) Hairer, E.; Wanner, G. Solving Ordinary Differential Equations 2018. DOI: 10.48550/arXiv.1807.02811.
II: Stiff and Differential-Algebraic Problems; Springer-Verlag, 1996. (36) Moon, T. S.; Yoon, S.-H.; Lanza, A. M.; Roy-Mayhew, J. D.;
(13) Blanchini, F.; Franco, E.; Giordano, G. A structural
Prather, K. L. J. Production of glucaric acid from a synthetic pathway
classification of candidate oscillatory and multistationary biochemical
in recombinant Escherichia coli. Applied and Environmental Micro-
systems. Bulletin of Mathematical Biology 2014, 76, 2542−69.
biology 2009, 75, 589−595.
(14) Briat, C.; Gupta, A.; Khammash, M. Antithetic proportional-
(37) Bergstra, J.; Yamins, D.; Cox, D. Making a science of model
integral feedback for reduced variance and improved control
performance of stochastic reaction networks. J. R. Soc., Interface search: Hyperparameter optimization in hundreds of dimensions for
2018, 15, 20180079. vision architectures. In International Conference on Machine Learning;
(15) Briat, C.; Gupta, A.; Khammash, M. Antithetic integral ICML, 2013; pp 115−123.
feedback ensures robust perfect adaptation in noisy biomolecular (38) Snoek, J.; Larochelle, H.; Adams, R. P. Practical bayesian
networks. Cell Systems 2016, 2, 15−26. optimization of machine learning algorithms. In Advances in Neural
(16) Bhattacharya, P.; Raman, K.; Tangirala, A. K. Discovering Information Processing Systems; NeurIPS, 2012; Vol. 25.
adaptation-capable biological network structures using control- (39) Bar-Even, A.; Noor, E.; Savir, Y.; Liebermeister, W.; Davidi, D.;
theoretic approaches. PLOS Computational Biology 2022, 18, Tawfik, D. S.; Milo, R. The moderately efficient enzyme: evolutionary
No. e1009769. and physicochemical trends shaping enzyme parameters. Biochemistry
(17) Araujo, R. P.; Liotta, L. A. The topological requirements for 2011, 50, 4402−4410.
robust perfect adaptation in networks of any size. Nat. Commun. 2018, (40) Venayak, N.; Anesiadis, N.; Cluett, W. R.; Mahadevan, R.
9, 1757. Engineering metabolism through dynamic control. Curr. Opin.
(18) Drengstig, T.; Ueda, H. R.; Ruoff, P. Predicting perfect Biotechnol. 2015, 34, 142−152.
adaptation motifs in reaction kinetic networks. J. Phys. Chem. B 2008, (41) Hartline, C.; Mannan, A.; Liu, D.; Zhang, F.; Oyarzún, D.
112, 16752−16758. Metabolite sequestration enables rapid recovery from fatty acid
(19) Woods, M. L.; Leon, M.; Perez-Carrasco, R.; Barnes, C. P. A depletion in Escherichia coli. mBio 2020, 11, 590943.
statistical approach reveals designs for the most robust stochastic gene (42) Zhu, Y.; Li, Y.; Xu, Y.; Zhang, J.; Ma, L.; Qi, Q.; Wang, Q.
oscillators. ACS Synth. Biol. 2016, 5, 459−470. Development of bifunctional biosensors for sensing and dynamic

2081 https://doi.org/10.1021/acssynbio.3c00120
ACS Synth. Biol. 2023, 12, 2073−2082
ACS Synthetic Biology pubs.acs.org/synthbio Research Article

control of glycolysis flux in metabolic engineering. Metabolic


Engineering 2021, 68, 142−151.
(43) Schomburg, I.; Jeske, L.; Ulbrich, M.; Placzek, S.; Chang, A.;
Schomburg, D. The BRENDA enzyme information system−From a
database to an expert system. Journal of biotechnology 2017, 261, 194−
206.
(44) Zhang, Y.; Nielsen, J.; Liu, Z. Metabolic engineering of
Saccharomyces cerevisiae for production of fatty acid−derived
hydrocarbons. Biotechnology and Bioengineering 2018, 115, 2139−
2147.
(45) Goikhman, M. Y.; Yevlampieva, N.; Kamanina, N.; Podeshvo,
I.; Gofman, I.; Mil’tsov, S.; Khurchak, A.; Yakimanskii, A. New
polyamides with main-chain cyanine chromophores. Polymer Science
Series A 2011, 53, 457−468.
(46) Ellington, A. D.; Szostak, J. W. In vitro selection of RNA
molecules that bind specific ligands. Nature 1990, 346, 818−822.
(47) Oyarzún, D. A.; Chaves, M. Design of a bistable switch to
control cellular uptake. Journal of The Royal Society Interface 2015, 12,
20150618.
(48) Virtanen, P.; Gommers, R.; Oliphant, T. E.; Haberland, M.;
Reddy, T.; Cournapeau, D.; Burovski, E.; Peterson, P.; Weckesser, W.;
Bright, J.; et al. SciPy 1.0: fundamental algorithms for scientific
computing in Python. Nat. Methods 2020, 17, 261−272.
(49) Abdi, H.; Williams, L. J. Principal component analysis. Wiley
Interdisciplinary Reviews: Computational Statistics 2010, 2, 433−459.
(50) Liu, D.; Mannan, A. A.; Han, Y.; Oyarzún, D. A.; Zhang, F.
Dynamic metabolic control: towards precision engineering of
metabolism. Journal of Industrial Microbiology and Biotechnology
2018, 45, 535−543.
(51) Radivojević, T.; Costello, Z.; Workman, K.; García Martín, H.
A machine learning Automated Recommendation Tool for synthetic
biology. Nat. Commun. 2020, 11, 1−14.
(52) Carbonell, P.; Radivojevic, T.; García Martín, H. Opportunities
at the Intersection of Synthetic Biology, Machine Learning, and
Automation. ACS Synth. Biol. 2019, 8, 1474−1477.
(53) Nikolados, E.-M.; Wongprommoon, A.; Aodha, O. M.;
Cambray, G.; Oyarzún, D. A. Accuracy and data efficiency in deep
learning models of protein expression. Nat. Commun. 2022, 13, 7755.
(54) Gardner, D. J.; Reynolds, D. R.; Woodward, C. S.; Balos, C. J.
Enabling new flexibility in the SUNDIALS suite of nonlinear and
differential/algebraic equation solvers. ACM Trans. Math. Softw. 2022,
48, 1.
(55) Solgi, R. M. Geneticalgorithm Package; 2020. https://pypi.org/
project/geneticalgorithm/.
(56) Loh, W.-L. On Latin hypercube sampling. Ann. Statist. 1996,
24, 2058−2080.

2082 https://doi.org/10.1021/acssynbio.3c00120
ACS Synth. Biol. 2023, 12, 2073−2082

You might also like