Robust Bayesian Estimation of The Kinetics of The Polymorphic Crystallization of L-Glutamic Acid Crystals

Robust Bayesian Estimation of Kinetics
for the Polymorphic Transformation of

L-Glutamic Acid Crystals
Martin Wijaya Hermanto, Nicholas C. Kee, Reginald B. H. Tan, and Min-Sen Chiu
Dept. of Chemical and Biomolecular Engineering, National University of Singapore, Singapore, Singapore 117576
Richard D. Braatz
Dept. of Chemical and Biomolecular Engineering, University of Illinois at Urbana-Champaign, Urbana, IL 61801
DOI 10.1002/aic.11623
Published online November 4, 2008 in Wiley InterScience (www.interscience.wiley.com).
Polymorphism, in which there exist different crystal forms for the same chemical
compound, is an important phenomenon in pharmaceutical manufacturing. In this arti-
cle, a kinetic model for the crystallization of L-glutamic acid polymorphs is developed
from experimental data. This model appears to be the first to include all of the trans-
formation kinetic parameters including dependence on the temperature. The kinetic pa-
rameters are estimated by Bayesian inference from batch data collected from two
in situ measurements: ATR-FTIR spectroscopy is used to infer the solute concentration,
and FBRM that provides crystal size information. Probability distributions of the esti-
mated parameters in addition to their point estimates are obtained by Markov Chain
Monte Carlo simulation. The kinetic model can be used to better understand the effects
of operating conditions on crystal quality, and the probability distributions can be
used to assess the accuracy of model predictions and incorporated into robust control
strategies for polymorphic crystallization. 2008 American Institute of Chemical Engineers
AIChE J, 54: 32483259, 2008
Keywords: pharmaceutical crystallization modeling, polymorphism, Bayesian inference,
Markov Chain Monte Carlo
Introduction industry.15 The variation in physical properties such as crys-

Polymorphism, in which multiple crystal forms exist for tal shape, solubility, hardness, color, melting point, and
the same chemical compound, is of significant interest to chemical reactivity makes polymorphism an important issue
for the food, specialty chemical, and pharmaceutical indus-
tries, where products are specified not only by chemical com-
Additional Supporting Information may be found in the online version of this position but also by their performance.2 Controlling polymor-
article.
N. C. Kee is also affiliated with Dept. of Chemical and Biomolecular Engineer- phism to ensure consistent production of the desired poly-
ing, University of Illinois at Urbana-Champaign, Urbana, IL 61801. morph is important in those industries, including in drug
R. B. H. Tan is also affiliated with Institute of Chemical and Engineering Scien-
ces Singapore, Singapore 627833. manufacturing where safety is paramount. With the ulti-
Correspondence concerning this article should be addressed to M.-S. Chiu at mate goal being to better understand the effects of process
[email protected]
conditions on crystal quality and to control the formation of
2008 American Institute of Chemical Engineers the desired polymorph, this article considers the estimation
3248 December 2008 Vol. 54, No. 12 AIChE Journal

of kinetic parameters in a model for polymorphic crystalliza- Table 1. Glutamic Acid Aqueous Solutions Used for
tion. Such a process model can accelerate the determination Calibration
of optimal operating conditions and to speed process devel- Concentration (g/g of water) Temperature Range (8C)
opment, when compared with time-consuming and expen-
sive trial-and-error methods for determining the operating 0.00837 3521
0.01301 4813
conditions. 0.01800 5732
In this article, a kinetic model of L-glutamic acid polymor- 0.02300 6434
phic crystallization is developed from batch experiments with 0.02800 6445
in situ measurements including ATR-FTIR spectroscopy to
infer the solute concentration and FBRM to provide crystal
size information. Kinetics of polymorphic transformation
have been estimated by various procedures.69 A commonly theory and MCMC integration is discussed next. This is fol-
used method to estimate model parameters in nonlinear pro- lowed by the description of the L-glutamic acid crystalliza-
cess models is weighted least squares,1012 which has been tion model and the results of the parameter estimation.
applied to polymorphic crystallization.1316 Although Finally, conclusions are given.
weighted least squares methods are adequate for many prob-
lems, Bayesian inference is able to include prior knowledge Experimental Methods
in the statistical analysis, which can produce models with
higher predictive capability. Although Bayesian inference is The crystallization instrument setup used was similar to
not within the standard toolkit for chemical engineers, there that described previously.31 A Dipper-210 ATR immersion
have been many applications to chemical engineering prob- probe (Axiom Analytical) with ZnSe as the internal reflec-
lems over the years including the estimation of parameters in tance element attached to a Nicolet Protege 460 FTIR spec-
chemical reaction,17 heat transfer in packed beds,18 microbial trophotometer was used to obtain L-glutamic acid spectra in
systems,1921 and microelectronics processes.22 aqueous solution, with a spectral resolution of 4 cm21. The
Quantifying uncertainties in the parameter estimates is chord length distribution (CLD) for L-glutamic crystals in so-
required for assessing the accuracy of model predictions.23,24 lution were measured using Lasentec FBRM connected to a
When weighted least squares methods are used for parameter Pentium III running version 6.0b12 of the FBRM Control
estimation, the widely used approaches to quantify uncertain- Interface software.
ties in parameter estimates are the linearized statistics and like- Calibration for solution concentration
lihood ratio approaches.25 In the linearized statistics approach,
the model is linearized around the optimal parameter estimates Different solution concentrations of L-glutamic acid (99%,
and the parameter uncertainty is represented by a v2 distribu- Sigma Aldrich) and degassed deionized water were placed in
tion. This model linearization can result in highly inaccurate a 500-ml jacketed round-bottom flask and heated until com-
uncertainty estimates for highly nonlinear models,25 and this plete dissolution. The solution was then cooled at 0.58C/min
approach ignores physical constraints on the model parame- while the IR spectra were being collected, with continuous
ters. The likelihood ratio approach, which is the nonlinear ana- stirring in the flask using an overhead mixer at 250 rpm. Ta-
logue to the well-known F statistic, takes nonlinearity into ble 1 lists the five different solution concentrations used to
account but approximates the distribution25 and ignores con- build the calibration model.
straints on the model parameters. This article applies a Bayes- The IR spectra of aqueous L-glutamic acid in the range
ian inference approach that not only avoids making these 11001450 cm21 and the temperature were used to construct
approximations but also includes prior information during the the calibration model based on various chemometrics meth-
estimation of parameter uncertainties. ods such as principal component regression (PCR) and partial
In this article, the parameters in a kinetic model for L-glu- least square regression (PLS).32 The calculations were carried
tamic acid polymorphic crystallization process are determined out using inhouse MATLAB 5.3 (The Mathworks) code
by Bayesian estimation. The probability distribution over pro- except for PLS, which was from the PLS Toolbox 2.0. The
cess model parameters is defined through the Bayesian posterior mean width of the prediction interval was used as the crite-
density, from which all parameter estimates of interest (e.g., rion to select the most accurate calibration model. The noise
means, modes, and credible intervals) are calculated. However, level was selected based on the compatibility of the predic-
the conventional approach to calculate these estimates often tion intervals with the accuracy of the solubility data. The
involves complicated integrals of the Bayesian posterior den- chemometrics method forward selection PCR 2 (FPCR 2)33
sity, which are analytically intractable. To overcome this draw- was selected because it gave the smallest prediction interval;
back, Markov Chain Monte Carlo (MCMC) integration2628 using a noise level of 0.001, the prediction interval (0.73 g/
was applied to compute these integrals in an efficient manner. kg) was compatible within the accuracy of this model with
MCMC does not require approximation of the posterior distri- respect to solubility data reported in the literature.13
bution by a Gaussian distribution.20,29,30 This posterior distribu-
tion for the estimated parameters can be used to accurately Solubility determination and feedback
quantify the accuracy of model predictions and can be incorpo- concentration control experiments
rated into robust control strategies for crystallization process.24 The commercially available L-glutamic acid crystals were
This article is organized as follows. The next section verified to be pure b-form using powder X-ray diffraction
describes the experimental procedure to obtain measurement (XRD) and were used for the determination of the b-form
data for parameter estimation. A short review of Bayesian solubility curve. Pure a-form crystals obtained using a rapid
AIChE Journal December 2008 Vol. 54, No. 12 Published on behalf of the AIChE DOI 10.1002/aic 3249
Table 2. Solubility Data for L-Glutamic Acid Polymorphs The main substance of Bayesian inference is Bayes rule:
Solubility of Solubility of Pryjh Prh
Temperature (8C) a-Form (g/kg) b-Form (g/kg) Prhjy ; (1)
Pry
25 10.5971 8.5434
30 13.1599 9.7362 where h is a vector of unknown parameters of interest and y
35 15.8004 12.4257 represents the collected data which is used to infer h. These
40 19.1689 13.7163
45 23.1385 17.0729 data usually consist of observed state variables (e.g., concen-
50 27.0364 19.8722 tration) at different time points. Pr(h) is the prior distribution
55 31.7768 23.3904 of h, Pr(y|h) is referred as the sampling distribution (or data
60 36.8028 27.7567 distribution) for fixed parameters h. When the data y are
known and the parameters h are unknown (i.e., as in parame-
cooling method outlined previously13 were used to determine ter estimation), the term Pr(y|h) is referred as the likelihood
the a-form solubility curve in similar fashion as the b-form function and denoted as L(h|y). Pr(h|y) is referred as
in a separate experiment. For each polymorph, the IR spectra the Bayesian
R posterior distribution of h, and
of L-glutamic acid slurries (saturated, and with excess crys- Pry Pryjh Prhdh acts as a normalizing constant to
tals) were collected at different temperatures ranging from 25 ensure that the Bayesian posterior integrates to unity. This
to 608 C. The slurry was equilibrated for 45 min to 1 h at a constant is also called marginal likelihood or Bayes factor.
specified temperature before recording the IR spectra. The For the inference of h, the Bayes factor can be omitted since
solution concentration was then calculated using the afore- it does not affect the the resulting posterior distribution of h,
mentioned calibration model. The resulting solubility mea- which yields the unnormalized posterior distribution:
surements for L-glutamic acid polymorphs are tabulated in
Table 2, and Figure 1 compares the measurements to their Prhjy / Lhjy Prh: (2)
quadratic polynomial fitting.
In the seeded batch crystallization experiments, appropriate In this article, it is assumed that the model structure is correct,
amounts of L-glutamic acid (99%, Sigma Aldrich) in 400 g of and the measurement noise is distributed normally with zero
water was heated to about 58C above the b-form saturation tem- mean and unknown variance. Then, the likelihood is of the form
perature in a 500-mL jacketed round-bottom flask with an over- Nd
head mixer at 250 rpm to create an undersaturated solution. YNm Yj

The crystallizer was then cooled and seed crystals (either pure Lhjy L hsys; rjy Pr yjk jhsys; r
j1 k1
a- or b-form) were added when the solution was supersaturated
Nd j 2 !
with respect to the seeded form. Different supersaturation set- Nm Y
Y 1 yjk y^jk hsys
point profiles were followed during crystallization based on in p exp
j1 k1 2prj
2r2j
situ solution concentration measurement as described previ- 0
ously.31 The control algorithm was started shortly after seeding. Nd j 2 1
1 X Nm X
y jk ^jk hsys
y
N exp@ A;
Qm p Ndj 2r2j
Review of Bayesian Inference 2prj j1 k1
j1
Bayesian posterior
3
Bayesian inference is the process of fitting a probability
model to a set of data and summarizing the results by a proba-
bility distribution on the parameters of the model and on unob-
served quantities such as predictions for new observations.28
The fundamental difference between Bayesian and traditional
statistical methods is the interpretation of probability. Classical
methods, also known as the frequentist methods, perceive
probability as the long-run relative frequency of occurrence
determined by the repetition of an event. A Bayesian method
perceives probability as a quantitative description of the degree
of belief in a given proposition.20,34 With this interpretation of
probability, the Bayesian method allows a practitioner to
account for prior information in a statistical analysis.
Furthermore, Bayesian inference facilitates a common-
sense interpretation of statistical conclusions. For instance, a
Bayesian credible interval for an unknown quantity of inter-
est can be directly regarded as having a high probability of
containing the unknown quantity, in contrast to a frequentist
confidence interval, which may strictly be interpreted only in
relation to a sequence of similar inferences that might be
made in repeated practice. A brief introduction to Bayesian
inference is given later. Interested readers are referred to Figure 1. Solubility curves of L-glutamic acid
Refs. 28, 34 and 35 for a thorough discussion. polymorphs.
3250 DOI 10.1002/aic Published on behalf of the AIChE December 2008 Vol. 54, No. 12 AIChE Journal
where h 5 [hsys, r]T is the vector of parameters of interest, used approach to create the next step of the chain c, hc,sp, is to
which consist of the system/model (hsys) and noise (r) parame- perturb the current step of the chain hc,s by adding some
ters, yjk and y^jk are the measurement and predicted value of jth amount of noise (hc,sp 5 hc,s 1 e), where e is distributed nor-
variable at sampling instance k, respectively, Nm is the number mally with zero mean and covariance matrix S. However,
of measured variables, Ndj is the number of time samples of jth specifying the covariance matrix can be challenging. This co-
variable, and rj is the standard deviation of the measurement variance matrix needs to be chosen in such a way so as to bal-
noise in the jth variable. ance progress in each step and a reasonable acceptance rate.
The prior distribution Pr(h) can be informative or nonin- A poorly chosen covariance matrix may cause slow conver-
formative depending on the prior knowledge of h. The most gence. Traditionally, the covariance matrix is estimated from
commonly used noninformative prior is Pr(h) ! 1. However, a trial run and much recent research is devoted to ways of
this is an improper prior distribution, since its integral is in- doing that efficiently and/or adaptively.38 If parameters h are
finity, and may lead to an improper posterior distribution. highly correlated, special precautions must be taken to avoid
The use of an informative prior distribution is preferred, for singularity of the estimated covariance matrix.
example, a prior distribution which specifies the minimum Recently, there has been a development in combining evo-
and maximum possible values of h is lutionary algorithms with MCMC.3942 Among others, the
combination of differential evolution (DE) with MCMC is
1 if hmin h hmax particularly interesting. DE is an evolutionary algorithm for
Prh / (4)
0 otherwise; numerical optimization; its combination with MCMC (short-
ened as DE-MC42) solves an important problem in MCMC,
which means that all values of h between hmin and hmax have namely that of choosing an appropriate scale and orientation
equal probability. In cases where the prior distribution is avail- for the jumping distribution (i.e., related to the covariance
able from past parameter estimation studies, the distribution is matrix S in the Metropolis algorithm). In DE-MC, the jumps
not uniform.22 A detailed discussion regarding informative and are simply a fixed multiple of two random parameter vectors
noninformative priors can be found in the literature.28,35,36 that are currently in the population, and the selection process
The product of the likelihood and prior distribution defines of DE-MC works via the usual Metropolis ratio that defines
the Bayesian posterior, which is the joint probability distribu- the probability with which a proposal is accepted. Motivated
tion for all parameters after data have been observed. Once by its efficiency and effectiveness, DE-MC is utilized to con-
the Bayesian posterior is defined, it is desirable to determine struct the Markov chains of h in this article.
the mean, mode, and credible intervals associated with each Constructing the Markov chains is one step. Next is to moni-
of the parameters. Markov chain simulation, also called tor the convergence of the chains to decide how many samples
MCMC, is employed for that purpose in this article. need to be collected or when to stop the MCMC simulation.
Too few samples will result in an inaccurate distribution of the
Markov chain simulation parameters h. Here, potential scale reduction factors Rî were
Markov chain simulation draws values of h from approxi- adopted to monitor the convergence of the Markov chains,28
mate distributions and then corrects these values to better ap- which estimate the potential improvement in the Markov
proximate the target distribution. In this case, the target distri- chain estimation of the respective ith parameter hi if the Mar-
bution is the Bayesian posterior. The samples are drawn kov chain simulation were continued. This potential scale
sequentially, with the distribution of the sampled values reduction factor is calculated from the following equations:
depending on the last value drawn. The Markov chain is a
s
sequence of random variables h0, h1,. . ., for which, for any s,
vâr hjyi
the distribution of hs11 given all previous hs depends only on Rî ; (5)
the most recent value, hs. The key to the methods success, Wi
however, is not the Markov property but rather that the ap- n1 1
proximate distributions are improved at each step in the simu- vâr hjyi Wi Bi ; (6)
n n
lation, in the sense of converging to the target distribution.*
In the application of Markov chain simulation, several par- n X m 2
allel chains are drawn. Parameters from each chain c, hc,s, s Bi hsi hi ; (7)
m 1 s1
5 1, 2, 3,. . ., are produced by starting at some point hc,0 and
then, for each step s, drawing hc,s11 from a jumping distribu- 1X m
2
tion, Ts(hc,s11|hc,s) that depends on the previous draw, hc,s. Wi ds ; (8)
m s1 i
The jumping probability distributions must be constructed so
that the Markov chain converges to the target posterior distri-
1X n
bution. hsi hc;s ; (9)
The Metropolis algorithm37 is a simple algorithm to con- n c1 i
struct a Markov chain, which converges to the posterior distri-
bution. The algorithm is an adaptation of a random walk that 1X m
hi hs ; (10)
uses an acceptance/rejection rule to converge to the specified m s1 i
target distribution. In the Metropolis algorithm, the widely
s 2 1 X n
s 2
* For
di hc;s
i hi ; (11)
27,28
further information on Markov chains, readers are referred to other litera- n 1 c1
ture.
where hic,s is the simulation draws of parameter i from step To calculate any properties of the Bayes posterior, it is
chain c at step s, Bi and Wi are the between- and within- necessary to evaluate integral
sequence variances of parameter i, respectively, m is the
number of parallel chains, with each chain of length n. The Zhmax
potential scale reduction factor decreases asymptotically to 1 Ef h f h Prhjydh; (16)
as n ? 1. Once Rî is near 1 for all i, it is safe to stop the hmin
simulation.
To summarize, the following is the procedure for con- where E[] is the expected value and, f(h) is a function for
structing Markov chains using DE-MC with the potential which the expected value is to be estimated. Conventionally,
scale reduction factor as the stopping criterion: this integration can be performed analytically if the resulting
(1) Draw starting parameters for all chains, hc,0 (c 5 function inside the integral operator is simple. However, the
1, . . ., m), from a starting distribution or choose starting pa- Bayesian posteriors most often have irregular forms such that
rameters from dispersed values around a crude approximation analytical integrations become infeasible. In such situations,
of the parameters. it is suitable to perform Monte Carlo integration,2628 which
(2) At each step, create a proposed value hc,sp according to utilizes the matrix Y obtained in the previous section:
the jumping rule
1 X Ns

hc;sp hc;s c hR1 ;s hR2 ;s e; (12) Ef h lim f hl
Ns !1 Ns
l1
(17)
where e is drawn from a symmetric distribution with a small 1 X Ns

variance compared to that of the target, but with unbounded f hl for large Ns ;
Ns l1
support (e.g., e ; N(0, b)Nh with b small, b 5 1024 is uti-
lized in this article), Nh is the number of parameters in h,
where hl 5 [hl1 , hl2 , . . ., hlNh ] is a random sample drawn from
hR1,s and hR2,s are randomly selected without replacement
the Bayesian posterior, which is obtained from the lth row of
from all chains at step s, and c is a scaling constant with typ-
matrix Y. For example, the mean of each parameter hi is
ical values between 0.4 and 1. From the guidelines
p in the lit-
obtained by setting f(hl) 5 hl in (17).
erature,42 the optimal choice of c is 2:38= 2Nh . This choice
It is also desirable to obtain the marginal mode and credible
of c is expected to give an acceptance probability of 0.44 for
interval for each parameter. Conventionally, this is done by
Nh 5 1, 0.28 for Nh 5 5, and 0.23 for large Nh.
drawing samples from the marginal posterior for each parame-
(3) Calculate the ratio of the posterior densities,
ter and analyzing their histograms, where the marginal poste-
Prhc;sp jy rior is calculated by integrating the Bayes posterior with respect
r : (13) to all parameters except the desired parameter as follows
Prhc;s jy
hZ1;max hZj;max
and obtain hc,s11 from
c;sp Prhi jy
h with probability minfr; 1g
hc;s1 c;s (14) h1;min hj;min
h otherwise: hNh ;max
Z

(4) For each parameter i, calculate the potential scale Pr h1 ; :::; hj ; :::; hNh jy dh1 :::dhj :::dhNh ; 18
reduction factor Rî by (5) to (11). If Rî 1:1 for all i 5 1, hNh ;min
2, . . ., Nh, stop the iteration and construct the matrix
2 3 where j = i and Pr(hi|y) is the marginal posterior of hi. By tak-
h11 h1Nh ing advantage of the MCMC approach, this integration is not
6 .. 7
H 4 ... ..
. . 5; (15) required since the samples from the marginal posterior of hi are
given by the ith column of the matrix Y. The marginal mode of
hN1 s Ns
hNh hi was estimated by determining the highest peak in the histo-
grams of the marginal posterior. Finally, the 95% credible inter-
where Y contains the approximated samples from the target
val of hi was estimated by determining the range of hi, which
distribution and Ns is the total number of values drawn from
have cumulative marginal distribution between 2.5 and 97.5%.
the second halves for all the chains.
Otherwise, if Rî > 1:1 for any i, set s 5 s 1 1 and go to
Step 2. L-Glutamic Acid Crystallization Model
Monte Carlo integration A kinetic model for the crystallization of metastable a-
In the previous sections, the Bayes posterior was defined form and stable b-form crystals of L-Glutamic acid is devel-
and a method for drawing samples from it was described, oped. This appears to be the first model for polymorphic
from which a matrix Y was generated. Here, the significance crystallization that includes all of the kinetic processes and
of this matrix is described through its use for calculating the also includes their dependence on the temperature. An earlier
desired properties of the Bayes posterior. model for this system did not include the nucleation and
growth kinetics of a-form crystals.13 An improved model

According to Gelman et al,28 a value below 1.1 is acceptable. that includes those kinetics14 only considered primary hetero-
geneous nucleation, which only applies when the crystalliza- Table 3. Values for Densities, Volume Shape Factors, and
tion is either starved with nuclei or overwhelmed by a burst Saturation Concentration Parameters
of new crystals, and hence not applicable to industrial prac- Parameters Values
tice.43 To develop a model amenable for industrial applica-
tion, secondary nucleation is considered in this article. qsolv 990
qa 1540
qb 1540
kma 0.480
kmb 0.031
Kinetic model aa,1 8.437 3 1023
aa,2 0.03032
The mass balance on the crystals is described by a popula- aa,3 4.564
tion balance equation44 ab,1 7.644 3 1023
ab,2 20.1165
@fi @ Gi fi ab,3 6.622
Bi dL L0 ; i a; b (19)
@t @L
where fi is the crystal size distribution of the i-form crystals
(#/m4) (i.e., a- or b-form crystals), Bi and Gi are the nuclea- Bb kbb;1 Sb 1 la;3 kbb;2 Sb 1 lb;3
tion (#/m3s) and growth rate (m/s) of the i-form crystals, b-form crystal nucleation rate; 26
respectively, L and L0 are the characteristic size of crystals
(m) and nuclei (m), respectively, and d() is a Dirac delta g kgb;2
function. Gb kgb;1 Sb 1 b exp
Sb 1
For parameter estimation, the method of moments{ was
b-form crystal growth rate; 27
applied to (19) to give
dli;0 where Si 5 C/Csat,i and Csat,i 5 ai,1T2 1 ai,2T 1 ai,3 are the
Bi ; (20) supersaturation and the saturation concentration (g/kg) of the
dt i-form crystals, respectively, and T is the solution tempera-
dli;n ture (8C). The kinetic parameters kba, kga, and kda correspond
nGi li;n1 Bi Ln0 ; n 1; 2; :::; (21) to the nucleation (#/m3s), growth (m/s), and dissolution (m/s)
dt
rates of a-form crystals, respectively, whereas kbb,j and kgb,j
where the nth moment of the i-form crystals (#mn23) is correspond to the jth nucleation (#/m3s) and growth (m/s) for
given by j 5 1 and dimensionless for j 5 2 rates of b-form crystals,
respectively, and gi is the growth exponent of the i-form
Z1 crystals, which may have a value between 1 (for diffusion-
li;n Ln fi dL: (22) limited growth) and 2 (for surface integration-limited
0 growth).45 The Arrhenius equation was used to account for
the variability of crystal growth rate with temperature:
These equations are augmented by the solute mass bal-
ance:
Ega
kga kga;0 exp ; (28)
8:314T 273
dC 103
3 q kva Ga la;2 qb kvb Gb lb;2 ; (23)
dt qsolv a Egb
kgb;1 kgb;0 exp ; (29)
8:314T 273
where C is the solute concentration (g/kg), qsolv is the den-
sity of the solvent (kg/m3), qi is the density of the i-form where kgi,0 and Egi are the pre-exponential factor (m/s) and
crystals (kg/m3), kvi is the volumetric shape factor of the i- activation energy (J/mol) for the growth rate of i-form crys-
form crystals (dimensionless) as defined by vi 5 kviL3, where tals, respectively. The values for densities, volumetric shape
vi is the volume of the i-form crystal (m3), and 103 is a con- factors, and parameters for thesaturation concentration are
stant (g/kg) to ensure unit consistency. The kinetic expres- given in Table 3.
sions are Secondary nucleation is assumed for both a- and b-form
crystals, since it is the dominant nucleation process in seeded
Ba kba Sa 1la;3 a-form crystal nucleation rate; crystallization. Primary nucleation is not included in the
(24) model since it is negligible compared to the secondary nucle-
ga ation. The nucleation rate expression (24) and the second
kga Sa 1 if Sa 1 term in (26) were adapted from that reported in the literature
Ga
kda Sa 1 otherwise for b crystals for L-glutamic acid.13 We have introduced the
a-from crystal growth=dissolution rate; 25 first term in (26) to model the nucleation of b-form crystals
from the surface of a-form crystals. The growth rate expres-
sion for the a-form crystals includes both growth (positive
{
supersaturation) and dissolution (undersaturation). Dissolu-
The approach applies for the experimental conditions in this study in which tion occurs during the polymorphic transformation of a- to
data were collected during nucleation and growth. The full population balance
Equation (19) is used under conditions in which dissolution occurs. b-form crystals, where a-form crystals dissolve and b-
Table 4. Seed Crystal Size Distribution Data and a-Form Crystal Purity at the End of Batch (xa)
No. Seed Size (lm) Mass (g/kg) ki rseed,i (3106m) lseed,i (3106m) xa
7
1 a 180250 0.613 8.227 3 10 8.608 214.977 1.000
2 a 75180 0.613 3.877 3 108 12.127 127.269 1.000
3 a 75180 0.592 3.731 3 108 12.115 127.427 0.924
4 b 40270 4.900 2.483 3 1010 27.289 155.069 0.000
5 b 40270 3.225 1.630 3 1010 27.989 155.017 0.000
6 b 40270 2.972 1.501 3 1010 28.131 155.004 0.000
form crystals nucleate and grow. As reported in the litera- exponential term for the a-form crystals is omitted in this ar-
ture,13,14 the dissolution kinetics cannot be estimated accu- ticle as it had a negligible effect on the model fitness to the
rately from polymorphic transformation experiments, as the data.
growth rate of b-form crystals is limiting. Thus the simple
form of dissolution rate with exponential factor of 1 was Parameter estimation
used with kda determined by a correlation equation based on
Before parameter estimation is carried out, the measured
mass transfer-limited dissolution, as reported in the litera-
variables are discussed first. The various in situ sensors that
ture.14 The growth rate expressions for both a- and b-form
crystals are also adopted from the literature,46 except that the
Figure 2. Experimental and model trajectories for (a) Figure 3. Experimental and model trajectories for (a)
temperature, (b) the first-order moment of temperature, (b) the first-order moment of
the a-form crystals, and (c) solute concentra- the b-form crystals, and (c) solute concentra-
tion for Experiment 1 of Table 4. tion for Experiment 4 of Table 4.
The vertical line in plot (a) shows the seeding time. The vertical line in plot (a) shows the seeding time.
Table 5. Definition of Measured Variables y and Interested The CSD can be computed from the CLD under certain
Parameters h for a- and b-Seeded Experiments assumptions.4952 For some systems, the square-weighted
Seed hT yT chord length was found to be comparable to laser diffraction,
sieving, and electrical sensing zone analysis over the range
a [ln(kba), ln(kga,0),ga, ln(Ega), [C, la,1, xa] of 50400 lm.53 Although the aforementioned methods are
ln(kbb,1), ln(rca), ln(rla,1), ln(rxa)]
b [ln(kbb,2), ln(kgb,0), ln(kgb,2), [C, lb,1] able to estimate the CSD from CLD successfully for some
gb, ln(Egb), ln(rcb), ln(rlb,1)] systems, the theory behind these methods require many
assumptions, including that the particles perfectly backscatter
light at all angles and that shape of the crystals is known.
have become available for crystallization processes have Although these assumptions are true for many particulate
removed or reduced sampling of the crystal slurry during systems (such as round polymer beads with a rough surface
crystallization and reduced the amount of pharmaceutical in water at low-to-moderate solids densities52), the assump-
needed for each batch experiment. The two in situ measure- tions are not accurate for other particulate systems including
ments utilized in this study were ATR-FTIR spectroscopy, the system studied here which has crystals with a similar re-
which infers the solute concentration and FBRM that pro- fractive index as the solution (and hence poor backscattering
vides crystal size information throughout the batch. Inferen- properties). Because of the limited time and pharmaceutical
tial modeling was used to construct a calibration curve to quantity available in the early stage of batch crystallization
relate the FTIR spectra to the solute concentration, using pro- design, it is typically not possible to carry out the extensive
cedures described elsewhere.47,48 FBRM measures the chord studies to verify the assumptions and to determine the effects
length distribution (CLD), which is not the same as the crys- of nonideality of the assumptions on the accuracy of the esti-
tal size distribution (CSD) that appears in the models in the mates of the CSD from the CLD. Furthermore, computing
previous section. the CSD from the CLD when assumptions such as perfect
laser backscattering do not hold is still an open problem.51,54
Figure 4. The marginal distributions of parameters h obtained from a-seeded experiments (Table 5).
Figure 5. The marginal distributions of parameters h obtained from b-seeded experiments (Table 5).
An alternative approach is to use the low-order moments distribution was approximated as a normal distribution
of the CLD directly,54,55 without first estimating the CSD 2 !
from the CLD. This approach replaces the first-principles ki L lseed;i
model for the CSD with a gray-box model for the CLD, in fi L; 0 fseed;i L p exp ;
2prseed;i 2r2seed;i
which the structure of the first-principles model for the low-
order moments of the CSD is used to parametrize the low- (30)
order moments for the CLD.54 The reasoning behind this par- with the parameters (ki, rseed,i, and lseed,i) in Table 4. The
ticular gray-box model is that the mapping between the CLD time series for the temperature, first-order moment of the i-
and the CSD is static (most of the aforementioned mapping
methods assume that the mapping is actually linear), so the
low-order moments of the CLD should follow the same Table 6. The Model Parameters Determined from
dynamic trends as the low-order moments of the CSD. Parameter Estimation
Because of the limitation of the FBRM precision, the zeroth
moment was not used because FBRM would undercount the Parameters Mean Mode 95% Credible Interval
very small crystals. On the other hand, it is not advisable to ln(kba) 17.233 17.213 17.08317.377
use moments with order higher than two because higher ln(kga,0) 1.878 1.778 0.8012.912
order moments are sensitive to low-sampling statistics of the ga 1.859 1.860 1.7751.944
ln(Ega) 10.671 10.671 10.61210.725
large crystals.55 In this study, the first-order moment was ln(kbb,1) 15.801 15.796 15.75815.842
used. As with any model,25 this article assesses the applic- ln(kbb,2) 20.000 20.000 19.96120.036
ability of this gray-box modeling by quantifying the accuracy ln(kgb,0) 52.002 52.426 50.74553.322
of the kinetic parameters and the models predictions. ln(kgb,2) 20.251 20.251 20.31120.197
The experiments are categorized into two sets, namely, gb 1.047 1.016 1.0021.143
ln(Egb) 12.078 12.076 12.06012.097
a-seeded and b-seeded experiments. The seed crystal size
Table 7. Seed Crystal Size Distribution Data and a-Form Crystals Purity at the End of Batch (xa) for Model Validation
No. Seed Size (lm) Mass (g/kg) ki rseed,i (3106m) lseed,i (3106m) xa
8
V1 a 75180 0.613 3.877 3 10 12.127 127.269 1.000
V2 b 40270 3.060 1.547 3 1010 28.081 154.978 0.000
form crystals, and solute concentration for Experiments 1 rather large uncertainty in kga,0 is mainly due to the large
and 4 are shown by the solid lines in Figures 2 and 3. For correlation coefficient of 0.993 between kga,0 and Ega, where
all the b-seeded experiments, there is no apparent formation a small change in Ega necessitates a larger change in kga,0 to
of a-form crystals at the end of all batches (Table 4).} As a ensure the resulting kga in (28) is of the same order of mag-
result, the kinetic parameters for b-form crystals were inde- nitude. Similar reasoning explains the large uncertainty in
pendently obtained from the b-seeded experiments, except kgb,0, with the correlation coefficient between kgb,0 and Egb
for kbb,1, which accounts for the nucleation of b-form a-form equal to 0.997. The growth exponent for the a-form is near
crystals. One a-seeded experiment was operated at a high 2, which indicates that the a-form growth rate is surface inte-
enough temperature that a measurable quantity of b-form gration-limited, whereas that for the b-form is near 1, sug-
crystals nucleated and grew (Experiment 3 in Table 4), so gesting that the b-form growth rate is diffusion-limited.
there would be enough information content in the data for Unlike past studies that quantified uncertainties in the kinetic
kbb,1 to be estimated. This experimental design enabled the
kinetic parameters for b-form crystals to be obtained before
determining the kinetic parameters for a-form crystals.
The nucleation and growth kinetics of a and b-form crys-
tals have 10 parameters to be estimated, four (kba, kga,0, ga,
Ega) corresponding to the kinetics of a-form crystals and six
(kbb,1, kbb,2, kgb,0, kgb,2, gb, Egb) corresponding to the kinetics
of b-form crystals. In relation to the notation defined in the
Review of Bayesian Inference section, the measured varia-
bles y and parameters of interest h for each set of experi-
ments are defined in Table 5, where rci, rli,1, rxi are the
noise parameters for the i-form crystals. The prior distribu-
tion Pr(h) came from a preliminary parameter estimation that
was carried out using maximum likelihood techniques as
described in Miller and Rawlings,23 which resulted in a nor-
mal distribution for each parameter. These were modified for
ga and gb according to (4) to limit their values between 1
and 2. The resulting marginal probability distributions of h
from a- and b-seeded experiments are in Figures 4 and 5,
respectively. While some of the marginal probability distribu-
tions could be approximated by a normal distribution, others
are not. These distributions can be directly inserted into those
model predictive control and other control algorithms that
have been designed to ensure robustness to stochastic param-
eter uncertainties.24 The means, modes, and 95% credible
intervals for the model parameters based on their marginal
probability distributions are in Table 6. Figures 2 and 3 com-
pare the temperature, first-order moment of the a-form crys-
tals, and solute concentration trajectories obtained from ex-
perimental data and those predicted through simulation using
the aforementioned mean values as the model parameters.
It is well known that concentration data alone are not suf-
ficient to characterize nucleation.23 The small uncertainties in
the nucleation kinetic parameters indicate that the first-order
moment of the FBRM provided enough information to char-
acterize the nucleation kinetics. The small range in the uncer-
tainties for the activation energies indicates that the tempera-
ture range from 24 to 558C in the experiments was large
enough to enable activation energies to be estimated. The Figure 6. Experimental and predictive trajectories of (a)
temperature, (b) the first-order moment of

The time series for the other moments and other experiments are in Supporting the a-form crystals, and (c) solute concentra-
Information.
} tion for Experiment V1 of Table 7.
Samples were taken at the end of all batches and XRD was used to determine
the crystal form purity. The vertical line in plot (a) shows the seeding time.
quite close to the measured solute concentration in both vali-
dation experiments, with the differences between the pre-
dicted and experimental first-order moment being comparable
to or smaller than the differences in the model and experi-
mental first-order moments in the experiments used for pa-
rameter estimation (compare Figures 6 and 7 with Figures 2
and 3). The biases observed in the model predictions for the
first-order moment of the i-form crystals could be due to the
FBRM undercounting very small and large crystals, which
would cause a different time-varying bias in different experi-
ments.
Conclusions
A model of polymorphic crystallization of L-glutamic acid,
which consist of a- and b-form crystallization, has been
developed. The detailed kinetics model takes into account
the temperature dependence of the crystals growth kinetic pa-
rameters when compared with the past studies on the model-
ing of L-glutamic acid crystallization.13,14 In addition to pro-
viding point estimates of the kinetic parameters, a Bayesian
inference approach is used to determine a detailed marginal
probability distribution for each parameter. The marginal
probability distributions of the parameters can give practi-
tioners insight regarding the parameter uncertainties and are
of significant value to develop robust control strategies for
the crystallization process.24
Although this article considers a specific polymorphic
crystallization, the same parameter estimation method can be
applied for crystallizations in which many nucleation and
growth rates occur simultaneously, or when there are no prior
literature data or estimates for the model parameters. The
details of the nucleation and growth rate expressions may be
different, depending on the particular solutesolvent system.
With multiple polymorphs in the crystallizer, improved pa-
rameter estimates would be obtained by including polymorph
Figure 7. Experimental and predictive trajectories of (a) ratio measurements obtained from in situ Raman spectros-
temperature, (b) the first-order moment of copy in (3).56
the b-form crystals, and (c) solute concentra-
tion for Experiment V2 of Table 7.
Literature Cited
The vertical line in plot (a) shows the seeding time.
1. De Anda JC, Wang XZ, Lai X, Roberts KJ. Classifying organic
crystals via in-process image analysis and the use of monitoring
parameters for crystallization processes,23,54 the analysis in charts to follow polymorphic and morphological changes. J Process
Control. 2005;15:785797.
this article explicitly takes into account hard theoretical
2. Blagden N, Davey R. Polymorphs take shape. Chem Britain.
bounds on the values for the parameters. In particular, the 1999;35:4447.
application of the linearized analyses used in past papers 3. Fujiwara M, Nagy ZK, Chew JW, Braatz RD. First-principles and
would have resulted in a confidence interval that included direct design approaches for the control of pharmaceutical crystalli-
values of gb \ 1, whereas the Markov Chain simulation zation. J Process Control. 2005;15:493504.
4. Rohani S, Horne S, Murthy K. Control of product quality in batch
approach takes the lower bound of 1 into account during the crystallization of pharmaceuticals and fine chemicals, Part 1: Design
statistical analysis (see Figure 5d). of the crystallization process and the effect of solvent. Org Process
To assess the predictive capability of the resulting model, Res Dev. 2005;9:858872.
another pair of experiments (i.e., one a- and one b-seeded 5. Yu LX, Lionberger RA, Raw AS, DCosta R, Wu HQ, Hussain AS.
experiment) were carried out with the seed distributions in Applications of process analytical technology to crystallization proc-
esses. Adv Drug Delivery Rev. 2004;56:349369.
Table 7. The trajectories of the temperature, first-order 6. Cardew PT, Davey RJ. The kinetics of solvent-mediated phase trans-
moment of the i-form crystals, and solute concentration tra- formations. Proc R Soc Lond A. 1985;398:415428.
jectories obtained from experimental data and those predicted 7. Sakai H, Hosogai H, Kawakita T, Onuma K, Tsukamoto K. Transfor-
through simulation are plotted in Figures 6 and 7. As can be mation of a-glycine to c-glycine. J Cryst Growth. 1992;116:421426.
8. Saranteas K, Bakale R, Hong YP, Luong H, Foroughi R, Wald S.
seen from these figures, the predictive capability of the Process design and scale-up elements for solvent mediated polymor-
model is sufficiently accurate for use in process design and phic controlled tecastemizole crystallization. Org Process Res Dev.
control. The solute concentration predicted by the model are 2005;9:911922.
9. Yu L. Survival of the fittest polymorph: how fast nucleater can lose 34. Bretthorst GL. An introduction to parameter estimation using Bayes-
to fast grower. CrystEngComm. 2007;9:841851. ian probability theory. In: Fougere PF, editor. Maximum Entropy and
10. Bard Y. Nonlinear Parameter Estimation. New York: Academic Bayesian Methods. Dordrecht: Kluwer Academic Publishers,
Press. 1974. 1990:5379.
11. Bates DM, Watts DG. Nonlinear Regression Analysis and Its Appli- 35. Carlin BP, Louis TA. Bayes and Empirical Bayes Methods for Data
cations. New York: Wiley, 1988. Analysis. Boca Raton: Chapman & Hall/CRC, 2000.
12. Mendes P, Kell DB. Non-linear optimization of biochemical path- 36. Box GEP, Tiao GC. Bayesian Inference in Statistical Analysis.
ways: applications to metabolic engineering and parameter estima- Reading, Mass: Addison-Wesley, 1973.
tion. Bioinformatics. 1998;14:869883. 37. Metropolis N, Rosenbluth AW, Rosenbluth MN, Teller AH. Equa-
13. Ono T, Kramer HJM, Ter Horst JH, Jansens PJ. Process modeling of tion of state calculations by fast computing machines. J Chem Phys.
the polymorphic transformation of L-glutamic acid. Cryst Growth 1953;21:10871092.
Des. 2004;4:11611167. 38. Haario H, Saksman E, Tamminen J. An adaptive Metropolis algo-
14. Scholl J, Bonalumi D, Vicum L, Mazzotti M. In situ monitoring and rithm. Bernoulli. 2001;7:223242.
modeling of the solvent-mediated polymorphic transformation of 39. Liang FM, Wong WH. Real-parameter evolutionary Monte Carlo
L-Glutamic acid. Cryst Growth Des. 2006;6:881891. with applications to Bayesian mixture models. J Am Stat Assoc.
15. Caillet A, Sheibat-Othman N, Fevotte G. Crystallization of monohy- 2001;96:653666.
drate citric acid. II. Modeling through population balance equations. 40. Liang F. Dynamically weighted importance sampling in Monte Carlo
Cryst Growth Des. 2007;7:20882095. computation. J Am Stat Assoc. 2002;97:807821.
16. Fevotte G, Alexandre C, Nida SO. A population balance model of 41. Laskey KB, Myers JW. Population Markov Chain Monte Carlo.
the solution-mediated phase transition of citric acid. AIChE J. Machine Learn. 2003;50:175196.
2007;53:25782589. 42. Braak CJ. A Markov Chain Monte Carlo version of the genetic algo-
17. Box GEP, Draper NR. The Bayesian estimation of common parame- rithm differential evolution: easy Bayesian computing for real pa-
ters from several responses. Biometrika. 1965;52:355365. rameter spaces. Stat Comput. 2006;16:239249.
18. Duran MA, White BS. Bayesian estimation applied to effective heat 43. Clontz NA, McCabe WL. Contact nucleation of magnesium sulfate
transfer coefficients in a packed bed. Chem Eng Sci. 1995;50:495 heptahydrate. AIChE Symp Ser. 1971;67:617.
510. 44. Hulburt HM, Katz S. Some problems in particle technology: a statis-
19. Bois FY, Fahmy T, Block JC, Gatel D. Dynamic modeling of bacte- tical mechanical formulation. Chem Eng Sci. 1964;19:555574.
ria in a pilot drinking-water distribution system. Water Res. 1997; 45. Mersmann A. Crystallization Technology Handbook, 2nd ed. Boca
31:31463456. Raton: CRC Press, 2001.
20. Coleman MC, Block DE. Bayesian parameter estimation with in- 46. Kitamura M, Ishizu T. Growth kinetics and morphological change of
formative priors for nonlinear systems. AIChE J. 2006;52:651667. polymorphs of L-glutamic acid. J Cryst Growth. 2000;209:138145.
21. Pouillot R, Albert I, Cornu M, Denis JB. Estimation of uncertainty and 47. Togkalidou T, Braatz RD, Johnson B, Davidson O, Andrews A. Ex-
variability in bacterial growth using Bayesian inference. Application perimental design and inferential modeling in pharmaceutical crys-
to listeria monocytogenes. Int J Food Microbiol. 2003;81:87104. tallization. AIChE J. 2001;47:160168.
22. Gunawan R, Jung MY, Seebauer EG, Braatz RD. Maximum a poste- 48. Togkalidou T, Tung HH, Sun Y, Andrews A, Braatz RD. Solution
riori estimation of transient enhanced diffusion energetics. AIChE J. concentration prediction for pharmaceutical crystallization processes
2003;49:21142122. using robust chemometrics and ATR FTIR spectroscopy. Org Pro-
23. Miller SM, Rawlings JB. Model identification and control strategies cess Res Dev. 2002;6:317322.
for batch cooling crystallizers. AIChE J. 1994;40:13121327. 49. Tadayyon A, Rohani S. Determination of particle size distribution
24. Nagy ZK, Braatz RD. Worst-case and distributional robustness analy- by Par-Tec 100: modeling and experimental results. Part Part Syst
sis of finite-time control trajectories for nonlinear distributed parame- Charact. 1998;15:127135.
ter systems. IEEE Trans Control Syst Technol. 2003;11:694704. 50. Simmons M, Langston P, Burbidge A. Particle and droplet size anal-
25. Beck JV, Arnold KJ. Parameter Estimation in Engineering and sci- ysis from chord distributions. Powder Technol. 1999;102:7583.
ence. New York: Wiley, 1977. 51. Ruf A, Worlitschek J, Mazzotti M. Modeling and experimental anal-
26. Tierney L. Markov Chains for exploring posterior distributions. Ann ysis of PSD measurements through FBRM. Part Part Syst Charact.
Stat. 1994;22:17011728. 2000;17:167179.
27. Liu JS. Monte Carlo Strategies in Scientific Computing. New York: 52. Hukkanen EJ, Braatz RD. Measurement of particle size distribution
Springer, 2001. in suspension polymerization using in situ laser backscattering. Sens
28. Gelman A, Carlin JB, Stern HS, Rubin DB. Bayesian Data Analysis. Actuators B. 2003;96:451459.
New York: Chapman & Hall/CRC, 2004. 53. Heath AR, Fawell PD, Bahri PA, Swift JD. Estimating average parti-
29. Chen WS, Bakshi BR, Goel PK, Ungarala S. Bayesian estimation cle size by focused beam reflectance measurement (FBRM). Part
via sequential Monte Carlo sampling: unconstrained nonlinear Part Syst Charact. 2002;19:8495.
dynamic systems. Ind Eng Chem Res. 2004;43:40124025. 54. Togkalidou T, Tung HH, Sun Y, Andrews AT, Braatz RD. Parame-
30. Lang L, Chen WS, Bakshi BR, Goel PK, Ungarala S. Bayesian esti- ter estimation and optimization of a loosely-bound aggregating phar-
mation via sequential Monte Carlo sampling - constrained dynamic maceutical crystallization using in-situ infrared and laser backscatter-
systems. Automatica. 2007;43:16151622. ing measurements. Ind Eng Chem Res. 2004;43:61686181.
31. Fujiwara M, Chow PS, Ma DL, Braatz RD. Paracetamol crystalliza- 55. Gunawan R, Ma DL, Fujiwara M, Braatz RD. Identification of ki-
tion using laser backscattering and ATR-FTIR spectroscopy: meta- netic parameters in a multidimensional crystallization process. Int J
stability, agglomeration and control. Cryst Growth Des. 2002;2:363 Mod Phys B. 2002;16:367374.
370. 56. Starbuck C, Spartalis A, Wai L, Wang J, Fernandez P, Lindemann
32. Togkalidou T, Fujiwara M, Patel S, Braatz RD. Solute concentration CM, Zhou GX, Ge ZH. Process optimization of a complex pharma-
prediction using chemometrics and ATR-FTIR spectroscopy. J Cryst ceutical polymorphic system via in situ Raman spectroscopy. Cryst
Growth. 2001;231:534543. Growth Des. 2002;2:515522.
33. Xie YL, Kalivas JH. Evaluation of principal component selection
methods to form a global prediction model by principal component
regression. Anal Chim Acta. 1997;348:1927. Manuscript received Jan. 14, 2008, and revision received July 8, 2008.

Robust Bayesian Estimation of The Kinetics of The Polymorphic Crystallization of L-Glutamic Acid Crystals

Uploaded by

Copyright:

Available Formats

Robust Bayesian Estimation of The Kinetics of The Polymorphic Crystallization of L-Glutamic Acid Crystals

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Robust Bayesian Estimation of The Kinetics of The Polymorphic Crystallization of L-Glutamic Acid Crystals

Uploaded by

Copyright:

Available Formats

Robust Bayesian Estimation of Kinetics

for the Polymorphic Transformation of

Introduction industry.15 The variation in physical properties such as crys-

3248 December 2008 Vol. 54, No. 12 AIChE Journal

You might also like