Rad 1
Rad 1
MATEMATIČKE ZNANOSTI
D. Marković
Parameter estimation problem in the Box-Cox simple linear model
Darija Marković
1. Introduction
Suppose we are given the data (xi , yi ), i = 1, . . . , n, such that yi > 0 for
all i = 1, . . . , n. The Box-Cox simple linear model has the form
(λ)
yi = axi + b + εi , i = 1, . . . , n,
where
(
yiλ −1
(1.1)
(λ)
yi = λ , for λ 6= 0
ln yi , for λ = 0,
and where it is assumed that errors εi are independent and normally dis-
tributed with zero mean and some unknown constant variance σ 2 > 0 (see
[2]). The Box-Cox transformation (1.1) was proposed as a modification of the
power transformation introduced by Turkey in [11] in order to avoid disconti-
nuity at λ = 0. The theoretical properties and a variety of applications of the
Box-Cox transformation (1.1) as well as other transformations can be found
1
2 D. MARKOVIĆ
Pr
By using a well-known
Pr fact that2the quadratic function t 7→ i=1 (t − ui )2
r
attains its minimum i=1 (τ1 − ui ) at point τ1 = 1r i=1 ui , as well as the
P
Pr 2
fact that the quadratic functionPt 7→ i=1 P(tvi − ui ) attains its minimum
P r 2 r r 2
(τ v
i=1 2 i − ui ) at point τ2 = u v
i=1 i i / i=1 vi , we obtain
n (λ)
X axi + b − y 2 i
F (λ, a, b) =
i=1
ẏ λ−1
n (λ)
X a(xi − x̄) + ȳλ − y 2 i
≥
i=1
ẏ λ−1
= F (λ, a, ȳλ − ax̄)
≥ F (λ, α(λ), ȳλ − α(λ)x̄)
= F (λ, α(λ), β(λ))
(2.3) = S(λ).
Pn
Furthermore, it is easy to verify that if i=1 (xi − x̄)2 6= 0, then
n n P n 2
(λ) (λ)
(xi − x̄)2 (yi − ȳλ )2 −
P P
(xi − x̄)(yi − ȳλ )
(2.4) S(λ) = i=1 i=1
n
P
i=1
,
ẏ 2(λ−1) (xi − x̄)2
i=1
Pn 2
whereas if i=1 (xi − x̄) = 0, then
n (λ)
X y − ȳλ 2
i
(2.5) S(λ) = .
i=1
ẏ λ−1
The next lemma will be used to prove our theorems 2.3, 2.4 and 2.5.
Lemma 2.1. With the notations as above, we have:
(i) inf 3 F (λ, a, b) = inf S(λ).
(λ,a,b)∈R λ∈R
(ii) If a point (λ0 , a0 , b0 ) is a global minimizer of F , then λ0 is a global
minimizer of S.
(iii) If λ0 is a global minimizer of S, then (λ0 , α(λ0 ), β(λ0 )) is a global
minimizer of F .
(iv) If F (λ, a, b) ≥ F (λ, a0 , b0 ) for all a, b ∈ R, then F (λ, a0 , b0 ) = S(λ).
Proof. (i) By (2.3) and the definition of infimum we obtain
F (λ, a, b) ≥ S(λ) ≥ inf S(λ) for all (λ, a, b) ∈ R3 ,
λ∈R
and, consequently, inf (λ,a,b)∈R3 F (λ, a, b) ≥ inf λ∈R S(λ). On the other hand,
since
inf F (λ, a, b) ≤ F (λ, α(λ), β(λ)) = S(λ) for all λ ∈ R,
(λ,a,b)∈R3
4 D. MARKOVIĆ
The next lemma is also used in the proofs of theorems 2.3, 2.4 and 2.5. Its
proof is omitted because it follows easily from the definition of infinite limit
at infinity and the Extreme Value Theorem, which says that a continuous
function from a closed interval attains its minimum value at some point in
the closed interval.
Lemma 2.2. Let f : R → [0, ∞) be a continuous function such that
lim f (λ) = ∞ & lim f (λ) = ∞.
λ→−∞ λ→∞
Then there exist reals λ1 < 0, λ2 > 0 and a point λ0 ∈ [λ1 , λ2 ] such that
inf f (λ) = inf f (λ) = f (λ0 ).
λ∈R λ∈[λ1 ,λ2 ]
Indeed, by Lemma 2.2 this will mean that there exists a point λ0 ∈ R such
that S(λ0 ) = inf λ∈R S(λ), and then according to assertion (iii) of Lemma 2.1,
we have that inf 3 F (λ, a, b) = F (λ0 , α(λ0 ), β(λ0 )).
(λ,a,b)∈R
PARAMETER ESTIMATION IN THE BOX-COX MODEL 5
The above equality follows easily from (2.4). Let us first show that limλ→∞ S(λ) =
∞. If we take yr = max{yi : yi > ẏ} in (2.8), after passing to the limit as
λ → ∞ we obtain
y λ 2
r
lim =∞
λ→∞ λẏ λ−1
and
n n n
X X L 2 X L 2
lim Hr (λ) = (xi − x̄)2 1− + −
λ→∞
i=1 i=1
n i=1
n
yi =yr yi 6=yr
n n
h X L X L i 2
− (xi − x̄) 1 − + (xi − x̄) −
i=1
n i=1
n
yi =yr yi 6=yr
≥ 0,
Let
yi0 := min yi & yi1 := max yi .
i=1,...,n i=1,...,n
Note that for each r ∈ {1, . . . , n} and for all λ 6= 0, by virtue of (2.5) we
have:
n (λ) n n
X yi − ȳλ 2 X yiλ 1 X yiλ 2
S(λ) = = −
i=1
ẏ λ−1 i=1
λẏ λ−1 n i=1 λẏ λ−1
yλ n
r 1 X yiλ 2
≥ −
λẏ λ−1 n i=1 λẏ λ−1
1 y 2 n
r λ 1 X yi λ 2
(2.9) = ẏ 2 1−
λ ẏ n i=1 yr
Since
yi1 yi0
>1 & < 1,
ẏ ẏ
we have
1 y 2 1 y 2
i1 λ i0 λ
lim =∞ & lim = ∞.
λ→∞ λ ẏ λ→−∞ λ ẏ
Thus, according to Lemma 2.2, there exists a point λ0 ∈ R such that S(λ0 ) =
inf λ∈R S(λ). Now, to complete the proof, note that from assertion (iii) of
Lemma 2.1 it follows that F (λ0 , α(λ0 ), β(λ0 )) = inf 3 F (λ, a, b).
(λ,a,b)∈R
Then NLS problem (1.2) has no solution if and only if exactly one of the sets
Yξ1 and Yξ2 is singleton and the second set is contained in (0, ẏ] or in [ẏ, ∞).
PARAMETER ESTIMATION IN THE BOX-COX MODEL 7
n
X y λ 2
i
(2.10) = ẏ 2 .
i=1
λẏ λ
xi =ξ1
Let us show that inf (λ,a,b)∈R3 F (λ, a, b) = 0. Indeed, if Yξ1 ⊆ [ẏ, ∞), i.e.,
equivalently, if yi ≥ ẏ for each yi such that xi = ξ1 , then, by virtue of (2.10),
we obtain
lim F (λ, α̃(λ), β̃(λ)) = 0,
λ→−∞
implying that inf (λ,a,b)∈R3 F (λ, a, b) = 0. But if Yξ1 ⊆ (0, ẏ], once again by
virtue of (2.10), we also get
lim F (λ, α̃(λ), β̃(λ)) = 0,
λ→∞
Pr
By using a well-known
Pfact that the quadratic function t 7→ i=1 (t − ui )2
r r
attains its minimum i=1 (τ1 − ui )2 at point τ1 = 1r i=1 ui , it is easy to
P
verify that
n (λ) 2 n (λ) 2
X aξ1 + b − yi X aξ2 + b − yi
F (λ, a, b) = +
i=1
ẏ λ−1 i=1
ẏ λ−1
xi =ξ1 xi =ξ2
n n
X 1 X yjλ yiλ 2
≥ λ−1
− λ−1
i=1
Nξ1 j=1 λẏ λẏ
xi =ξ1 xj =ξ1
n n
X 1 X yjλ yiλ 2
+ λ−1
− λ−1
i=1
Nξ2 j=1 λẏ λẏ
xi =ξ2 xj =ξ2
(2.11) = F (λ, a0 , b0 )
n n
1
P (λ) 1
P (λ)
Nξ2 yj − Nξ1 yj
j=1 j=1 n
xj =ξ2 xj =ξ1 1 X (λ)
a0 := and b0 := y − a0 ξ1 .
ξ2 − ξ1 Nξ1 j=1 j
xj =ξ1
F (λ, a0 , b0 ) = S(λ).
Let
Without loss of generality, assume that ymin ∈ Yξ1 (the case ymin ∈ Yξ2 can
be handled in a similar way). Then from (2.11) it easily follows that
S(λ) = F (λ, a0 , b0 )
n
1 X yjλ λ
ymin 2
≥ −
Nξ1 j=1 λẏ λ−1 λẏ λ−1
xj =ξ1
y λ 2 n λ !2
min 1 X yj
(2.12) = 1− .
λẏ λ−1 Nξ1 j=1 ymin
xj =ξ1
Without loss of generality, we may also assume that ymax ∈ Yξ1 (the case
ymax ∈ Yξ2 can be handled in a similar way). Once again, arguing as above,
by virtue of (2.11), we get
yλ 2 n λ !2
max 1 X yj
(2.13) S(λ) ≥ 1− ,
λẏ λ−1 Nξ1 j=1 ymax
xj =ξ1
Since
ymin ymax
<1 & > 1,
ẏ ẏ
we have that
y λ 2 yλ 2
min max
lim =∞ & lim = ∞,
λ→−∞ λẏ λ−1 λ→∞ λẏ λ−1
Thus, by Lemma 2.2, there exists a point λ0 ∈ R such that S(λ0 ) = inf λ∈R S(λ).
Therefore, from assertion (iii) of Lemma 2.1 it follows that inf 3 F (λ, a, b) =
(λ,a,b)∈R
F (λ0 , α(λ0 ), β(λ0 )), contradicting the assumption that problem (1.2) has no
solution.
Step 3. Since n ≥ 3, without loss of generality, we assume that |Yξ2 | = 1 and
|Yξ1 | ≥ 2. To complete the proof, it remains to show that Yξ1 ⊆ [ẏ, ∞) or
Yξ1 ⊆ (0, ẏ]. Suppose to the contrary that
yp := min yi < ẏ < max yi =: yq .
yi ∈Yξ1 yi ∈Yξ1
References
[1] A.C. Atkinson, M. Riani, A. Corbellini, The Box-Cox Transformation: Review and
Extensions, Stat. Sci. 36 (2021), 239–255.
[2] G.E.P. Box, D.R. Cox, An analysis of transformations, J. Roy. Statist. Soc. Ser. B 26
(1964), 211–252.
[3] R.J. Carroll, D. Ruppert, Transformation and Weighting in Regression, Chapman &
Hall, New York, 1988.
[4] E. Demidenko, Criteria for global minimum of sum of squares in nonlinear regression,
Comput. Statist. Data Anal. 51 (2006), 1739–1753.
[5] E. Demidenko, Criteria for unconstrained global optimization, J. Optim. The-
ory Appl. 136 (2008), 375–395.
[6] J.E. Dennis, R.B. Schnabel, Numerical Methods for Unconstrained Optimization and
Nonlinear Equations, SIAM, Philadelphia, 1996.
[7] N.R. Draper, D.R. Cox, On distributions and their transformation to normality,
J. Roy. Statist. Soc. Ser. B 31 (1969), 472–476.
[8] P.E. Gill, W. Murray, M.H. Wright, Practical Optimization, Academic Press, London,
1981.
[9] D. Jukić, A necessary and sufficient criterion for the existence of the global
minima of a continuous lower bounded function on a noncompact set, J. Com-
put. Appl. Math. 375 (2020), 112791
[10] Y. Nievergelt, On the existence of best Mitscherlich, Verhulst, and West growth curves
for generalized least-squares regression, J. Comput. Appl. Math. 248 (2013), 31–46.
[11] J.W. Tukey, On the comparative anatomy of transformations, Ann. Math. Statist. 28
(1957), 602–632.
Darija Marković