Introduction To Clusterwise Regression
Introduction To Clusterwise Regression
By
Eman Ismail
Introduction
Multiple regression has been frequently used in many
fields.
Problem
DeSarbo and Cron (1988) mentioned that there are many
applications where the estimation of a single regression line
is not adequate or is even misleading.
Introduction
➢ They showed that by the following synthetic data set ,
as shown in the following figure.
Heterogenous Data
with a combined R 2 = 1.
Figure 3
Introduction
Solution
C i 1 + C i 2 = 1, i = 1,..., n , (7)
C i 1 ,C i 2 0, i 1 , i 2 , 0 , j , 0 , j unrestricted, i = 1,..., n , j = 1,..., J . (8)
Model 1: NLP model
This model is defined by; 𝑛 observations, 𝐽 explanatory variables 𝑥𝑖𝑗 and one
response variable 𝑦𝑖 .
The iterators are for; an explanatory variable 𝑗 ∈ 1, … , 𝐽 and an observation(𝑖
∈ 1, … , 𝑛 ).
Also, this model assumes that a sample of n observations is divided into two
mutually exclusive segments or clusters (I, II).
The model decision variables are: the regression coefficients of cluster I and cluster
II 𝛼0 , 𝛼𝑗 and 𝛽0 , 𝛽𝑗 , respectively; the deviations of observation 𝑖 from the regression
line of cluster I and cluster II 𝜀𝑖1 and 𝜀𝑖2 , respectively and binary decision variables
which indicate whether observation 𝑖 belongs to cluster I or cluster II 𝐶𝑖1 and 𝐶𝑖2 ,
respectively.
If observation 𝑖 belongs to cluster I then 𝐶𝑖1 = 1, otherwise 𝐶𝑖1 = 0. While, if
observation 𝑖 belongs to cluster II then 𝐶𝑖2 = 1, otherwise 𝐶𝑖2 = 0.
Model 1: NLP model
The objective function (4) is to minimize the total sum of squared
errors.
Constraint (5) and (6) are required for defining and estimating the
regression functions of cluster I and cluster II, respectively.
Constraint (7) ensures that the observation i is a member of either
cluster I or cluster II but not both.
We do not restrict 𝐶𝑖1 and 𝐶𝑖2 to be binary. The optimization with
constraint (7) and 𝐶𝑖1 , 𝐶𝑖2 ≥ 0 will force them to be either zero or
one.
Model 2: NLP model
If the distribution of the error term is not known, most researchers appeal to
the robustness by minimizing the sum of absolute errors, instead of the sum of
squared errors in equation (4).
Find the values of C i 1 ,C i 2 , 0 , 0 , j , j , i+1 , i−1 , i+2 and i−2 i = 1,..., n , j = 1,..., J which:
n
minimize (C i 1 ( i+1 + i−1 ) + C i 2 ( i+2 + i−2 )) (9)
i =1
subject to
J
y i = 0 + j x ij + i+1 − i−1 , i = 1,..., n , (10)
j =1
J
y i = 0 + j x ij + i+2 − i−2 , i = 1,..., n , (11)
j =1
C i 1 + C i 2 = 1, i = 1,..., n , (12)
C i 1 ,C i 2 , i+1 , i−1 , i+2 , i−2 0, 0 , 0 , j , j unrestricted, i = 1,..., n , j = 1,..., J . (13)