100% found this document useful (2 votes)

1K views

Linear Regression Models, Analysis, and Applications

analise de regressoes

Uploaded by

Werkson Santana

Available Formats

Download as PDF, TXT or read online on Scribd

100% found this document useful (2 votes)

1K views

Linear Regression Models, Analysis, and Applications

analise de regressoes

Uploaded by

Werkson Santana

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 193

MATHEMATICS RESEARCH DEVELOPMENTS

LINEAR REGRESSION

MODELS, ANALYSIS
AND APPLICATIONS

No part of this digital document may be reproduced, stored in a retrieval system or transmitted in any form or
by any means. The publisher has taken reasonable care in the preparation of this digital document, but makes no
expressed or implied warranty of any kind and assumes no responsibility for any errors or omissions. No
liability is assumed for incidental or consequential damages in connection with or arising out of information
contained herein. This digital document is sold with the clear understanding that the publisher is not engaged in
rendering legal, medical or any other professional services.
MATHEMATICS RESEARCH
DEVELOPMENTS

Additional books in this series can be found on Nova’s website

under the Series tab.

Additional e-books in this series can be found on Nova’s website

under the eBooks tab.

ANALYTICAL CHEMISTRY
AND MICROCHEMISTRY

Additional books in this series can be found on Nova’s website

under the Series tab.

Additional e-books in this series can be found on Nova’s website

under the eBooks tab.
MATHEMATICS RESEARCH DEVELOPMENTS

LINEAR REGRESSION

MODELS, ANALYSIS
AND APPLICATIONS

VERA L. BECK
EDITOR
Copyright © 2017 by Nova Science Publishers, Inc.

All rights reserved. No part of this book may be reproduced, stored in a retrieval system or transmitted
in any form or by any means: electronic, electrostatic, magnetic, tape, mechanical photocopying,
recording or otherwise without the written permission of the Publisher.

We have partnered with Copyright Clearance Center to make it easy for you to obtain permissions to
reuse content from this publication. Simply navigate to this publication’s page on Nova’s website and
locate the “Get Permission” button below the title description. This button is linked directly to the
title’s permission page on copyright.com. Alternatively, you can visit copyright.com and search by
title, ISBN, or ISSN.

For further questions about using the service on copyright.com, please contact:
Copyright Clearance Center
Phone: +1-(978) 750-8400 Fax: +1-(978) 750-4470 E-mail: [email protected].

NOTICE TO THE READER

The Publisher has taken reasonable care in the preparation of this book, but makes no expressed or
implied warranty of any kind and assumes no responsibility for any errors or omissions. No liability is
assumed for incidental or consequential damages in connection with or arising out of information
contained in this book. The Publisher shall not be liable for any special, consequential, or exemplary
damages resulting, in whole or in part, from the readers’ use of, or reliance upon, this material. Any
parts of this book based on government reports are so indicated and copyright is claimed for those parts
to the extent applicable to compilations of such works.

Independent verification should be sought for any data, advice or recommendations contained in this
book. In addition, no responsibility is assumed by the publisher for any injury and/or damage to
persons or property arising from any methods, products, instructions, ideas or otherwise contained in
this publication.

This publication is designed to provide accurate and authoritative information with regard to the subject
matter covered herein. It is sold with the clear understanding that the Publisher is not engaged in
rendering legal or any other professional services. If legal or any other expert assistance is required, the
services of a competent person should be sought. FROM A DECLARATION OF PARTICIPANTS
JOINTLY ADOPTED BY A COMMITTEE OF THE AMERICAN BAR ASSOCIATION AND A
COMMITTEE OF PUBLISHERS.

Additional color graphics may be available in the e-book version of this book.

Library of Congress Cataloging-in-Publication Data

ISBN: H%RRN

Published by Nova Science Publishers, Inc. † New York

CONTENTS

Preface vii
Chapter 1 Weighting and Transforming Data in
Linear Regression 1
Julia Martín, Alberto Romero Gracia
and Agustín G. Asuero
Chapter 2 Regression through the Origin 69
Julia Martín and Agustín G. Asuero
Chapter 3 Linear Regression for Interval-Valued Data
in Kc (R) 117
Yan Sun and Chunyang Li
Chapter 4 Linear Regression versus Non-Linear
Regression in Mathematical Modeling
of Adsorption Processes 149
Gabriela-Nicoleta Moroi
Index 179
PREFACE

Chapter One addresses the importance of weighted linear regression in

fitting straight lines. In Chapter Two, the authors cover the homocedastic
condition, i.e., variance of y’s independent of x, errors of y’s accumulative, the
heterocedastic case, i.e., variance or standard deviation proportional to x
values, respectively, and orthogonal regression (error in both axes). The
chapter also covers topics such as prediction (using the regression line in
reverse), leverage, goodness of fit, comparison between models with and
without intercept, uncertainty, polynomial regression models without intercept,
and an overview of robust regression through the origin. Chapter Three
focuses on linear regression for interval-valued data within the framework of
random sets, and proposes a new model that generalizes a series of existing
ones. Chapter Four provides an investigation on modeling of adsorption of
heavy metal ions onto surface-functionalized polymer beads. Linear and non-
linear regressions were employed for each of the isotherm models considered
to describe the equilibrium data. To reliably assess model validity, various
error functions (whose mathematical expressions contain the number of
experimental measurements, the numbers of independent variables and
parameters in the regression equation as well as the measured and predicted
equilibrium adsorption capacities) were used.
Chapter 1 - Improper parameter estimation is achieved when non constant
variance (heterocedasticity) is ignored. For this reason the importance of
weighted linear regression in fitting straight lines is stressed in this chapter. A
viii Vera L. Beck

number of issues are thus addressed concerning random error, noise and
variance modelling when precision varies as the values of x (e.g.,
concentration) increase. The use of data transformation and weighted least
squares regression are two main solutions to deal with the heterocedasticity
problem. Non-linear terms may be introduced into the frame of linear
regression by transforming variables. Fitting is improved in this way and
necessary assumptions involved in least squares method such as
homocedastivity (constant variance) are thus satisfied. The following topics
concerning transformations are covered on this context: reasons to carry out,
simplification of relationships, model linearization, variance stabilization and
weighting transformation data. Box-Cox transformation topic has also
received a distinctive attention. Applications (weighting, transformation and
Box-Cox method) from a variety of fields (analytical, biochemical, clinical,
environmental and pharmaceutical) are summarized in tabular form. The
chapter is based on two previous reviews published by the authors in Critical
Reviews in Analytical Chemistry (2007, 37(3) 143-172 and 2011, 41(1), 36-
69).
Chapter 2 - Regression through the origin, a very interesting topic, has
usually received a scarce attention in the bibliography. This model is also
known as the no-intercept model. It is applied because of subject matter theory
or either when other physical and material considerations are necessary to
taken into account. An intensive bibliographical search has been carried out
with the purpose of gathering the literature on the subject, which is widely
scattered. Some about one hundredth and thirty references have been
compiled, comprising about twenty monographs and fifty scientific journals,
from varying fields, e.g., analytical, biological, clinical, chemometrical,
educational, environmental, pharmaceutical, physico-chemical, and statistical.
The authors will dealt systematically with the homocedastic condition, i.e.,
variance of y’s independent of x, errors of y’s accumulative, the heterocedastic
case, i.e., variance or standard deviation proportional to x values, respectively,
and orthogonal regression (error in both axes). The chapter also covers topics
such as prediction (using the regression line in reverse), leverage, goodness of
fit, comparison between models with and without intercept, uncertainty,
polynomial regression models without intercept, and an overview of robust
regression through the origin.
Preface ix

Chapter 3 - In the recent scientific research, data are increasingly taking

on new formats such as sets, lists, and histograms. Among these, a particular
type that is frequently encountered is interval-valued data, which refers to
collection of observations in the form of intervals. Examples are daily [min,
max] temperature, spatially [low, high] elevation, range of a group of
individual observations, among many others. Linear regression as a
fundamental tool of statistical analysis has been increasingly investigated for
extensions to accommodate interval-valued data. Various models and methods
have been proposed and studied in the last decades. However, issues such as
interpretability and computational feasibility still remain. Especially, a
commonly accepted mathematical foundation is largely underdeveloped,
compared to the demand of applications. In this chapter, the authors focus on
linear regression for interval-valued data within the framework of random sets,
and propose a new model that generalizes a series of existing ones. By
proposing the authors’ model, the authors continue to build up the theoretical
framework that deeply understands the existing models and facilitates future
developments. In particular, the authors establish important properties of the
model in the space of compact convex subsets of R, analogous to those for the
classical linear regression. Additionally, the authors carry out theoretical
investigations into the least squares estimation that is widely used in the
literature. It is shown that the least squares estimator is asymptotically
unbiased. A simulation study is presented that supports the authors’ theorems,
and an application to a climate data set is demonstrated.
Chapter 4 - In mathematical modeling of adsorption processes, linear
and/or non-linear regression analysis may be employed. In adsorption isotherm
modeling, non-linear regression has lately been reported by some authors to
provide a better fit to experimental data than linear regression. Isotherm
models used in describing the adsorption systems, criteria selected to evaluate
isotherm model validity as well as modeling results are comparatively
discussed. In the authors’ investigation on modeling of adsorption of heavy
metal ions onto surface-functionalized polymer beads, linear and non-linear
regressions were employed for each of the isotherm models considered to
describe the equilibrium data. To reliably assess model validity, various error
functions (whose mathematical expressions contain the number of
experimental measurements, the numbers of independent variables and
parameters in the regression equation as well as the measured and predicted
x Vera L. Beck

equilibrium adsorption capacities) were used. The modeling results obtained

by employing the two regression methods were compared. For the adsorption
of each metal ion species, it was revealed that (a) for a particular isotherm
model, the regression providing the best fit is linear, non-linear or both linear
and non-linear, and (b) the order of isotherm model validities indicated via
linear regression is the same with that shown by non-linear regression.
In: Linear Regression ISBN: 978-1-53611-992-3
Editor: Vera L. Beck © 2017 Nova Science Publishers, Inc.

Chapter 1

WEIGHTING AND TRANSFORMING DATA

IN LINEAR REGRESSION

Julia Martín, Alberto Romero Gracia

and Agustín G. Asuero*
Department of Analytical Chemistry, Faculty of Pharmacy,
The University of Seville, Seville, Spain

ABSTRACT

Improper parameter estimation is achieved when non constant

variance (heterocedasticity) is ignored. For this reason the importance of
weighted linear regression in fitting straight lines is stressed in this
chapter. A number of issues are thus addressed concerning random error,
noise and variance modelling when precision varies as the values of x
(e.g., concentration) increase. The use of data transformation and
weighted least squares regression are two main solutions to deal with the
heterocedasticity problem. Non-linear terms may be introduced into the
frame of linear regression by transforming variables. Fitting is improved
in this way and necessary assumptions involved in least squares method

*
Corresponding Author address: Agustín G. Asuero, Department of Analytical Chemistry,
Faculty of Pharmacy, University of Seville, Seville, Spain.
2 Julia Martín, Alberto Romero Gracia and Agustín G. Asuero

such as homocedastivity (constant variance) are thus satisfied. The

following topics concerning transformations are covered on this context:
reasons to carry out, simplification of relationships, model linearization,
variance stabilization and weighting transformation data. Box-Cox
transformation topic has also received a distinctive attention.
Applications (weighting, transformation and Box-Cox method) from a
variety of fields (analytical, biochemical, clinical, environmental and
pharmaceutical) are summarized in tabular form. The chapter is based on
two previous reviews published by the authors in Critical Reviews in
Analytical Chemistry (2007, 37(3) 143-172 and 2011, 41(1), 36-69).

Keywords: least squares method, weighting, transforming data

INTRODUCTION

Simple linear regression assumes the homocedasticity property

(regular or uniform variance) and is widely applied in natural and physical
sciences (Asnin, 2016; Olivieri, 2015, Lavagnini and Magno, 2007;
Sayago et al., 2004; de Levie, 2000; Asuero and Gonzalez, 1989; Meites,
1979). Plots of residuals against fitted values or versus x values allow
checking equal variance assumption, though it is much better to have
replications (Sayago et al., 2004). Residuals are the differences, in the y-
direction, between the experimental points and the corresponding fitted
values, giving a minimum sum of their squares. A complete analysis in
regression diagnostic requires a thorough examination of residuals.
Residuals corresponding to correct fitted models should confirm the
assumptions inherent in a regression analysis or failing to deny them
(Meloun and Militkí, 2011; Bates and Watts, 2007; Asuero et al. 2006;
Belloto and Sokolovski, 1995; Phillips et al., 1990; Ellis and Duggleby,
1978). Residuals should be randomly distributed (with equal number of
plus an minus sign) when the variables are related (Miller and Miller,
2010) by means of a linear relationship (with the error symmetrically
distributed).
A plot of residuals allows checking for systematic deviation between
data and model, for example: i) a curvilinear pattern (higher order term to
Weighting and Transforming Data in Linear Regression 3

accommodate curvature is needed), ii) systematic descending or ascending

linear trend (additional terms needed), iii) fun-shaped residual pattern
(inappropriate constant variance assumption), or iv) time order analysis
(time effect). Regression models are used in assay development (Aarons et
al., 1987), in enzymatic kinetics and pharmacokinetics, calibration,
recovery studies and comparison methods, and many other pharmaceutical,
biological, and chemical applications (Asuero and Bueno, 2011; Davidian,
1990). The assumption of variance homogeneity (homocedasticity) to
describe a relationship between a dependent (response) variable Y and an
independent (predictor) variable x usually does not hold (Tellinghuisen
2009b; Tellinghuisen, 2007; Asuero and Gonzalez, 2007); being patent
instead irregular or heterogeneous variance (heterocedastic condition).
Mass, substrate concentration, or temperature (Davidian, 1990), may be the
predictors. Reaction rate, radioactive count, peak area or another physical
property, are examples of single responses. Perform a weighted least
squares regression analysis or transforming the data (Asuero and Bueno,
2011; Asuero and Gonzalez, 2007) are the two main solutions to the
heterocedasticity problem.
In chemical analysis non linear calibration curves are sometimes
apparent, in techniques such as liquid chromatography-mass spectrometry
(matrix related non linearity effects) or atomic absorption
spectrophotometry (Asnin, 2016; Mermet, 2010). In fact, in most of real
problems the response function moves away from linearity as
concentration values increase in a large way (end of the calibration curve).
Carrying out a transformation in one or in the two variables is a mean of
simplifying a non linear relationship. Keeping the model as simple as
possible (minimum number of parameters fitting data at hand) is the better
choice in agreement with Occam’s razor. “Non sunt multiplicanda entia
praeter necessitaten” (Bates and Watts, 2007; Garfinkel and Fegley, 1984).
Transformations to stabilize variance and to achieve normality often go
hand by hand and it often happens that both assumptions are almost
satisfied after carrying out an appropriate transformation. In any case as
stated by Acton (1959) “the gods who favour statisticians have frequently
ordained that the world be well behaved, and so we often find that a
4 Julia Martín, Alberto Romero Gracia and Agustín G. Asuero

transformation to obtain one of these desiderata in fact achieves them all

(well, almost achieve them).” Note that dealing with fitting no model is
perfect; some models suit better than others. The aim of this contribution is
to underline the significance of weighting and transforming data subject in
fitting straight lines models to data. Some selected examples from a variety
of fields are taken from the literature and shown in tabular form in order to
illustrate this book chapter.

WEIGHTING DATA

The underlying idea in weighted least squares is to give utmost

importance to the most accurate data. The weighted least squares
procedure entails in (Sayago et al., 2004) minimizing the weighted
residuals. The benefit obtained by applying weighted least squares
procedure is greater the greater the distance from homoscedasticity (Zorn
et al., 1977). Table 1 compiles some formulas, which allow calculating
statistics for weighted linear regression (WLR) in those cases in which data
are replicated. The number of replicates required by weighted least squares
is greater than the ones required by ordinary least squares. A number of
factors such as the cost of calibration, standards and reagents, or the time
required to perform the measurements, make, in practice, difficult to obtain
the level of replicates required. However, unequal weights can also be
estimated without being performed, as we will see later.
In weighted least-squares procedure each observation is characterized
by a weighting value wi (measure of the information which contains)
proportional (Deming, 1943) to the inverse of the yi.
By definition (Deming, 1943)

 02
wf 
 2f (1)
Weighting and Transforming Data in Linear Regression 5

 02 is the variance of a function of unit weight (Connors, 1987), a

proportionality factor. Let f be the mean of ni observations yi1, yi2,…yini
(random variates) from a population of standard deviation  0 . Then we get

 02
 2y  (2)
i ni

for the variance of yi , being its weighting factor according to Eqn. (1)

 02  02
wy    ni (3)
i
 2y  02
i

Table 1. Formulae for calculating statistics for weighted linear

regression with replication data (Asuero and González, 2007)

Equation: Slope:
ŷi  a0  a1 xi a1  S XY / S xx
Mean responses Intercept
yi   y  / n iv i
a0  y  a1 x
Weighted residuals
Residual sum of squares
SSE   wy yi  ŷi
i
 
2 w1/2
y i

yi  ŷi 
Correlation coefficient
Mean
r  S XY / S XX SYY
x   wy xi /  wi
i
Standard errors
y   wy yi /  wi
i
SSE S  a2 S
Sum of squares about the mean s 2y /x   YY 1 XX
n2 n2
 
S XX   wy xi  x
 w x  /  S 
2

i sa2  s 2y / x
0
2
yi i XX w
yi

  w  y  y
2
SYY yi i sa2  s2y /x / S XX
  w  x  x  y  y 
1
S XY yi i i
cov(a0 ,a1 )  x s y2 /x / S XX
6 Julia Martín, Alberto Romero Gracia and Agustín G. Asuero

We get for the weights of the k means of w1=n1, w2=n2,…wk=nk (as

single observations have unit weight). Weights are then depending on the
arbitrary factor  02 (they are relative, no absolute values).

The variance of a single observation would be  02 / wi if the weight of

each ni original variable was wi. If the ni original variables were each of
weight wi (instead of unity), in this case

 02
wy   ni wi (4)
i
 02
ni wi

 02
 2y  (5)
i
ni wi

The weight of the mean y is as before just n times the weight of a

single observation.
Let yi now the mean value of ni observations taken from a population

of standard deviation  i , with single observation varying precision. Then,

the variance of yi will be  i2 / ni . Then we get in this case

 02  02
wy   n (6)
i
 i2 i
 i2
ni

Thus, weighting factors must consider both the number of replicates

and the variance of each given point.
The influence of the weighting procedure in parameter estimation is
depends on the nature of the experimental data set. Note that if
Weighting and Transforming Data in Linear Regression 7

experimental points are not properly weighted or they are incorrectly

calculated wrong results may be obtained in weighting.
Dealing with kinetic enzymatic data, the role of intuition may be of
vital importance (Reigh et al., 1972).
Several kinds of weighting factors may be envisaged (Asuero and
Bueno, 2011; Asuero and Gonzalez, 2007; Chow and Liu, 1995, Asuero
and González, 1989; Connors, 1987, Jurs, 1986; Meites, 1979), according
to the characteristics of a given data set:

(a) Absolute Weights. Equal weighting factors are assumed for all the
points; i.e., wi = 1.
(b) Statistical Weights. Replication for each calibration data point is
required to estimate the reciprocal of variance, which prevents its
application in routine practice (Mullins, 2003). For this reason, empirical
weights based on x-variable (i.e., concentration) or y-variable (i.e.,
response) may be used as approximations, i.e., weights such as 1/x0.5, 1/x,
1/x2, 1/y0.5, 1/y2 (Almeida et al., 2003). In those cases in which the variance
of residuals decrease with x, we may also apply:

1
wi  (7)
xmax  xi

(c) Instrumental Weights. Making a small number of replicates allows

assigning individual weights to data points (Asuero and Gonzalez, 1989;
de Levie, 2001; de Levie, 1986). However, we may assume a functional
relationship between variance and the predictor (independent) variable,
when there is no enough replicates (Baumann, 1997). In fact,
heterocedasticity usually implies variance to be related to the expected
value of the response by means of a functional relationship (Bayne and
Rubin, 1986).
(d) Transformation-Dependent Weights. A transformation may be
sometimes carried out (in one or both variables) to obtain a straight line
function from a (intrinsically linear) non linear relationship (Rawlings et
al., 1998; Tomassone et al., 1983).
8 Julia Martín, Alberto Romero Gracia and Agustín G. Asuero

Some assumptions theoretically necessary in applying a regression

analysis may be not satisfied by the transformed data as the relative
magnitudes of the errors are affected thorough the plot. Thus non-linear
homocedastic data may be transformed into a linear straight line
relationship with heterocedastic errors. If zi experimental data values turn
into transformed linearized data yi (de Levie, 2012; Asuero and Bueno,
2011; Asuero and González, 1989; de Levie, 1986) the weighting factor wi
(  02 = 1) is given by

2
 
 1 
wi   (8)
y 
 
 z 

Weighted least squares method is not always problem-free. Even

powers of measured (untransformed signal) values, i.e., z2 or z4 resulting
from Eqn. (8) are always positive even in those cases where the
corresponding mean values average zero (de Levie, 2000). This implies
that random errors in small signal regions can contribute substantially to
the sum of squares distorting the analysis. The weights must also be
transformed to keep the appropriate relationship (Jurs, 1970) between the
weights and the points being fitted. The random error propagation law
(Tellinghuisen, 2015; Tellinghuisen, 2001; Asuero and González, 1988)
when applies to a function y=f(z) gives (taking  02 =1)

2
 y
   
2 2
(9)
y
 z 
z

and then we get

2
 y
wy  wz   (10)
 z 
Weighting and Transforming Data in Linear Regression 9

That is, in addition to the weighting corresponding to the zi individual

data point measurements, wz  1/  z2 , the transformation-dependent
weighting, wy has to be used. An overview of distinct kind of weights may
be seen in Table 2.

Table 2. Some kinds of weights* (Asuero and González, 2007)

Kind Weight Authors

Absolute weights 1 Jurs, 1986
Statistical weights 1 Johnson, 1980; Jurs, 1986

yi
Assumption of constant 1 Anderson and Snow, 1967;
percentage error Smith and Mathews, 1967
yi2
Instrumental weights 1 Jurs, 1986

si2
Transformation-dependent 1 de Levie, 1986
weights Meites, 1979
y
( )2
z
Mixed instrumental 1 de Levie, 1986
transformation depending Meites, 1979
y
weights sz ( )2
z
2
* si is the estimate of  i2

RANDOM ERRORS AND NOISE

Noise is a source of random errors, which play a vital role in analytical

chemistry as well as in the analysis of experimental results (Rudnyi, 1996;
Prudnikov and Shapkina, 1984). Noise may be dependent of i) signals; ii)
concentration; iii) other factors (Sun et al., 1994; Garden et al., 1980;
10 Julia Martín, Alberto Romero Gracia and Agustín G. Asuero

Rothman et al., 1975; Ingle, 1974; Pardue et al., 1974; Ingle and Crouch,
1972; Winefordner et al., 1970). The precision of intensity measurements
in spectrochemical analysis (Klockenkämper and Bubert, 1986; Bubert and
Klockenkämper, 1983) can be affected by three kinds of noise, namely,
slot noise, flicker noise and detector noise. The rate and amount of ions
reaching the detector are the origins of the shot noise, which follow a
Poisson statistics. The process of nebulization as well as fluctuations
related with the source is the origin of flicker noise, which is proportional
to the signal magnitude. Detector and electronics are involved in the dark
count noise. So the error total is given by

st  sshot
2
 s 2flic  sdet
2
(11)

The sources of the three kinds of errors are summarized in Table 3.

Table 3. Likely analytical causes of the three types of signal

errors (Steliopoulos et al., 2006; Lavagnini et al., 2004;
Kirkup and Mulholland, 2004; van Loco et al., 2003;
de Galán et al., 1985; Kemp, 1985)

Constant Error Proportional error Quadratic error

Sample turbidity Volumetric error Decay/dissociation of product
Reagent Gravimetric error Reagent depletion
absorbance Incomplete separation Instrumental non-linearity
Nonspecificity or derivation  matrix-related non-linearity
Zero error/drift Matrix evaporation in CG-MS
Carryover Error in time  purge and trap GC-MS
Contamination processed  electron capture detector
Weighting and Transforming Data in Linear Regression 11

MODELLING THE VARIANCES AS A FUNCTION OF THE

DEPENDENT OR INDEPENDENT VARIABLE

It is possible to model the variance in function of the response given its

smoothing variability behaviour through the response level range
(modelled either as a function of x, or a function of y) (Tellinghuisen
2010b; Tellinghuisen, 2009d; Baumann, 1997; Davidian, 1990). As a
matter of fact, variance is related to the mean (or to other parameters or
variables) when dealing with many physico-chemical properties (Table 4).
A varying number of functions have been devised to estimate variances
(Tellinghuisen, 2005a; Sadray et al., 2003; Hwang, 1994; O’Connell et al.,
1993; Davidian, 1990) (Table 4).
A simple approach follows (Rodbard, et al., 1976; Rodbard and
Frazier, 1975)

1/2
log Var(Yij )   log  0   log i (12)

2ˆ
The estimated weights would then be y .
In fact, variance function estimation is challenging. Outliers strongly
affect (Baumann and Wätzig, 1995) the estimation of variance. Models
based on the addition of variance from independent sources are closer to
physical reality than the ones based on the contribution of standard
deviations.

VARIATION OF PRECISION WITH CONCENTRATION

The precision of an analytical system may be given in function of the

analyte concentration (ISO 5725, 1994; Thompson, 1988; Thompson and
Howarth, 1973), varying models being proposed as compiled in Table 5.
12 Julia Martín, Alberto Romero Gracia and Agustín G. Asuero

Table 4. Variant function estimation (Asuero and González, 2007)

Var (Yij) wi Comments

I Constant CV equal to σ;
 
2
0
2
i
 2
i reasonable approach in e.g.,
HPLC, as long as the limits of
assay sensitivity are not
approached too closely
II Quite useful from count data for
 02 i i1 which a Poisson assumption
implies that Var (Yij) = μi
III α known
 02 (  i )2 (  i )2
IV General model to accommodate
 02 i2 i2 overdispersion; θ often falls in
the range 0. 6 ≤ θ ≤ 0. 9. Poisson
model if σ0 = 1 and θ = 0. 5.
Power of the mean variance
function, which is likely to be of
the most importance in
chromatographic and capillary
electrophoresis applications. Plot
of log |rij| versus log of the
predicted value gives a straight
line
V θ1 describes the imprecision of
 02 (1  i ) 2
(1  i 2 )1 measurement that dominates at
small response value and θ2 the
relationship between mean and
variance that dominates at larger
response values
VI The variability increase very
 02 exp(2i ) quickly with the mean Plot of log
|rij| versus predicted value
show a linear relationship
VII The Standard deviation is
 02 (1 1i   2 i2 )2 thought to be a quadratic function
of the dependent variable Plot of
log |rij| versus yi shows a
quadratic relationship
VIII
 02 (1 1i   2 i2 )
IX g is the variance function general
 02 g 2 ( i , zi , ) [g 2 ( i , zi , )]1 model
Weighting and Transforming Data in Linear Regression 13

Table 5. Varying models

Model Function Authors

I ISO 5725
sk c Hughes and Hurley
II Thompson and Howarth, 1973
sc  s0  kc
Thompson, 1976
 c  pc  q Howarth and Thompson, 1976
Thompson and Howarth, 1978
Thompson, 1978
Thompson, 1988
Lee and Ramsey, 2001
III Oppenheimer et al., 1983
sx  a0  a 1 x  a2 x 2 Watters et al., 1987
Zorn et al., 1996
IV Modamio et al., 1996
sc  A0  A1 c  A2 c 2  A3c3
V ISO 11843-2, 2000
sx  a0  a1x
VI Zitter and God, 1971
sx  a0  a1x 2 Thompson, 1988
Rocke and Lorenzato, 1995
sx  s02  k 2 c 2 Lee and Ramsay, 2001
Rocke et al., 2003
 x  p 2c2  q2 Wilson et al., 2004
EURACHEM/CITAT Guide, 2002
ux  s02  (xsi ) 2
Heydorn and Anglow, 2002
VII Watters et al., 1987
 x  c0  c 1 x  c2 x 2 Schwartz, 1978
Boumans et al., 1981
 y  c0  c 1 y  c2 y 2 Bubert and Klockenk¨amper, 1983
Oppenheimer et al., 1983
VIII ISO 5725
  bc d
Hughes and Hurtley, 1987
sx  a0 ea1x Zorn et al., 1997
Desimoni, 1999
 c  pc k  q
Prudnikov and Shapkings, 1984
 2y  Ay b Oppenheimer et al., 1983

 x2  A(x  1)b
14 Julia Martín, Alberto Romero Gracia and Agustín G. Asuero

The lineal model, II, is the most simple. When the analytical errors
stem from two independent terms, the most satisfactory option should be to
combine variances. Then the models V and VI describe the variation of
precision with concentration more correctly. Standard deviation usually
increases with concentration, whereas the relative standard deviation
(coefficient of variation) remains constant or slightly decreases. Some
empirical models such as III and VIII have found use for radioassay ligand
and other general situations; the standard deviation is modelled as a
function of concentration.
The topic concerning the weighting choice is an open subject, and
there is no universal solution to this problem being (Modamio et al., 1996)
often subjective and somewhat arbitrary.

APPLICATIONS

Some examples of application of heterogeneous variance (weighted

least squares) in analytical chemistry are compiled in Table 6. In addition,
an experimental situation covering the area of enzymatic kinetics has been
subject of study in this book chapter in order to weighting or not properly
the data.

Table 6. Some selected applications of weighted least squares

in analytical chemistry

Content Reference
Theory of chromatographic detection and modern approaches Asnin, 2016
to data acquisition and processing is given in the context of
the calibration problem
Characterizing nonconstant instrumental variance in emerging Noblitt et al., 2016
miniaturized analytical techniques
Simultaneous determination of 40 novel psychoactive Concheiro et al., 2015
stimulants in urine by liquid chromatography–high resolution
mass spectrometry and library matching
Practical guidelines for reporting results in single- and multi- Olivieri, 2015
component analytical calibration
Weighting and Transforming Data in Linear Regression 15

Content Reference
Method validation using weighted linear regression models Pereira da Silva et al.,
for quantification of UV filters in water samples 2015
Using Least Squares for Error Propagation. Practical Tellinghuisen, 2015
examples.
Analysis and interpretation of enzyme kinetic data Cornish-Bowden, 2014
Selecting the correct weighting factors for linear and quadratic Gu et al., 2014
calibration curves with least-squares regression algorithm in
bioanalytical LC-MS/MS assays and impacts of using
incorrect weighting factors on curve stability, data quality,
and assay performance
Impact of calibrator concentrations and their distribution on Tan et al., 2014
accuracy of quadratic regression for liquid chromatography–
mass spectrometry bioanalysis
Reducing the number of signals needed to perform LW Brasil et al., 2013
calibrations by developing models of weighing factors robust
to daily variations of instrument sensibility: Application to the
identification of explosives by ion chromatography
Comparative study of some robust statistical methods: Korany et al., 2013
weighted, parametric, and nonparametric linear regression of
HPLC convoluted peak responses using internal standard
method in drug bioavailability studies
The quality coefficient as performance assessment parameter de Beer et al., 2012
of straight line calibration curves in relationship with the
number of calibration points.
The approaches for estimation of limit of detection for ICP- Rajakovic et al., 2012
MS trace analysis of arsenic
A comparison in the evaluation of measurement uncertainty in Sousa et al., 2012
analytical chemistry testing between the use of quality control
data and a regression analysis
Application of a special in-house validation procedure Brüggemann and
for environmental–analytical schemes including a comparison Wennrich, 2011
of functions for modelling the repeatability standard deviation
Overall calibration procedure via a statistically based matrix- Lavagnnini et al., 2011
comprehensive approach in the stir bar sorptive extraction–
thermal desorption–gas chromatography–mass spectrometry
analysis of pesticide residues in fruit-based soft drinks
Using R2 to compare least square fit models: when it must Tellinghuisen y Bolster,
fail. 2011
Comparison of three weighting schemes in weighted Jain, 2010
regression analysis for use in a chemistry laboratory
Method validation for the endocrine disruptors and pesticides Mansilha et al., 2010
in water by gas chromatography–tandem mass spectrometry
using weighted linear regression schemes
16 Julia Martín, Alberto Romero Gracia and Agustín G. Asuero

Table 6. (Continued)

Content Reference
Calibration in atomic spectrometry: A tutorial review dealing Mermet, 2010
with quality criteria, weighting procedures and possible
curvatures
Comparison between ordinary least squares regression and Nascimento et al., 2010
weighted regression in the calibration of metals present in
human milk
Cochran’s test optimized “G test”: Expressions are derived to ’t Lam, 2010
calculate upper limit as well as lower limit critical values for
data sets of equal and unequal size at any significance level.
Least-squares analysis of data with uncertainty in x and y: A Tellinghuisen, 2010a
Monte Carlo methods comparison
Least-Squares Analysis of Phosphorus Soil Sorption Data Tellinghuisen, 2010b
with Weighting from Variance Function Estimation: A
Statistical Case for the Freundlich Isotherm
Least-squares analysis of phosphorus soil sorption data with Tellinghuisen and
weighting from variance function estimation Bolster, 2010
The guiding role of the assumptions for least-squares Brito et al., 2009
regression in practical problem solving: Calibration of 109Cd
KXRF systems
Weighted least-squares regression with different weighting Brito and Chettle, 2009
functions: Calibration of 109Cd KXRF systems
Verifying if alternative approaches are available for getting Desimoni and Brunetti,
acceptably approximate estimates of the limit of detection 2009
Least squares in calibration: weights, nonlinearity, and other Tellinghuisen, 2009a
nuisances
The least-squares analysis of data from binding and enzyme Tellinghuisen, 2009b
kinetics studies: weights, bias, and confidence intervals in
usual and unusual situations
Weighting Formulas for the Least-Squares Analysis of Tellinghuisen, 2009c
Binding Phenomena Data
Variance function estimation by replicate analysis and Tellinghuisen, 2009d
generalized least squares: A Monte Carlo comparison
Weighting formulas for the least-squares analysis of binding Tellinghuisen and
phenomena data Bolster, 2009
Analysis of Flavonoids in Oxytropis kansuensis Bunge by RP- Li et al., 2008
LC–DAD with Weighted Least-Squares Linear Regression
Least squares with non-normal data: estimating experimental Tellinghuisen, 2008a
variance functions
The problem with using “quality coefficients” to select Tellinghuisen, 2008b
weighting formulas
Weighting and Transforming Data in Linear Regression 17

Content Reference
Least-squares variance component estimation. Various Teunissen and Amiri-
examples are given to illustrate the theory Simkooei, 2008
Weighted least squares in calibration: Estimating data Zeng et al., 2008
variance functions in high-performance liquid
chromatography
A statistical overview on univariate calibration, inverse Lavagnini and Magno,
regression, and detection limits: application to gas 2007
chromatography/mass spectrometry technique
A general approach to heteroscedastic linear regression: The Leslie et al., 2007
methodology is applied to a number of simulated and real
examples
Weighted least-squares in calibration: The distinction between Tellinghuisen, 2007
a priori and a posteriori parameter standard errors is
emphasized
Why are we weighting? Recommendations Thompson, 2007
Reviews calibration-, uncertainty-, and recovery-related Vanatta and
documents from 10 consensus-based organizations Coleman, 2007
Determination of lanthanides in international geochemical Santoyo et al., 2006
reference materials by reversed-phase high-performance
liquid chromatography using error propagation theory
to estimate total analysis uncertainties
Understanding Least Squares through Monte Carlo Tellinghuisen, 2005b
Calculations

Enzymatic Kinetics: Lineweaver-Burk, Hones or

Eadie- Hofstee?

An example of a function that can be transformed into the hyperbola:

x
y (13)
x  

If we do Y=1/y, and X=1/x, we have the lineal equation:

Y  X (14)
18 Julia Martín, Alberto Romero Gracia and Agustín G. Asuero

The Eqn. (13) is important because it is the functional form that

explains the relationship between the rate of an enzymatic reaction, v, and
the concentration of substrate C. The rate of reaction of the enzymatic
reactions depends on the affinity of the substrate and the enzyme. If it
follows an ideal model ([substrate] >> [enzyme]), and if:

E + S ↔ ES ↔E + P (15)

A plot of the concentration of the reaction rate versus the substrate

concentration follows a rectangular hyperbola through the origin. The
relationship between the reaction rate and the substrate concentration can
be expressed (Jurs, 1986; Noggle, 1993) by the Michaelis-Menten
equation:

C
v V (16)
C  K m max

where Km and Vmax are the constants of Michaelis. Vmax is the reaction rate
when the enzyme is completely saturated with the substrate and the
reaction proceeds at the maximum possible speed, and Km is the substrate
concentration at half the maximum speed. The Michaelis-Menten equation
can be regrouped to produce different linear forms:
Lineaweaver-Burk (LB); plotting 1/v versus 1/C:

1 1 Km
  (17)
v Vmax CVmax

Hanes (H); plotting C/v versus C:

C C K
  m (18)
v Vmax Vmax
Weighting and Transforming Data in Linear Regression 19

Eadie-Hosfstee (EH); plotting v versus v/C:

vK m
v  Vmax  (19)
C

Table 7. Data of substrate concentration and rates for an

enzimatic reaction

C (a) v C (b) v C (c) v C (d) v C (e) v

0.10 1.9 20.0 48.24 0.03 0.14 0.14 0.15 1 0.10
0.33 4.2 12.0 40.09 0.04 0.17 0.22 0.17 3 0.24
1.00 6.1 8.0 41.19 0.05 0.18 0.29 0.23 5 0.30
3.33 6.5 6.4 37.39 0.09 0.26 0.56 0.32 8 0.54
10.00 7.2 4.8 32.91 0.13 0.31 0.77 0.39 10 0.72
33.30 7.4 3.2 28.72 0.22 0.35 1.46 0.49 15 0.97
100.00 6.9 1.6 18.94 0.43 0.40 20 1.15
0.65 0.44 30 1.52
1.08 0.45 40 1.72
50 1.97
(a) Deshidratación de l-malato catalizada por fumarasa (Noggle, 1993)
(b) Hipurato de metilo-quimiotripsina a pH 7.8 y 25 ºC (Elmore et al., 1963)
(c) Formación de maltosa a partir de almidón/amilasa (Noggle, 1993)

(d) Nicotinamida-adenina dinucleótido (Jurs, 1986)

(e) (Crabbe, 1982)

The experimental data for various enzymatic systems is shown in

Table 7, and the results of the application of e linearization methods
unweighted and weighted are shown in Table 8. The application of the
least squares in the EH case is questionable. Some of the representations
are shown in Figure 1. The weight factors for the LB and H methods are
contemplated in Eqn. 20a, b.

1 1 v4 (20a, b)
wi (LB)  2
 v4 wi (H )  2

 (1/ v)   (c / v)  c2
 v   v 
20 Julia Martín, Alberto Romero Gracia and Agustín G. Asuero

Both, LB and H methods, lead to identical results by applying the

appropriate weights. Transformations distort data space and change the
way in which observations affect parameters (Noggle, 1993). There are
two ways to avoid these distortions: WLR and nonlinear regression (NLR).
Both methods lead to almost similar results.

Table 8. Michaelis constants values obtained by the

three methods described by simple linear regression
(upper result) and weighted (lower result)

Method Lineaweaver-Burk Hanes Eadie-Hosfstee

System Km Vmax Km Vmax Km Vmax
I (*) 0.248 7.435 -0.164 6.933 0.268 7.303
0.242 7.257 0.242 7.257
II 2.929 5.389 3.036 5.436 2.862 5.352
2.918 5.409 2.918 5.409
III (**) 0.073 0.470 0.077 0.478 0.075 0.475
0.076 0.477 0.076 0.477
IV 0.441 0.585 0.582 0.685 0.490 0.626
0.571 0.680 0.571 0.680
V 16.642 1.766 42.497 3.586 30.271 2.853
42.253 3.610 42.196 3.806
Km a1/a0 a0/a1 −a1
Vmax 1/a0 1/a1 a0
I: L-malato/fumarasa (Noggle, 1993)
II: Methyl Hipurato/quimiotripsina (Elmore et al., 1963)
III: maltosa/amilasa (Noggle, 1993)
IV: nicotinamide/adenina dinucleótid (Jurs, 1986)
V: Crabbe, 1982
NLR (Noggle, 1993):
(*) 0.245±0.068 y 7.24 ± 0.33;
(**) 0.477±0.009 y 0.076±0.05.
Weighting and Transforming Data in Linear Regression 21

Figure 1. Plots of de Hones and Lineaweaver-Burk for some studies systems.

22 Julia Martín, Alberto Romero Gracia and Agustín G. Asuero

TRANSFORMING DATA

It may seems that the best way of calculating the coefficient of a non
linear equation is the direct application of a non linear regression program.
However, NLR is no free (Mager, 1991) from problems: i) depending on
the structure of the data and the starting value one may obtained different
final solutions; ii) the discrimination between rival models is difficult; iii)
NLR is relatively sensitive to deviations from homocedasticity; iv) a
substantially multi-collinearity may appears to lead to non robust
estimates. Some trouble may be originated using asymptotic NLR
estimates (Mager, 1991) because of the too small number of observations
in real experiments.
Some advantages are derived from the application of mathematical
transformation to experimental data. Transformation may be successfully
applied to reach homocedasticity (stabilize variances), to get (an
approximate) normality or test in an approximate way the type of model
(Meloun and Militký, 2011; Meloun, 1992; Draper and Smith, 1998;
Weisberg, 2005).
Graphical or numerical examination of data (Lavagnini and Magno,
2007; Barnet, 2004) may be carried out in order to check (separately or
jointly) key assumptions such as linearity of relationship, error
independence, residual variance constancy, normally distributed errors, and
outliers (Weisberg 2005; Belloto and Sokolovski, 1985). Informal plots
may reveal in a clear way the need for a given transformation such as ln x
or 1/y, holding in reserve the checking with a more formal analysis (Draper
and Smith, 1998). The log rule and the range rule are two often-helpful
empirical rules (Weisberg, 2005). Logarithm rule applies when the variable
is strictly positive and range over more than order of magnitude. If the
range is less than one order of magnitude any transformation is useless.
The greater the quotient ymax/ymin the greater the effect of the transformation
considered. As more  differs from the unity greater is the effect of a
power transformation of the kind Y=y (Box and Draper, 1987).
Logarithms and exponentials are involved in the most common
transformations (EPA, 2000; Daniel and Wood, 1999; Tomassone et al.,
Weighting and Transforming Data in Linear Regression 23

1983). Transformations less common include reciprocals, square roots, and

trigonometric functions. Sometimes, two or more of these functions may
be combined, as occur (Bysouth and Tyson, 1986) with some calibration
programs of commercial instruments.
Lacking prior information trial and error is involved to ascertain the
kind of transformation to be used. Plotting on the normal probability paper
the accumulative frequency of the transformed variable allows selecting as
most suitable the transformation yielding the best straight line. Statistical
tests such as comparisons, confidence limits, etc., are thus applied to the
transformed data, and results obtained back transformed to the original
curvilinear scale (Meloun et al., 1992; Acton, 1959), if desired. Finally, if
the transformations are not adequate, proceed to reject the linear model
considering instead a series of alternative nonlinear models.

TRANSFORMATIONS TO SIMPLIFY RELATIONSHIPS

Two situations are mainly considered in this context (Asuero and

Bueno, 2011; Draper and Smith, 1998; Rawlings et al., 1998). If no priori
knowledge of the model fitting the data is available an empirical form of
the dependence between the variables is searched so that a straight line
relationships may be used. The model is linear in the parameters
(Lavagnini and Magno, 2007); and it is only considered the form in which
the variables are expressed.
The aim of a transformation is the re-expression, when possible, of a
non-linear model into another one linear to which ordinary linear
regression may be applied. The functional form of the model dictates the
possible transformation.
Simple power transformations suitable only for positive values do not
retain the scale and is not always continuous, carrying out transformations
for symmetry (Gad, 1999). The power family of transformations x* = xk or
y* = yk, named as the “one bend” transformations (Rawlings et al., 1998),
affords a useful set of transformations for “straightening” a single bend in
the relationship between two variables. The corresponding transformations
24 Julia Martín, Alberto Romero Gracia and Agustín G. Asuero

may be ordered according to the k exponent values thus obtaining a

sequence of power transformations known as (Mosteler and Tukey, 1977;
Tukey, 1977) ladder of reexpressions (Table 9).
The sharpness of the curvature determines how far has to move on the
transformations ladder (Rawlings et al., 1998). Several transformations on
a few observations covering the all range of the data are proved (one
independent variable being involved), choosing the transformation that
reaches the highest linearity. If the variable follows a Poisson distribution
(i.e., count, frequency), the square root transformation is used (less
dramatic than taking logarithms) (Shumway et al., 2002). The reciprocal
transformation reverses the order of observations having a much more
drastic effect than taking logs, being useful when data have an extremely
skewed distribution. However, the use of the reciprocal transformation is
not common, and the log transformation is preferably to any other if yields
satisfactory results.

Table 9. Tukey’s Ladder of Transformations

(for choosing a function to change a distributions’ shape)
(Asuero and Martín Bueno, 2011)

Need to Correct Strength of Mathematical k exponent value

Transformation Function
y*  y k
Positive Skew Stronger 1 -2

y2
Mild 1 -1

y
“ ln y 0

“ 1
y 2
No Shape Change --- y 1
Negative Skew Mild y2 2

“ y3 3

Stronger exp y
Weighting and Transforming Data in Linear Regression 25

TRANSFORMATIONS TO LINEARIZE THE MODEL

Adequately transformations allow non-linear models to be presented in

a linear form (Table 10) (de Levie, 2004; Bayne and Rubin, 1986; du Toit
et al., 1986; Tomassone et al., 1983; Daniel and Wood, 1980). It should be
noted that using of these transformations are certain to accomplish one
thing only, i.e., to yield a straight-line form. Some assumptions
theoretically necessary to apply the least squares method may not be
necessarily satisfied by the transformed data. (Bates and Watts, 2007;
Seber, 2003; Belloto and Sokolovski, 1985).
Examples of linearizable functions are shown in Figure 2. The power
family of transformations cannot straighten relationships showing more
than one bend (i.e., the classical S-shaped growth curve). Logit, arcsin (or
angular), and probit are commonly used two-bend transformations kind.

Table 10. Nonlinear function that can be written in a linear form by

means of a transformation (Asuero and Martín Bueno, 2011)

Function Formula Transformation Linear form

Power
y x b y '  log y y '  log    x '
function
x '  log x
Exponential
y   e x y '  log y y '  log    x
grown
model
Logarithmic y     log x x '  log x y    x'
Hyperbolic
y 
x y '  1/ y y'    x'
x
x '  1/ x
Logit
e   x  y  y'   x
y y '  log  
1  e   x 1 y 
2 tanh  1  2 y  1 

In transforming data, the disturbance term is also transformed, thus

affecting the assumptions concerning it (Bates and Watts, 2007; Seber,
2003; Box and Draper, 1987). That is to say, correct assumptions of
26 Julia Martín, Alberto Romero Gracia and Agustín G. Asuero

additive and normal disturbance terms for model functions are not valid for
the transformed data. Use non-linear regression on the original data, or
weighted least squares on the transformed data being then required. Fitting
then the transformed model leads to some initial estimates (Meloun et al.,
1992; Mager, 1991).

Figure 2. Plots of linearizable curves (Asuero and Martín Bueno, 2011).

Weighting and Transforming Data in Linear Regression 27

TRANSFORMATIONS TO STABILIZE VARIANCES

Non constant variance are related with non normal distributed data
(Canavos 1984, Rios 1977), being data transformation the most appropriate
mean to deal with such situations (Asuero and Martín Bueno, 2011).
Variance heterogeneity usually appears when the errors corresponding to
some treatments are significantly higher (or lower) than others, given the
nature of the experiences. In a normal distribution, the variance σy and the
mean σy are independent; a direct relationship between the mean and the
variance is typical from all other common distributions. Either theoretical
considerations and/or a preliminary empirical analysis may suggest the
nature of the dependence between the variance and the mean value (Box
and Draper, 1987). If the functional relationship is known, a transformation
exists making (approximately) constant the variance (Draper and Smith,
1998). With certain kinds of data, heterogeneous (non uniform) variance
and non normality are expected at first. The same experimental situations
that lead to non-normal distributions as usually provide heterogeneous
variances as σy =f ( ) in most non- normal distributions (Brownlee, 1984;
Natrella, 1963).
Table 11 summarizes a number of transformations (some from the
power family) used to correct for homogeneity and approximate normality.
Note that in stabilizing variance, the transformed variable is more normal
(Gaussian).

TRANSFORMATION BASED ON SAMPLE DATA

OBSERVATIONS: BOX-COX METHOD

There are many situations in which the only information available in

order to determine the appropriate transformation is the sample data.
Residual plots may indicate a given kind of transformation to be applied.
Several transformations may be applied and it should be adopted the one
28 Julia Martín, Alberto Romero Gracia and Agustín G. Asuero

that better meet the normality criteria. The most appropriate transformation
can also be calculated empirically (Box and Cox, 1964).

Table 11. Transformation to correct for homogeneity

and approximate normality (Asuero and Martín Bueno, 2011)

Data type sy  f  y  Variance stabilizing transformation

Poisson (Count)*
y y
y0
Small counts
( y )**
y 1 or y  y 1
Binomial
y 1  y  a sin y
(0  y  1)
Negative 1  y 
3

binomial
1 y  2 y 
1  y  1  y 
1
2

2

 
 3  3
0  y 1 y
Variance = y ln y
(mean)2
y0
0.5 ln 1  y   ln 1  y  
Correlation 1
coefficient
1  y  1 1 y 2

* Modifications for the Poisson and binomial cases have been suggested by Freeman and Tukey
(105).
** It should be noted that the square root transformation overcorrects when very small values and
zero appears in the original data. In these cases, y 1 is often used as a transformation.

A collection of transformations characterized by one or a few selected

parameters (Weisberg, 2005; Chinn, 1996; Sakia, 1992) is called a family
of transformations.
Logarithmic, square root, and inverse transformations are contained in
the Tukey (1957) family (Table 12):

 y  , for   0
T  (21)
 ln y, for   0
Weighting and Transforming Data in Linear Regression 29

λ ≤ 1. When λ  0 then yλ 1 (discontinuity at λ zero values), and the

transforation lacs of sense (Draper and Smith, 1998). In order to choose the
best λ value to run smoothly as λ approaches zero, Box and Cox (1964)
carried out the following proposal (Chinn 1996; Sakia, 1992; Peace, 1988;
Schlesselman, 1971)

 ( y   1) /  , for   0
W  (22)
 ln y, for   0

identically in essence to Eqn. 21 for those cases in which a constant

term b0 is contained in the regression model the (Peace, 1988).
The same transformation was practically suggested by Kapteyn (1916;
1903) about 150 years ago in his work on growth. Sclove (1972) gave a
test of =0 (log y versus x). As the F statistic in the analysis of variance is
invariant under linear transformations (Malaeb, 1997), Eqns. 21 and 22 are
equivalent.
It is more convenient, however, to employ (Table 12)

(23)

being the geometric mean of the corresponding yi values

(24)

is a constant, evaluated at the beginning of the calculations by means

of the formula

(25)
30 Julia Martín, Alberto Romero Gracia and Agustín G. Asuero

As the values of  varies, the W´s may change drastically

(disadvantage of Eqn. 22), posing minor problems; a special program being
required to achieve the best  value. It is then better to use Eqn. (23).
The in Eq. (23) is the nth power of the appropriate Jacobian of
the transformation (Mateu, 1997); the set of yi values are converted into the
Wi ones

k
dyi(  ) k
J (  , y)     yi 1 (26)
i1 dyi i1

A more flexible extended power or shifted power transformation

family was also proposed in the Box and Cox (1964) paper, which
accounts for negative y’s

 ( y   )1  1
 2 if 1  0
y(  )   1
(27)
 log( y   ) if 1  0
 2

In practice we get y  2  0 for any y value. The variance of the

(1 1 )
untransformed scale must be proportional to (mean  2 ) (Chinn,
1996), when we deal with a shifted variance stabilizing Box-Cox
transformation. Approximate normality is only to be expected (Sakia,
1992) because the y (  ) range (Eqns. 22, 23 and 27) is restricted.
It is important to note that the range of equations (22)-(23) and (27) is
restricted according to whether  is positive or negative. This implies that
transformed values do not cover the entire range (,) (bounded
supported distribution).
The idea behind the Box and Cox method (Draper and Smith, 1998) is
that, if a suitable λ value is found, then it is possible by the maximum
Weighting and Transforming Data in Linear Regression 31

likelihood method fitting an additive model with a normal, independent

and homogeneous error structure (Draper and Smith, 1998).
Box-Cox transformation calculations are carried out by means of
computer programs (Huang et al., 1978; Chang, 1977). Statistical packages
including graphical facilities allow selecting power transformations
(Weisberg, 2005; Malaeb, 1997) of both the independent and dependent
variables. Multivariate Box-Cox transformation (Rode and Chinchilli,
1988) may be performed by means of the MULTTBXX program (the
univariate transformation being a special case).

Table 12. Values of certain power functions for transformations to

stabilize variances (Asuero and Martín Bueno, 2011)

 y W   y   1 / 

1 y y 1 y 1
0.5 y 2  
y 1

0 1(?) ln y
-0.5 1/ y 
2 1  1/ y 
-1 y 1 2 1  1/ y 

APPLICATIONS

Some examples of application of transforming data and Box-Cox

transformation in analytical chemistry are compiled in Tables 13 and 14.
An illustrative example taken form scientific journals about the
equilibrium constant for the heterogeneous reaction and the CO2 vapour
pressure versus temperature is selected in order to elucidate this book
chapter:
32 Julia Martín, Alberto Romero Gracia and Agustín G. Asuero

Equilibrium Constant for the Heterogeneous Reaction:

CoTiO3 = Co + TiO2 + CO2

The dependence of the equilibrium constant on temperature has the

form:

S H 1
log K    (28)
4.576 4.576 T

S and H are the entropy and enthalpy variations of the reaction,

expression of the type

Y  a0  a1x (29)

S H 1
Y  log K a0  a1   x (30)
4.576 4.576 T

Figure 3. Plot of the measurement of the equilibrium constant of the reaction as a

function of the temperature. Middle: Plot of the transformed data (log Keq) in function
of 1/T ºK (Single Linear Regression). Bottom: Residual analysis.
Weighting and Transforming Data in Linear Regression 33

The results obtained on the measurement, at five different

temperatures, of the equilibrium constant for the reaction CoTiO3 = Co +
TiO2 + CO2 are shown in Table 15, and in Figure 3 (top). Also, in Figure 3
(center) is plotted the straight line that adjusts to the transformed data, log
Keq in function of 1/T ºK, together with the residual analysis (Figure 3
bottom) by simple linear regression.

Table 13. Some selected applications of transformations

Content Reference
A bilogarithmic hyperbolic cosine method for the evaluation of Beaumount et al.,
overlapping formation constants at varying (or fixed) ionic 2016
strength
Evaluation of three isotherm models (Langmuir, Freundlich, and Chen, 2015
Dubinin-Radushkevich) to correlate four sets of experimental
adsorption isotherm data, which were obtained by batch tests in
lab
Kinetics of Carbaryl Hydrolysis: An Undergraduate Hawker, 2015
Environmental Chemistry Laboratory
Feasibility study of potentiometric multisensor system of 18 ion- Yaroshenko et al.,
selective and cross-sensitive sensors as an analytical tool for 2015
determination of urine ionic composition
A novel multiple headspace extraction gas chromatographic Zhang and Chai,
method for measuring the diffusion coefficient of methanol in 2015
water and in olive oil
Adsorption Kinetics and Isotherms: A Safe, Simple, and Piergiovanni, 2014
Inexpensive Experiment for Three Levels of Students
Evaluation of Equilibrium Sorption Isotherm Equations: Datasets Chen, 2013
from literatures are selected and three two-parameter and three-
parameter equations were used to evaluate adsorption systems
Statistical Analysis of Linear and Non-linear Regression for the Osmari et al., 2013
Estimation of Adsorption Isotherm Parameters
Equlibrium sorption of the phosphoric acid modified rice husk: Dada et al., 2012
Langmuir, Freundlich, Temkin and Dubinin–Radushkevich
Isotherms Studies
Application of the van’t Hoff dependences in the characterization Denderz and
of molecularly imprinted polymers for some phenolic acids: Lehotay, 2012
Evaluation of the temperature effect on the sorption processes
investigated analytes in methanol and acetonitrile (porogen) as
mobile phases
An alternative analytical method for measuring the kinetic Heinzerling et al.,
parameters of the enzymes invertase and lactase 2012
34 Julia Martín, Alberto Romero Gracia and Agustín G. Asuero

Table 13. (Continued)

Content Reference
Study of the pattern of formation of absorption signals for high Katskov et al., 2012
concentrations of analyte atoms in the absorption volume and to
employ the findings for High-resolution continuum source
electrothermal atomic absorption spectrometry data quantification
within a broad concentration range of the analyte
A comprehensive treatment of experimental enzyme kinetics Barton, 2011
strongly coupled to electronic data acquisition and use of
spreadsheets to organize data and perform linear and nonlinear
least-squares analyses
Chemical Dosing and First-Order Kinetics: Examples of multiple- Hladky, 2011
dose problems are presented that are appropriate for students
taking introductory, general, and physical chemistry courses
On the use of linearized pseudo-second-order kinetic equations El-Khaiary et al.,
for modeling adsorption systems 2010
Insights into the modeling of adsorption isotherm systems: Foo and
accuracy and consistency in parameters prediction or estimation Hameed, 2010
Introduce and compare numerical approaches that involve Markovic et al.,
diferent levels of knowledge about the noise structure of the 2010
analytical method used for initial and equilibrium concentration
determination
Polydimethylsiloxane-based permeation passive air sampler. Part Seethapathy and
II: Effect of temperature and humidity on the calibration constants Górecki, 2010
A simple competitive enzyme-linked immunosorbent assay Wang et al., 2010
(cELISA) was established for rapid measurement of secretory
immunoglobulin A (sIgA) in saliva
An equation relating the absorbance of the solute to the acidity Asuero, 2009
constants (pKa1 and pKa2) and pH is derived for weak diprotic
acids (diprotic bases and zwitterions)
Weighting Formulas for the Least-Squares Analysis of Binding Tellinghuisen, 2009c
Phenomena Data
A comprehensive study on the possibility of applying the nth- Cai et al., 2008
degree polynomial logistic regression model for fitting the kinetic
conversion data of cellulose pyrolysis
The Hill equation: a review of its capabilities in pharmacological Goutelle et al., 2008
modelling
Least-squares regression of adsorption equilibrium data: El-Khaiary, 2008
comparing the options
Evaluation of logistic and polynomial models for calibration Herman et al., 2008
curves spanning the quantitative concentration range for seven
different protein assays based on examination of residuals
Weighting and Transforming Data in Linear Regression 35

Content Reference
Methods for studying reaction kinetics in gas chromatography, Krupcık et al., 2008
exemplified by using the 1-chloro-2,2-dimethylaziridine
interconversion reaction
A bilogarithmic hyperbolic cosine method for the Boccio et al., 2007
spectrophotometric evaluation of stability constants of 1: 1 weak
complexes is developed and applied to data found in the literature
Examination of the limitations of using linearized Langmuir Bolster and
equations by fitting P sorption data collected on eight different Hornberge, 2007
soils with four linearized versions of the Langmuir equation and
comparing goodness-of-fit measures and fitted parameter values
with those obtained with the nonlinear Langmuir equation.
A review of existence criteria for parameter estimation of the Jukic et al., 2007
Michaelis–Menten regression model
Highlights some common errors of data evaluation that are fre- Badertscher and
quently found in the literature Pretsch, 2006
The general equation resulting from the logistic transformation is Capitán-Vallvey et
discussed considering the stoichiometric factors for monovalent al., 2006
anions, and the linearization of the theoretical fit to experimental
data was checked for two real cases
Alternative method to the Arrhenius equation for Naya et al., 2006
termogravimetric analysis based on a logistic mixture model
Log-log transformation without weighting is the simplest model Singtoroj et al., 2006
to fit the calibration data for the determination of piperaquine
(PC) in urine
A bilogarithmic hyperbolic cosine method for the Sayago and
spectrophotometric evaluation of stability constants of 1:1 weak Asuero,2006
complexes from continuous variation data is devised and applied
to literature data

Table 14. Some selected papers on Box-Cox transformation

Content Reference
Estimating Box-Cox power transformation parameter via Asar et al., 2017
goodness of fit tests. An artificial covariate method is also
included for comparative purposes
Two strategies are proposed to extend and unify residual error Dosne et al., 2016
modeling: a dynamic transform-both-sides approach combined
with a power error model capable of handling skewed and/or
heteroscedastic residuals, and a t-distributed residual error model
allowing for symmetric heavy tails
Models with Transformed Variables.Interpretation and Software Boef et al., 2015
36 Julia Martín, Alberto Romero Gracia and Agustín G. Asuero

Table 14. (Continued)

Content Reference
Overview of state-of-the-art dose-response analysis, both in terms Ritz et al., 2015
of general concepts that have evolved and matured over the years
and by means of concrete examples
Optimization of sonochemical degradation of tetracycline in Safari et al., 2015
aqueous solution using a central composite design
New methodology for estimating λ and an alternative method of Vélez et al., 2015
determining plausible values for it
Experimental design and multiple response optimization. Using Candioti et al., 2014
the desirability function in analytical methods development
Design-based development of a stability-indicating RP-HPLC Roy and
method for the simultaneous determination of parabens in Chakrabarty, 2014
pharmaceutical formulation
Occurrence of pharmaceuticals in urban wastewater of north Singh et al., 2014
Indian cities and risk assessment
Statistical Evaluation and Validation of Quantitative Methods of Komsta, 2013
Drug Analysis
A calibration-free/minimum approach, iterative optimization Muteki et al., 2013
technology, which is used to predict (without calibration
standards) the composition of a mixture while maintaining a
similar predictability to calibration standard models
Gaussian Quadrature is an efficient method for the back- Dekkers and Slob,
transformation in estimating the usual intake distribution when 2012
assessing dietary exposure
CALUX measurements: Statistical inferences for the dose– Elskens et al., 2011
response curve. Use of linear calibration functions based on Box–
Cox transformations to overcome the issue of uncertainty
assessment
Statistical Data Analysis in data transformation: A practical guide Meloun and Militký,
2011
A general equation is presented for modeling retention, using the Komsta, 2010
organic modifier content of the mobile phase. The equation is
based on the Box-Cox transform of modifier concentration.
Overview of traditional normalizing transformations and how Osborne, 2010
Box-Cox incorporates, extends, and improves on these traditional
approaches to normalizing data. Examples of applications are
presented, and details of how to automate and use this technique
are included
Least-Squares Analysis of Phosphorus Soil Sorption Data with Tellinghuisen, 2010
Weighting from Variance Function Estimation: A Statistical Case
for the Freundlich Isotherm
Weighting and Transforming Data in Linear Regression 37

Content Reference
Evaluation of the environmental contamination at an abandoned Bagur et al., 2009
mining site using multivariate statistical techniques. The Box–
Cox transformation has been used to transform the data set in
normal form in order to minimize the non-normal distribution of
the geochemical data
A method for identifying relevant proteins from SIMCA Marengo et al., 2008
discriminating powers is proposed, based on the Box-Cox
transformation coupled to probability papers
Application of differential permeation and Box–Cox Xu and Que Hee,
transformation in the analysis of di-n-octyl disulfide in a straight 2006
oil metalworking fluid
The Box-Cox transformation applied to soil data improves sample Meloun et al., 2005
symmetry and stabilizes spread; the logarithmic plot of a profile
likelihood function enables the optimum transformation
parameter to be found

Table 15. Equilibrium constant for the heterogeneous reaction

CoTiO3 = Co + TiO2 + CO2 (Spiridonov y Lopatkin, 1973)

T ºK 955 1018 1031 1056 1083

Keq 3.30 3.00 2.96 2.81 2.82
3.25 2.99 3.04 2.81 2.84
3.29 3.00 2.99 2.84 2.78
3.29 3.04 3.02 2.80 2.75
3.34 3.02 3.01 2.82 2.80
3.28 3.04 2.99 2.84 2.74
3.33 3.03 2.99 2.94 2.71
3.03 3.01 2.90 2.76
3.08 3.00 2.92 2.87
3.07 3.03 2.93 2.81
3.07 2.98 2.90 2.78
3.05 3.01 2.90 2.79
3.07 2.78
3.05
3.02
3.04
3.05
38 Julia Martín, Alberto Romero Gracia and Agustín G. Asuero

Since the number of measurements for the different xi is different, the

Barlett test is used to estimate the homogeneity (homoscedasticity) of the
variances

1 2 k k
Bexp   ln s  f i   ( f i ln si2 ) (31)
C i1 i1

1  1 1 
C  1 
3(k  1) 
 
f i  f i 
(32)

If the equality of the variances is true, the magnitude Bexp obeys the
Chi-square distribution (  2 ) with k-1 degrees of freedom, if each of fi > 2.

If Bexp   2 (k  1) , the hypothesis is accepted. If, on the contrary,

Bexp   2 (k  1) the hypothesis is considered incompatible with the

experimental data obtained.
Since Bexp  13.3   0.05
2
(4)  9.49 , the hypothesis of homogeneity of
the variances is discarded (Table 16), while for a significance level of 1%,
 0.01
2
(4)  13.3 the hypothesis of homogeneity can not be rejected with
certainty. In this case, the measures are treated assuming the worst variant,
that is heteroscedasticity (other option is to repeat the experiment), and
WLR is performed (Table 17), with

s2y c
s 
2
yi
i
wi  (33)
ni s2y
i

2
where c (= sPE ) is an arbitrary constant that does not influence on the final
results of a0 and a1, sa0 and sa1 (although it does affect the magnitudes of sy/x
and cov(a0, a1)). If we follow the criterion of Spiridonov and Lopatkin
(1973) to make the sum of the weights equal to 1 we have
Weighting and Transforming Data in Linear Regression 39

1 1 1 1
w  1  c c wy  f 
yi
s 2y  s2y i
s 2y 
1 w2y
i i
i
s2y  i

fi
i i

(34)

The weighted residuals (Figure 4) show the form of rectangular band

(and not funnel) indicating a random distribution.

Table 16. Bartlett's test for homogeneity of variances

N 7 17 12 12 13
Mean 0.5181 0.4826 0.4775 0.4574 0.4451
SD 4.003E-03 3.794E-03 3.214E-03 7.912E-03 6.647E-03
Variance s2 1.603E-05 1.439E-05 1.033E-05 6.260E-05 4.419E-05
Degrees of 6 16 11 11 12
freedom (f)
Sum f 56
f * s2 9.616E-05 2.303E-04 1.136E-04 6.886E-04 5.303E-04
(s2 mean) 2.962E-05
ln (s2 mean) * -583.9083
sum f
ln s2 -11.0413 -11.1489 -11.4806 -9.6787 -10.0271
f * ln s2 -66.2475 -178.3821 -126.2870 -106.4655 -120.3247
Sum (f* ln s2) -597.7068
1/f 0.1667 0.0625 0.0909 0.0909 0.0833
Sum (1/f) 0.4943
B 13.7985 B = ln s2(mean) * sum fi - sum (fi * ln si2)
C 1.0397 C=1+[1/(3(k-1)]*[sum (1/f) - 1/(sum f)]
B/C 13.272 Chi2 9.488
(0.05, 4)
B/C = 13.272 > 9.488: The hypothesis of homogeneity of s2 can not be accepted at 5%
level. With a significance level of 1% [Chi2(0.01,4) = 13.277]: The hypothesis can not
be rejected with certainty.
40 Julia Martín, Alberto Romero Gracia and Agustín G. Asuero

Table 17. Equilibrium constant of the reaction as a function of the

temperature: log Keq=a0+a1·(1/T) (WLR)

N 1/T (ºK) log Keq s2 (log Keq) s2/N w

7 8.142E-04 0.5181 1.603E-05 2.290E-06 4.368E+05
17 7.745E-04 0.4826 1.439E-05 8.466E-07 1.181E+06
12 7.668E-04 0.4775 1.033E-05 8.607E-07 1.162E+06
12 7.524E-04 0.4574 6.260E-05 5.217E-06 1.917E+05
13 7.374E-04 0.4451 4.419E-05 3.399E-06 2.942E+05
3.845E-03 2.3807 3.266E+06
w(norm) w(i)*x(i) w(i)*y(i) w(i)*x(i)^2 w(i)*y(i)^2 w(i)*x(i)*y(i)
0.1337 1.089E-04 0.06929 8.867E-08 0.03590 5.642E-05
0.3617 2.801E-04 0.17456 2.170E-07 0.08424 1.352E-04
0.3558 2.728E-04 0.16987 2.092E-07 0.08111 1.303E-04
0.0587 4.416E-05 0.02685 3.322E-08 0.01228 2.020E-05
0.0901 6.643E-05 0.04009 4.898E-08 0.01785 2.957E-05
1.0000 7.724E-04 0.48067146 5.970E-07 0.231383438 3.716E-04
S(XX)= 3.809E-10 a1= 937.24026 s(y/x)= 0.001123407
S(XY)= 3.570E-07 a0= -0.24328 s(a1)= 57.5606
S(YY)= 3.384E-04 R= 0.99439 s(a0)= 0.0445
cov(a0,a1)= -2.5592

2
sLOF 1.2604 106
Fexp    4.12 F0.05(3,42)  2.83 F0.01(3,42)  4.29
2
sPE 3.062110 7
(35)

The linearity hypothesis cannot be accepted for =0.05, nor rejected

for =0.01. Since the non-linearity is unlikely in the studied T interval, this
possibility is attributed to the insufficient accuracy of the experimental
data.
It can be taken into account the weights dependent on the
transformation
1 1
wt  2
 2
 ln10 2 K 2 (36)
 log K   1 ln K 
 K   ln10 K 
Weighting and Transforming Data in Linear Regression 41

which are combined to estimated through replicate measures. In this case:

Y  (0.2276  0.0365)  (916.8915  46.8490)  x R  9.9961 (37)

Figure 4. Weighted residuals versus inverse of the temperature; log Keq=f(1/T ºK).
42 Julia Martín, Alberto Romero Gracia and Agustín G. Asuero

The CO2 Vapour Pressure versus Temperature Case

The linear variation of the physical magnitudes is not universal,

although it is often possible to find a coordinate transformation (Asuero
and Bueno, 2011) that converts the non linear data into linear ones. The
vapour pressure (in atmospheres) of liquid CO2 as as a function of
temperature (in Kelvin degrees) is not linear (Figure 5). There is a
theoretical justification (the Claussius-Clapeiron equation) that allows
fitting the data of the vapour pressure (P) versus the absolute temperature
(K) into an equation of the form:
B
ln P  A  (38)
T

If we do: Y = ln P; X =1/T; we have lineal form:

Y = A + B·X (39)

Figure 5. CO2 vapor pressure data as a function of temperature.

Weighting and Transforming Data in Linear Regression 43

This requires a transformation of the data. After making the

appropriate transformations (Table 18), the resulting graph (Figure 6,
continue line) is examined. Is the data linear?. Statistics and the graph
appear to be fine, and apparently there are no obvious reasons to suspect a
problem with this analysis.

Figure 6. Plot of ln P versus 1/T (Clausius-Clapeiron equation) including residuals

analysis.

However, the residuals and the line resulting from the least-squares (Y
= A + BX) model fitted to the data could be combined in the same plot for
checking purposes. The results (Figure 6) lead to a correlation coefficient
of 0.99998876. This almost perfect fit is indeed very poor if attention is
paid to the pattern of residuals [+ + - - - - + + + + + + - -]. Systematic
deviations can either indicate a systematic error in the experiment (which
can not be tested since the details of the measurements are not known) or,
as it turns out in this case, the use of an incorrect or inadequate model. The
Claussius-Clapeyron equation does not exactly represent the vapor
pressure data over a wide temperature range. Results similar to those
44 Julia Martín, Alberto Romero Gracia and Agustín G. Asuero

shown in Figure 6 are obtained by using a weighting factor transformation

dependent (WLR) (Asuero and Bueno, 2011, Asuero and Gonzalez, 1989,
Asuero and González, 2007, 1989, de Levie, 2012, 1986):

1 1
wi    Pi 2 (40)
 Yi   ln Pi 
  y   P 
i i

Table 18. CO2 vapor pressure data versus temperatura using Clausius-
Clapeyron equation (lnP = A + B/T) (Nogle, 1993).

T P 1/T ln P ln P est Residual

216.550 5.110 0.005 1.631 1.637 -0.006
222.050 6.444 0.005 1.863 1.864 -0.001
227.606 8.043 0.004 2.085 2.083 0.002
233.161 9.921 0.004 2.295 2.291 0.003
238.717 12.099 0.004 2.493 2.490 0.003
244.272 14.623 0.004 2.683 2.679 0.003
249.828 17.508 0.004 2.863 2.860 0.002
255.383 20.788 0.004 3.034 3.034 0.001
260.939 24.510 0.004 3.199 3.199 0.000
266.494 28.702 0.004 3.357 3.358 -0.001
272.050 33.397 0.004 3.508 3.511 -0.002
277.606 38.636 0.004 3.654 3.657 -0.003
283.161 44.475 0.004 3.795 3.798 -0.003
288.717 50.939 0.003 3.931 3.933 -0.002
294.272 58.070 0.003 4.062 4.063 -0.001
299.828 65.916 0.003 4.188 4.188 0.000
304.161 72.768 0.003 4.287 4.283 0.005
a1 -1988.945 10.822 a0
s(a1) 1.721 0.007 s(a0)
R2 1.000 0.003 s(y/x)

The error lies not in the data, but in the model. We must try to improve
the latter. A more general form of the equation is:

ln P = A + B/T + C ln T + D T (41)
Weighting and Transforming Data in Linear Regression 45

Results obtained in this later case (analysis by multiple linear

regression) are much better (Table 19, Figure 7) than those obtained by the
linear equation, with the residuals being distributed randomly. The
standard deviation of the regression suggests that ln P can be calculated
with an accuracy of 0.001, or an accuracy level of 0.1%. Another
advantage is that T is used as a variable, rather than its inverse, so the
interpolations become somewhat easier to calculate.

Table 19. CO2 vapor pressure data versus temperatura using equation
(lnP = A + B/T + C· lnT + D·T) (Nogle, 1993)

T P ln P 1/T ln T T ln P est y - yest

216.5500 5.1102 1.6312 0.0046 5.3778 216.5500 1.6311 0.000131641
222.0500 6.4439 1.8631 0.0045 5.4029 222.0500 1.8633 -0.000153221
227.6056 8.0430 2.0848 0.0044 5.4276 227.6056 2.0848 -2.3707E-05
233.1611 9.9211 2.2947 0.0043 5.4517 233.1611 2.2945 0.000162525
238.7167 12.0985 2.4931 0.0042 5.4753 238.7167 2.4934 -0.000306487
244.2722 14.6230 2.6826 0.0041 5.4983 244.2722 2.6825 0.000133521
249.8278 17.5082 2.8627 0.0040 5.5208 249.8278 2.8626 7.06843E-05
255.3833 20.7880 3.0344 0.0039 5.5428 255.3833 3.0346 -0.000193665
260.9389 24.5101 3.1991 0.0038 5.5643 260.9389 3.1991 -1.01222E-05
266.4944 28.7017 3.3570 0.0038 5.5854 266.4944 3.3568 0.000143651
272.0500 33.3968 3.5085 0.0037 5.6060 272.0500 3.5083 0.000149554
277.6056 38.6364 3.6542 0.0036 5.6262 277.6056 3.6541 7.43584E-05
283.1611 44.4747 3.7949 0.0035 5.6460 283.1611 3.7947 0.000204959
288.7167 50.9390 3.9306 0.0035 5.6654 288.7167 3.9305 8.55518E-05
294.2722 58.0702 4.0617 0.0034 5.6845 294.2722 4.0620 -0.00034836
299.8278 65.9159 4.1884 0.0033 5.7032 299.8278 4.1895 -0.001079349
304.1611 72.7681 4.2873 0.0033 5.7176 304.1611 4.2863 0.000965059
0.0354 -18.3653 -4353.3367 112.8273 a3 a2 a1 a0
0.0014 0.7395 94.8033 4.1040 s(a3) s(a2) s(a1) s(a0)
1.0000 0.0004 #N/A #N/A R2 s(y/x)
19138738. 13.0000 #N/A #N/A F df
4636
11.2081 0.0000 #N/A #N/A SS(REG) SS(RES)
46 Julia Martín, Alberto Romero Gracia and Agustín G. Asuero

Figure 7. CO2 vapor pressure as a function of temperature according to the expanded

model ln P = A + B/T + C ln T + D T.

CONCLUSION

Linear regression is probably one of the most important problems in

data analysis (Giloni et al., 2006). However, some problems appear in
applying regression data analysis usually associated with mathematical
statistical unfamiliarity aspects. Heterocedasticity (irregular, heterogeneous
or non constant variance), when ignored, leads to an inadequate estimation
and inference. In order to solve this problem, data transformation and
weighted least squares regression can be used.
Varying quality data are easily handled with weighted least squares.
Weights stem directly from the least squares criterion, i.e., the likelihood
function, requiring variance known and independent of the model
parameters, and weights equals to the reciprocal variance (Seber, 2003;
Rawlings et al., 1998; Thompson, 1982; Draper and Smith, 1965;
Williams, 1959). In real applications weights are almost never exactly
known and estimates must be used instead (Seber, 2003; Williams, 1959;
Weighting and Transforming Data in Linear Regression 47

Engineering Statistics Handbook), which supposes a common drawback.

Theoretical models (Danzer and Currie, 1998) or statistical tests (Sayago
and Asuero, 2004; Penninckx et al., 1996) help in taking the decision of
weighting or not. We may assume that a smooth variance function account
for the heterocedasticity (Tellinghuisen, 2005a; Davidian, 1990; Carroll
and Ruppert, 1988), as an alternative procedure to replication.
Restore linearity (linearly transformable models) may be the aim of
response transformation (Rawlings, 1998; Meloun and Militký, 1994)
though sometimes pursue variance stabilization or getting normality.
Heterogeneous variance may be removed (Natrella, 1963) as additional
benefits besides linearity. In practice, however, it is difficult to find one
simple transformation that simultaneously satisfies different criteria
(Draper and Hunter 1969, Bartlett, 1947). A general scientific theory or
some possible distributional assumptions or an empirical plotting of the
data (Weisberg, 2005; AMC, 1994) often serve as the basis for these
simple transformations (logarithmic, power, root). The adequacy of the
transformed data model has to be checked before proceeding.
Box and Cox (1964) derived a general method for choosing
transformations of the response valid both in simple and multiple linear
regressions (Li and Moor, 2002; Lee et al., 1999). The Box-Cox original
paper has been a fruitfully source of inspiration (Andersen et al., 1999;
Mateu, 1997; Chinn, 1996; Sakia, 1992) that generated as much theoretical
work as practical applications. Presence of outliers and heterocedasticity
(Zarembka, 1974) adversely affect the robustness of Box-Cox method.
However many advantages (Logothetis, 1990) are ascribed to this method:
i) complete use of the information contained in the data at hand; ii)
assurance of the validity of simple and normality assumptions; iii) shorter
confidence interval for ; and iv) treatment in absence of replication (Box
and Meyer, 1986).
Keep the model as simple as possible (Rawlings et al., 1998; Bates and
Watts et al., 1988; Garfinkel and Fegley, 1984) is a golden key in science
because facilitates the understanding of the system as well as the
communication of results. FDA guidelines for bioanalytical method
validation follow this direction (Singtoroj et al., 2006) when states “the
48 Julia Martín, Alberto Romero Gracia and Agustín G. Asuero

selection of weighting and use of a complex regression equation should be

justified.”

REFERENCES

Aarons, L., Toon, S., Rowland, M., (1987). Validation of assay

methodology used in pharmacokinetic studies. J. Pharmacol. Methods
17, 337-346.
Acton, F.S., (1959). Analysis of Straight Line Data. New York, USA:
Wiley.
Almeida, A.M., Castel-Branco, M.M., Falcao, A.C., (2003). Linear
regression for calibration lines revisited: weighting schemes for
bioanalytical methods. J. Chromatogr. B 774, 215-222.
AMC, (1994). Is my calibration linear? Analyst 119(11), 2363-2366.
Anderson K.P., Snow, R.L., (1967). A relative deviation, least squares
method of data treatment. J. Chem. Educ. 44, 756-757.
Asar, Ö., İlk, Ö., Dağ, O., (2017). Estimating Box-Cox power
transformation parameter via goodness of fit tests. Commun. Stat.
Simul. Comput. 46(1), 91-105.
Asnin, L.D., (2016). Peak measurement and calibration in chromatographic
analysis. Trends Anal. Chem. 81, 51-62.
Asuero, A.G., González, G., de Pablos, F., Gomez Ariza, J.L., (1988).
Determination of the optimum working range in spectrophotometric
procedures. Talanta 35, 531-537.
Asuero A.G., González, A.G., (1989). Some observations of fitting a
straight line to data. Microchem. J. 40, 216-225.
Asuero, A.G., Sayago, A., Gonzalez, A., (2006). The correlation
coefficient: an overview. Crit. Rev. Anal. Chem. 36, 1-19.
Asuero, A.G., González G., (2007). Fitting Straight Lines with Replicated
Observations by Linear Regression. III. Weighting Data. Crit. Rev.
Anal. Chem. 37, 143-172.
Weighting and Transforming Data in Linear Regression 49

Asuero, A.G., (2009). A Hiperbolic Sine Procedure for the

Spectrophotometric Evaluation of Acidity Constants for Two-Step
Overlapping Equilibria. J. Anal. Chem. 64, 1026-1030.
Asuero, A.G., Martín Bueno, J., (2011). Fitting Straight Lines with
Replicated Observations by Linear Regression. IV. Transforming Data.
Crit. Rev. Anal. Chem. 41, 36-69.
Badertscher, M., Pretsch, E., (2006). Bad results from good data. Trends
Anal. Chem. 25, 1131-1138.
Bagur, M.G., Morales, S., López-Chicano, M., (2009). Evaluation of the
environmental contamination at an abandoned mining site using
multivariate statistical techniques—The Rodalquilar (Southern Spain)
mining district. Talanta 80, 377-384.
Barnet, V., (2004). Environmental Statistical Methods and Applications.
New York, USA: Wiley; 161-173.
Bartlett, M.S., (1947). The use of transformations. Biometr. 3(1), 39-52.
Barton, J.S., (2011). A Comprehensive Enzyme Kinetic Exercise for
Biochemistry. J. Chem. Educ. 88, 1336-133.
Bates, D.M., Watts, D.G., (2007). Nonlinear Regression Analysis and its
Applications. New York, USA: Wiley.
Baumann, K., Wätzig, H., (1995). Appropriate calibration functions for
capillary electrophoresis. II. Heterocedasticity and its consequences. J.
Chromatogr. A 700, 9-20.
Baumann, K., (1997). Regression and calibration for analytical separation
techniques. Part II. Validation, weighted and robust regression.
Process Contr. Quality, 10, 75-112.
Bayne C.K., Rubin, I.B., (1986). Practical Experimental Designs and
Optimization Methods for Chemists. Deerfield Beach, Fl, USA: VCR
Publishers, pp. 61-62.
Beaumont, S., Martin, J., Asuero, A.G., (2016). A Potentiometric
Evaluation of Stability Constants of Two-Step Overlapping Equilibria
via a Bilogarithmic Hyperbolic Cosine Method. J. Anal. Sci. Methods
Instrum. 6, 33-43.
Belloto J.R.J., Sokolovski, T.D., (1985). Residual analysis in regression.
Am. J. Pharm. Educ. 49, 295-303.
50 Julia Martín, Alberto Romero Gracia and Agustín G. Asuero

Boccio, M., Asuero, A.G., Sayago, A., (2007). Spectrophotometric

Evaluation of Stability Constants of 1:1 Weak Complexes from Mole
Ratio Data Using the Bilogarithmic Hyperbolic Cosine Method. J.
Anal. Chem. 62, 840-844.
Boef, A.G.C., le Cessie, S., Dekkers, O.M., (2015). Models with
Transformed Variables. Interpretation and Software. Epidemiol. 26,
16-17.
Bolster, C.H., Hornberge, G.M., (2007). On the Use of Linearized
Langmuir Equations. Nutrient Manag. Soil Plant Anal. 71, 1796-1806.
Boumans, P.W.J.M., McKenna, R.J., Bosveld, M., (1981). Analysis of the
limiting noise and identification of some factors that dictate the
detection limits in a low-powder inductively coupled argon plasma
system. Spectrochim. Acta B 36, 1031-1058.
Box, G.E.P., Cox, D.R., (1964). An analysis of transformations. J. Royal
Stat. Soc. Ser. B 26 (2), 211-252.
Box, G.E.P., Draper, N.R., (1987). Empirical-Model-Building and
Response Surfaces. New York, USA: Wiley.
Brasil, B., Bettencourt da Silva, R.J.N., Camõesb, M.F.G.F.C., Salgueiro,
P.A.S., (2013). Weighted calibration with reduced number of signals
by weighing factor modelling: Application to the identification of
explosives by ion chromatography. Anal. Chim. Acta 804, 287-295.
Brownlee, K.A., (1984). Statistical Theory and Methodology in Science
and Engineering. 2nd ed., Malabar, FL: Robert E. Krieger.
Brüggemann, L., Wennrich, R., (2011). Application of a special in-house
validation procedure for environmental–analytical schemes including
a comparison of functions for modelling the repeatability standard
deviation. Accred. Qual. Assur. 16, 89-97.
Bubert, H., Klockenkämper, R., (1983). Precision-dependent calibration in
instrumental analysis. Fres. Zeitschrif Anal. Chem. 316, 186-193.
Bysouth, S.R., Tyson, J.F., (1986). A comparison of curve fitting
algorithms for flame absorption spectrometry. J. Anal. At. Spectrosc.
1(1), 85-87.
Weighting and Transforming Data in Linear Regression 51

Cai, J., Liu, R., Sun, C., (2008). Logistic Regression Model for
Isoconversional Kinetic Analysis of Cellulose Pyrolysis. Energy Fuels
22, 867-870.
Canavos, G.C., (1984). Applied Probability and Statistical Methods.
Toronto, Canada: Little, Brown and Company.
Candioti, L.V., De Zan, M.M., Cámara, M.S., Goicoechea, H.C., (2014).
Experimental design and multiple response optimization. Using the
desirability function in analytical methods development. Talanta 124,
123-138.
Capitán-Vallvey, L.F., Arroyo-Guerrero, E., Fernández-Ramos, M.D.,
Cuadros-Rodríguez L., (2006). Logit linearization of analytical
response curves in optical disposable sensors based on coextraction for
monovalent anions. Anal. Chim. Acta 561, 156-163.
Carroll R.J., Ruppert, D., (1988). Transformation and Weighting in
Regresion. London, England: Chapman & Hall.
Chang, H.S., (1977). A computer program for Box-Cox transformation and
estimation technique. Econometrica 45(7), 1741.
Chen, C., (2013). Evaluation of Equilibrium Sorption Isotherm Equations.
Open Chem. Eng. J. 7, 24-44.
Chen, X., (2015). Modeling of Experimental Adsorption Isotherm Data.
Information 6, 14-22.
Chinn, S., (1996). Choosing a transformation. J. Appl. Stat. 23(4), 395-404.
Chow S.-C., Liu, J.-P., (1995). In Statistical Design and Analysis in
Pharmaceutical Sciences. New York, USA: Marcel Dekker.
Concheiro, M., Castaneto, M., Kronstrand, R., Huestis, M.A., (2015).
Simultaneous determination of 40 novel psychoactive stimulants in
urine by liquid chromatography-high resolution mass spectrometry and
library matching. J. Chromatogr. A 1397, 32-42.
Connors, K.A., (1987). Binding Constants, the Measurement of Molecular
Complex Stability. New York, USA: Wiley, 115.
Cornish-Bowden, A., (2014). Analysis and interpretation of enzyme kinetic
data. Perspect. Sci. 1, 121-125.
Crabbe, M.J.C., (1982). An enzyme-kinetic program for desk-top
computeres. Comput. Biol. Med. 12 (4), 263-283.
52 Julia Martín, Alberto Romero Gracia and Agustín G. Asuero

Dada, A.O., Olalekan, A.P., Olatunya, A.M., Dada, O., (2012). Langmuir,
Freundlich, Temkin and Dubinin–Radushkevich Isotherms Studies of
Equilibrium Sorption of Zn2+ Unto Phosphoric Acid Modified Rice
Husk. IOSR J. Appl. Chem. 3, 38-45.
Daniel, C., Wood, F.S., (1980). Fitting Equations to Data: Computer
Analysis of Multifactor Data. 2nd ed., New York, USA: Wiley.
Danzer, K., Currie, L.A., (1998). IUPAC Guidelines for calibration in
analytical chemistry. Part 1. Fundamentals and single component
calibration. Pure Appl. Chem. 70, 993-1014.
Davidian, M., Haaland, P.D., (1990). Regression and calibration with non
constant error variance. Chemometr. Intell. Lab. Systems, 9, 231-248.
de Beer, J.O., Naert, C., Deconinck, E., (2012). The quality coefficient as
performance assessment parameter of straight line calibration curves in
relationship with the number of calibration points. Accred. Qual.
Assur. 17 (3), 265-274.
de Brito J.A.A., Chettle, D.R., (2009). Calibration of 109Cd KXRF
systems for in vivo bone lead measurements: weighted least-squares
regression with different weighting functions. Phys. Med. Biol. 54,
L45-L50.
de Brito, J.A.A., de Carvalho, M.L., Chettle, D.R., (2009). Calibration of
109Cd KXRF systems for in vivo bone lead measurements: the
guiding role of the assumptions for least-squares regression in practical
problem solving. Phys. Med. Biol. 54, 919-934.
De Galan L., van Dalen, H.P.J., Kornblum, G.R., (1985). Determination of
strongly curved calibration graphs in flame atomic absorption
spectrometry: comparison of manually drawn and computer calculated
graphs. Analyst 110, 323-329.
de Levie, R., (1986). When, why, and how to use weighted least squares. J.
Chem. Educ. 63, 10-15.
de Levie, R., (2000). Curve fitting least squares. Crit. Rev. Anal. Chem. 30,
59-74.
de Levie, R., (2001). How to Use Excel in Analytical Chemistry and in
General Scientific Data Analysis. Cambridge, England: Cambridge
University Press.
Weighting and Transforming Data in Linear Regression 53

de Levie, R., (2004). Estimating precision in derived quantities. J. Chem.

Educ. 9(2), 80-88.
de Levie, R., (2012). Advanced Excel for Scientific Data Analysis. 3th ed.,
Brunswick, Maine: Atlantic Academic.
Dekkers, A.L.M., Slob, W., (2012). Gaussian Quadrature is an efficient
method for the back-transformation in estimating the usual intake
distribution when assessing dietary exposure. Food Chem. Toxicol. 50,
3853-3861.
Deming, W.E., (1943). Statistical Adjustment of Data. New York, USA:
Dover.
Denderz, N., Lehotay, J., (2012). Application of the van’t Hoff
dependences in the characterization of molecularly imprinted polymers
for some phenolic acids. J. Chromatogr. A 1268, 44-52.
Desimoni, E., (1999). A program for the weighted linear least squares
regression of unbalanced response arrays. Analyst 124, 1191-1196.
Desimoni, E., Brunetti, B., (2009). About estimating the limit of detection
of heteroscedastic analytical systems. Anal. Chim. Acta 655, 30-37.
Dosne, A-G., Bergstrand, M., Karlsson, M.O., (2016). A strategy for
residual error modeling incorporating scedasticity of variance and
distribution shape. J. Pharmacokinet. Pharmacodyn. 43, 137-151.
Dowd, J.E., Riggs, D.S., (1965). A comparison of estimates of Michaelis-
Menten kinetic constants from various linear transformations. J. Biol.
Chem. 240 (2), 863-869.
Draper, N.R., Smith, H., (1998). Applied Regression Analysis. 3rd ed.,
New York, USA: Wiley.
du Toit, S.H.C., Steyn, A.G.W., Stumf, R.H., (1986). Graphical
Exploratory Data Analysis. New York, USA: Springer Verlag.
El-Khaiary, M.I., (2008). xLeast-squares regression of adsorption
equilibrium data: comparing the options. J. Haz. Mat. 158, 73-87.
El-Khaiary M.I., Malash G.F., Ho, Y-S., (2010). On the use of linearized
pseudo-second-order kinetic equations for modeling adsorption
systems. Desalination 257, 93-101.
Ellis K.J., Duggleby, R.G., (1978). What happens when data are fitted to
the wrong equation? Biochem. J. 171, 513-517.
54 Julia Martín, Alberto Romero Gracia and Agustín G. Asuero

Elmore, D.T., Kingston, A.E., Shields, D.B., (1963). The computation of

velocities and kinetic constants of reactions, with particular reference
to enzyme-catalysed processes. J. Chem. Soc. 2070-2078.
Elskens, M., Baston, D.S., Stumpf, C., Haedrich, J., Keupers, I., Croes, K.,
Denison, M.S., Baeyens, W., Goeyens, L., (2011). CALUX
measurements: Statistical inferences for the dose–response curve.
Talanta 85, 1966-1973.
Engineering Statistics Handbook 4.1.4.3. Weighted least squares
regression. http://www.itl.nist.gov/div.898/handbook/pmd/section1/
pmd143.htm.
EPA, (2000). Guidance for Data Quality Assessment. Practical Methods
for Data Analysis. EPA QA/G-9 QA00 Update, United States
Environmental Protection Agency, EPA/600/R-96/084, 4-42.
EURACHEM/CITAC Guide, (2000). Quantifying Uncertainty in
Analytical Chemistry, 2nd ed., http://www.measurementuncertainty.
org/mu/guide/index.html.
FDA Home Page: “Guidance for Industry. Bioanalytical Method
Validation.” http://www.fda.gov/cder/guidance/ index.htm.
Foo, K.Y., Hameed, B.H., (2010). Insights into Modeling of Adsorption
Isotherm Systems. Chem. Eng. J. 156, 2-10.
Gad, S.C., (1999). Statistics and Experimental Design for Toxicologists.
3th ed., Boca Raton, FL: CRC Press, 49-51.
Garden, J.S., Mitchell, D.G., Mills, W.N., (1980). Nonconstant variance
regression techniques for calibration curve based analysis. Anal. Chem.
52, 305-307.
Garfinkel, D., Fegley, K.A., (1984). Fitting physiological models to data.
Am. J. Physiol. 246, R641-R650.
Giloni, A., Simonof, J.S., Sengupta, B., (2006). Robust weighted LAD
regression. Comp. Stat. Data Anal. 50, 3124-3140.
Goutelle, S., Maurin, M., Rougier, F., Barbaut, X., Bourguignon, L.,
Ducher, M., Maire, P., (2008). The Hill equation: a review of its
capabilities in pharmacological modelling. Fundam. Clin. Pharmacol.
22, 633-648.
Weighting and Transforming Data in Linear Regression 55

Gu, H., Liu, G., Wang, J., Aubry, A., Arnold, M.E., (2014). Selecting the
correct weighting factors for linear and quadratic calibration curves
with least-squares regression algorithm in bioanalytical LC-MS/MS
assays and impacts of using incorrect weighting factors on curve
stability, data quality, and assay performance. Anal. Chem. 86 (18),
8959-8966.
Hawker, D., (2015). Kinetics of Carbaryl Hydrolysis: An Undergraduate
Environmental Chemistry Laboratory. J. Chem. Educ. 92, 1531-1535.
Heinzerling, P., Schrader, F., Schanze, S., (2012). Measurement of
Enzyme Kinetics by Use of a Blood Glucometer: Hydrolysis of
Sucrose and Lactose. J. Chem. Educ. 89, 1582-1586.
Herman, R.A., Scherer, P.N., Shan, G., (2008). Evaluation of logistic and
polynomial models for fitting sandwich-ELISA calibration curves. J.
Immunolog. Methods 339, 245-258.
Heydorn, K., Anglow, T., (2002). Calibration uncertainty. Accred. Qual.
Assur. 7, 153-158.
Hladky, P.W., (2011). Chemical Dosing and First-Order Kinetics. J. Chem.
Educ. 88, 776-781.
Howarth R., Thompson, M., (1976). Duplicate analysis in geochemical
practice. Part 2. Examination of the proposed method and examples of
its use. Analyst 101, 699-709.
Hoyle, M.H., (1973). Transformations—An introduction and a
bibliography. Int. Stat. Rev. 41(2), 203-223.
Huang, C.-L., Moon, L.C., Chang, H.S., (1978). A computer program
using the Box-Cox transformation technique for the specification of
functional form. Am. Stat. 32(4), 144.
Hughes, H., Hurley, P.W., (1987). Precision and accuracy of test methods
and the concept of K-factor in chemical analysis. Analyst 112, 1445-
1449.
Hwang, L.-J., (1994). Impact of variance function estimation in regression
and calibration. Methods Enzymol. 240, 150- 170.
Ingle, Jr. J.D., Crouch, S.R., (1972). Evaluation of precision of quantitative
absorption spectrometric measurements. Anal. Chem. 44, 1375-1386.
56 Julia Martín, Alberto Romero Gracia and Agustín G. Asuero

Ingle, Jr. J.D., (1974). Precision of atomic absorption spectrometric

measurements. Anal. Chem. 46, 2161-2171.
ISO 5725, (1994). Accuracy (Trueness and Precision) of Measurement
Methods and Results. Part 2: Basic Methods for the Determination of
Repeatibility and Reproducibility of a Standard Measurement Method
(ISO, Geneva).
ISO11843–2, (2000). Capacity of Detection. Part 2. Metrology in the
Linear Calibration Case. ISO, Geneva.
Jain, R.B., (2010). Comparison of three weighting schemes in weighted
regression analysis for use in a chemistry laboratory. Clin. Chim. Acta
411, 270-279.
Johnson, K.J., (1980). Numerical Methods in Chemistry. New York, USA:
Marcal Dekker, 245.
Jukic, D., Sabo, K., Scitovski, R., (2007). A review of existence criteria for
parameter estimation of the Michaelis–Menten regression model. Ann.
Univ. Ferrara 53, 281-291.
Jurs, P., (1970). Weighted least squares curve fitting using functional
transformations. Anal. Chem. 42, 747- 750.
Jurs, P.C., (1986). Computer Software Applications in Chemistry. New
York, USA: Wiley, 37-38.
Kapteyn, J.C., (1903). Skew Frequency Curves in Biology and Statistics.
Groningen, The Netherlands: P. Noordhoff.
Kapteyn, J.C., van Uwen, M.J., (1916). Skew Frequency Curves in Biology
and Statistics. Groningen, The Netherlands: Hoitrema Brothers.
Katskov, D., Hlongwane, M., Heitmann, U., Florek, S., (2012). High-
resolution continuum source electrothermal atomic absorption
spectrometry: Linearization of the calibration curves within a broad
concentration range. Spectrochim. Acta Part B 71-72, 14-23.
Kemp, G.J., (1985). The susceptibility of calibration methods to errors in
the analytical signal. Anal. Chim. Acta 176, 229-237.
Kirkup, L., Mulholland, M., (2004). Comparison of linear and non- linear
equation for univariate calibration. J. Chromatogr. A 1029, 1-11.
Kleijburg, M.R., Pijpers, F.W., (1985). Calibration graphs in atomic-
absorption spectrophotometry. Analyst 110, 147-150.
Weighting and Transforming Data in Linear Regression 57

Klockenkämper, R., Bubert, H., (1986). Improvement of precision in

spectrochemical analysis by correlation of intensity measurements.
Fres. Zeitschrif Anal. Chem. 323, 112-116.
Komsta, Ł., (2010). A new general equation for retention modeling from
the organic modifier content of the mobile phase. Acta
Chromatographica 22.
Komsta, Ł., (2013). Statistical Evaluation and Validation of Quantitative
Methods of Drug Analysis. Chapter 11. In: Thin Layer
Chromatography in Drug Analysis. Komsta, L., Waksmundzka-
Hajnos, M., Sherma, J., CRC Press, pp. 187-192.
Korany, M.A., Maher, H.M., Galal, S.M., Ragab, A.A., (2013).
Comparative study of some robust statistical methods: weighted,
parametric, and nonparametric linear regression of HPLC convoluted
peak responses using internal standard method in drug bioavailability
studies. Anal. Bioanal. Chem. 405 (14), 4835-4848.
Krupcık, J., Mydlova, J., Majek, P., Simon, P., Armstrong, D.W., (2008).
Methods for studying reaction kinetics in gas chromatography,
exemplified by using the 1-chloro-2,2-dimethylaziridine
interconversion reaction. J. Chromatogr. A 1186, 144-160.
Lavagnini, I., Favaro, G., Magno, F., (2004). Non-linear and nonconstant
variance calibration curves in analysis of volatile organic compounds
for testing of water by the purge-and-trap method coupled with gas
chromatography/mass spectrometry. Rapid Commun. Mass Spectrom.
18, 1383-1391.
Lavagnini, I., Favaro, G., Magno, F., (2005). Non-linear and non-constant
variance calibration curves in analysis of volatile organic compounds
for testing of water by the purge-and-trap method coupled with gas
chromatography/mass spectrometry. Rapid Comm. Mass Spectrom. 18,
1383-1391.
Lavagnini, I., Magno, F., (2007). A statistical overview of univariate
calibration, inverse regression, and detection limits: Application to gas
chromatography/mass spectrometry technique. Mass Spectrom. Rev.
26(1), 1-18.
58 Julia Martín, Alberto Romero Gracia and Agustín G. Asuero

Lavagnini, I., Urbani, A., Magno, F., (2011). Overall calibration procedure
via a statistically based matrix-comprehensive approach in the stir bar
sorptive extraction–thermal desorption–gas chromatography–mass
spectrometry analysis of pesticide residues in fruit-based soft drinks.
Talanta 83, 1754-1762.
Lee, J.C., Chen, D-T., Hung, H-N., Chen, J.J., (1999). Analysis of drug
dissolution data. Stat. Med. 18(7), 799-814.
Lee, J.-C., Ramsey, M.H., (2001). Modeling measurement uncertainty as a
function of concentration: an example from a contaminated land
investigation. Analyst 126, 1784-1791.
Leslie, D.S., Kohn, R., Nott, D.J., (2007). A general approach to
heteroscedastic linear regression. Stat. Comp. 17, 131-146.
Li, B.B., Moor, B., (2002). The general Box-Cox transformation in
multiple regression analysis. Commun. Stat. Simul. Comput. 31(4),
673-687.
Li, C., Liu, J., Di, D., Jiang, S., (2008). Analysis of Three Flavonoids in
Oxytropis kansuensis Bunge by RP-LC–DAD Coupled with Weighted
Least-Squares Linear Regression. Chromatographia 68, 773-779.
Logothetis, N., (1990). Box-Cox transformations and the Taguchi methods.
Appl. Stat. 39(1), 31-48.
Mager, P.P., (1991). Design Statistics in Pharmacochemistry. New York,
USA: Wiley, pp. 20-44.
Malaeb, Z.A., (1997). A SAS code to correct for non-normality and
nonconstant variance in regression and ANOVA models using the
Box-Cox method of power transformation. Environ. Monit. Assess.
47(3), 255-273.
Mansilha, C., Melo, A., Rebelo, H., Ferreira, I.M.P.L.V.O., Pinho, O.,
Domingues, V., Pinho, C., Gameiro, P., (2010). Quantification of
endocrine disruptors and pesticides in water by gas chromatography-
tandem mass spectrometry. Method validation using weighted linear
regression schemes. J. Chromatogr. A 1217 (43), 6681- 6691.
Marengo, E., Robotti, E., Bobba, M., Righetti, P.G., (2008). Evaluation of
the Variables Characterized by Significant Discriminating Power in the
Weighting and Transforming Data in Linear Regression 59

Application of SIMCA Classification Method to Proteomic Studies. J.

Proteom. Res. 7, 2789-2796.
Markovic, D.D., Lekic, B.M., Rajakovic-Ognjanovic, V.N., Onjia, A.E.,
Rajakovic, L.V., (2014). A New Approach in Regression Analysis for
Modeling Adsorption Isotherms. Sci. World J. 1-17.
Mateu, J., (1997). Methods of assessing and achieving normality applied to
environmental data. Environ. Manag. 21(5), 766-777.
McLean, A.M., Ruggirello, D.A., Banfield, C., Gonzalez, M.A., Bialer,
M., (1990). Application of a variance-stabilizing transformation
approach to linear regression of calibration lines. J. Pharm. Sci.
79(11), 1005-1008.
Meites, L., (1979). Some new techniques for the analysis and interpretation
of chemical data. Crit. Rev. Anal. Chem. 8, 1-53.
Meloun, M., Militký, J., Forina, M., (1992). Chemometrics for Analytical
Chemistry. Vol. 1: PC-Aided Statistical Data Analysis. New York,
USA: Ellis Horwood, pp. 71-77.
Meloun, M., Pluharová, M., (2000). Thermodynamic dissociation constants
of codeine, ethylmorphine and homatropine by regression analysis of
potentiometric titration data. Anal. Chim. Acta 416, 55-68.
Meloun, M., Hill, M., Militký, J., Kupka, K., (2003). Assessment of the
mean value of 17-hydroxypregnenolone in the umbilical blood of new-
borns by the exploratory analysis of biochemical data. Comp. Methods
Programs Biomed. 70(3), 187-197.
Meloun, M., Sanka, M., Nemec, P., Krıtkova, S., Kupka, K., (2005). The
analysis of soil cores polluted with certain metals using the BoxeCox
transformation. Environ. Pollut. 137, 273-280.
Meloun, M., Militký, J., (2011). Statistical Data Analysis: A Practical
Guide. 1st Edition, Woodhead Publishing India, pp. 57-63.
Mermet, J-M., (2010). Calibration in atomic spectrometry: A tutorial
review dealing with quality criteria, weighting procedures and possible
curvatures. Spectrochim. Acta Part B 65, 509-523.
Miller-Ihli, N.J., O’Haver, T.C., Harnly, J.M., (1984). Calibration and
curve fitting for extended range AAS. Spectrochim. Acta 39B (2-3),
1603-1614.
60 Julia Martín, Alberto Romero Gracia and Agustín G. Asuero

Miller, J.N., Miller, J.C., (2010). Statistics and Chemometrics for

Analytical Chemistry. 6th ed., Harlow, England: Prentice-Hall.
Modamio, P., Lastra, C.F., Mariño, E.L., (1996). Determination of
analytical error function for β-blockers as a possible weighting method
for the estimation of the regression parameters. J. Pharm. Biomed.
Anal. 14, 401- 408.
Mosteler, R., Tukey, J.W., (1977). Data Analysis and Regression: A
second course in statistics. Reading, MA: Addison-Wesley.
Mullins, E., (2003). Statistics for the Quality Control Laboratory.
Cambridge, England: RSC.
Muteki, K., Blackwood, D.O., Maranzano, B., Zhou, Y., Liu, Y.A.,
Leeman, K.R., Reid, G.L., (2013). Mixture Component Prediction
Using Iterative Optimization Technology (Calibration-Free/Minimum
Approach). Ind. Eng. Chem. Res. 52, 12258-12268.
Nascimento, R.S., Froes, R.E.S., Silva, N.O.C., Naveira, R.L.P., Mendes,
D.B.C., Neto, W.B., Silva, J.B.B., (2010). Comparison between
ordinary least squares regression and weighted regression in the
calibration of metals present in human milk determined by ICP-OES.
Talanta 80 (3), 1102-1109.
Natrella, M.G., (1963). The use of transformations, Experimental
Statistics, National Bureau of Standards Handbook 91. Washington,
DC: NBS, Chapter 20, pp. 201-203.
Naya, S., Cao, R., de Ullibarri I.L., Artiaga, R., Barbadillo, F., García, A.,
(2006). Logistic mixture model versus Arrhenius for kinetic study of
material degradation by dynamic thermogravimetric analysis. J.
Chemometr. 20(3-4), 158-163.
Noblitt, S.D., Berg, K.E., Cate, D.M., Henry, C.S., Characterizing
nonconstant instrumental variance in emerging miniaturized analytical
techniques. Anal. Chim. Acta 915, 64-73.
Noggle, N., (1993). Practical Curve Fitting and Data Analysis: Software
and Self-Instructions for Scientists and Engineers. Chichester,
England: Horwood.
Weighting and Transforming Data in Linear Regression 61

O’Connell, M.A., Belanger, B.A., Haaland, P.D., (1993). Calibration and

assay development using the four-parameter logistic model.
Chemometr. Intell. Lab. Systems 20, 97-114.
Olivieri, A.C., (2015). Practical guidelines for reporting results in single-
and multi-component analytical calibration: A tutorial. Anal. Chim.
Acta 868, 10-22.
Oppenheimer, L., Capizzi, T.P., Weppelman, R.M., Mehta, H., (1983).
Determining the lowest limit of reliable assay measurement. Anal.
Chem. 55, 638-643.
Osborne, J.W., (2010). Improving your data transformations: applying the
Box-Cox transformation. Pract. Assess. Res. Eval. 15, 1-9.
Osmari, T.A., Gallon, R., Schwaab, M., Barbosa-Coutinho, E., Baptista
Severo Jr. J., Pinto, J.C., (2013). Statistical Analysis of Linear and
Non-linear Regression for the Estimation of Adsorption Isotherm
Parameters. Adsorpt. Sci. Technol. 31, 433-458.
Pardue, H.L., Hewitt, T.E., Milano, J.N., (1974). Photometric errors in
kinetics and equilibrium analysis based on absorption spectroscopy.
Clin. Chem. 20, 1028-1042.
Peace, K.E., (1988). Biopharmaceutical Statistics for Drug Development.
New York, USA: Marcel Dekker, pp. 357-359.
Penninckx, W., Hartmann, D., Massart, D.L., Smeyers-Verbeke, J., (1996).
Validation of the calibration procedure in atomic absorption
spectrometric methods. J. Anal. At. Spectrom. 11, 237-246.
Pereira da Silva, C., Soares Emídio, E., Rodrigues de Marchi, M.R.,
(2015). Method validation using weighted linear regression models for
quantification of UV filters in water samples. Talanta 131, 221-227.
Phillips, L.J., Alexander, J., Hill, H.M., (1990). Quantitative
Characterization of Analytical Methods, in Analysis for Drugs and
Metabolites including Anti-infective Agents, E. & Reid I. D. Wilson,
eds., London, England: RSC, pp. 23-36.
Piergiovanni, P.R., (2014). Adsorption Kinetics and Isotherms: A Safe,
Simple, and Inexpensive Experiment for Three Levels of Students. J.
Chem. Educ. 91, 560-565.
62 Julia Martín, Alberto Romero Gracia and Agustín G. Asuero

Prudnikov, E.D., Shapkina, Y.S., (1984). Random errors in analytical

methods. Analyst 109, 305-307.
Rajakovic, L.V., Markovic, D.D., Rajakovic-Ognjanovic, V.N.,
Antanasijevic, D.Z., (2012). Review: The approaches for estimation of
limit of detection for ICP-MS trace analysis of arsenic. Talanta 102,
79-87.
Rawlings, J.O., Pantula, S.G., Dickey, D.A., (1998). Applied Regression
Analysis. A Research Tool. 2nd ed., New York, USA: Springer-Verlag.
Reigh, J.G., Wangermann, G., Rhode, K., Falck, M., (1972). General
strategy for parameter estimation from isosteric and allosteric kinetic
data and binding measurements. Eur. J. Biochem. 26, 368-379.
Rios, S., (1977). Métodos Estadísticos. 2nd ed., Madrid, España: Ediciones
del Castillo.
Ritz, C., Baty, F., Streibig, J.C., Gerhard, D., (2015). Dose-Response
Analysis Using R. Plos One, 1-13.
Rocke, M., Lorenzato, S., (1995). A two-component model for
measurement error in analytical chemistry. Technometr. 37, 176-184.
Rocke, D.M., Durbin, B., Wilson, M., Kahn, H.D., (2003). Modeling
uncertainty in the measurement of low-level analytes in environmental
analysis. Ecotoxicol. Environ. Saf. 56, 78-92.
Rodbard, D., Frazier, G.R., (1975). Statistical Analysis of radioligand
assay. Methods Enzymol. 37, 3-22.
Rodbard, D., Lenox, R.H., Wray, H.L., Ramseth, D., (1976). Statistical
characterization of random errors in the radioimmunoassay dose-
response variable. Clin. Chem. 22, 350-358.
Rode, R.A., Chinchilli, V.M., (1988). The use of Box-Cox transformations
in the development of multivariate tolerance regions with applications
to clinical chemistry. Am. Stat. 42(1), 23-30.
Rothman, L.D., Crouch, S.R., Ingle, Jr, J. D., (1975). Theoretical and
experimental investigation of factors affecting precision in molecular
absorption spectrophotometry. Anal. Chem. 47, 1226-1233.
Roy, C., Chakrabarty, J., (2014). Quality by Design-Based Development of
a Stability-Indicating RP-HPLC Method for the Simultaneous
Determination of Methylparaben, Propylparaben, Diethylamino
Weighting and Transforming Data in Linear Regression 63

Hydroxybenzoyl Hexyl Benzoate, and Octinoxate in Topical

Pharmaceutical Formulation. Sci. Pharm. 82, 519-539.
Rudnyi, E.B., (1996). Statistical model of systematic errors: linear error
model. Chemometr. Intell. Lab. Systems 34, 41-54.
Sadray, S., Rezaee, S., Rezakhah, S., (2003). Non-linear heterocedastic
regression model for determination of methotrexate in human plasma
by high performance liquid chromatography. J. Chromatogr. B 787,
293-302.
Safari, G.H., Nasseri, S., Mahvi, A.H., Yaghmaeian, K., Nabizadeh, R.,
Alimohammadi, M., (2015). Optimization of sonochemical
degradation of tetracycline in aqueous solution using sono-activated
persulfate process. J. Environ. Health Sci. Eng. 13, 1-15.
Sakia, R.M., (1992). The Box-Cox transformation technique—A review.
Statistician 41(2), 169-178.
Sands, D.E., (1974). Weighting factors in least squares. J. Chem. Educ. 51,
473-474.
Santoyo, E., Guevara, M., Verma S.P., (2006). Determination of
lanthanides in international geochemical reference materials by
reversed-phase high-performance liquid chromatography using error
propagation theory to estimate total analysis uncertainties. J.
Chromatogr. A 1118, 73-81.
Sayago, A., Asuero, A.G., (2004). Fitting straight lines with replicated
observations by linear regression: Part II. Testing for homogeneity of
variances. Crit. Rev. Anal. Chem. 34, 133-146.
Sayago, A., Boccio M., Asuero, A.G., (2004). Fitting straight lines with
replicated observations by linear regression: the least squares
postulates. Crit. Rev. Anal. Chem. 34, 39-50.
Sayago, A., Asuero, A.G., (2006). Spectrophotometric evaluation of
stability constants of 1:1 weak complexes from continuous variation
data. Int. J. Pharm. 321, 94-100.
Schlesselman, J., (1971). Power families: A note on the Box and Cox
transformation. J. Royal Stat. Soc. B 33(2), 307-311.
Schwartz, L.M., Gelb, R.I., (1978). Statistical analysis of titration data.
Anal. Chem. 50, 1571-1576.
64 Julia Martín, Alberto Romero Gracia and Agustín G. Asuero

Sclove, S.L., (1972). (Y vs x) or (Log y vs x). Technometr. 14, 391.

Seber, G.A.F., (2003). Linear Regression Analysis. New York, USA:
Wiley.
Seethapathy, S., Górecki, T., (2010). Polydimethylsiloxane-based
permeation passive air sampler. Part II: Effect of temperature and
humidity on the calibration constants. J. Chromatogr. A 1217, 7907-
7913.
Shumway, R.H., Azari, R.S., Kayhanian, M., (2002). Statistical approaches
to estimating mean water quality concentrations with detection limits.
Environ. Sci. Tecnol. 36(15), 3345-3353.
Singh, K.P., Rai, P., Singh, A.K., Verma, P., Gupta, S., (2014). Occurrence
of pharmaceuticals in urban wastewater of north Indian cities and risk
assessment. Environ. Monit. Assess. 186, 6663-6682.
Singtoroj, T., Tarning, J., Annerberg, A., Ashton, M., Berqvist, Y., White,
N. J., Lindegardh, N., Day, N.P.J., (2006). A new approach to evaluate
regression models during validation of bioanalytical assays. J. Pharm.
Biomed. Anal. 41, 219-227.
Smith E.D., Mathews, D.M., (1967). Least squares regression lines:
Calculations assuming a constant percent error. J. Chem. Educ. 44,
757-759.
Sousa, J.A., Reynolds, A.M., Ribeiro, A.S., (2012). A comparison in the
evaluation of measurement uncertainty in analytical chemistry testing
between the use of quality control data and a regression analysis.
Accred. Qual. Assur. 17, 207-214.
Steliopoulos, P., Stickel, E., Haas, H., Kranz, S., (2006). Method validation
approach on the basis of a quadratic regression model. Anal. Chim.
Acta 572, 121-124.
Sun, X.-Y., Singh, H., Millier, B., Warren, C.H., Aye, W.A., (1994).
Noise, filters and detection limits. J. Chromatogr. A 687, 259-281.
’t Lam, R.U.E., (2010). Scrutiny of variance results for outliers: Cochran’s
test optimized. Anal. Chim. Acta 659, 68-84.
Tan, A., Awaiye, K., Trabelsi, F., (2014). Impact of calibrator
concentrations and their distribution on accuracy of quadratic
Weighting and Transforming Data in Linear Regression 65

regression for liquid chromatography-mass spectrometry bioanalysis.

Anal. Chim. Acta 815, 33-41.
Tellinghuisen, J., (2001). Statistical error propagation. J. Phys. Chem. A
105(15), 3917-3921.
Tellinghuisen, J., (2005a). Statistical error in isothermal titration
calorimetry: variance function estimation from generalized least
squares. Anal. Biochem. 343, 106-115.
Tellinghuisen, J., (2005b). Understanding Least Squares through Monte
Carlo Calculations. J. Chem. Educ. 82, 157-166.
Tellinghuisen, J., (2007). Weighted least-squares in calibration: What
difference does it make? Analyst 132, 536-543.
Tellinghuisen, J., (2008a). Least squares with non-normal data: estimating
experimental variance functions. Analyst 133, 161-166.
Tellinghuisen, J., (2008b). Weighted least squares in calibration: The
problem with using “quality coefficients” to select weighting formulas.
J. Chromatogr. B 872, 162–166.
Tellinghuisen, J., (2009a). Least squares in calibration: weights,
nonlinearity, and other nuisances. Chapter 10. Methods Enzymol. 454,
259-285.
Tellinghuisen, J., (2009b). The least-squares analysis of data from binding
and enzyme kinetics studies: weights, bias, and confidence intervals in
usual and unusual situations. Chapter 10. Methods Enzymol. 467, 599-
529.
Tellinghuisen, J., (2009c). Weighting Formulas for the Least-Squares
Analysis of Binding Phenomena Data. J. Phys. Chem. B 113, 6151-
6157.
Tellinghuisen, J., (2009d). Variance function estimation by replicate
analysis and generalized least squares: A Monte Carlo comparison.
Chemometr. Intell. Lab. Systems 99, 138-149.
Tellinghuisen, J., (2010a). Least-squares analysis of data with uncertainty
in x and y: A Monte Carlo methods comparison. Chemometr. Intell.
Lab. Systems 103, 160-169.
Tellinghuisen, J., (2010b). Least-Squares Analysis of Phosphorus Soil
Sorption Data with Weighting from Variance Function Estimation: A
66 Julia Martín, Alberto Romero Gracia and Agustín G. Asuero

Statistical Case for the Freundlich Isotherm. Environ. Sci. Technol. 44,
5029-5034.
Tellinghuisen, J., Bolster, C.H., (2011). Using R2 to compare least square
fit models: when it must fail. Chemometr. Intell. Lab. Systems 105 (2),
220-222.
Tellinghuisen, J., (2015). Using Least Squares for Error Propagation. J.
Chem. Educ. 92, 864-870.
Teunissen, P.J.G., Amiri-Simkooei, A.R., (2008). Least-squares variance
component estimation. J. Geodesy, 82, 65-82.
Thompson, M., Howarth, R.J., (1973). Rapid estimation and control of
precision by duplicate determinations. Analyst 98, 153-160.
Thompson, M., (1976). Duplicate analysis in geochemical practice. Part 1.
Theoretical approach and estimation of analytical reproducibility.
Analyst 101, 690-698.
Thompson, M., (1978). Dupan 3, a subroutine for the interpretation of
analytical data in geochemical analysis. Comp. Geosci. 4, 333-340.
Thompson, M., Howarth, R., (1978). New approach to estimation of
analytical precision. J. Geochem. Explor. 9, 23-30.
Thompson, M., (1982). Regression methods in the comparison of accuracy.
Analyst 107, 1169-1180.
Thompson, M., (1988). Variation of precision with concentration in an
analytical system. Analyst 112, 1579-1587.
Thompson, M., (2007) Why are we weighting? Anal. Methods Committee.
AMC technical brief No 27.
Tomassone, R., Lesquoy, E., Miller, C., (1983). La Regression, nouveaux
regards sur une anciene methode statistique. Paris, France: Masson,
15, 38.
Tukey, J.W., (1977). Exploratory Data Analysis. Reading, MA: Addison-
Wesley.
van Loco, J., Hanot, V., Huysmans, G., Elskens, M., Degroodt, J.M.,
Beemaert, H., (2003). Estimation of the minimum detectable value for
the determination of PCBs in fatty food samples by GC-ECD: A
curvilinear calibration case. Anal. Chim. Acta 483(1-2), 413-418.
Weighting and Transforming Data in Linear Regression 67

Vanatta, L.E., Coleman, D.E., (2007). Calibration, uncertainty, and

recovery in the chromatographic sciences. J. Chromatogr. A 1158, 47-
60.
Vélez, J.I., Corre, J.C., Marmolejo-Ramos, F., (2015). A new approach to
the Box–Cox transformation. Frontiers Appl. Mathemat. Stat. 1, 1-10.
Wang, D., Zhuo, J-Q., Zhao, M-P., (2010). A simple and rapid competitive
enzyme-linked immunosorbent assay (cELISA) for high-throughput
measurement of secretory immunoglobulin A (sIgA) in saliva. Talanta
82, 432-436.
Watters, R.L., Carroll, R.J., Spiegelman, C.H., (1988). Heterocedastic
calibration using analyzed reference materials as calibration standards.
J. Res. Nati. Bur. Stand. (U.S.) 93, 264-265.
Weisberg, S., (2005). Applied Linear Regression. 3rd ed, New York, USA:
Wiley.
Williams, E.J., (1959). Regression Analysis. New York, USA: Wiley.
Wilson, M.D., Rocke, D.M., Durbin, B., Kahn, H.D., (2004). Detection
limits and goodness-of-fits measures for the two component model of
chemical analytical error. Anal. Chim. Acta 509, 197-208.
Winefordner, J.D., Svoboda, V., Cline, L.J., (1970). Sources of noise in
atomic absorption measurements. Crit. Rev. Anal. Chem. 1, 233-239.
Xu, W., Que Hee, S.S., (2006). Gas chromatography–mass spectrometry
analysis of di-n-octyl disulfide in a straight oil metalworking
fluid: Application of differential permeation and Box–Cox
transformation. J. Chromatogr. A 1101, 25-31.
Yamada, K.T., (1992). Standard deviation in weighted least-squares
analysis. J. Mol. Spectrosc. 156, 512-516.
Yaroshenko, I., Kirsanov, D., Kartsova, L., Sidorova, A., Borisova, I.,
Legin, A., (2015). Determination of urine ionic composition with
potentiometric multisensor system. Talanta 131, 556-561.
Zarembka, P., (1974). Transformation of variables in econometrics, in
Frontiers of Econometrics. Zarembka P. New York, USA: Academic
Press, pp. 81-104.
68 Julia Martín, Alberto Romero Gracia and Agustín G. Asuero

Zeng, Q.C., Zbang, E., Tellinghuisen, J., (2008). Univariate calibration by

reversed regression of heteroscedastic data: a case study. Analyst 133
(12), 1649-1655.
Zeng, Q.C., Zhang, E., Dong, H., Tellinghuisen, J., (2008). Weighted least
squares in calibration: Estimating data variance functions in high-
performance liquid chromatography. J. Chromatogr. A 1206, 147-152.
Zhang, C-Y., Chai, X-S., (2015). A novel multiple headspace extraction
gas chromatographic method for measuring the diffusion coefficient of
methanol in water and in olive oil. J. Chromatogr. A 1385, 124-128.
Zitter, H., God, C., (1971). Ermittlung, Auswertung und Ursachen von
Fehlern bei Betriebsanalysen. Zeitschrif Anal. Chem. 255, 1-9.
Zorn, M.E., Gibbons, R.D., Sonzogni, W.C., (1977). Weighted least
squares approach to calculating limits of detection and quantification
by modeling variability as a function of concentration. Anal. Chem. 69,
3069-3075.
Zorn, M.E., Gibbons, R.D., Sonzogni, W.C., (1977). Weighted least
squares approach to calculating limits of detection and quantification
by modeling variability as a function of concentration. Anal. Chem. 69,
3069-3075.
In: Linear Regression ISBN: 978-1-53611-992-3
Editor: Vera L. Beck © 2017 Nova Science Publishers, Inc.

Chapter 2

REGRESSION THROUGH THE ORIGIN

Julia Martín and Agustín G. Asuero*

Department of Analytical Chemistry, Faculty of Pharmacy,
The University of Seville, Seville, Spain

ABSTRACT

Regression through the origin, a very interesting topic, has usually

received a scarce attention in the bibliography. This model is also known
as the no-intercept model. It is applied because of subject matter theory or
either when other physical and material considerations are necessary to
taken into account. An intensive bibliographical search has been carried
out with the purpose of gathering the literature on the subject, which is
widely scattered. Some about one hundredth and thirty references have
been compiled, comprising about twenty monographs and fifty scientific
journals, from varying fields, e.g., analytical, biological, clinical,
chemometrical, educational, environmental, pharmaceutical, physico-
chemical, and statistical. We will dealt systematically with the
homocedastic condition, i.e., variance of y’s independent of x, errors of
y’s accumulative, the heterocedastic case, i.e., variance or standard
deviation proportional to x values, respectively, and orthogonal

*
Corresponding Author address: Agustín G. Asuero, Department of Analytical Chemistry,
Faculty of Pharmacy, University of Seville, Seville, Spain.
70 Julia Martín and Agustín G. Asuero

regression (error in both axes). The chapter also covers topics such as
prediction (using the regression line in reverse), leverage, goodness of fit,
comparison between models with and without intercept, uncertainty,
polynomial regression models without intercept, and an overview of
robust regression through the origin.

Keywords: least squares method, regression, origin

INTRODUCTION

Regression and related fitting methods have found wide use (Finney,
1996, Deming, 1968; Howard, 2001) in the field of natural and social
sciences. Though linear least squares regression is probably the most
widely used modeling statistical method, linear regression through the
origin, in spite of its importance, has not received a great attention. There
are occasions when it appears appropriate for a regression line (Bissell,
1992; Brownlee, 1984; Freund et al., 2006; Myers, 1986; Noggle, 1993;
Ryan, 2008) to pass through the origin, i.e., for the true relation line to be

  1  (1)

As the true variables  and  are unobservable, the model to be fitted is

Yi  1xi   i (2)

where (Yi, xi) is the ith pair of associated xi and Yi values, i is the number of
data point, 1 is the model parameter to be estimated, and i is the error
associated with the measurement Yi. This model is also called the no-
intercept model (Chatterjee et al., 2012; Afifi and Azen, 1972). When it is
known in advance that the intercept term is zero, then one has to impose
this on the model (Rousseeuw, 2001). Regression through the origin is
Regression through the Origin 71

applied due to underlying theory or either when other circumstances, i.e.,

physical and material considerations (Eisenhauer, 2003) are necessary to
taken into account. Many practical applications can be found where the
model given by Eqn (2) is more appropriate than the one with intercept
added

Y   0  1x   (3)

An example may be the regression of dose against area under the curve
(AUC) in pharmacokinetic studies (Bonate, 2011).
The error term in Eqn (2) is assumed to be normally distributed with
mean zero and unknown variance

2
 i2  (4)
wi

where  is a constant (which may be absorbed into the unknown i) and
the weighting factors wi’s are known for all i, being inversely proportional
to the variances.
The aim of this contribution is to offer a primer on the regression
through the origin to analytical chemists and other related researchers
interested in this subject. A number of applications have been compiled in
tabular form on this respect. Figure 1 shows the number of papers
published per year. Some fifty journals are cited from the fields of
analytical and physical chemistry, chemometrics, clinical chemistry,
ecology, educational chemistry, environmental chemistry, industrial
hygiene, pharmacy, biology, and statistics. The authors apologize for those
papers may have overlooked or inadvertently omitted. The most cited
journals are shown in Figure 2.
Figure 1. Number of publications cited per year.
Regression through the Origin 73

Figure 2. Number of papers cited by Journal.

WEIGHTED REGRESSION THROUGH THE ORIGIN

We proceed to derive the expression for weighted linear regression

through the origin as in analytical chemistry is widely recognized (Asuero
and Bueno, 2011; Asuero and Gonzalez, 2007; Sayago and Asuero, 2004)
that the standard deviation increases with concentration, i.e., the variance is
heterocedastic. The least squares method makes minimum the weighted
sum of the residuals ri

Qmin   wi ri2   wi  yi  ŷi 

2
(5)

of the ith fitted values

74 Julia Martín and Agustín G. Asuero

ŷi  b1 xi (6)

where b1 is the least squares estimates of 1, obtained by making

Qmin/b1=0, the only normal equation from which

b1 
wx y i i i
(7)
wx 2
i i

This formula is expressed in terms of the deviations from the origin

unlike to the intercept case (Draper and Smith, 1977) that are expressed in
terms of the deviation form the mean. Note that differentiating twice Qmin
with respect to b1 we obtain

2 Q / b12  2b1  wi xi2  0 (8)

which confirms Qmin to be a minimum. The sum of residuals is not

necessarily equal to zero as occur for a model with intercept.
Several situations may be usually envisaged (Brownlee, 1984; Cox,
1971; Natrella, 1963; Turner, 1960) concerning weighting, as we will see
in that follow.

Variance of y’s independent of x

In those cases we get

wi  1 (9)

and then

b1 
x y i i
(10)
x 2
i
Regression through the Origin 75

Variance proportional to x

Then we have (e.g., Georgian, 2009)

1
wi  (11)
xi

and

b1 
y i

y
(12)
x i
x

Standard deviation proportional to x

The weight will be given by

1
wi  (13)
xi2
which led to the slope value

yi
x
b1  i
(14)
n

Aston (1959) and Barlow (1989) follow a notation slightly different fro
the one shown here.

STANDARD ERRORS IN REGRESSION

THROUGH THE ORIGIN

By assuming that the values of x are free from error, applying the
random error propagation law (Asuero et al., 1988) we get for the
estimated variance of the slope
76 Julia Martín and Agustín G. Asuero

s y/2 x
s 
2
(15)
b1
wx 2
i i

The larger the term wixi2, the greater the precision of the slope is.
Note in addition that large values of xwill contribute substantially to this
sum; increasing numbers of such values will also increase the denominator
sum in Eqn. (15)
For the estimate of variance provided by the weighted residual we get

 w  y  ŷ  w y  b12  wi xi2
2
2

s 2
 i i i
 i i

y/ x
n 1 n 1
 w x y 
2
(16)
w y 2

i i i

w y i
2
i
 b1  wi xi yi

i i
wx 2
i i

n 1 n 1

In straight line regression through the origin sy/x2 has n-1 degrees of
freedom (since only one parameter is estimated), not n-2 as is the case for a
model with intercept. The rightmost expression in Eqn. (16) is the most
convenient way (Green and Margerison, 1977) for computation purposes.

CONFIDENCE INTERVALS IN REGRESSION

THROUGH THE ORIGIN

We get for the confidence interval for 1

b1  tn1, /2 sb  1  b1  tn1, /2sb (17)

1 1

From Eqn. (6) we have for the variance of a point on the true
regression line
Regression through the Origin 77

xi2 s y/2 x
s x s 
2 2 2
(18)
ŷi i b1
wx 2
i i

Note that s ŷ increases linearly with x whereas s ŷ / ŷ (variation

coefficient), remains constant.
The confidence interval for a point on the true regression line is then
calculated (Bennett and Franklin, 1954) from

x0 s y/ x x0 sy/ x
b1x0  tn1, /2  x0 1  b1x0  tn1, /2 (19)
wx 2
i i wx 2
i i

The confidence band for the entire regression line is the region
between two straight lines passing through the origin (Figure 3), whereas
in a model with intercept they are parabolic curves. Thus, the interval
becomes larger as we move away from the origin.

Figure 3. The solid line is the least squares line of slope b1 passing through the origin
and the set of points (xi , ̂i ) . The co-ordinates of the point P are, therefore,
(x0 ,b1x0 ) . The dotted lines show the least squares estimates displaced vertically by ±
one standard error, s( ̂ ) . The diagram shows clearly that the uncertainty associated
with a least squares estimate of  increases rapidly with increasing displacement of x0
from the origin (Green and Margerison, 1977).
78 Julia Martín and Agustín G. Asuero

Hahn (1977) and Natrella (1963) deal with the confidence intervals to
contain either b1 or the true average response for a given x value, as well as
the prediction interval containing a future response at a given x value.
Hedayat (1970) and Hedayat et al. (1977) propose a test for detecting a
monotonic relationship between the mean and variance. Iwase (1989)
studies the case in which the y values follow an inverse Gaussian
distribution being the coefficient of variation constant and unknown.

INVERSE EXTRAPOLATION: REVERSE USE OF THE

REGRESSION LINE THROUGH THE ORIGIN

The mean y0 of m new observations allow to predict the

corresponding x value and we may apply the Fieller’s (1940) theorem to
obtain the confidence limits for that prediction. Solving Eqn. (6) in reverse
we get the point estimate

y0
x0  (20)
b1

Here x̂0 is a non linear function of two normally distributed random

variables: y0 and b1, and will be approximately normally distributed about

the true value of x, for which the m observations y0 were made, with
(Bennett and Franklin, 1954; Lark et al., 1968), for the unweighted case
(wi=1)

2 2
 x0  2  x0  2 s y/ x  1 x02 
2

s 
2
s  s  2 
 2
(21)
  y0  0  b1  1 b1  m  xi 
x0 y b
Regression through the Origin 79

Although y0  x0 b1  0 , y0  x0 b1  z , is not necessarily equals to

zero since any particular y0is not necessarily equal to the media of the m y0
values. The difference, however, has a mean value of zero, and is normally
distributed about zero. The ratio of z/sz2 is distributed as Student’s t. The
two capital premises had shown above support the Fieller’s theorem
(Bánfai, 2012; Bánfai and Kemény, 2012, Fieller, 1940, Schwartz and
Gelb, 1984). Then we get

z 2  y0  b1 x  y  b1 x 
2 2

  0
 tn1,
2
 /2 (22)
sz2 sz2 1 x2 
s 2y/ x   2
 m x 

Confidence limits associated with x0, at a prescribed level of

significance , valid even if the scatter of the y0 values about the line is not
small, can be calculated (Seber and Lee, 2003) by solving the quadratic
equation

 2 t 2 s 2y/ x  2 2
tn1, s2
 /2 y/ x
 b1   x  2 y0 b1 x  y 0  0
2
(23)
  x2  m

whose roots give the values of the lower and upper confidence limits
of x, xL and xU, respectively. The difference between the upper and lower
limits gives the confidence interval.
We may use a pooled variance (Cox, 1971; Seber and Lee, 2003)
instead of sy/x2

 y  b x    y 
n m 2
2
i 1 1 oj
 y0
i1 j1
s 2p  (24)
n m 2

being in this case n+m-2 the degrees of freedom for the Student t.
80 Julia Martín and Agustín G. Asuero

GOODNESS OF FIT

We may be interested for example in fitting data as well as possible or

in obtaining good predictions. Some difficulties appear in fitting non
intercept models as usual statistics such as R2 or F are not comparable with
respect to the intercept model. A number of authors including Eisenhauer
(2003), Gillingham and Heien (1971, Gordon (1981), Hahn (1977), Kozak
and Kozak (1995), Hahn (1977) have addressed this issue that has
generated a fruitful discussion around this subject, e.g., Beals (1972),
Carmer and Walker (1971), D’Agostino (1971), Golsmith (1981), Gordon
(1981b), Haws (1981), and Valentine (1971).
In the model with intercept, let

yi  y   yi  ŷi    ŷi  y  (25)

Summing and squaring over i we get

  y  y     y  ŷ     ŷ  y   2  yi  ŷi   ŷi  y 
2 2 2
i i i i

(26)

and as the cross product in Eqn. (26) is equal to zero (Eisenhauer,

2003; Draper and Smith, 1997) we have

SSTm  SSE  SSR (27)

The total sum of squares corrected for the mean (SSTm) may be
partitioned into the residual error sum of squares (SSE) and the sum of
squares due to regression (SSR). The coefficient of determination, R2, is
given by

SSR SSE
R2   1 (28)
SSTm SSTm
Regression through the Origin 81

In the intercept term case R2 is a measure of the total variability

proportion that the regression explains. It coincides with the squared
correlation coefficient (- 1 < R2 < +1) between x and y (or between y and y
estimated) (Chatterjee et al., 2012; Draper and Smith, 1997).
When we deal with the regression through the origin, the cross product
in Eqn. (26) will generally take a non zero value and therefore this
equation is not valid as the basis for an analysis of variance. We may now
write (Brownlee, 1984; Eisenhauer, 2003; Rousseeuw, 1987) for the non
intercept model

yi   yi  ŷi   ŷi (29)

and then

 y    y  ŷ    ŷ  2 ŷi  yi  ŷi 
2 2 2
i i i i
(30)

Taking into account that the cross product in Eqn. (30) is equal to zero,
the (redefined) total sum of squares is decomposed now as

SST  SSE  SSR (31)

(SSR being also redefined), and so

SSE
R 2  1 (32)
SST

Equation (32) is the correct form of R2 for regression through the

origin (no intercept models). The R2 values for models without an intercept
are significantly higher than the ones for models without an intercept
(Meloun and Militky, 2012; Meloun et al., 1994). Note, however, that the
interpretations for the two formulas of R2 (Eqns. (28) and (32)) are
different. Fitting Eqn. (2) (non intercept model) by using the formula for R2
82 Julia Martín and Agustín G. Asuero

given by Eqn. (28) may lead to a negative R2 value, in some circumstances.

For models without an intercept, no adjustment of Y is made.
A table of variance analysis corresponding to Eqn. (29) may be
constructed as can be seen in Table 1. For details Brownlee (1984) should
be consulted. Table 1 may be combined with the table of analysis of
variance for regression with intercept giving a test of whether the model
through the origin is or not adequate (Brownlee, 1984; Lark, 1968).

Table 1. Table of analysis of variance for regression through the origin

(straight-line case)*

Source of Degrees of
Sum of squares E[M.S.]**
variation freedom

 w x y  /  w x
Model (Due
 2   2  wi xi2
2
to line) i i i
2
i i
1

 w  y  ŷ 
2
Residual
i i
n 1 2
w y
Total about
origin i
2
i
n
* Adapted from Brownlee (1984); ** M.S. = mean squares.

COMPARISON BETWEEN MODELS WITH

AND WITHOUT INTERCEPT

When dealing with straight lines relationships, the choice of choosing

or not choosing between a model with and without an intercept can
sometimes be posed. However, no simple solution to this problem (Casella,
1983; Gordon, 1981; Othman, 2014) is found. Applying linear regression
with model intercept (Eqn. (3)) to a set of n data point (i=1,2,…n), we get
for the estimated values of slope and intercept, b1 and b0, respectively

S XY
b1  (33)
S XX
Regression through the Origin 83

b0  yw  b1xw (34)

where SXX and SYY are the sum of the squares of deviations from the mean
for the two variables x and y, respectively, and SXY is the corresponding
sum of the cross products (Asuero and Gonzalez, 2006; Asuero and
Gonzalez, 1989; Martin and Asuero, 2017; Sayago and Asuero, 2004)

S XY   wi xi yi 
 w x  w y 
i i i i
(35)
w i

SXX and SYY may be easily derived from Eqn. (35) by substituting yi by
xi, and xi by yi, respectively. The weighted (sample) mean values of x and y
are given by

xw 
wx i i
(36)
w i

and

yw 
w yi i
(37)
w i

respectively. Note than when wi=1 then wi=n.

In a model with intercept, the first normal equation (Eqn. (34)) requires
that the fitted line pass thorough the centroid, and thus the weighted sum of
residuals has to be zero. Nevertheless, in the case of non intercept term the
sum of residual is not generally equals to zero.
Note that the slope of the regression through the origin given by Eqn.
(7) may be expressed as
84 Julia Martín and Agustín G. Asuero

S XY 
  w x   w y 
i i i i

b1 
w i

S XY   w  x y
i w w
(38)
 w x    w  x
2 2
S XX i w
i i
S XX 
w i

Thus, the line will not in general pass through the centroid so that Eqn.
(7) and (38) are not equivalent.
In what follows from this section we use unweighted regression (wi=1).
Casella (1983) has shown that a new point (xn+1, yn+1) may be added to the
previous full n set, forcing the straight line to pass now through the origin.
Model based on Eqn. (2) is then applied, the slope being given by Eqn. (7).
The new point added satisfies the identity

x n1 
, yn1   n* x ,n* y  (39)

where

n
n*  (40)
n 1 1
The coordinates of the new point (its position with respect to the
others) determine its leverage, that is, the amount of influence that has on
each fitted value

    
1  nx2 2  1   x 2

hn1   1 n    1 n  S  (41)
n  1 2 n  1
  xi 
2 
 XX
x 
i1
  n 

Thus, the impact of the new (augmented) n+1 data point increases with
Regression through the Origin 85

x / . The maximum discrepancy is expected to be when

2
(S XX / n)

x  (S XX / n) . In those cases in which this new point is an outlier, the

regression through the origin (Casellla, 1983; Meloun and Militky, 2011;
Meloun et al., 1994) is not suitable.
On the other hand he corresponding standardized residual to the new
(augmented) point is given by

b0
*
rn1  (42)
x2
sy/ x 1
S XX

where sy/x2 is the residual mean square from the full fit on the original n
data points. Note that rn+1* is identical to the t statistic that tests H0: = 0. It
can be shown that

 s0,y/
2

r    n  1  2 x    n  2 
2
*
n1
(43)
 sy/ x 

where (s0,y/x)2 is the estimated residual mean squares from the regression
through the origin and (sy/x)2, as before, the estimated residual mean
squares of the full regression. The (rn+1*)2 statistics is an exact measure of
the relationship between the residual variances. The original paper of
Casella (1983) should be consulted for additional details not included here.
For the leverage of models with intercept refer to Meloun et al. (1994)
and Meloun and Miliktik (2012) for details. An excellent introduction to
the topic is found in Sheater (2006).
Models through the origin may be used when consistency with the
underlying theory or other adequate prior (material and physical) reasons
are evident. There cases, however, where it is not clear which model
should be used and the choice between both non-intercept and intercept
models should be made with care. A comparison of the residual mean
86 Julia Martín and Agustín G. Asuero

squares obtained by the two models (Chatterjee et al., 2012) is a measure

of the goodness of fit (closeness of observed and predicted values).
Residual analysis may be also helpful in making decisions (Noggle, 1993).
In addition if the intercept t test [t = (b0 - 0)/sb0] on the model given by
Eqn. (3) is significant (reproducible non zero response at zero x value), this
model should be used, otherwise use the regression through the origin
given by Eqn. (2).

CAUTION CONCERNING ABOUT R2

The coefficient of determination (R2) is used to assess the goodness of

fit for regression models. It is probably the single most extensively used
measure of goodness of fit, but also widely misused (Asuero et al., 2006;
Raposo, 2016; Scott and Wild, 1991), because the several alternative R2
statistics are not generally equivalent (Kvalseth, 1985), except for linear
models with an intercept term. Some regression packages compute R2 in an
inappropriate way (Hawkins, 1980; Gordon, 1981; Kozak and Kozak,
1995; Okunade et al., 1993). It is therefore possible to obtain different
regression summary statistics, i.e., R2 for the same equation specified in
two equivalent ways (Uyar and Erdem, 1990). Becker and Kennedy (1992)
possess a helpful exercise to understanding least squares remembering at
the same time the problems with R2 in those cases in which an intercept is
not included in the regression model.

CONSTRAINED (CALIBRATION) EQUATIONS

In some instances, when dealing with fitting equations to data, the

physical reality of the context dictates that the curve pass through one or
more given points. Calibration curves, for example, often are known to
pass through points such as (0, 0), (100, 100), (100, l), or (1, l) (Leary and
Messick, 1985). As a matter of fact calibration curve may be forced to pass
Regression through the Origin 87

through one or more independently selected points by using Lagrangian

multipliers (Draper and Smith, 1997). Single ways of constraining a given
equation (e.g., calibration) in order to pass through one or more
independent selected points have been described by Meites and Leary
(1985). The general case of forcing an equation to pass through a specific
point or points is conveniently treated via the use of one or more Lagrange
multipliers, which can supplement the more traditional least-squares curve
fitting procedures. Some authors possess that constraints are appropriate
because the fitting curve in each case is "known" to pass through the
theoretical fixed points appealing to varying arguments. Helpful
suggestions to decide that may whether or not to impose constraints have
been advocated by Schwartz (1986) on the basis of the intended purpose of
the regression. However, when the data are constrained to pass trough
some fixed co-ordinates (x0, y0) point we may simply to shift the origin of
coordinates system (Green and Margerison, 1977; Hahn, 1977) to the point
(x0, y0) by means of the transformation

y   y  y0
x   x  x0
(44a,b)

STRAIGHT LINE IN THE CASE OF ACCUMULATIVE ERRORS

Sometimes, in engineering tests and in research, due to the

characteristics of the experiment, the error related with each point involves
the error related with all previous points. Then the successive observed
values of the dependent variable y represent the cumulative magnitude
(Mandel, 1964; Mandel, 1957; Natrella, 1963) of some successive effect at
successive values of the independent variable x

yi  1xi  1   2  ...   i1   i (45)

88 Julia Martín and Agustín G. Asuero

Then
yi  yi1  1  xi  xi1    i (46)

which may be put in the form

zi  1 Li   i (47)

Note that i’s are statistically independent. By dividing through Li we

get

zi i
 1 Li  (48)
Li Li

equation which satisfies all requirement of ordinary least squares

(errors statistically independent with zero mean and constant variance).
Then the least squares (Eqn. 7) gives the estimate value of the slope

b1 
z i
(49)
L i

from which the estimated value of its variance may be calculated

(Mandel, 1964) as
 di2 
1
  L 
2  i
sb 
2
k L  i
(50)
1

 Li   n  2  Li
where

di  zi  b1 Li (51)
Regression through the Origin 89

Mandel (1964) analyses Boyle’s original data concerning P, V

measurements and shows as the conventional treatment leads to non-
randomness residual pattern, suggesting the model PV=constant (i.e., 1/P=
V/constant) as incorrect. Nevertheless, a random residual pattern is
obtained if cumulative errors are considered, showing the model of Boyle
to be correct.

ERRORS IN VARIABLES METHOD

There are experiments in which the assumption of variable x free from

error is not correct (Boccio et al., 2006; Duer et al., 2008; Sayago et al.,
2004), being necessary in those cases to apply a more general treatment.
We are going to consider three cases in order of increasing complexity.

Adcock’s Regression though the Origin

Adcock, already in 1878, as indicated by Ripley and Thompson (1987)

suggested minimizing the sum of the squares of the distance perpendicular
to the line of adjustment. In the case of regression through the origin we
get
yi  b1xi
ri  (52)
1 b12

and then

1
2  i
y  bxi 
2
Qmin  (53)
1 b1
90 Julia Martín and Agustín G. Asuero

By differentiating with respect to the slope and equation to zero

 
Qmin 1 2
 2  yi  b1 xi   xi     y  b x 
2
 
b1
 
2 2 i 1 i
1 b1  1 b2 
 1 
(54)

we get

  xi2   yi2 
b  b1 
2
 1 0 (55)
  xi yi 
1

The minimization occurs when b1 has the sign of xiyi. This kind of
regression is known with the name of orthogonal regression (equal errors
in both axes).

Deming Regression

As the residuals are defined as

ri  yi  ŷi  yi  b1xi (56)

their variance are given by

 r2   2y  b12 x2  2b1 cov  xi , yi 

i i i
(57)

and the we have the following general expression for the weights

1 1
wi   (58)
 2
ri
  b   2b1 cov  xi , yi 
2
yi
2
1
2
xi
Regression through the Origin 91

If the (xi, yi) measurements are independent we have zero covariance

and assuming the ratio of the variances of yi to the xi independent of the x
values we obtain

1 1
wi   (59)
  b1  xi
2
2 2
yi 
C  b12  x2  i

where

 2y
C i
(60)
 2
xi

The weighted sum of the residuals has to be a minimum

1   y  b x 2 
2 
Qmin   i 21 i  (61)
C  b1  x i


and then

    2 
Qmin 1   y bx 
  2  yi  b1xi  
 2    2  i 
i 1 i
x  2 
    0
b1 C  b12 
  xi  
 C  b1
2
  

 x2i  

(62)

from which we get

92 Julia Martín and Agustín G. Asuero

 x2   y2 
C   2     i2 
i

  xi    xi 
b12  b1 C  0 (63)
xy 
  i 2 i 
 xi 
and like in the case of Eqn. (55) the minimization occurs when b1 has the
sign of the denominator of b1 in Eqn. (63).
Weighted regression with weights given by Eqn. (59) but applied to
models with intercept is known in the clinical literature (Linnet, 1993) with
the name of Deming regression, and it is very used in comparison methods.
It is also a kind of orthogonal regression, also named oblique regression.

Orthogonal Generalized Regression

We have assumed the independence of all the xi and yi, but when this is
no the case we must to include cross terms involving the covariance of
correlated variables. In addition, in those cases in which the ratio of
variances of y to x values are not a constant
From Eqn. (5) in the most general case we get

Qmin  r 2 w 
   wi i  ri2 i   0 (64)
b1  b1 b1 

and then

ri2 w
 wi b1
   ri2 i
b1
(65)

By differentiating ri2 and wi with respect to b1 we get, respectively

Regression through the Origin 93

ri2
 2  yi  b1 xi   xi   2b1xi2  2xi yi (66)
b1

and

wi   1  2a1 x2  cov  xi , yi 

  2   i

b1 b1   y  b12 x2  2b1 cov  xi , yi    2  b2 2  2b cov x , y
  i i 
2
i i
y 1 x i 1 i


wi2 2a1 x2  cov  xi , yi 
i

(67)

By substituting Eqns. (67) and (66) into Eqn. (65) we get

  w  2b x
i
2
1 i
 2xi yi     r w  2b
i
2 2
i 1
2
xi
 2cov  xi , yi   (68)

which finally led to

 
b1  wi xi2   ri2 wi2 b1 x2  cov  xi , yi    wi xi yi
i
(69)

As the value of b1 is depending on weighting factors and, at the same

time, weighting factors are depending on b1, it is necessary to use an
iterative algorithm for solving the system. Thus the rigorous computation
of the weights may become quite involved (Boccio et al., 2006; Sands,
1974). The starting value of b1 Is obtained from Eqn. (7) (setting wi=1).
The new values of wi are computed from Eqn. (58), and from these, an
improved value of b1 is calculated applying Eqn. (69), an so on. A
convergence criterion must be selected, e.g., k digits of b1 should no be
changed in the iteration
94 Julia Martín and Agustín G. Asuero

  b1,n1  
   1  10  k (70)
  b1,n  

where n is the number of iterations. Once the optimum values of b1 and

wi’s are known, sb12 is computed by applying the random error
propagation law (Duer et al., 2008) to Eqn. (69).
The methodology followed to derive Eqn. (69) is that employed by
Lisy et al. (1990), alternative procedure to the first derived by York (1969),
for the most general case of intercept involved (covariance of the
correlated variables included). This orthogonal generalized regression
receives in the clinical bibliography the name (Martin, 2000) of general
Deming regression. However, Martin (2000) follows an alternative
derivation given by Williamson (1968).

ROBUST REGRESSION THROUGH THE ORIGIN

Departures from the assumptions inherent in single linear regression

appear in the practice, exerting often a dramatic influence on the quality of
statistical results. Robust regression leads to estimates that outliers do not
influence so strongly as the standard least squares estimators. Thus,
observations that lead to large residuals are down-weighted, i.e., weighted
unequally. Robust regression methods are distribution free but require
more computing than conventional least squares.

Least Median of Squares through the Origin (L1 Regression)

The sum of the squared residuals may be replaced (Rousseeuw, 1984)

by the median of squared residuals (Least Median of Squares, LMS, L2
norm), making use of a (somewhat complex) non-linear optimization
algorithm to carry out the necessary calculations, providing a robust
version of the least squares regression. The PROGRESS (Program for
Regression through the Origin 95

RObust reGRESSion) program (Rousseeuw and Leroy, 1987; Rousseeuw,

1988) has become popular on this respect, being available a most recent
version (Rousseeuw and Hubert, 1997). Given the problems find in the
PROGRESS software with the slope estimation when intercept lacks,
Barreto and Maharry (2006) have devised an exact algorithm in the
bivariate case, applicable in those circumstances. A new algorithm for a
model with intercept suppressed including at most two unknown
parameters covering bivariate cases have been set up by Kayhan and
Gunay (2008), in the case of an odd number of data points. These later
authors (Atilgan and Gunay, 2011) have also studied the LMS estimate for
multiple linear regression models providing a more general algorithm. The
problem may be treated as a convex optimization one. Note that robust
methods are insensitive to departure from the normal distribution and to
the outliers.

Least Absolute Deviation Regression through the Origin

(L1 Regression)

Minimizing the sum of absolute errors

 i
(71)
i1

is called L1 regression (or L1-norm regression (Draper and Smith,

1997).
A L1 type estimator has been derived (Rieder, 1987) for regression
through the origin (for both errors-in-variables and error-free-variables
models). It is, among all estimators, minimax at finite sample size and
extends Huber's (1964) robust interval estimator of location.
96 Julia Martín and Agustín G. Asuero

Deepest Regression through the Origin in Analytical Chemistry

The deepest regression method, a linear regression method, fits with

the best depth relative to the data. Müller (2011) and Müller and Wellmann
(2009) should be consulted for details to lengthy to include here. Deepest
regression is reduced to

y
ŷi  median  i  (72)
 xi 

for a line through the origin, where observations with xi=0 are not
taken into account. Rousseeuw et al. (2001) showed a calibration data
example for peak area in ng/ml for cadmium from graphite furnace atomic
absorption spectrometry. The least squares line thorough the origin is
displaced towards the outlier observed at the highest concentration
standard, whereas the deepest regression through the origin is robust and
fits the good data points.

POLYNOMIAL REGRESSION

No-intercept models more complex than the previous use seen here
may also fitted to experimental data, e.g., a parabolic model passing
through the origin (Hahn, 1977; Karl and Huber, 1997)

y  1x   2 x 2   (73)

The law of Galileo on bodies in free fall (distance d travelled as a

function of time t) is an example, d=v0 t+(1/2) g t2, where the initial
velocity v0 is equal to 1 and half of the acceleration of gravity is 2
Solving by the least squares method we get (from the normal
equations) the coefficients
Regression through the Origin 97

b 
  x y  x     x y   x 
i i
3
i
2
i i
2
i
(74)
 x    x  x 
2 2
3 2 4
i i i

and

b 
 x y   b  x 
i i 2
3
i
(75)
1
 x  2
i

From signal values y0 we may calculate the corresponding values of x,

making a reverse use of the regression line

2
b  b  y
x0   1   1   0 (76)
2b2  2b2  b2

The sign of the root with physical meaning coincides with the sign of
the parameter b2, lacking of meaning the other root.
The variance of the regression will be given in this case by

s 2

 y  b x  b x 
i 1 i 2
2

(77)
y/ x
n2

Meites and Leary (1985) and Leary and Messick (1985) have treated
constrained calibration curves with parabolic examples as shown above.
Dalebrou (1974) reports variance analysis of polynomial regression with
no intercept by means of the coefficients orthogonal method. D-optimal
designs for polynomial regression models with no intercept have been the
subject of statistical consideration (Fang, 2002).
98 Julia Martín and Agustín G. Asuero

ANALYTICAL APPLICATIONS OF REGRESSION

THROUGH THE ORIGIN

Regression through the origin has found application in a variety of

fields such as astronomy (Deming, 1968), computer tomography (Sun et
al., 2000), ecology (Iwao, 1968; Waters et al., 2014), fishery (Bourgeois et
al., 1997; Cade and Terrell, 1997; Bourgeois et al., 1996), forestry (Kozak
and Kozak, 1995), industrial hygiene (Knight and Moore, 1987), parameter
estimation (Cvetanovic et al., 1979) and wood science (Han, 1977).
However, perhaps the largest applications have occurred within the
framework of calibration (x free from error) in the field of analytical
chemistry. Linear calibration functions passing through the origin have
found use (Bánfai 2012a; Liteanu and Rica, 1980; Meloun and Militky,
2012; Mullins, 2003) in chromatographic, electrochemical, and other
method of analysis. A number of authors dealing with that subject (no
statistical journals) are included in Table 2.
For non-negligible x-errors the situation is more difficult to deal. The
number of papers concerning specifically with regression through the
origin with errors in the two variables is scarce. Andrews et al. (1996),
Austin and Pelzer (1946), Kerrich (1966), Sands (1974), Tan and Jones
(1989), Ripley and Thompson (1987), Synek (2001) and Winsor (1946)
have been treated this topic. It has not been widely applied by chemists.
Tan and Jones, for example, have been reported the relationship between
the absorbance and chloride dioxide concentration (determined by
iodometry). It could also been applied in the field of comparison methods,
but in absence of systematic errors (Ripley and Thompson, 1987).
Analytical applications of regression through the origin are compiled
in tabular form in Table 3.
Regression through the Origin 99

Table 2. Authors who have published papers on calibration

(regression linear through the origin with x free-from error)
in non statistical journals

Alexander et al., 2015 Francis and Kim and Burkart, Shayanfar and
Sobel, 1970 2008 Shayanfar, 2011
Bonate, 2011 Georgian, 2009 Leroy and Strong III, 1979
Messick, 1985
Bánfai and Kemény, 2012 Hubert, 1997 Raposo et al., 2015 Synek et al.,
2000
Bonate, 1992 Kemp, 1985 Ripley and Van Zoonen et
Thompson, 1987 al., 1999
Dolan, 2009 Kemp, 1984a Roy and Kas, 2014
Ellerton and Strong, 1980 Kemp, 1984b Schwartz, 1986

Table 3. Some selected papers on regression through the origin

Content Reference
The usage of R2 as a measure of model fit and predictive power Alexander, Tropsha
in QSAR or QSPR modelling. Suggestion of how to use it and Winkler, 2015
appropriately as a measure of model fit.
Methodology for developing priors from individual or combined Hamel, 2015
meta-analyses which implicitly implies the assumption that there
is variation around the meta-analytical relationships themselves.
Examples of application to individual species are provided.
Comparison between models with and without intercept and Abdulsalam Othman,
statement the beast one. Applying the method leverage point 2014
when a new point is added to the original data.
The rm2 metrics and regression through origin approach: reliable Roy and Kar, 2014
and useful validation tools for predictive QSAR models
(Commentary on ‘Is regression through origin useful in external
validation of QSAR models?’).
Comparisson study of the proposed criteria using the regression Shayanfar and
through origin method (calculation with SPSS and Excel) for Shayanfar, 2014
external validation and prediction capability for models
developed using literature data. Prediction capability was
evaluated using the statistically significant differences between
absolute error values of training and test sets.
100 Julia Martín and Agustín G. Asuero

Table 3. (Continued)

Content Reference
Iwao’s patchiness regression through the origin: exploration Waters et al., 2014
whether fixing Iwao’s m*– m relation to go through the origin is
theoretically justifiable, statistically advantageous given the
methods used to estimate its parameters, and reduces the sample
size required when used to design sequential sampling plans
with no loss of sampling precision. Both analytical methods and
resampling methods based on field data are employed.
Research on the suitability of interval hypotheses for a selection Bánfai, 2012
of analytical problems frequently occurring in the
pharmaceutical setting. Overview of the statistical intervals and
hypothesis tests used in the Dissertation. The interval hypothesis
testing is discussed for the following topics: the transfer of
analytical methods, the evaluation of the accuracy of analytical
methods, the applicability of single-point calibration, and the
content uniformity assessment.
Estimation of bias for single-point calibration using a proposed Bánfai and Kemény,
method based on the two one- sided tests (interval hypothesis). 2012
The test is performed by comparing a confidence interval for the
bias to an allowable limit, defined in concentration units.
Fieller’s theorem was used for the ratio of two normally
distributed random variables to construct the confidence interval
for the bias.
Survey of the development of different rm2 metrics followed by Roy and Mitra, 2012
their applications in modeling studies for selection of the best
QSAR models in different reports made by several workers.
Clarification of the statement “one often tends to use the origin Burkart and Kim,
point (0,0) in the data. However, whether that is best practice or 2009
not is entirely arguable.” The argument that a zeroed instrument
is expected to provide a point at (0,0) is specious and
misleading.
Calibration models: How to decide if a calibration curve goes Dolan, 2009
through zero and some problems that can occur if the wrong
choices are made.
Evaluating ‘goodness-of-fit’ for linear instrument calibrations Georgian, 2009
through the origin. A weighted regression coefficient is
subsequently defined to evaluate the ‘goodness-of-fit’ and is
expressed as function of the %RSD.
Regression through the Origin 101

Content Reference
Properties of weighted least squares regression, particularly with Knaub, 2009
regard to regression through the origin for establishment survey
data, for use in periodic publications.
The statistical reasons why regression through the origin should be Legendre and
used to analyze comparative data, and supports the Desdevises, 2009
recommendation of Garland et al. (1992) through additional
geometric reasons.
Discussing the visualization of statistical concepts and reply to the Kim and Burkart,
letter writed by Levie (2008) about including or not including an 2008
origin point (0,0) in a regression analysis for building a standard
curve.
The correct use of visualizing statistical concepts. Fails in the Levie, 2008
attempt of this test in the example described by Kim and Burkart,
2006 about “Beer’s Law Plot.”
Fitting curve passing for designated point to data for promoting the Sun et al., 2008
reproducibility of peripheral quantitative computed tomography.
A interactive and dynamic method of visual interactive regression Kim and Burkart,
minimizing the sum visible by allowing the individual to adjust 2006
heights in a bar graph. The interactive feature of Excel spreadsheet
programs is utilized; use of the spinner bar is particularly helpful.
Properties of the deepest regresion and applications in analytical Rousseeuw et al.,
chemistry: Regression through the origin, polynomial regression, 2001
the Michaelis–Menten model, and censored responses.
Linear regression of calibration lines passing through the origin Synek, 2001
was investigated for three models of y-direction random errors:
normally distributed errors with an invariable standard deviation
(SD) and log normally and normally distributed errors with an
invariable relative standard deviation (RSD).
Uncertainties of mercury determinations in biological materials Synek, Subrt and
using an atomic absorption spectrometer. Study of potential Marecek, 2000
sources of uncertainties as possible in order to work out a general
model of determination of uncertainty in trace atomic absorption
measurements.
Critical overview of most conflicting points concerning linear Giordano, 1999
regression. Confidence bands and a discussion about the use of a
line through the origin are included. In addition, the simplest
expressions for expressing parameters to the appropriate significant
figures from built-in calculator programs are also provided.
102 Julia Martín and Agustín G. Asuero

Table 3. (Continued)

Content Reference
Validation is put in the context of the process of producing Zoonen et al., 1999
chemical information. Two cases are presented in more detail: the
development of a European standard for chlorophenols and its
validation by a full scale collaborative trial, and the intralaboratory
validation of a method for ethylene-thiourea using alternative
analytical techniques.
Response to Comment of Cade and Terrell about cautions on Bourgeois et al.,
Forcing Regression Equations through the Origin: This paper 1997
strengthens the caution to any-one considering no-intercept models
for improving relations between fish density and weighted usable
area. The explanations given by Cade and Terrell (1997)
convincingly reinforce this warning.
Comment to Cautions on Forcing Regression Equations through Cade and Terrell,
the Origin (Bourgeois et al., 1997): Prediction of biological 1997
response still depends largely of an detailed understanding of local
biological conditions. Authors urged caution in forcing regression
of fish density on weighted usable area through the origins, when
such a forcing  is contemplated, one should verify the calculations
used by commercial statistical packages to generate summary
statistics.
Improved calibration for wide measuring ranges and low contents. Karl and Huber,
For some calibrations, a straight line through the origin instead of a 1997
general straight line should be determined by regression analysis:
advantages and restrictions.
A least-squares-based method for determining the ratio between Moreno, 1997
two measured quantities.
Relationship between principal components analysis and weighted Andrews et al.,
linear regression for bivariate data sets: Application to linear, two- 1996
dimensional data sets with a zero intercept.
The problem of fitting a straight line when both variables are Draper et al., 1991
subject to error. A brief review of the literature is undertaken, and
one fitting method, the geometric mean functional realationship, is
spotlighted and illustrated with two sets of example data.
Several methods of obtaining the best straught line from data in Tan and Jones,
which the two variables are subject to errors of measurement are 1989
proposed and discussed.
Regression through the Origin 103

Content Reference
Statistical analysis techniques to compare pairs of dust samples: A Knight and Moore,
straight line through the origin, linear with intercepts, logarithmic, 1987
a logarithmic (weigh + constant), and a fifth forced
through the origin. It is suggested that the best estimate of the
relation between two dust samplers can be obtained by a least
squares determination of the straight line through the origin
using transformed variables.
A regression-like technique, maximum-likelihood fitting of a Ripley and
functional relationship (MLFR), is explained and is Thompson, 1987
demonstrated to work well. Under some conditions weighted
regression provides a good approximation to MLFR, and so can
be used if more convenient.
Suggestions that may be helpful to researchers in deciding Schwartz, 1986
whether or not to impose constraints.
The four main calibration methods (single separate or added Kemp, 1985
standard and multiple separate or added standards) and some
modifications are described mathematically and subjected to
error-propagation analysis, to examine the likely effects of errors
in the analytical signal on the overall accuracy and precision of
the concentration estimate.
Constrained Calibration Curves: How the use of Lagrange Leary and Messick,
multipliers can supplement the more traditional least-squares 1985
curve fitting procedures. The concept of degrees of freedom
when describing the variability of data around a calibration
curve is also discussed.
Simple ways are described of constraining a calibration or other Meites and Leary,
equation so that it will pass through one or more independently 1985
selected points and also give the “best” representation of any
number of experimental data in terms of the model selected.
Theoretical aspects of one-point calibration: causes and effects Kemp, 1984a
of some potential errors, and their dependence on
concentration. 
New ways of using data from analytical-recovery studies to Kemp, 1984b
assess analytical nonlinearity, without access to samples of
known concentration. A recovery-based method of assessing
constant, proportional, and non-linear errors with use of as little
as one sample pool of known concentration is described. In each
case, the theoretical basis of the method and an outline of a
practical experimental protocol is presented.
104 Julia Martín and Agustín G. Asuero

Table 3. (Continued)

Content Reference
Evaluation of both qualitatively and quantitatively the bias error Cardone and Palermo,
caused by an single-point-ratio calculations from an assumed 1980
linear response curve through zero for the case where the true
response curve is a straight line with a significant intercept.
Comments on the correspondence about regression through the Ellerton, 1980
origin (Strong, 1979): Precision and accuracy shoud be
considered from a statistical viewpoint and discussed.
Response of Strong to comments of Ellerton (1980) on Strong III, 1980
regression through the origin: Strong agree thoroughly with
Ellerton's definitions of precision and accuracy which apply to a
chemist's repetitions of determinations on the same sample.
Regarding the use of n-2 to calculate se, rather than n-1, as
recommended by Ellerton, Strong felt that requiring the best
straight line to pass through the origin was a constraint on the
system and therefore constituted a reduction in the number of
degrees of freedom.
Determination of the precision and accuracy of kinetic data. CvetanoviE,
Suggestions for the presentation of kinetic results and their Singleton and
uncertainties due to random and systematic errors. Regarding Paraskevapoulos,
random errors, least-squares expressions are summarized, and 1979
confidence limits, propagation of errors, and change of variable
are discussed. Sources of systematic errors are outlined, along
with potential methods for their detection and estimation.
Practical examples of fitting regression models with no intercept Hahn, 1979
term. Caution in the use of the model is advised.
Demostration of how in a photometric experiment if one Strong III, 1979
measures the absorbances, y, of solutions having solute
concentrations x, and if the solutions are expected to conform
with Beer's law, one should fit a straight line that passes through
the point y = 0 at x = 0. Strong proposes to accomplish this by
using a single-parameter model equation, y = blx, rather than the
conventional two-parameter model y = a + b2x. The single-
parameter model effectively forces the straight line to pass
through (0, 0), but in either case, the slope, bl or b2,represents the
absorptivity. This will unavoidably reduce the precision slightly,
but could increase the accuracy.
Regression through the Origin 105

Content Reference
Weighting factors in least squares: When there is great variation Sands, 1974
among the variances, the assumption of constant weights can
produce gross errors. A prcatical pH example.
Interval Estimate of the Ratio of an Unknown to a Standard. Francis and Sobel,
Methods for testing the suitability of the models under 1970
discussion are given.
QSAR or QSPR: Quantitative Structure-Activity/Property Relationships (QSAR or
QSPR)

FINAL COMMENTS

Some automated analytical systems, e.g., chromatographs, carrying out

non-intercept fitting on routine basis (Mullins, 2003), mainly using single
point calibration (Bánfai and Kemény, 2012), i.e., measuring only one
standard and drawing the line from the origin to this measured point.
Caution is required working on this way (Raposo, 2016) because chance
component in the measure influences the line in a mayor way. It is
essential in those cases in the validation step to investigate carefully the
linearity of the response. The regression model thorough the origin, when
applicable, gives estimates more précises (Hahn, 1977) than the ones
obtained by the most usual model with the intercept.
It should be noted that special models should be adopted only for
adequate prior reasons. Eisenhauer (2003) has said about regression
through the origin that “it remains a subject of pedagogical neglect,
controversy and confusion,” concluding with regard to the practice of
statistics that it “remains as much as art as it is science,” being “the
development of the statistical judgment as important as the computational
ability.
A primer on regression through the origin has been reported in this
contribution with the hope of being useful in teaching and in research.
106 Julia Martín and Agustín G. Asuero

REFERENCES

Acton, F.S., (1959). Analysis of Straight Line Data: The equation y=b x.
New York, USA: Wiley, pp. 16-17.
Afifi, A.F., Azen, S.P., (1972). Statistical Analysis, A Computer Oriented
Approach. 2nd ed., 1st ed., New York, USA: Academic Press, p.125; pp.
88-89.
Alexander, D.L., Tropha, A., Winkler, D.A., (2015). Beware of R2: simple,
unambiguous assessment of the prediction accuracy of QSAR and
SSPR models. J. Chem. Inf. Model. 55(7), 1316-1322.
Andrews, D.T., Chen, L., Wentzell, P.D., Hamilton, D.C., (1996).
Comments on the relationship between principal components analysis
and weighted linear regression for bivariate data sets. Chemometr.
Intell. Lab. Systems 34, 231-244.
Asuero, A.G., Bueno, J., (2011). Fitting straight lines with replicated
observations by linear regression. IV. Transforming data. Crit. Rev.
Anal. Chem. 41(1), 36-69.
Asuero, A.G., Gonzalez, G., (1989). Some observations on fitting a straight
line to data. Microchem. J. 40(2), 216-225.
Asuero, A.G., Gonzalez, G., (2007). Fitting straight lines with replicated
observations by linear regression. III. Weighting data. Crit. Rev. Anal.
Chem. 37(3), 143-172.
Asuero, A.G., Gonzalez, G., de Pablos, F., Ariza, J.L.G., (1988).
Determination of the optimum working range in spectrophotometric
procedures. Talanta 35(7), 531-537.
Asuero, A.G., Sayago, A., Gonzalez, A.G., (2006). The correlation
coefficient: an overview. Crit. Rev. Anal. Chem. 36(1), 41-59.
Atilgan, Y.K., Gunay, S., (2011). Least median of squares solution of
multiple linear regression models through the origin. Commun. Stat.
Theory Methods 40(22), 4125-4137.
Austen, A.E.W., Pelzer, H., (1946). Linear curves of best fit. Nature 157,
693-694.
Regression through the Origin 107

Bánfai, B., (2012). Statistical Problems in the Pharmaceutical Analysis.

PhD. Thesis, Budapest University of Technology and Economics,
Budapest, pp. 43-56.
Bánfai, B., Kemény, S., (2012). Estimation of bias for single-point
calibration. J. Chemometrics 26, 117-124.
Barlow, R., (1989). Statistics. A Guide to the Use of Statistical Methods in
the Physical Sciences. New York, USA: Wiley, pp. 98-99.
Barreto, H., Maharry, D., (2006). Least median of squares and regression
through the origin. Comput. Stat. Data Anal. 50(6), 1391-1397.
Beals, R.E., (1972). Regression through origin –comment. Am. Stat. 26(1),
54.
Becker, W., Kennedy, P., (1992). A lesson in least squares and R Squared.
Am. Stat. 46(4), 282-283.
Bennett, C.A., Franklin, N.L., (1954). Statistical Analysis in Chemistry and
the Chemical Industry. New York, USA: Wiley, pp. 232-234.
Bissell, A.F., (1992). Lines through the origin- is NO INT the answer?. J.
Appl. Stat. 19(2), 193-210.
Boccio, M., Sayado, A., Asuero, A.G., (2006). A bilogarithmic method for
the spectrophotometric evaluation of stability constants of 1:1 weak
complexes from mole ratio data. Int. J. Pharm. 318, 70-77.
Bonate, P.L., (1992). Concepts in calibration theory. 2. Regression through
the origin –when should it be used. LC GC Magazine Sep. Sci. 10(5),
378-379.
Bonate, P.L., (2011). Linear models and regression, In Pharmacokinetic-
Pharmacodynamics Modeling and Simulation. Springer
Science+Business Media, Chapter 2, pp. 61-100.
Bourgeois, G., Cunjak, R.A., Caissie, D., El-Jabi, N., (1996). A special and
temporal evaluation of PHAB-SIM in relation to measured density of
juvenile Atlantic salmon in a small stream. N. Am. J. Fish. Manag. 16,
154-166.
Bourgeois, G., Cunjak, R.A., Caissie, D., El-Jabi, N., (1997). Cautions on
forcing regression equations through the origin: response to comments.
N. Am. J. Fish. Manag. 17(1), 227-228.
108 Julia Martín and Agustín G. Asuero

Brownlee, K.A., (1984). Statistical Methodology in Science and

Engineering. In Regression through the origin. 3rd ed., Malabat, Fla;
R.G. Krieger, pp. 358-362.
Cade, B.S., Terrell, J.W., (1997). Comment: cautions on forcing regression
equations through the origin. N. Am. J. Fish. Manag. 17 (1), 225-232.
Carmer, S.G., Walker, W.M., (1971). Regression through origin. Am. Stat.
25(5), 57-58.
Casella, G., (1983). Leverage and regression through the origin. Am. Stat.
37(2), 147-152.
Chatterjee, S., Hadi, A.S., Price, B., (2012). Regression Analysis by
Example: Regression through the origin. 5th ed., New York, USA:
Wiley, pp. 46-48.
Cox, C.P., (1971). Interval estimation for X-predictions from linear Y on X
regression lines through the origin. J. Am. Stat. Assoc. 66(336), 749-
751; (1972), 67(337), 252 (Erratum).
Cvetanovic, R.J., Singleton, O.L., Paraskcvopoulos, G., (1979). Evaluation
of the mean value, and standard errors of rate constants and their
temperature coefficients. J. Phys. Chem. 83(1), 50-60.
Dalebrou, M.A., (1974). Polynomial regression through the origin –
analysis of variance by method of orthogonal coefficients. Annales de
l’amélioration des plantes 24(1), 71-76.
David, H.A., (1972). Regression through origin –comment. Am. Stat.
26(1), 54.
de Levie, R., (2008). Visualizing statistical concepts. J. Chem. Educ. 85(5),
635.
Deming, T.J., (1968). The analysis of linear correlation in Astronomy.
Vistas Astron. 10, 125-142.
Dolan, J.W., (2009). Calibration curves, Part 1: to b or not to b?. LC GC
Eur. 190, 192-194.
D’Agostini, R.B., (1971). Regression through origin. Am. Stat. 25(5), 59.
Duer, W.C., Ogren, P., Meetze, A., Kitchen, C.J., von Lindern, R.,
Yaworsky, D.C., Boden, C., Gayer, J.A., (2008). Comparison of
ordinary, weighted, and generalized least squares straight line
Regression through the Origin 109

calibrations for LC-MS-MS, GC-MS, HPLC, GC, and enzymatic

assays. J. Anal. Toxicol., 32(5), 329-338.
Draper, N.R., Smith, H., (1998). Applied Regression Analysis. 3rd ed.,
New York, USA: Wiley, pp. 121-233.
Eisenhauer, J.G., (2003). Regression thorough the origin. Teach. Stat.
25(3), 76-80.
Ellerton, R.W., Strong III, F.C., (1980). Comments on regression through
the origin. Anal. Chem. 52(7), 1152-1154.
Fang, Z., (2002). D-optimal designs for polynomial regression models
through the origin. Stat. Probabil. Lett. 57 (4), 343-351.
Fieller, E.C., (1940). The biological standardization of insulin. J. Roy. Stat.
Soc. Supp. 7(1), 1-64.
Finney, D.J., (1996). A note on the history of regression. J. Appl. Stat.
23(5), 515-558.
Francis, M., Sobel, E., (1970). Interval estimate of the ratio of an unknown
to a standard. Anal. Chem. 42(3), 314-320.
Freund, J.R., Wilson, W.J., Sa, P., (2006). Regression Analysis, Statistical
Modely of a Response Variable. 2nd ed., Burlington, MA: Elsevier.
Georgian, T., (2009). Evaluating ‘goodness-of fit’ for linear instrument
calibrations through the origin. Int. J. Enviro. Anal. Chem. 89, 383-
388.
Gillingham, G., Heien, D., (1971). Regression through the origin. Am. Stat.
25(1), 54-55.
Giordano, J.L., (1999). On reporting uncertainties of the straight-line
regression parameters. Eur. J. Phys. 20(5), 345-349.
Goldsmith, P.L., (1981). Letter to the Editor. Stat. 30(3), 234.
Gordon, H.A., (1981a). Errors in computer packages. Least squares
regression through the origin. Stat. 30(1), 23-29.
Gordon, H.A., (1981b). Letter to Editor. Stat. 30(4), 305-308.
Green, J.R., Margerison, D., (1977). Statistical Treatment of Experimental
Data: The straight line through the origin or through some other fixed
point. Amsterdam: Elsevier, Chapter 12, pp. 198-235.
Hahn, G.J., (1977). Fitting regression models with no intercept term. J.
Qual. Technol. 9(2), 56-61.
110 Julia Martín and Agustín G. Asuero

Hamel, O.S., (2015) A method for calculating a meta- analytical prior for
the natural mortality rate using multiple life history correlates. ICES J.
Mar. Sci. 72(1), 62-69.
Hawkins, D.M., (1980). A note on fitting a regression without an intercept
term. Am. Stat. 34(4), 233.
Haws, A.P., Gordon, H.A., (1981). Letter to Editor. Stat. 30(4), 304-308.
Hedayat, A., (1970). Examination and analysis of residuals, diagnostic
checking of residuals for detecting a special type of heteroscedasticity
in linear regression through the origin. Biometrics 26(3), 603. (Joint
Meeting of ENAR with IMS and ASA, Chape Hill, North Caroline).
Hedayat, A., Raktoe, B.L., Talwar, P.P., (1977). Examination and analysis
of residuals: a test for detecting a monotonic relation between mean
and variance in regression through the origin. Commun. Stat. Theory
Methods 6(6), 497-506.
Howarth, R.J., (2001). A history of regression and related model-fitting in
the earth sciences. Nat. Resourc. Res. 10(4), 241-286.
Huber, M.K.W., (1997). Improved calibration for wide measuring ranges
and low contents. Accred. Qual. Assur. 2(8), 367-374.
Huber, P.J., (1964). Robust estimation of a location parameter. Ann. Math.
Stat. 35, 73-101.
Iwao, S., (1968). A new regression method for analyzing the aggregation
pattern of animal population. Res. Popul. Ecol. 10(1), 1-20.
Iwase, K., (1989). Linear regression through the origin with constant
coefficient of variation for the inverse Gaussian distribution. Commun.
Stat. Theory Methods 18(10), 3587-3593.
Kayhan, Y., Gunay, S.M., (2008). A new approach to least median of
squares and regression through the origin. Commun. Stat. Theory
Methods 37(5), 773-781.
Kemp, G.J., (1985). The susceptibility of calibration methods to errors in
the analytical signal. Anal. Chim. Acta 176, 229-247.
Kemp, G.J., (1984). Theoretical aspects of one-point calibration: causes
and effects of some potential errors, and their dependence on
concentration. Clin. Chem. 30(7), 1163-1167.
Regression through the Origin 111

Kemp, G.J., (1984). Assessment of analytical bias: four new ways to use
recovery measurements. Clin. Chem. 30(7), 1168-1170.
Kerrich, J.E., (1966). Fitting the line y=ax when errors of observation are
present in both variables. Am. Stat. 20(1), 24.
Kim, M-H., Burkart, M., (2008). The author replies. Including or not
including an original point (0,0) in a regression analysis for building a
standard curve. J. Chem. Educ. 85(5), 635-636.
Kim, M-H., Burkart, M., Kim, M.H., (2006). A method of visual
interactive regression. J. Chem. Educ. 83(12), 1884.
Knaub, J.R., (2009) Properties of weighted least squares regression for
cutoff sampling in establishment surveys. Conference paper Cuttof
Sampling and Establishment Surveys. InterStat J. December.
Knight, G., Moore, E., (1987). Comparison of dust samplers: statistical
analysis techniques. Am. Ind. Hyg. Assoc. J. 48(4), 344-353.
Kozak, A., Kozak, R.A., (1995). Notes on regression through the origin.
Forest. Chron. 7(3), 326-330.
Kvalseth, T.O., (1985). Cautionary note about R2. Am. Stat. 39(4), 279-
285.
Lark, P.D., Craven, B.R., Bosworth, R.C.L., (1969). The Handling of
Chemical Data (pp 159-163). Oxford, England: Pergamon Press.
Leary, J.J., Messick, E.B., (1985). Constrained calibration curves: a novel
application of Lagrange multipliers in analytical chemistry. Anal.
Chem. 57(4), 956-957.
Legendre, P., Desdevises, Y., (2009). Independent contrasts and regression
through the origin. J. Theoret. Biol. 259(4), 727-743.
Linnet, K., (1993). Evaluation of regression procedures for method
comparison studies. Clin. Chem. 39(3), 424-432.
Lisy, J.M., (1990). Multiple straight-line least squares analysis with
uncertainties in all variables. Comp. Chem. 14, 189-192.
Liteanu, C., Rica, I., (1980). Statistical Theory and Methodology of Trace
Analysis. New York, USA: Ellis Horwood, pp. 161-162.
Mandel, J., (1957). Fitting a straight line to certain type of cumulative data.
J. Am. Stat. Assoc. 12(280), 552-566.
112 Julia Martín and Agustín G. Asuero

Mandel, J., (1964). The Statistical Analysis of Experimental Data. New

York, USA: Dover, pp. 295-303.
Martin, J., Asuero, A.G., (2017). Weighting and transforming data in linear
regression. In Linear Regression: Models, Analysis and Applications.
Nova Science Publishers.
Martin, R.F., (2000). General Deming regression for estimating systematic
bias and its confidence interval in method comparison studies. Clin.
Chem. 46, 100-104.
Meites, L., Leary, J.J., (1985). Simple procedures for obtaining constrained
calibration equations. Anal. Chim. Acta 176, 249-251.
Meloun, M., Militky, J., (2011). Statistical Data Analysis. A Practical
Guide with 1250 exercises and answer key on CD. New Delhi:
Woodhead Publ., pp 483-486.
Meloun, M., Militky, J., Forina, M., (1994). Chemometrics for Analytical
Chemistry. Volume 2: PC-Aided Regression and Related Methods.
New York, USA: Ellis Horwood, pp. 30-33.
Moreno, C., (1996). A least-squares based method for determining the ratio
between two measured quantities. Measur. Sci. Technol. 7(2), 137-141;
(1997), 8(8), 951.
Muller, C.H., (2011). Data depth for simple orthogonal regression with
application to crack orientation. Metrika 74(2), 135-165.
Muller, C.H., Wellmann, R., (2009). Data Depth for Classical and
Orthogonal Regression. Cors09 International Conference on Robust
Statistics, Book of Abstracts (pp. 110-111), Università degli Studi di
Parma, Facoltà di Economia, Riani, M., Ceroli, A., Agostinelli, C.,
Perrotta, D. Eds., Libero Libri- Claudio Agostinelli, Paese (TV), Itali.
Mullins, E., (2003). Statistics for the Quality Control Chemistry
Laboratory. Cambridge, England: Royal Society of Chemistry (RSC),
pp. 270-275.
Myers, R.H., (1986). Classical and Modern Regression with Applications.
Boston, USA: Duxbury Press.
Natrella, M.G., (1963). Experimental Statistics, NBS Handbook 91.
Washington: U.S. Government Printing Office, pp. 5-24 to 5-27.0,
483-485.
Regression through the Origin 113

Noggle, J.H., (1993). Practical Curve Fitting and Data Analysis. Software
and Self-Instruction for Scientists and Engineers. Chichester, England:
Ellis Horwood.
Okunade, A.A., Chang, C.F., Evans, R.D., (1993). Comparative analysis of
regression output summary statistics in common statistical packages.
Am. Stat. 47(4), 298-303.
Othman, S.A., (2014). Comparison between models with and without
intercept. Gen. Math. Notes 21(1), 118-127.
Raposo, F., (2016). Evaluation of analytical calibration based on least-
squares linear regression for instrumental techniques: a tutorial review.
Trends Anal. Chem. 77, 167-185.
Rieder, H., (1989). A finite-sample minimax regression estimator. Stat.
20(2), 211-221.
Ripley, B.D., Thompson, M., (1987). Regression techniques for the
detection of analytical bias. Analyst 112(4), 377-383.
Rousseeuw, P.J., (1984). Least median of squares regression. J. Am. Stat.
Assoc. 79(12), 871-880.
Rousseeuw, P., (1988). PROGRESS: a program for robust regression.
Trends Anal. Chem. 7(9), 320-321.
Rousseeuw, P.J., Hubert, M., (1997). Recent development in PROGRESS.
In Lectur Notes-Monograph Series. Vol. 31, Institute of Mathematical
Statistics (IMS).
Rousseeuw, P.J., van Aelst, S., Rambali, B., Smeyers-Verbeke. J., (2001).
Deepest regression in analytical chemistry. Anal. Chim. Acta 446, 245-
256.
Rousseeuw, P.J., Leroy, A.M., (1987). Robust Regression & Outlier
Detection: Simple regression through the origin. New York, USA:
Wiley, pp. 62-65.
Roy, K., Kar, S., (2014). The r(m)(2) metrics and regression through origin
approach: reliable and useful validation tools for predictive QASR
models commentary on ‘Is regression through the origin useful in
external validation of QASR models? Eur. J. Pharm. Sci. 62, 111-114.
114 Julia Martín and Agustín G. Asuero

Roy, K., Mitra, I., (2012). On the use of the metric rm2 as an effective tool
for validation of QASR models in computational drug design and
predictive toxicology. Mini Rev. Med. Chem. 12(6), 491-504.
Ryan, T.P., (2008). Modern Regression Methods. 2nd ed., New York, USA:
Wiley.
Sands, D.E., (1974). Weighting factors in least squares. J. Chem. Educ.
51(7), 473-474.
Sayago, A., Boccio, M., Asuero, A.G., (2004). Fitting straight lines with
replicated observations by linear regression: the least squares
postulates. Crit. Rev. Anal. Chem. 34(1), 39-50.
Sayago, A., Asuero, A.G., (2004). Fitting straight lines with replicated
observations by linear regression. Part II. Testing for homogeneity of
variances. Crit. Rev. Anal. Chem. 34(3-4), 133-146.
Schwartz, L.M., (1986). Effect of constraints on precision of calibration
analyses. Anal. Chem. 58(1), 246-250.
Schwartz, L.M., Gelb, R.I., (1984). Statistical uncertainties of end points at
intersecting straight lines. Anal. Chem. 56(8), 1487-1492.
Scott, A., Wild, C., (1991). Transformations and R2. Am. Stat. 45(2), 127-
129.
Seber, G.A.F., Lee, A.J., (2003). Linear Regression Analysis. 2nd ed., New
York, USA: Wiley, p. 149.
Shayanfar, A., Shayanfar, S., (2011). Is regression through origin useful in
external evaluation of QASR models?. Eur. J. Pharm. Sci. 87, 271-
273.
Sheather, S.J., (2009). A Modern Approach to Regression with R. New
York: Springer, pp 51-70, pp 115-123.
Strong III, F.C., (1979). Regression line that starts at the origin. Anal.
Chem. 51(2), 298-299.
Sun, L., Xie, T., Fan Y.M., Zhang, C., (2008). Fitting curve passing
through designated point to data for promoting the reproducibility of
peripheral quantitative computed tomography (pQCT). IEEE
Computer Society 2008: Proceedings of the 2008 International
Conference on BioMedical Engineering and Informatics, Sanya,
Hainan, China, Vol. 2, pp. 867-871.
Regression through the Origin 115

Synek, V., (2001). Calibration lines passing through the origin with errors
in both axes. Accred. Qual. Assur. 6(8), 360-367.
Synek, V., Subrt, P., Marecek, J., (2000). Uncertainties of mercury
determinations in biological materials using an atomic absorption
spectrometer – AMA 254. Accred. Qual. Assur. 5(2), 58-66.
Tan, H.S., Jones, W.E., (1989). Fitting of a straight line when both
variables contain errors. Application to the Beer-Lambert law. J.
Chem. Educ. 66(8), 650-651.
Turner, M.E., (1960). Straight line regression through the origin.
Biometrics 16(3), 483-485.
Uyar, B., Erdem, O., (1990). Regression procedures in SAS problems?.
Am. Stat. 44(4), 296-301.
Valentine, T.J., (1971). Regression through origin. Am. Stat. 25(5), 58-59.
van Zoonen, P., Hoogerbrugge, R., Gort, S.M., van de Wiel, H.J., van’t
Klooster, H.A., (1999). Some practical examples of method validation
in the analytical laboratory. Trends Anal. Chem. 18(9-10), 584-593.
Waters, E.K., Furlon, M.J., Benke, K.K., Grove, J.R., Hamilton, A.J.,
(2014). Iwao’s patchiness regression through the origin: biological
importance and efficiency of sampling applications. Popul. Ecol.
56(2), 393-399.
Willett, J.B., Singer, J.D., (1988). Another cautionary note about R2, it use
in weighted least squares regression. Am. Stat. 42(3), 236-238.
Williamson, J.H., (1968). Least squares fitting of a straight line. Can. J.
Phys. 46 (16), 1845-1847.
Winsor, C.P., (1946). Which regression. Biometr. Bull. 2(6), 101-109.
York, D., (1969). Least squares fitting of a straight line with correlated
errors. Earth Planet. Sci. Lett. 5, 320-324.
In: Linear Regression ISBN: 978-1-53611-992-3
Editor: Vera L. Beck
c 2017 Nova Science Publishers, Inc.

Chapter 3

L INEAR R EGRESSION FOR

I NTERVAL -VALUED D ATA IN KC (R)
Yan Sun∗and Chunyang Li

Department of Mathematics & Statistics

Utah State University, Logan, UT, US

Abstract

In the recent scientific research, data are increasingly taking on new

formats such as sets, lists, and histograms. Among these, a particular
type that is frequently encountered is interval-valued data, which refers
to a collection of observations in the form of intervals. Some examples
include daily [min, max] temperature, [low, high] elevation of a geo-
graphical region, and the range of a group of individual observations.
Linear regression as a fundamental tool of statistical analysis has been
increasingly investigated for extensions to accommodate interval-valued
data. Various models and methods have been proposed and studied in the
last decades. However, issues such as interpretability and computational
feasibility still remain. Especially, a commonly accepted mathematical
foundation is largely underdeveloped, compared to the demand of appli-
cations.
∗ Corresponding Author: [email protected]
118 Yan Sun and Chunyang Li

In this chapter, we focus on linear regression for interval-valued data

within the framework of random sets, and propose a new model that gen-
eralizes a series of existing ones. By proposing our model, we continue
to build up the theoretical framework that deeply understands the existing
models and facilitates future developments. In particular, we establish im-
portant properties of the model in the space of compact convex subsets of
R, analogous to those for the classical linear regression. Additionally, we
carry out theoretical investigations into the least squares estimation that is
widely used in the literature. It is shown that the least squares estimator
is asymptotically unbiased. A simulation study is presented that supports
our theorems, and an application to a climate data set is demonstrated.

Keywords: linear regression; random interval, metric space, coefficient of

determination, least squares, asymptotic unbiasedness

1. Introduction
Linear regression for interval-valued data has been attracting increasing inter-
ests among researchers. See [10], [20], [12, 13], [23], [8], [5], [14], [26, 27],
[6], [9], for a partial list of references. However, issues such as interpretabil-
ity and computational feasibility still remain. Especially, a commonly accepted
mathematical foundation is largely underdeveloped, compared to its demand of
applications. By proposing our new model, we continue to build up the theoreti-
cal framework that deeply understands the existing models and facilitates future
developments.
In the statistics literature, the interval-valued data analysis is most often
studied under the framework of random sets, which includes random intervals
as the special (one-dimensional) case. The probability-based theory for random
sets has developed since the publication of the seminal book of [24]. See [25] for
a relatively complete monograph. To facilitate the presentation of our results,
we briefly introduce the basic notations and definitions in the random set theory.
Let (Ω, L , P) be a probability space. Denote by K Rd or K the collection of
all non-empty compact subsets of Rd . In the space K , a linear structure is
defined by Minkowski addition and scalar multiplication, i.e.,
Linear Regression for Interval-Valued Data in KC (R) 119

A + B = {a + b : a ∈ A, b ∈ B} λA = {λa : a ∈ A},
∀A, B ∈ K and λ ∈ R. A natural metric for the space K is the Hausdorff metric
ρH , which is defined as

ρH (A, B) = max sup ρ (a, B), sup ρ (b, A) , ∀A, B ∈ K ,
a∈A b∈B

where ρ denotes the Euclidean metric. A random compact set is a Borel measur-
able function A : Ω → K , K being equipped with the Borel σ-algebra induced
by the Hausdorff metric. For each X ∈ K Rd , the function defined on the unit
sphere Sd−1 :
sX (u) = sup hu, xi, ∀u ∈ Sd−1
x∈X
is called the support function of X. If A(ω) is convex almost surely, then A is
called a random compact convex set. (See [25], p.21, p.102.) The collection of
d d

all compact convex subsets of R is denoted by KC R or KC . When d = 1,
the corresponding KC contains all the non-empty bounded closed intervals in
R. A measurable function X : Ω → KC (R) is called a random interval. Much of
the random sets theory has focused on compact convex sets. Let S be the space
of support functions of all non-empty compact convex subsets in KC . Then, S
is a Banach space equipped with the L2 metric
Z 1
2
2
ksX (u)k2 = d |sX (u)| µ (du) ,
Sd−1

where µ is the normalized Lebesgue measure on Sd−1. According to the em-

bedding theorems (see [28], [15]), KC can be embedded isometrically into the
Banach space C(S) of continuous functions on Sd−1 , and S is the image of KC
into C(S). Therefore, δ (X,Y ) := ksX − sY k2 , ∀X,Y ∈ KC , defines a metric on
KC . Particularly, let
X = [X, X] = [X c − X r , X c + X r ]
be an bounded closed interval with center X c and radius X r , or lower bound X
and upper bound X, respectively. Then, the δ-metric of X is
1 2 2

kXk2 = ksX (u)k2 = X + X = (X c )2 + (X r )2 ,
2
120 Yan Sun and Chunyang Li

and the δ-distance between two intervals X and Y is

1
1 2 1 2 2
δ (X,Y ) = (X −Y ) + X −Y
2 2
h i 12
= (X c −Y c )2 + (X r −Y r )2 .

Investigation of linear regression in KC (R) began with [10] developing a

least squares fitting of compact set-valued data and considering the interval-
valued input and output as a special case. Precisely, he gave analytical solu-
tions to the real-valued numbers a and b under different circumstances such that
δ (Y, aX + b) is minimized on the data. The pioneer idea of [10] was further
studied in [11, 12], where the δ-metric was extended to a more general metric
called W -metric originally proposed by [20]. The advantage of the W -metric
lies in the flexibility to assign weights to the radius and midpoints in calculating
the distance between intervals. So far the literature had been focusing on find-
ing the affine transformation Y = aX + b that best fits the data, but the data are
not assumed to fulfill such a transformation. A probabilistic model along this
direction kept missing until [13], and simultaneously [14], proposed the same
simple linear regression model for the first time. The model essentially takes on
the form of
Yi = aXi + b + εi , (1)
with a, b ∈ R and E(εi ) = [−c, c], c ∈ R. This can be written equivalently as

Yic = aXic + b + εci ,

Yir = |a|Xir + c + εri .

It leads to the following equation that shows linearity in KC :

δ Ŷi , Ŷ j = |a|δ (Xi , X j ) . (2)

Some advances have been made regarding this model and the associated es-
timators. [13] derived least squares estimators for the model parameters and
examined them from a theoretical perspective. [14] established a test of linear
independence for interval-valued data. However, many problems still remain
open such as biases and asymptotic distributions, as anticipated in [13]. This
Linear Regression for Interval-Valued Data in KC (R) 121

chapter presents a continuous development addressing some issues and open

problems in the direction of model (1).
We point out that, in a separate framework, linear regression models for
interval-valued data have been studied in R2 by treating the intervals essentially
as bivariate vectors. Examples belonging to this category include the center
method by [3], the MinMax method by [4], the (constrained) center and range
method by [26, 27], and the model M by [6]. Although the bivariate representa-
tion of an interval could result in loss of geometric information (e.g., equation
(2) does not hold anymore), this type of models generally has better flexibility
and easier inferences, and therefore are preferred in some practical situations.
We emphasize that the purpose of the chapter is not to compare models from
the two domains, but to focus on and provide insights into model developments
in KC (R).
Our contributions in this chapter are three fold. First, we relax the restric-
tion of model (1) that the Hukuhara difference Y (aX + b) must exist (see
[16]) and generalize the univariate model to the multiple case. We also give an-
alytical least squares (LS) solutions to the model parameters. Second, we show
that our model and LS estimation together accommodate a decomposition of the
sums of squares in KC analogous to that of the classical linear regression. Third,
we derive explicit formulas of the LS estimates for the univariate model, which
exist with probability going to one. The LS estimates are further shown to be
asymptotically unbiased. A simulation study is carried out to validate our the-
oretical findings. Finally, we apply our model to a climate data set to illustrate
the applicability of our model.
The rest of the chapter is organized as follows: Section 2 formally intro-
duces our model and the associated LS estimators. Then, the sums of squares
and coefficient of determination in KC are defined and discussed. Section 3
presents the theoretical properties of the LS estimates for the univariate model.
The simulation study is reported in Section 4, and the real data application is
presented in Section 5. We give concluding remarks in Section 6. Technical
proofs and useful lemmas are deferred to the Appendices.
122 Yan Sun and Chunyang Li

2. The Proposed Model

2.1 Model Specification
We consider an extension of model (1) to the form

δ (Yi , aXi + b) = kεi k2 , (3)

where E[εi ] = [−c, c], c > 0. It is equivalently expressed as

(
Yi = aXi + b + εi , if Yi (aXi + b) exists;
(4)
Yi + εi = aXi + b, if otherwise (aXi + b) Yi exists.

This leads to the following center-radius specification

Yic = aXic + b ± εci ,

Yir = |a| Xir ± εri ,

where E(εci ) = 0, E(εri ) = c > 0, and the signs “±" correspond to the two cases
in (4). Define
(
λi = εci , ηi = εri , if Yi (aXi + b) exists;
c r
(5)
λi = −εi , ηi = −εi , if otherwise (aXi + b) Yi exists.

Our model is specified as

Yic = aXic + b + λi , (6)

Yir = |a|Xir + ηi , (7)

where E(λi ) = 0, E(ηi ) = µ ∈ [−c, c], Var(λi) = σ2λ > 0, and Var(ηi ) = σ2η > 0.

Tohmodel thei outcome intervals Yi = Yi ,Yi by p interval-valued predictors
X j,i = X j,i , X j,i , i = 1, · · · , n; j = 1, · · · , p, we consider the multivariate exten-
sion of (3):
!
p
δ Yi , b + ∑ a j X j,i = kεi k2 , (8)
j=1
Linear Regression for Interval-Valued Data in KC (R) 123

which leads to the following center-radius specification

p
Yic = b + ∑ a j X cj,i + λi , (9)
j=1
p
Yir = ∑ a j X rj,i + ηi .

(10)
j=1

where E(λi ) = 0, E(ηi ) = µ ∈ [−c, c], Var(λi) = σ2λ , and Var(ηi ) = σ2η . We have
assumed λi and ηi are independent in this chapter to simplify the presentation.
The model that includes a covariance between λi and ηi can be implemented
without much extra difficulty.

2.2 Least Squares Estimate (LSE)

Least squares method is widely used in the literature to estimate the interval-
valued regression coefficients ([10], [20], [12]). It minimizes δ (Y, E(Y |X)) on
the data with respect to the parameters. Denote
p
Ŷic = E(Yic |Xi) = b + ∑ a j X cj,i , (11)
j=1
p
Ŷir = E(Yir |Xi) = µ + ∑ a j X rj,i .

(12)
j=1

Then the sum of squared δ-distance between Yi and Ŷi is written as

n
L = ∑ δ2 [E (Yi|Xi) ,Yi]
i=1
 !2 !2 
n p p
∑ b + ∑ a j X cj,i −Yic ∑ a j X rj,i + µ −Yir  .

= +
i=1 j=1 j=1

Therefore, the LSE of µ, b, a j, j = 1, · · · , p is defined as

1
µ̂, b̂, â j , j = 1, · · · , p = arg min L (µ, b, a j , j = 1, · · · , p) . (13)
n
124 Yan Sun and Chunyang Li

Let
! !
1 n c c 1 n c 1 n c
X cj , Xkc ∑ X j,iXk,i − ∑ X j,i ∑ Xk,i ,

S =
n i=1 n i=1 n i=1
! !
1 n r r 1 n r 1 n r
S X rj , Xk r
∑ X j,iXk,i − ∑ X j,i ∑ Xk,i ,

=
n i=1 n i=1 n i=1

be the sample covariances of the centersandradii of X j and

Xk , respectively.
2 c 2 r
Especially, when k = j, we denote by S X j and S X j the corresponding
sample variances. In addition, define
! !
1 n c c 1 n c 1 n c
S X cj ,Y c
∑ X j,iY − ∑ X j,i ∑Y ,

=
n i=1 n i=1 n i=1
! !
1 n r r 1 n r 1 n r
S X rj ,Y r
∑ X j,iY − ∑ X j,i ∑Y ,

=
n i=1 n i=1 n i=1

as the sample covariances of the centers and radii of X j and Y , respectively.

Then, the minimization problem (13) is solved in the following proposition.
p
Proposition 1. The least squares estimates of the regression coefficients â j j=1 ,
if they exist, are solution of the equation system:
p p
∑ a jS X cj , Xkc + sgn (ak ) ∑ |a j |S X rj , Xkr

j=1 j=1
c c r r
= S (Xk ,Y ) + sgn (ak )S (Xk ,Y ) , k = 1, · · · , p. (14)

And then, b̂, µ̂ are given by

p
b̂ = Y c − ∑ â j X cj , (15)
j=1
p
µ̂ = Y r − ∑ |â j |X rj . (16)
j=1
Linear Regression for Interval-Valued Data in KC (R) 125

2.3 Sums of Squares and Coefficient of Determination

The variance of a compact convex random set X in Rd is defined via its support
function as
Var(X) = Eδ2 (X, EX) ,
where the expectation is defined by Aumann integral (see [2], [1]) as

EX = {Eξ : ξ ∈ X almost surely} .

See [18, 19]. For the case d = 1, it is shown by straightforward calculations that

EX = [EX, EX],
Var(X) = Var (X c ) + Var (X r ).

This leads us to define the sums of squares in KC (R) to measure the variability
of interval-valued data. A definition of the coefficient of determination R2 in
KC (R) follows immediately, which produces a measure of goodness-of-fit.
Definition 1. The total sum of squares (SST) in KC is defined as
n h 2 2 i
SST = ∑ Yic −Y c + Yir −Y r . (17)
i=1

Definition 2. The explained sum of squares (SSE) in KC is defined as

n h 2 2 i
SSE = ∑ Ŷic −Y c + Ŷir −Y r . (18)
i=1

Definition 3. The residual sum of squares (SSR) in KC is defined as

n h 2 2 i
SSR = ∑ Yic − Ŷic + Yir − Ŷir . (19)
i=1

Definition 4. The coefficient of determination (R2 ) in KC is defined as

SSR
R2 = 1 − , (20)
SST
where SST and SSR are defined in (17) and (19), respectively.
126 Yan Sun and Chunyang Li

Analogous to the classical theory of linear regression, our model (9)-(10)

together with the LS estimates (13) accommodates the partition of SST into SSE
and SSR. As a result, the coefficient of determination (R2 ) can also be calculated
as the ratio of SSE and SST . The partition has a series of important implications
of the underlying model, one of which being that the residual Y Ŷ /Ŷ Y and
the predictor Ŷ are empirically uncorrelated in (KC , δ).

Theorem 1. Assume model (9)-(10).

Let Yic and Yi r in (11)-(12) be calculated
according to the LS estimates µ̂, b̂, â j , j = 1, · · · , p in (13). Then,

SST = SSE + SSR.

It follows that the coefficient of determination in KC is equivalent to

R2 = SSE/SST.

2.4 Positive Restriction and Goodness-of-fit

It is possible to get negative values of Ŷir by its definition (12). That is, the
model implied outcome could be outside KC (R). This is an inevitable draw-
back to force a linear model in the nonlinear space KC (R) (e.g., there is no
inverse of addition). Theoretically, this phenomenon is closely related to the
goodness-of-fit of the linear regression model. Theorem 2 gives an upper bound
of how often the model predicts outcomes outside of KC (R). For a model that
largely explains the variability of Y r , σ2η should be very small and so is this
bound. Otherwise, the upper bound probability could grow large if most of the
variability of Y r lies in the random error. In practice, for model inferences,
the negative values of Ŷir can be rounded to 0, which always improves on the
predicting accuracy since Yir is non-negative.

Theorem 2. Consider model (9)-(10). Let Ŷi be defined in (11)-(12). Then,

2
E Yir − Yˆir σ2η
Ŷir

P <0 ≤ = .
(Yir )2 (Yir )2
Linear Regression for Interval-Valued Data in KC (R) 127

3. Properties of LSE
In this section, we study the theoretical properties of the LSE for the univariate
model (6)-(7). Applying Proposition 1 to the case p = 1, we obtain the two
sets of half-space solutions, corresponding to a ≥ 0 and a < 0, respectively, as
follows:

S(X c ,Y c ) + S(X r ,Y r )
a+ = , (21)
S2 (X c ) + S2 (X r )
b+ = Y c − a+ X c , (22)
+
µ = Y r − |a+ |X r ; (23)

and

S(X c ,Y c ) − S(X r ,Y r )
a− = , (24)
S2 (X c ) + S2 (X r )
b− = Y c − a− X c , (25)
−
µ = Y r − |a− |X r . (26)

The final formula for the LS estimates falls in three categories. In the first, there
is one and only one set of existing solution, which is defined as the LSE. In the
second, both sets of solutions exist, and the LSE is the one that minimizes L. In
the third situation, neither solution exists, but this only happens with probability
going to 0. We conclude these findings in the following Theorem.

Theorem 3. Assume model (6)-(7). Let â, b̂, µ̂ be the least squares solution
defined in (13). If |S(X c,Y c)| > |S(X r ,Y r )|, then there exists one and only one
half-space solution. More specifically,

i. if in addition S (X c ,Y c ) > 0, then the LS solution is given by

â, b̂, µ̂ = a+, b+, µ+ ;

ii. if instead S (X c ,Y c) < 0, then the LS solution is given by

â, b̂, µ̂ = a−, b−, µ− .

128 Yan Sun and Chunyang Li

Otherwise, |S(X c,Y c )| < |S(X r ,Y r )|, and then either both of the half-space so-
lutions exist, or neither one exists. In particular,

iii. if in addition S (X r ,Y r ) > 0, then both of the half-space solutions exist,

and
â, b̂, µ̂ = arg min{{a+ ,b+ ,µ+ },{a− ,b− ,µ− }} {L (a, b, µ)};
iv. if instead S (X r ,Y r ) < 0, then the LS solution does not exist, but this
happens with probability converging to 0.
Unlike the classical linear regression, LS estimates for the model (6)-(7) are
biased. We calculate the biases explicitly in Proposition 2, which are shown
to converge to zero as the sample size increases to infinity. Therefore, the LS
estimates are asymptotically unbiased.

Proposition 2. Let â, b̂, µ̂ be the least squares solution in Theorem 3. Then,
2aS2 (X r )
P(â = a−)I{a≥0} + P(â = a+ )I{a<0} ,

E (â − a) = − 2 c 2 r
S (X ) + S (X )

2|a|S2(X c )
P(â = a− )I{a≥0} + P(â = a+ )I{a<0} .

E (|â| − |a|) = − 2 c 2 r
S (X ) + S (X )
2 c 2 r
Theorem 4. Consider model (6)-(7). Assume S (X ) = O(1) and S (X ) =
O(1). Then, the least squares solution â, b̂, µ̂ in Theorem 3 is asymptotically
unbiased, i.e.
   
â a
E b̂ → b ,
µ̂ µ
as n → ∞.

4. Simulation
We carry out a systematic simulation study to examine the empirical perfor-
mance of the least squares method proposed in this chapter. First, we consider
the following three models:
Linear Regression for Interval-Valued Data in KC (R) 129

• Model 1: a = 2, b = 5, µ = 0.5, ση = 0.3, σλ = 2;

• Model 2: a = −2, b = 5, µ = 0.5, ση = 0.3, σλ = 3;

• Model 3: a = 2, b = 5, µ = −0.5, ση = 0.3, σλ = 2;

where data show a positive correlation, a negative correlation, and a positive

correlation with a negative µ, respectively. A simulated dataset from each model
is shown in Figure 1, along with its fitted regression line.
Model 1: a=2, b=5, µ = 0.5 Model 2: a=-2, b=5, µ = 0.5
35 20

30 15

10
25

5
20

0
Y

Y
15
-5

10
-10

5
-15

0 -20

-5 -25
-4 -2 0 2 4 6 8 10 12 14 -4 -2 0 2 4 6 8 10 12 14
X X

Model 3: a=2, b=5, µ = −0.5

20
Y

-5
-4 -2 0 2 4 6 8 10 12 14
X

Figure 1: Plots of simulated datasets from models 1, 2, and 3, each with sample
size n = 50. The solid line denotes the regression line y = âx + b̂, and the two
dashed lines denote the two accompanying lines y = âx + b̂ ± µ̂.
130 Yan Sun and Chunyang Li

To investigate the asymptotic behavior of the LS estimates, we repeat the

process of data generation and parameter estimation 1000 times independently
using sample size n = 20, 50, 100 for all the three models. The resulting 1000
independent sets of parameter estimates for each model/sample size are evalu-
ated by their mean absolute error (MAE) and mean error (ME). The numerical
results are summarized in Table 1. Consistent with Proposition 2, â tends to un-
derestimate a when a > 0 and overestimate a when a < 0. This bias also causes
a positive and negative bias in b̂, when a > 0 and a < 0, respectively. Similarly,
a positive bias in µ̂ is induced by the negative bias in |â|. All the biases dimin-
ish to 0 as the sample size increases to infinity, which confirms our finding in
Theorems 4.

Table 1. Evaluation of Parameter Estimation

n MAE ME MAE ME MAE ME

a=2 b=5 µ = 0.5

Model 1 20 0.1449 -0.0921 0.8083 0.445 0.3655 0.2304
50 0.0848 -0.0411 0.4899 0.2141 0.214 0.1011
100 0.0562 -0.0171 0.3151 0.0872 0.142 0.041

a=-2 b=5 µ = 0.5

Model 2 20 0.2011 0.103 1.1389 -0.5071 0.5067 0.2578
50 0.1205 0.0336 0.6973 -0.1774 0.3038 0.0807
100 0.0842 0.0185 0.4814 -0.0865 0.2118 0.0465

a=2 b=5 µ = −0.5

Model 3 20 0.1488 -0.1047 0.8143 0.495 0.3785 0.262
50 0.0836 -0.0412 0.4703 0.2119 0.2108 0.1015
100 0.0579 -0.0187 0.3321 0.098 0.1453 0.0464

Next, we compare our model to CCRM, a typical bivariate type of model

from the literature. As we discussed in the introduction, these two models are
developed for different purposes and are generally not comparable. We include
a comparison in the simulation study to better evaluate the performances of
Linear Regression for Interval-Valued Data in KC (R) 131

our model, with CCRM providing a baseline of converging rate and predicting
accuracy. From Model 1, 2, 3, respectively, we simulate 1000 independent
samples with size n = 20, 50, 100. Then, each sample is randomly split into a
training set (80%) and a validation set (20%). The two models are evaluated by
their sample variance adjusted mean squared errors (AMSE’s) on the validation
set, which are defined as
c 2
∑m c

i=1 Yi − Ŷi
AMSE(center) = c 2
,
∑m c
i=1 Yi −Y i
r 2
∑m r

i=1 Yi − Ŷi
AMSE(radius) = ,
r −Y r 2
∑m Y
i=1 i i

and
AMSE(center) + AMSE(radius)
AMSE(average) = ,
2
where m = n/5 is the size of validation set. We use the R function ccrm in
the iRegression package to implement CCRM. The average result of the 1000
repetitions are summarized in Table 2. For Model 1 and 2, both models have
competitive performances. Model 3 has a negative µ, so CCRM is slightly worse
than our model due to its positive restriction on µ. To better show this, we
continue to consider the following two univariate models and one multivariate
model with a much smaller µ:
• Model 4: a = 3, b = 5, µ = −5, ση = 0.5, σλ = 5;
• Model 5: a = −3, b = 5, µ = −5, ση = 0.5, σλ = 5;

• Model 6: a1 = −3, a2 = 2 b = 5, µ = −5, ση = 0.5, σλ = 5.

A sample of n = 50 from each of Model 4 and 5 are plotted in Figure 2. For all
of the three models, our model performs significantly better than CCRM.

5. A Real Data Application

In this section, we apply our model to analyze the average temperature data
for large US cities, which are provided by National Oceanic and Atmospheric
132 Yan Sun and Chunyang Li
Model 4: a = 3, b = 5, µ = −5 Model 5: a = −3, b = 5, µ = −5
120 40

100 20

80
0

60
−20
40
Y

Y
−40
20

−60
0

−20 −80

−40 −100
−15 −10 −5 0 5 10 15 20 25 30 35 −15 −10 −5 0 5 10 15 20 25 30
X X

Figure 2: Plots of simulated datasets from models 4 and 5, each with sample
size n = 50.

Administration (NOAA) and are publicly available. The three data sets we ob-
tained specifically are average temperatures for 51 large US cities in January,
April, and July. Each observation contains the averages of minimum and max-
imum temperatures based on weather data collected from 1981 to 2010 by the
NOAA National Climatic Data Center of the United States. July in general is
the hottest month in the US. By this analysis, we aim to predict the summer
(July) temperatures by those in the winter (January) and spring (April). Figure
3 plots the July temperatures versus those in January and April, respectively.
The parameters are estimated according to (14)-(16) as

â1 = −0.4831, â2 = 1.1926;

b̂ = 10.2510, µ̂ = −3.7071.

Denote by TJan , TApril , and TJuly, the average temperatures in a US city in Jan-
uary, April, and July, respectively. The prediction for TJuly based on TJan and
TApril is given by
c c c
T̂July = 10.2510 − 0.4831TJan + 1.1926TApril , (27)
r r r
T̂July = −3.7071 + 0.4831TJan + 1.1926TApril . (28)

The three sums of squares are calculated to be

SST = 663.8627; SSE = 495.0874; SSR = 168.7753.

Linear Regression for Interval-Valued Data in KC (R) 133

Table 2. Mean results of AMSE on the validation set based on 1000 inde-
pendent repetitions

CCRM Our Model

n Center Radius Average Center Radius Average

Model 1 20 0.1716 0.3134 0.2425 0.1772 0.3374 0.2573

50 0.1181 0.2368 0.1775 0.1116 0.2313 0.1714
100 0.1149 0.2241 0.1695 0.1119 0.2219 0.1669

Model 2 20 0.3499 0.3244 0.3372 0.3467 0.3294 0.3380

50 0.2341 0.2356 0.2348 0.2344 0.2318 0.2331
100 0.2263 0.2201 0.2232 0.2203 0.2200 0.2201

Model 3 20 0.1708 0.3367 0.2538 0.1687 0.3241 0.2464

50 0.1192 0.2288 0.1740 0.1192 0.2246 0.1719
100 0.1128 0.2196 0.1662 0.1101 0.2190 0.1646

Model 4 20 0.3795 0.3499 0.3647 0.1250 0.3691 0.2470

50 0.2802 0.2738 0.2770 0.0867 0.2734 0.1800
100 0.2580 0.2727 0.2653 0.0808 0.2605 0.1706

Model 5 20 0.3519 0.3207 0.3363 0.1204 0.3717 0.2461

50 0.2712 0.2799 0.2756 0.0827 0.2681 0.1754
100 0.2558 0.2751 0.2655 0.0800 0.2552 0.1676

Model 6 50 0.0622 0.4288 0.2455 0.0661 0.2536 0.1599

100 0.0596 0.3934 0.2265 0.0606 0.2370 0.1488
200 0.0565 0.3838 0.2201 0.0593 0.2344 0.1469

Therefore, the coefficient of determination is

SSR SSE
R2 = 1 − = = 0.7458.
SST SST
134 Yan Sun and Chunyang Li
Average Temperatures for Large US Cities Average Temperatures for Large US Cities
45 45

40 40

35 35

30 30
July (o C)

July (o C)
25 25

20 20

15 15

10 10
−15 −10 −5 0 5 10 15 20 25 0 5 10 15 20 25 30
January (o C) April (o C)

Figure 3: Left: plot of July versus January temperatures. Right: plot of July
versus April temperatures.

Finally, the variance parameters can be estimated as

1 n 2
σ̂2λ = ∑ c
TJuly,i c
− T̂July,i = 2.1708;
n − 1 i=1
1 n 2
σ̂2η = ∑ r
TJuly,i r
− T̂July,i = 1.2047.
n − 1 i=1

r
Thus, by Theorem 2, an upper bound of P T̂July,i < 0 on average is estimated
to be
1 n σ̂2η 1.2047 n 1
∑
n i=1

r
2 = ∑
n i=1

r
2 = 0.047,
TJuly,i TJuly,i

r
which is very small and reasonably ignorable. We calculate T̂July,i for the entire
sample and all of them are well above 0. So, for this data, although µ̂ < 0 and
it is possible to get negative predicted radius, it in fact never happens because
the model has captured most of the variability. The empirical distributions of
residuals are shown in Figure 4. Both distributions are centered at 0, with the
center residual having a slightly bigger tail.
Linear Regression for Interval-Valued Data in KC (R) 135
Probability Density Plots of Residuals
0.4
T Jc u l y− T̂ Jc u l y
T Jr u l y− T̂ Jr u l y
0.35

0.3

0.25

Probability Density
0.2

0.15

0.1

0.05

0
−8 −6 −4 −2 0 2 4 6 8
Residuals

Figure 4: Empirical probability density plots of the residuals for the center and
radius.

Conclusion
We have rigorously studied linear regression for interval-valued data in the met-
ric space (KC , δ). The new model we introduces generalizes previous models
in the literature so that the Hukuhara difference Yi (aXi + b) needs not exist.
Analogous to the classical linear regression, our model together with the LS es-
timation leads to a partition of the total sum of squares (SSR) into the explained
sum of squares (SSE) and the residual sum of squares (SSR) in (KC , δ), which
implies that the residual is uncorrelated with the linear predictor in (KC , δ). In
addition, we have carried out theoretical investigations into the least squares es-
timation for the univariate model. It is shown that the LS estimates in (KC , δ)
are biased but the biases reduce to zero as the sample size tends to infinity.
Therefore, a bias-correction technique for small sample estimation could be a
good future topic. The simulation study confirms our theoretical findings and
shows that the least squares estimators perform satisfactorily well for moderate
sample sizes.
136 Yan Sun and Chunyang Li

Appendix: Proofs
Proof of Proposition 1
Proof. Differentiating L with respect to µ, b, and a j , j = 1, · · · , p, respectively,
and setting the derivatives to zero, we get
n
∂L
∝ ∑ Ŷir −Yir = 0,

(29)
∂µ i=1
n
∂L
∝ ∑ Ŷic −Yic = 0,

(30)
∂b i=1
n n
∂L
∝ ∑ Ŷic −Yic Xk,i
c
+ ∑ Ŷir −Yir sgn (ak ) Xk,i
r

= 0, (31)
∂ak i=1 i=1
k = 1, · · · , p.

Equations (29)-(30) yield

p p
1 n c 1 n
b = ∑ Yi − n
n i=1 ∑ a j ∑ X cj,i = Y c − ∑ a j X cj , (32)
j=1 i=1 j=1
p p
1 n r 1 n
µ = ∑ Yi − n
n i=1 ∑ |a j| ∑ X rj,i = Y r − ∑ |a j |X rj . (33)
j=1 i=1 j=1

Equations (14) are obtained by plugging (32)-(33) into (31), and equations (15)-
(16) follow from (32)-(33). This completes the proof.

5.1 Proof of Theorem 1

Proof. According to definitions (17)-(19),
n h 2 2 i
SST = ∑ Yic − Ŷic + Ŷic −Y c + Yir − Ŷir + Ŷir −Y r
i=1
n
= SSE + SSR + 2 ∑ Yic − Ŷic Ŷic −Y c + Yir − Ŷir Ŷir −Y r

i=1
n
= SSE + SSR + 2 ∑ Yic − Ŷic Ŷic + Yir − Ŷir Ŷir .

(34)
i=1
Linear Regression for Interval-Valued Data in KC (R) 137

The last equation is due to (29)-(30). Further in view of (11)-(12) and (31), we
have
n
∑ Yic − Ŷic Ŷic + Yir − Ŷir Ŷir

i=1
" #
n p p
∑ Yic − Ŷi c
∑ a j X cj,i + Yir − Ŷi
r
∑ |a j |X rj,i

=
i=1 j=1 j=1
p n
∑ aj ∑ Yic − Ŷi X cj,i + Yir − Ŷi sgn(a j )X rj,i
c r

=
j=1 i=1
= 0.

This together with (34) completes the proof.

5.2 Proof of Theorem 2

Proof. Notice that

P Ŷir < 0 = P Ŷir −Yir < −Yir ≤ P |Ŷir −Yir | > Yir .

An application of Markov’s inequality completes the proof.

5.3 Proof of Theorem 3

Proof. Parts i, ii and iii are obvious from Proposition 1. Part iv follows from
Lemma 1 in Appendix II.

5.4 Proof of Proposition 2

Proof. We prove the cases a ≥ 0 and a < 0 separately. To simplify notations,
we will use E (·) throughout the proof, but the expectation should be interpreted
as being conditioned on X.

Case I: a ≥ 0.
138 Yan Sun and Chunyang Li

From Lemma 2, we have

a+ − a
∑i< j (Xic − X cj )(Yic −Y jc ) + ∑i< j (Xir − X rj )(Yir −Y jr )
= −a
∑i< j (Xic − X cj )2 + ∑i< j (Xir − X rj )2
h i h i
(X c − X c ) (Y c −Y c ) − a(X c − X c ) + (X r − X r ) (Y r −Y r ) − a(X r − X r )
∑i< j i j i j i j ∑i< j i j i j i j
= c c 2 r r 2
∑i< j (Xi − X j ) + ∑i< j (Xi − X j )
∑i< j (Xic − X cj )(λi − λ j ) + ∑i< j (Xir − X rj )(ηi − η j )
= .
∑i< j (Xic − X cj )2 + ∑i< j (Xir − X rj )2

This immediately yields

E a+ − a = 0.

(35)

Similarly,

∑i< j (Xic − X cj )(λi − λ j ) − ∑i< j (Xir − X rj )[2a(Xir − X rj ) + (ηi − η j )]

a− − a = ,
∑i< j (Xic − X cj )2 + ∑i< j (Xir − X rj )2

and consequently,

2aS2 (X r )
E a− − a = −

. (36)
S2 (X c ) + S2 (X r )
Linear Regression for Interval-Valued Data in KC (R) 139

Notice now

E(â − a) = E(â − a)I{â=a+ } + E(â − a)I{â=a− }

Z Z
= (â − a)dP + (â − a)dP
{â=a+ } {â=a− }
Z Z
= (a+ − a)dP + (a− − a)dP
{â=a+ } {â=a− }
Z Z
+ +
= (a − a)dP + (a − a)dP (37)
{â=a+ } {â=a− }
Z Z
+ −
− (a − a)dP + (a − a)dP
{â=a− } {â=a− }
Z
= E a+ − a − (a+ − a− )dP
{â=a− }
= −E(a+ − a− )I{â=a− } . (38)

Here, equation (38) is due to (35). Recall that

2 ∑i< j (Xir − X rj )(Yir −Y jr )

a+ − a− =
∑i< j (Xic − X cj )2 + ∑i< j (Xir − X rj )2
h i
2 ∑i< j (Xir − X rj ) a(Xir − X rj ) + (ηi − η j )
= , (39)
∑i< j (Xic − X cj )2 + ∑i< j (Xir − X rj )2

since a ≥ 0. Therefore,
 h i
 2 ∑i< j (Xir − X rj ) a(Xir − X rj ) + (ηi − η j ) 
E(â − a) = −E I −
 ∑i< j (Xic − X cj )2 + ∑i< j (Xir − X rj )2  {â=a }
h i
2 ∑i< j |a|(Xir − X rj )2 P(â = a− ) + (Xir − X rj )E(ηi − η j )I{â=a− }
= −
∑i< j (Xic − X cj )2 + ∑i< j (Xir − X rj )2
2 ∑i< j (Xir − X rj )2 P(â = a− )
= −
∑i< j (Xic − X cj )2 + ∑i< j (Xir − X rj )2
2aS2 (X r )
= − P(â = a−). (40)
S2 (X c ) + S2 (X r )
140 Yan Sun and Chunyang Li

Similar to the preceding arguments,

E(|â| − |a|) = E(|â| − a) = E(|â| − a)I{â=a+ } + E(|â| − a)I{â=a− }

Z Z
= (a+ − a)dP + (−a− − a)dP
{â=a+ } {â=a− }
Z Z
= E(a+ − a) − (a+ − a)dP − (−a− + a)dP
{â=a− } {â=a− }
= −E(a+ + a− )I{â = a− }.

Recall again that

h i
2 ∑i< j (Xic − X cj ) a(Xic − X cj ) + (λi − λ j )
a+ + a− = . (41)
S2 (X c) + S2 (X r )
It follows that
2aS2 (X c )
E(|â| − |a|) = − P(â = a−). (42)
S2 (X c ) + S2 (X r )
Case II: a < 0

In this case, we have

+ ∑i< j (Xic − X cj )(λi − λ j ) + ∑i< j (Xir − X rj )[−2a(Xir − X rj ) + (ηi − η j )]

a −a = ,
S2 (X c) + S2 (X r )
∑i< j (Xic − X cj )(λi − λ j ) − ∑i< j (Xir − X rj )(ηi − η j )
a− − a = .
S2 (X c ) + S2 (X r )

These imply

2aS2 (X r )
E(a+ − a) = − ,
S2 (X c ) + S2 (X r )
E(a− − a) = 0.
Linear Regression for Interval-Valued Data in KC (R) 141

Similar to the case of a ≥ 0, we obtain

E(â − a) = E(a+ − a− )I{â=a+ } ,
E(|â| − |a|) = E(a+ + a− )I{â=a+} .

These, together with (39) and (41), imply,

2aS2 (X r )
E(â − a) = − P(â = a+ ), (43)
S2 (X c ) + S2 (X r )
2aS2 (X c)
E(|â| − |a|) = 2 c P(â = a+ ). (44)
S (X ) + S2 (X r )
The desired result follows from (40), (42), (43) and (44).

5.5 Proof of Theorem 4

Proof. From (22) and (25),
E(b̂|X) = E(Y c − âX c |X) = E(aX c + b + λ − âX c |X) = X c E(a − â|X) + b.
Similarly, from (23) and (26),

E(µ̂|X) = E(Y r −|â|X r |X) = E(|a|X r +η+λ−|â|X r |X) = X r E(|a|−|â|X)+µ.
Hence, the desired result follows by Proposition 2 and Lemma 3 in the Ap-
pendix.

6. Appendix II: Lemmas

Lemma 1. Assume model (6)-(7) and Var(X r ) < ∞. Then Cov(X r ,Y r ) ≥ 0.
Consequently, S(X r ,Y r ) ≥ 0 with probability converging to 1.
Proof. According to (7),
Cov (X r ,Y r ) = E (X rY r ) − E (X r ) E (Y r )
= E [X r (|a|X r + η1 )] − E (X r ) E (|a|X r + η1 )
= |a|E (X r )2 + µE (X r ) − |a| [E (X r )]2 − µE (X r )
= |a|Var(X r )
≥ 0, (45)
142 Yan Sun and Chunyang Li

provided that Var (X r ) < ∞. By the SLLN,

S (X r ,Y r ) → Cov (X r ,Y r ) a.s.. (46)

(46) together with (45) completes the proof.

Lemma 2. The following are true for v ∈ {c, r}:

1
S (X v ,Y v ) = ∑ (Xiv − X vj )(Yiv −Y jv ), (47)
n2 i< j
1
S2 (X v ) = ∑ (Xiv − X vj )2. (48)
n2 i< j

Proof. To prove (47),

∑ (Xiv − X vj )(Yiv −Y jv) = ∑ (XivYiv − XivY jv − X vjYiv + X vjY jv )

i< j i< j

=∑ (XivYiv + X vjY jv ) − ∑(XivY jv + X vjYiv )

i< j i< j
n n n n
= (n − 1) ∑ XivYiv − [( ∑ Xiv )( ∑ Yiv ) − ∑ XivYiv ]
i=1 i=1 i=1 i=1
n n n
= n ∑ XivYiv − ( ∑ Xiv )( ∑ Yiv ) = n2 S (X v ,Y v ) .
i=1 i=1 i=1

(48) follows by replacing Yiv with Xiv and Yiv with X vj in the above calculations.

Lemma 3. Assume model (6)-(7). Assume in addition that S2 (X c ) = O(1) and

S2 (X r ) = O(1). Let â, b̂, µ̂ be the least squares solution defined in (13). Then

P â = a− |a ≥ 0 → 0,

P â = a+ |a < 0 → 0,

as n → ∞.
Linear Regression for Interval-Valued Data in KC (R) 143

Proof. We prove the case a ≥ 0 only. The case a < 0 can be proved similarly.
Under the assumption that a ≥ 0,
Cov (X c ,Y c) = aVar(X c ) ≥ 0,
and consequently, P (S (X c ,Y c) < 0) → 0. According to Theorem 3, the only
other circumstance under which â = a− is when S (X r ,Y r ) > S (X c ,Y c ) > 0 and
L (a+ , b+, µ+) > L (a−, b− , µ−) simultaneously. It is therefore sufficient to show
that
P S (X r ,Y r ) > S (X c ,Y c ) > 0, L a+, b+ , µ+ > L a−, b−, µ−

(49)
→ 0.
Notice
L a+ , b+ , µ+ − L a− , b− , µ−

1 n h + c c 2 − c
i
c 2
∑

= a Xi + b −Yi − a Xi + b −Yi
n i=1
1 n h 2 2 i
+ ∑ a+Xir + µ −Yir − a−Xir + µ −Yir
n i=1
1
:= (I + II) .
n
The first term
n h 2 2 i
I = ∑ a+Xic + b −Yic − a−Xic + b −Yic
i=1
n 2 2 2
+ +
∑ Xic − X c + Xic − X c

= a −a λi − λ −2 a −a λi − λ
i=1
n
2 2 2
− −
−∑ Xic − X c + Xic − X c

a −a λi − λ −2 a −a λi − λ
i=1
h 2 2 i n 2
= a+ − a − a− − a ∑ Xic − X c
i=1
n
−2 a+ − a −
∑ Xic − X c

λi − λ
i=1
" #
n 2 n
a+ − a −
a+ + a− − 2a ∑ Xic − X c −2 ∑ Xic − X c λi − λ .

=
i=1 i=1
144 Yan Sun and Chunyang Li

From this, and the assumption that S (X r ,Y r ) > S (X c ,Y c ) > 0, we see that I > 0
is equivalent to
+
a + a−
n
2 n
− a ∑ Xic − X c − ∑ Xic − X c λi − λ (50)
2 i=1 i=1
> 0.
On the other hand,
(50)
n n
S (X c ,Y c )

c c 2− c
∑ ∑

= 2 c − a X − X X − X c λ − λ
i i i
S (X ) + S2 (X r ) i=1 i=1
 
c − X c (λ − λ )
∑i< j X i j i j 2 r
S (X ) n 2
= 2 − a 2 c ∑ Xic − X c
 
2 2 r
S (X ) + S (X ) i=1

∑i< j Xic − X cj + ∑i< j Xir − X rj
n
− ∑ Xic − X c λi − λ
i=1
2
∑ni=1 Xic − X c c c
2 ∑ Xi − X j (λi − λ j )

= 2
∑i< j Xic − X cj + ∑i< j Xir − X rj i< j
n n
S2 (X r ) c 2
− ∑ Xic − X c λi − λ − a 2 c ∑ c

X i − X
i=1 S (X ) + S2 (X r ) i=1
2
" #
∑ni=1 Xic − X c
n
c

= 2 2 n ∑ Xi − X c λi − λ
c c r r
∑i< j Xi − X j + ∑i< j Xi − X j i=1

n n
S2 (X r ) 2
− ∑ Xic − X c λi − λ − a 2 c 2 r ∑ Xic − X c
i=1 S (X ) + S (X ) i=1
n
S2 (X c )

c
= ∑ Xi − X

c λi − λ −1
i=1 S2 (X c ) + S2 (X r )
n
S2 (X r ) c c 2
∑

−a Xi − X
S2 (X c) + S2 (X r ) i=1
S2 (X r ) 2 c c

=− n aS (X ) + S (X , λ) ,
S2 (X c ) + S2 (X r )
Linear Regression for Interval-Valued Data in KC (R) 145

where S (X c , λ) = 1n ∑ni=1 Xic − X c λi − λ denotes the sample covariance of

the random variables X c and λ, which converges to 0 almost surely by the inde-
pendence assumption. Therefore,
1 S2 (X r )
I = −2 a+ − a− 2 c
2 c c

aS (X ) + S (X , λ)
n S (X ) + S2 (X r )
→ C1 < 0 (51)
almost surely, as n → ∞.

By the similar calculation, we have that the second term

1 S2 (X c )
II = −2 |a+ | − |a−| 2 c
2 r
aS (X ) + S (X r , η)

n 2 r
S (X ) + S (X )
→ C2 < 0 (52)
almost surely, as n → ∞. (51) and (52) together imply that
P â = a− |a ≥ 0 → 0.

This completes the proof.

References
[1] Artstein, Z, & Vitale, R.A. (1975). A strong law of large numbers for ran-
dom compact sets. Annals of Probability, 5, 879-882.
[2] Aumann, R.J. (1965). Integrals of set-valued functions. J. Math. Anal.
Appl., 12,1-12.
[3] Billard, L., & Diday, E. (2000). Regression analysis for interval-valued
data. In: Data Analysis, Classification and Related Methods, Proceedings
of the Seventh Conference of the International Federation of Classification
Societies (IFCS’00). Springer, Berlin; 369-374.
[4] Billard, L., & Diday, E. (2002). Symbolic regression analysis. In: Classi-
fication, Clustering and Data Analysis, Proceedings of the Eighth Confer-
ence of the International Federation of Classification Societies (IFCS’02).
Springer, Berlin; 281-288.
146 Yan Sun and Chunyang Li

[5] Billard, L. (2007). Dependencies and variation components of symbolic

interval-valued data. In: Selected Contributions in Data Analysis and
Classification. Springer, Berlin Heidelberg; 3-12.

[6] Blanco-Fernández, A., Corral, N., & González-Rodríguez, G. (2011). Es-

timation of a flexible simple linear model for interval data based on set
arithmetic. Computational Statistics & Data Analysis, 55, 2568-2578.

[7] Blanco-Fernández, A., Colubi, A., & González-Rodríguez, G. (2012).

Confidence sets in a linear regression model for interval data. Journal of
Statistical Planning and Inference, 142, 1320-1329.

[8] Carvalho, F.A.T., Lima Neto, E.A., & Tenorio, C.P. (2004). A new method
to fit a linear regression model for interval-valued data. Lecture Notes in
Computer Sciences, 3238, 295-306.

[9] Cattaneo, M.E.G.V., & Wiencierz, A. (2012). Likelihood-based imprecise

regression. International Journal of Approximate Reasoning, 53, 1137-
1154.

[10] Diamond, P. (1990). Least squares fitting of compact set-valued data. J.

Math. Anal. Appl., 147, 531-544.

[11] Gil, M.A., Lopez, M.T., Lubiano, M.A., & Montenegro, M. (2001). Re-
gression and correlation analyses of a linear relation between random in-
tervals. Test,10, 183-201.

[12] Gil, M.A., Lubiano, M.A., Montenegro, M., & Lopez, M.T. (2002). Least
squares fitting of an affine function and strength of association for interval-
valued data. Metrika, 56, 97-111.

[13] Gil, M.A., González-Rodríguez, G., Colubi, A., & Montenegro, M.

(2007). Testing linear independence in linear models with interval-valued
data. Computational Statistics & Data Analysis, 51, 3002-3015.

[14] González-Rodríguez, G., Blanco, A., Corral, N., & Colubi, A. (2007).
Least squares estimation of linear regression models for convex compact
random sets. Advances in Data Analysis and Classification, 1, 67-81.
Linear Regression for Interval-Valued Data in KC (R) 147

[15] Hörmander, H. (1954). Sur la fonction d’appui des ensembles convexes

dans un espace localement convexe. Arkiv för Mat, 3, 181-186.

[16] Hukuhara, M. (1967). Integration des applications mesurables dont la

valeur est un compact convexe. Funkcialaj Ekvacioj, 10, 205-223.

[17] Kendall, D.G. (1974). Foundations of a theory of random sets. In: Harding
EF, & Kendall DG (Eds), Stochastic Geometry. New York: John Wiley &
Sons.

[18] Körner, R. (1995). A variance of compact convex random sets. Institut für
Stochastik, Bernhard-von-Cotta-Str. 2 09599 Freiberg.

[19] Körner, R. (1997). On the variance of fuzzy random variables. Fuzzy Sets
and Systems, 92, 83-93.

[20] Körner, R., & Näther, W. (1998). Linear regression with random fuzzy
variables: extended classical estimates, best linear estimates, least squares
estimates. Information Sciences, 109, 95-118.

[21] Lyashenko, N.N. (1982). Limit theorem for sums of independent compact
random subsets of Euclidean space. Journal of Soviet Mathematics, 20,
2187-2196.

[22] Lyashenko, N.N. (1983). Statistics of random compacts in Euclidean

space. Journal of Soviet Mathematics, 21, 76-92.

[23] Manski, C.F., & Tamer, T. (2002). Inference on regressions with interval
data on a regressor or outcome. Econometrica, 70, 519-546.

[24] Matheron, G. (1975). Random Sets and Integral Geometry. New York:
John Wiley & Sons.

[25] Molchanov, I. (2005). Theory of Random Sets. London: Springer.

[26] Lima Neto, E.A., & Carvalho, F.A.T. (2008). Centre and range method for
fitting a linear regression model to symbolic interval data. Computational
Statistics & Data Analysis, 52, 1500-1515.
148 Yan Sun and Chunyang Li

[27] Lima Neto, E.A., & Carvalho, F.A.T. (2010). Constrained linear regression
models for symbolic interval-valued variables. Computational Statistics &
Data Analysis, 54,333-347.

[28] Rȧdström, H. (1952) An embedding theorem for spaces of convex sets.

Proc. Amer. Math. Soc., 3, 165-169.
In: Linear Regression ISBN: 978-1-53611-992-3
Editor: Vera L. Beck © 2017 Nova Science Publishers, Inc.

Chapter 4

LINEAR REGRESSION VERSUS NON-LINEAR

REGRESSION IN MATHEMATICAL
MODELING OF ADSORPTION PROCESSES

Gabriela-Nicoleta Moroi*, PhD

Laboratory of Polyaddition and Photochemistry
“Petru Poni” Institute of Macromolecular Chemistry
Iaşi, Romania

ABSTRACT

In mathematical modeling of adsorption processes, linear and/or non-

linear regression analysis may be employed. In adsorption isotherm
modeling, non-linear regression has lately been reported by some authors
to provide a better fit to experimental data than linear regression.
Isotherm models used in describing the adsorption systems, criteria
selected to evaluate isotherm model validity as well as modeling results
are comparatively discussed.
In our investigation on modeling of adsorption of heavy metal ions
onto surface-functionalized polymer beads, linear and non-linear

*
Corresponding Author: [email protected].
150 Gabriela-Nicoleta Moroi

regressions were employed for each of the isotherm models considered to

describe the equilibrium data. To reliably assess model validity, various
error functions (whose mathematical expressions contain the number of
experimental measurements, the numbers of independent variables and
parameters in the regression equation as well as the measured and
predicted equilibrium adsorption capacities) were used. The modeling
results obtained by employing the two regression methods were
compared. For the adsorption of each metal ion species, it was revealed
that (a) for a particular isotherm model, the regression providing the best
fit is linear, non-linear or both linear and non-linear, and (b) the order of
isotherm model validities indicated via linear regression is the same with
that shown by non-linear regression.

Keywords: adsorption isotherm modeling, linear regression, non-linear

regression, heavy metal ions, surface-functionalized polymer beads,
ionic liquid-like functionalities

INTRODUCTION

Heavy metals are highly toxic environmental pollutants making a great

impact on human health and environment quality, which are usually
introduced into natural water resources by wastewaters resulting from
industrial activities; therefore, removal of heavy metals from contaminated
waters is an absolute necessity for public health protection and
environmental conservation (Sigel et al. 2013; Casas and Sordo 2006).
Adsorption is one of the most popular procedures used in wastewater
treatment for preventing environmental contamination. Materials with
adsorption ability towards metals can be obtained by chemically
immobilizing functional groups onto polymeric supports; e.g., styrene-
divinylbenzene copolymer beads with ionic liquid-like functionalities
(1-methyl-3-methylimidazolium chloride) covalently attached onto their
surface (ILLF-SDVB) were synthesized and employed to remove heavy
metal ions from aqueous solutions (Moroi et al. 2016; Moroi 2012; Bilba
et al. 2007; Moroi et al. 2006; Bilba et al. 2006; Moroi et al. 2004; Bilba
et al. 2004; Moroi et al. 2001).
Linear Regression versus Non-Linear Regression … 151

Generally, the equilibrium state of an adsorption process is

characterized by the relationship between the amount of adsorbate being
adsorbed and the amount of adsorbate remaining in solution. An
experimental equilibrium isotherm, i.e., equilibrium adsorption capacity
(qe) versus equilibrium adsorbate concentration in solution (Ce) plot,
reflects the change of adsorbate distribution between adsorbent and
solution as Ce increases, at constant temperature and pH; such an isotherm
may be analyzed by various isotherm models for determining which model
provides the best mathematical description of experimental data and the
best prediction of adsorption parameters. Finding the best-fitting model is
of great importance since the thermodynamic assumptions and parameter
estimates give information on adsorbent surface properties, adsorbent-
adsorbate affinity and adsorption mechanism that are useful for optimizing
adsorption system design. In mathematical modeling of equilibrium
adsorption isotherms, linear and/or non-linear regression analysis may be
employed. A large variety of modeling approaches are used that differ
from each other as regards the number and type of (a) isotherm models
considered (Langmuir, Freundlich, Dubinin–Radushkevich, Temkin,
Flory–Huggins, Hill, Redlich–Peterson, Sips, Koble–Corrigan, Toth etc.),
(b) error functions minimized/maximized, (c) error functions calculated
and (d) criteria based on the calculated error functions that are employed to
assess isotherm model validity (Foo and Hameed 2010, Han et al. 2009,
Ho et al. 2002).

COMMENTS ON ARTICLES STATING THAT NON-LINEAR

REGRESSION IS BETTER THAN LINEAR REGRESSION
IN ADSORPTION ISOTHERM MODELING

The following observations were made on several articles that compare

the results of linear regression and non-linear regression in modeling of
adsorption isotherms. The names of isotherm parameters and error
functions are used inconsistently, which may cause confusion (Armagan
152 Gabriela-Nicoleta Moroi

and Toprak 2013, Kumar 2006). Some mathematical expressions of

isotherm models and error functions are written incorrectly or not shown at
all, making questionable the accuracy of modeling results (Brdar et al.
2012, Kumar et al. 2008). The main observation regards the statement that
non-linear regression provides better results than linear regression, which
however is in disagreement with the presented data; some examples of
such discrepancies are given below using the names and abbreviations
employed in the articles.
In the study of Cu(II) adsorption onto lignin by linear and non-linear
regression analysis of Freundlich, Langmuir and RedlichPeterson
isotherm models, two different modeling approaches are employed for
each isotherm model: in linear regression, least square method is used to
calculate one value of r2 and one value of chi-square test, whereas in non-
linear regression, the values of five error functions, i.e., ERRSQ, HYBRD,
MPSE, ARE and EABS, are minimized to obtain five values of r2 and five
values of chi-square test (Brdar et al. 2012). It is noted that the statement
that non-linear regression is better than linear regression is not supported
by r2 and chi-square test values, these indicating, on the contrary, that
linear regression is comparatively better; e.g., in the case of Redlich
Peterson isotherm model, on one hand, r2 value for linear regression is
higher than the following r2 values for non-linear regression: each value
corresponding to HYBRD, MPSE, ARE and EABS, the average value of
the three higher values that correspond to ERRSQ, HYBRD and MPSD
and the average value of the five values corresponding to each error
function and, on the other hand, chi-square test value for linear regression
is lower than the following chi-square test values for non-linear regression:
each value corresponding to ERRSQ, MPSE, ARE and EABS, the average
value of the three lower values that correspond to ERRSQ, HYBRD and
MPSD and the average value of the five values corresponding to each error
function. The very good fit to experimental data of linearized Redlich–
Peterson isotherm model is also graphically revealed, whereas such a
figure is not shown for non-linearized Redlich–Peterson isotherm model.
In the comparative investigation of linear and non-linear regressions to
estimate isotherm parameters for adsorption of malachite green onto
Linear Regression versus Non-Linear Regression … 153

activated carbon, it is stated that linear method is inappropriate for

describing adsorption isotherm and it is better to use non-linear method
(“which have a uniform error distribution (irrespective of the linear form)
for the whole range of experimental data”) (Kumar 2006). However, of all
24 values of r2 calculated by using linear regression (the experimental data
obtained at four temperatures being analyzed by Langmuir in four forms,
Freundlich and Redlich-Peterson isotherm models), 15 values, representing
more than half of the total number of values, are higher than the
corresponding values calculated by employing non-linear regression.
In studying the adsorption of methylene blue onto activated carbon by
Langmuir, Freundlich and RedlichPetersen isotherm models, different
approaches are employed for linear regression (r2 and least squares method)
and non-linear regression (six error functions, i.e., r2, ERRSQ, HYBRID,
MPSD, ARE and EABS, and a trial and error method) (Kumar 2008). It is
stated that non-linear regression is a better way compared with linear
regression to obtain isotherm parameters and select the optimum isotherm
(”as sometime linearization of non-linear experimental data may distort the
error distribution structure of isotherm”). However, the same conclusion is
reached by the two regression methods, i.e., that this adsorption process is
“well represented by both Langmuir and Redlich Peterson isotherm.”
In isotherm modeling of NaCN adsorption onto activated carbon by
using six isotherm models (Langmuir, Freundlich, DubininRadushkevich,
Temkin, Redlich–Peterson and Koble–Corrigan) and three error functions
(R2, MPSD and HYBRID), it is stated that non-linear regression is better
than linear regression for predicting isotherm parameters (Salarirad and
Behnamfard 2011). However, of 18 pairs of values of all error functions
obtained by using linear and non-linear regressions for all isotherm models
considered, in only 8 pairs, i.e., in less than half of the total number of
pairs, the value provided by non-linear regression is better than that given
by linear regression, whereas in 9 pairs, on the contrary, linear regression
value is better than non-linear regression value and, in one pair, the linear
and non-linear regression values are equal.
154 Gabriela-Nicoleta Moroi

The above comments highlight the need for reliable, consistent

approaches to assess the performance of linear and non-linear regressions
in accurately describing adsorption processes by using isotherm models.

MODELING BY LINEAR AND NON-LINEAR REGRESSIONS

OF EQUILIBRIUM ISOTHERMS IN ADSORPTION OF
HEAVY METAL IONS ONTO SURFACE-FUNCTIONALIZED
POLYMER BEADS

In performing Cd(II) and Pb(II) (Me) adsorption onto ILLF-SDVB

beads, aqueous solutions of the toxic cadmium nitrate, Cd(NO3)24H2O,
and lead nitrate, Pb(NO3)2 were used. Batch experiments were carried out
at 20ºC, pH of 5, adsorbent dose of 4.100 g L1, contact time of 24 h and
different initial Me concentrations in solution (C0) varying in the ranges of
0.2002.810 and 0.1061.700 mmol L1 for Cd(II) and Pb(II), respectively;
equilibrium Me concentrations in solution (Ce) were spectrophoto-
metrically determined (Moroi et al. 2016). Adsorption performance was
evaluated by equilibrium adsorption capacity (qe):

qe 
C 0  C e  V (mmol g1) (1)
m

where C0 and Ce  the initial and equilibrium adsorbate (Me)

concentrations in solution, respectively (mmol L1); V  the volume of
solution (L); m  the mass of adsorbent (g).
As already shown, in experimental equilibrium isotherm modeling,
non-linear regression has been reported by some authors to provide a better
fit to experimental data than linear regression; to determine whether this
statement is valid for adsorption of Cd(II) and Pb(II) onto ILLF-SDVB
beads, both linear and non-linear regressions were employed in the present
study to analyze the equilibrium data by the two-parameter Langmuir,
Linear Regression versus Non-Linear Regression … 155

Temkin and Freundlich isotherm models. For consistency reasons, the

same approach was used for both regressions: the same nine error
functions were calculated by minimizing the sum of squared errors (SSE)
and the same two criteria based on these error functions were employed to
establish isotherm model validities. The main features of the three isotherm
models used in mathematical modeling of Cd(II) and Pb(II) adsorption are
presented below.
Langmuir isotherm model is based on the assumption of a
homogeneous adsorption with monomolecular layer coverage of a surface
with a finite number of energetically equivalent sites, one site being
occupied by only one adsorbate species; there is no interaction among
adsorbed species and no transmigration of adsorbed species in the surface
plane (Langmuir 1916). Four linear forms, whose plots are Ce/qe versus Ce,
1/qe versus 1/Ce, qe versus qe/Ce and qe/Ce versus qe (Llin1, Llin2, Llin3 and
Llin4, respectively), and non-linear form (Lnonlin) of Langmuir isotherm
are displayed below:

- Llin1:

Ce 1 1
 Ce  (2)
qe qm KL qm

- Llin2:

1 1 1 1 (3)
 
qe K Lqm Ce qm

- Llin3:

1 qe
qe    qm (4)
K L Ce
156 Gabriela-Nicoleta Moroi

- Llin4:

qe
  KL qe  K L qm (5)
Ce

- Lnonlin:

qm K L Ce (6)
qe 
1 K L Ce

where qe  the measured equilibrium adsorption capacity (mmol g1); Ce 

the measured equilibrium adsorbate concentration in solution (mmol L1);
qm  the Langmuir isotherm constant representing the maximum adsorption
capacity (complete monolayer coverage) (mmol g1) and KL  the
Langmuir isotherm constant (adsorbent-adsorbate affinity parameter)
related to binding energy (L mmol1).
By employing Langmuir constant KL, two important adsorption
parameters are assessed:

- the dimensionless separation factor (RL), whose value may be 0,

between 0 and 1, 1 or above 1, indicating whether adsorption nature is
irreversible, favorable, linear or unfavorable, respectively (McKay et al.
1982, Wasewar 2010):

1 (7)
R L 
1  K L C 0h

where KL  the Langmuir isotherm constant (L mmol1) and C0h  the

highest initial concentration of adsorbate in solution (mmol L−1)

- the Gibbs free energy change (G0) indicates adsorption feasibility

and spontaneous nature (He et al. 2010):
Linear Regression versus Non-Linear Regression … 157

G 0   R T lnK L (J mol1) (8)

where KL  the Langmuir isotherm constant (L mol1); R  the universal

gas constant (8.314 J mol1 K1) and T  the absolute temperature (K).
Temkin isotherm model assumes that adsorption heat of molecules
decreases linearly with increasing surface coverage due to adsorbent-
adsorbate interactions, adsorption being characterized by a uniform
distribution of binding energies up to a maximum energy value; a good fit
of Temkin isotherm to experimental equilibrium data reveals the
occurrence of chemisorption (Temkin 1941; Foo and Hameed 2010;
Boparai et al. 2011). Linear form, whose plot is qe versus ln C e  (Tlin),
and non-linear form (Tnonlin) of Temkin isotherm are shown nextly:

- Tlin:

RT RT (9)
qe  ln C e   ln K T 
bT bT

- Tnonlin:

R T (10)
qe  ln K T Ce 
bT

where qe  the measured equilibrium adsorption capacity (mmol g1);

Ce  the measured equilibrium adsorbate concentration in solution (mmol
L1); bT  the Temkin isotherm constant related to adsorption heat (kJ
mol1); KT  the Temkin equilibrium binding constant corresponding to the
maximum binding energy (L mmol1); R  the universal gas constant
(8.314 J mol1 K1) and T  the absolute temperature (K).
Freundlich isotherm model hypothesizes a multiple layer adsorption on
an energetically heterogeneous surface and a logarithmic decrease in
adsorption energy with increasing surface coverage (Freundlich 1906; Ho
158 Gabriela-Nicoleta Moroi

and McKay 1998). Linear form, whose plot is lnqe  versus ln C e  (Flin),
and non-linear form (Fnonlin) of Freundlich isotherm are presented below:

- Flin:

1
ln q e   ln C e   ln K F  (11)
n

- Fnonlin:

1n
q e  K F Ce (12)

where qe  the measured equilibrium adsorption capacity (mmol g1); Ce 

the measured equilibrium adsorbate concentration in solution (mmol L1);
KF  the Freundlich isotherm constant indicative of adsorption capacity
(mmol11/n L1/n g1) and 1/n  Freundlich isotherm constant related to
adsorption intensity, representing a measure of surface energetic
heterogeneity.
In mathematical modeling of experimental equilibrium isotherm data
of Cd(II) and Pb(II) adsorption onto ILLF-SDVB beads, parameter
estimates for Langmuir, Temkin and Freundlich models were calculated by
using linear and non-linear least-squares regression analysis, minimizing
SSE:

n
SSE   q e  q̂ e i2 (13)
i 1

where n  the number of experimental measurements and qe and q̂e  the

measured and predicted equilibrium adsorption capacities, respectively.
Linear Regression versus Non-Linear Regression … 159

For reliably assessing the validity of each linear and non-linear

isotherm model form for the studied adsorption systems (i.e., the goodness
of fit between q̂e and qe), nine error functions, which are either relative or
absolute, were employed (Ho et al. 2002; Foo and Hameed 2010). The
mathematical expressions of these error functions are presented next.

- average relative error (ARE):

100 n q e  q̂ e
ARE  
n i 1 q e
(14)
i

- average absolute error (EABS):

1 n
EABS   q e  q̂ e
n i 1
(15)
i

- sum of the absolute errors (SAE):

n
SAE   q e  q̂ e (16)
i 1 i

- root mean square error (RMSE):

1 n
RMSE   q e  q̂ e i2 (17)
n i 1

- Marquardt’s percent standard deviation (MPSD):

2
1 n  q e  q̂ e 
MPSD  100 
n  p i 1  q e
 (18)
i
160 Gabriela-Nicoleta Moroi

- average relative standard error (ARSE):

2
1 n  q e  q̂ e 
ARSE  
n  1 i 1  q e
 (19)
i

- hybrid fractional error function (HYBRID):

100 n  q e  q̂ e  
2

HYBRID   
n  p i 1  qe

 i
(20)

- chi-square test (CST):

n 
q  q̂ 2 
CST    e e  (21)
 q̂ e
i 1   i

- adjusted coefficient of determination (ADRSQ):

ADRSQ

 1  (1  R 2 ) 
n 1  (22)
 n  ( k  1 ) 

where n  the number of experimental measurements; k and p  the

numbers of independent variables and parameters, respectively, in the
regression equation; qe, q̂e and qe  the measured, predicted and average
measured equilibrium adsorption capacities, respectively, and R2  the
coefficient of determination:

 q̂  q e i
2
e
R2  i 1 (23)
n n

 q    q̂ e  q 
2 2
e  q̂ e i e i
i 1 i 1
Linear Regression versus Non-Linear Regression … 161

For all error functions except ADRSQ, the lower the value, the closer
the match between q̂e and qe; for ADRSQ, whose values may vary from 0
to 1, a higher value indicates that q̂e more closely match qe.
After determining the values of all error functions for all linear and
non-linear isotherm model forms, the following calculations were
performed for comparison reasons:

- for every error function, percent deviation (EPD) of each of its values
(E) with respect to the best of these values (E0, which is the maximum
value for ADRSQ and the minimum value for the other error functions)
was determined:

E  E0
EPD  100 (%) (24)
E0

- for each isotherm model form, the sum of EPD values of all error
functions (SEPD) was calculated:

9
SEPD   EPD (%) (25)
i 1

The validity of an isotherm model was estimated by employing two

criteria that take into account all nine error functions, knowing that good
criteria of validity are those based on a combination of relative and
absolute error functions (Legates and McCabe 1999). The first criterion is
the number of error functions having the minimum among EPD values of
compared isotherm models (EFmin) and the second criterion is the SEPD
value. Thus, the greater the number of EFmin and the smaller the value of
SEPD, the better the model validity.
For Cd(II) and Pb(II) adsorption onto ILLF-SDVB beads, linear
(Figures 1 and 2, respectively) and non-linear (Figures 3 and 4,
respectively) forms of Langmuir, Temkin and Freundlich isotherm models
162 Gabriela-Nicoleta Moroi

were considered (Moroi et al. 2016). Plots of experimental equilibrium

data for both Me species reveal an increase in qe with increasing Ce
(Figures 3 and 4); these L-type isotherms indicate chemisorption, reflecting
a high adsorbate-adsorbent affinity (Bradl 2004). Isotherm parameter
estimates as well as RL and G0 values were determined for adsorption of
Cd(II) and Pb(II) (Tables 1 and 2, respectively). The values of error
functions for Cd(II) and Pb(II) adsorption differ largely from each other,
being spread over a wide numerical range (Tables 3 and 4, respectively); to
reach reliable conclusions on the validity of isotherm models by analyzing
comparable data, EPD and SEPD values were calculated (considering only
the isotherm forms included in each table/figure) (Tables 5 and 6,
respectively, and Figures 5 and 6, respectively).

Table 1. Values of isotherm model parameters for Cd(II) adsorption

Isotherm model Adsorption parameters

Langmuir qm KL RL G0a
1
(mmol g ) (L mmol1) (kJ mol1)
Llin1 0.112 3.60 0.581 19.96
Llin2 0.116 3.42 0.594 19.83
Llin3 0.109 4.04 0.553 20.24
Llin4 0.112 3.68 0.576 20.02
Lnonlin 0.111 3.87 0.564 20.14
Temkin bT KT
(kJ mol1) (L mmol1)
Tlin 110.03 45.17
Tnonlin 110.02 45.16
Freundlich KF 1/n
(mmol11/n L1/n g1)
Flin 0.081 0.372
Fnonlin 0.081 0.312
a
For C0h = 2.810 mmol L1.
Linear Regression versus Non-Linear Regression … 163

(a) (b)

(e) (f)

Figure 1. Linear Langmuir Llin1 (a), Langmuir Llin2 (b), Langmuir Llin3 (c),
Langmuir Llin4 (d), Temkin (e) and Freundlich (f) isotherms for Cd(II) adsorption.
164 Gabriela-Nicoleta Moroi

(a) (b)

(e) (f)

Figure 2. Linear Langmuir Llin1 (a), Langmuir Llin2 (b), Langmuir Llin3 (c),
Langmuir Llin4 (d), Temkin (e) and Freundlich (f) isotherms for Pb(II) adsorption.
Linear Regression versus Non-Linear Regression … 165

Figure 3. Experimental data and non-linear Langmuir, Temkin and Freundlich

isotherms for Cd(II) adsorption.

Figure 4. Experimental data and non-linear Langmuir, Temkin and Freundlich

isotherms for Pb(II) adsorption.
166 Gabriela-Nicoleta Moroi

Table 2. Values of isotherm model parameters for Pb(II) adsorption

Isotherm model Adsorption parameters

Langmuir qm KL RL G0a
1
(mmol g ) (L mmol1) (kJ mol1)
Llin1 0.079 12.96 0.421 23.08
Llin2 0.070 24.00 0.282 24.58
Llin3 0.072 22.13 0.299 24.39
Llin4 0.074 20.52 0.315 24.20
Lnonlin 0.075 17.00 0.357 23.74
Temkin bT KT
(kJ mol1) (L mmol1)
Tlin 193.43 336.12
Tnonlin 193.45 336.30
Freundlich KF 1/n
(mmol11/n L1/n g1)
Flin 0.076 0.280
Fnonlin 0.074 0.241
a
For C0h = 1.700 mmol L1.

Before performing any comparative analysis of linear and non-linear

forms of the three isotherm models, the best among the four linear
Langmuir isotherm forms to be used in subsequent comparisons must be
determined for adsorption of each Me species. For Cd(II) adsorption, Llin1
presents four EFmin (ARE, EABS, SAE and ADRSQ), Llin4 displays four
too (RMSE, MPSD, ARSE and HYBRID), Llin3 exhibits one (CST) and
Llin2 none, while SEPD value of Llin1 (13.65%) is smaller than that of
Llin4 (24.91%), therefore, Llin1 is the selected form (Table 5). For Pb(II)
adsorption, Llin4 is the best since it has six EFmin (ARE, EABS, SAE,
RMSE, HYBRID and CST) compared with two (MPSD and ARSE) for
Llin3, one (ADRSQ) for Llin1 and none for Llin2, as well as the lowest
SEPD value (17.47%) (Table 6).
Subsequently, linear and non-linear forms of the same isotherm model
are compared for adsorption of Cd(II) and Pb(II) (Figures 5 and 6,
respectively). Regarding Langmuir isotherm, for Cd(II) adsorption, Llin1
Linear Regression versus Non-Linear Regression … 167

is better than Lnonlin as the former has six EFmin (ARE, EABS, SAE,
MPSD, ARSE and ADRSQ), whereas the latter has only three (RMSE,
HYBRID and CST), and SEPD value of the former (20.42%) is smaller
than that of the latter (36.54%); for Pb(II) adsorption, Llin4 is better than
Lnonlin as indicated by seven EFmin (ARE, EABS, SAE, MPSD, ARSE,
HYBRID and CST) versus two (RMSE and ADRSQ) and a smaller SEPD
value (1153 versus 1377%). Concerning Freundlich isotherm, for Cd(II)
adsorption, Fnonlin is better than Flin since it has six EFmin (ARE, EABS,
SAE, RMSE, CST and ADRSQ) versus three (MPSD, ARSE and
HYBRID) and a smaller SEPD value (1282 versus 1300%); for Pb(II)
adsorption, Fnonlin compared with Flin presents six EFmin (ARE, EABS,
SAE, RMSE, CST and ADRSQ) versus three (MPSD, ARSE and
HYBRID) and a slightly larger SEPD value (1737 versus 1731%). As
regards Temkin isotherm, for Cd(II) adsorption, Tlin and Tnonlin have
equal error values and, consequently, identical EPD values and the same
SEPD value (100.9%); for Pb(II) adsorption, Tlin and Tnonlin have
practically the same error values and, as a consequence, very similar to
each other (0 or very close to 0) EPD and SEPD values. It is noteworthy
that, for adsorption of the two Me species, modeling results provided by
linear regression are better (Langmuir isotherm) than or very similar
(Temkin isotherm) to those offered by non-linear regression, which is in
agreement with previously published data (Ho et al. 2002). Regression
giving the best fit differs from one model to another, being linear, non-
linear and both linear and non-linear for Langmuir, Freundlich and Temkin
isotherms, respectively.
Then, a comparison is made among linear forms of all isotherms for
adsorption of each Me species. For Cd(II) adsorption, Llin1 is the best with
seven EFmin (ARE, EABS, SAE, MPSD, ARSE, HYBRID and ADRSQ),
while Tlin has two (RMSE and CST) and Flin none, SEPD values for
Llin1, Tlin and Flin being 20.42, 100.9 and 1300%, respectively (Figure
5). For Pb(II) adsorption, the best is Tlin, which has nine EFmin and the
smallest SEPD value (0 versus 1153% for Llin4 and 1731% and Flin)
(Figure 6).
Table 3. Values of error functions for Cd(II) adsorption

Error Error function value

functiona Langmuir isotherm Temkin isotherm Freundlich isotherm
Llin1 Llin2 Llin3 Llin4 Lnonlin Tlin Tnonlin Flin Fnonlin
ARE 4.67 5.36 5.69 4.90 5.26 5.62 5.62 11.19 11.04
EABS 0.00282 0.00346 0.00336 0.00289 0.00300 0.00285 0.00285 0.00673 0.00529
SAE 0.0197 0.0242 0.0235 0.0203 0.0210 0.0199 0.0199 0.0471 0.0371
RMSE 0.00371 0.00425 0.00369 0.00367 0.00363 0.00350 0.00350 0.00741 0.00612
MPSD 7.77 8.25 8.18 7.76 7.97 9.96 9.96 16.05 21.18
ARSE 0.0710 0.0753 0.0747 0.0708 0.0728 0.0910 0.0910 0.147 0.193
HYBRID 0.0332 0.0403 0.0329 0.0325 0.0322 0.0368 0.0368 0.124 0.138
CST 0.00187 0.00217 0.00170 0.00177 0.00168 0.00178 0.00178 0.00611 0.00529
ADRSQ 0.997 0.971 0.894 0.894 0.976 0.978 0.978 0.896 0.932
a
For each error function, the bold value is E0 for EPD calculation in Figure 5, i.e., the best among linear and non-linear form values of all isotherms
(Llin1 is selected from the four linear Langmuir isotherm forms).
Table 4. Values of error functions for Pb(II) adsorption

Error Error function value

functiona Langmuir isotherm Temkin isotherm Freundlich isotherm
Llin1 Llin2 Llin3 Llin4 Lnonlin Tlin Tnonlin Flin Fnonlin
ARE 9.15 6.35 6.19 6.00 7.45 3.45 3.45 8.42 8.00
EABS 0.00327 0.00388 0.00353 0.00316 0.00332 0.00171 0.00171 0.00424 0.00326
SAE 0.0229 0.0272 0.0247 0.0221 0.0233 0.0120 0.0120 0.0297 0.0228
RMSE 0.00429 0.00474 0.00423 0.00403 0.00376 0.00198 0.00198 0.00462 0.00392
MPSD 16.21 8.98 8.75 9.13 11.06 4.63 4.64 10.96 14.60
ARSE 0.148 0.0820 0.0798 0.0834 0.101 0.0423 0.0424 0.100 0.133
HYBRID 0.0761 0.0494 0.0427 0.0423 0.0446 0.00997 0.00999 0.0541 0.0589
CST 0.00470 0.00248 0.00201 0.00190 0.00227 0.00050 0.00050 0.00269 0.00251
ADRSQ 0.998 0.984 0.912 0.912 0.950 0.986 0.986 0.942 0.946
a
For each error function, the bold value is E0 for EPD calculation in Figure 6, i.e., the best among linear and non-linear form values of all isotherms
(Llin4 is selected from the four linear Langmuir isotherm forms).
Table 5. Error percent deviations (EPD) and EPD sums (SEPD) of linear Langmuir isotherm forms
for Cd(II) adsorption

Isotherm EPD (%) SEPD

model ARE EABS SAE RMSE MPSD ARSE HYBRID CST ADRSQ (%)
Llin1 0 0 0 1.09 0.13 0.28 2.15 10.00 0 13.65
Llin2 14.78 22.70 22.84 15.80 6.31 6.36 24.00 27.65 2.61 143.1
Llin3 21.84 19.15 19.29 0.545 5.41 5.51 1.23 0 10.33 83.31
Llin4 4.93 2.48 3.05 0 0 0 0 4.12 10.33 24.91

Table 6. Error percent deviations (EPD) and EPD sums (SEPD) of linear Langmuir isotherm forms
for Pb(II) adsorption

Isotherm EPD (%) SEPD

model ARE EABS SAE RMSE MPSD ARSE HYBRID CST ADRSQ (%)
Llin1 52.50 3.48 3.62 6.45 85.26 85.46 79.91 147.4 0 464.1
Llin2 5.83 22.78 23.08 17.62 2.63 2.76 16.78 30.53 1.40 123.4
Llin3 3.17 11.71 11.76 4.96 0 0 0.95 5.79 8.62 49.96
Llin4 0 0 0 0 4.34 4.51 0 0 8.62 17.47
Linear Regression versus Non-Linear Regression … 171

Afterwards, non-linear forms of all isotherms for adsorption of the two

Me species are compared. For Cd(II) adsorption, Lnonlin with five EFmin
(ARE, MPSD, ARSE, HYBRID and CST) is better than Tnonlin with four
(EABS, SAE, RMSE and ADRSQ) and Fnonlin with none; the same
ranking of Lnonlin, Tnonlin and Fnonlin is indicated by SEPD values of
36.54, 100.9 and 1282%, respectively (Figure 5). For Pb(II) adsorption, all
EFmin belong to Tnonlin, which also has the smallest SEPD value (0.66
versus 1377 and 1737% for Lnonlin and Fnonlin, respectively) (Figure 6).

(a)

(b)

Figure 5. Error percent deviations (EPD) (a) and EPD sums (b) of linear and non-linear
Langmuir, Temkin and Freundlich isotherms for Cd(II) adsorption.
172 Gabriela-Nicoleta Moroi

(a)

(b)

Figure 6. Error percent deviations (EPD) (a) and EPD sums (b) of linear and non-linear
Langmuir, Temkin and Freundlich isotherms for Pb(II) adsorption.

The order of isotherm model validities for Cd(II) adsorption, indicated

by both linear and non-linear regressions, is Langmuir > Temkin >
Freundlich; the values of EPD (except those of ADRSQ) for Llin1,
Linear Regression versus Non-Linear Regression … 173

Lnonlin, Tlin and Tnonlin are below 30%, whereas most of those for Flin
and Fnonlin lie within the range of 135330% (Figure 5). For Pb(II)
adsorption, isotherm validity order revealed by using linear regression is
the same with that indicated by non-linear regression, i.e., Temkin 
Langmuir  Freundlich; EPD (except ADRSQ) values for Tlin and Tnonlin
are equal or very close to 0, whereas those for Llin4, Lnonlin, Flin and
Fnonlin range mostly from 130 to 490% (Figure 6). It is worth
emphasizing that, for adsorption of both Me species, the descending order
of isotherm model validities established by linear regression is identical
with that determined via non-linear regression. It is noted that, among all
linear and non-linear isotherm model forms considered, i.e., Llin1/Llin4,
Lnonlin, Flin, Fnonlin, Tlin and Tnonlin, the highest validity is presented,
for Cd(II) adsorption, by linear form Llin1 and, for Pb(II) adsorption, to
practically the same extent by linear form Tlin and non-linear form
Tnonlin.
The analysis of isotherm parameter values predicted by mathematical
modeling gives useful information on adsorption of Cd(II) and Pb(II)
(Tables 1 and 2, respectively). The values of qm for Cd(II) and Pb(II)
adsorption (0.112 and 0.074 mmol g1, respectively) are close to the
highest corresponding qe values (0.100 and 0.075 mmol g−1, respectively);
the qm value for Cd(II) is larger than that for Pb(II), as expected
considering qe values. The binding energy towards Pb(II) is higher than
that towards Cd(II), as indicated by the larger KL value for the former Me
species (20.52 L mmol1) compared with that for the latter (3.60 L
mmol1). Temkin isotherm fits well (to similar extents when using linear
and non-linear regressions) the experimental data, confirming that Me
chemisorption takes place. Strong interactions Meadsorbent consistent
with chemisorption are indicated by the high values of Temkin parameters
bT and KT (those estimated by linear regression are very close to the
corresponding ones determined by non-linear regression for both Me
adsorption); parameter values for Pb(II) are larger than the corresponding
ones for Cd(II), pointing out that the forces binding the former Me species
are stronger than those holding the latter, which is in agreement with what
KL values indicate (Zafar et al. 2007). Of the three models, Freundlich
174 Gabriela-Nicoleta Moroi

isotherm gives the poorest fit to experimental data for each Me species,
excluding the possibility that multilayer adsorption takes place and further
confirming the occurrence of chemisorption that results in monolayer
coverage of adsorbent surface (McKay 1995). The values of 1/n (0.312 and
0.241 for Cd(II) and Pb(II), respectively) are comprised within the range
01, showing favorable conditions for Me adsorption and therefore easy
Me removal from aqueous solutions (Subramanyam and Das 2009;
Hamdaoui and Naffrechoux 2007). The KF value for Cd(II) is higher than
that for Pb(II) (0.081 and 0.074 mmol11/n L1/n g1, respectively), which is
in accordance with the larger qm value of the former Me species compared
with that of the latter. The values of ΔG0 are negative for adsorption of
both Me species (19.96 and 24.20 kJ mol1 for Cd(II) and Pb(II),
respectively), indicating the feasibility and spontaneous nature of
adsorption (Boparai et al. 2011). The RL values (0.581 and 0.315 for Cd(II)
and Pb(II), respectively) lying within the 01 range point out that
adsorption is favorable, revealing that ILLF-SDVB beads constitute a good
adsorbent for the two Me species.

CONCLUSION

Both linear and non-linear regressions were used in mathematical

modeling of adsorption of the heavy metal ions Cd(II) and Pb(II) from
aqueous solutions onto surface-functionalized polymer beads for
comparatively analyzing the experimental equilibrium data by three
isotherm models. The validities of Langmuir, Temkin and Freundlich
models were evaluated by employing two criteria based on nine error
functions. Langmuir and Temkin models successfully describe Cd(II) and
Pb(II) adsorption, respectively; by contrast, Freundlich model gives the
poorest description of adsorption of each Me species. It was evidenced that
modeling results provided by linear regression may be better than or
similar to those offered by non-linear regression. For adsorption of both
Me species, the best fit to experimental data is obtained by using linear
Linear Regression versus Non-Linear Regression … 175

regression, non-linear regression and both linear and non-linear regressions

for Langmuir, Freundlich and Temkin models, respectively; the descending
order of isotherm model validities determined by employing linear
regression is the same with that established by using non-linear regression.
Modeling results confirm that Me adsorption is a chemisorption process,
revealing its feasibility and spontaneous nature, therefore point out that
ILLF-SDVB beads have potential applications as adsorbent in wastewater
treatment.

REFERENCES

Armagan, B.; Toprak, F. Optimum isotherm parameters for reactive azo

dye onto pistachio nut shells: Comparison of linear and non-linear
methods. Polish Journal of Environmental Studies 2013, 22, 1007–
1011.
Bilba, N.; Bilba, D.; Moroi, G. Synthesis of a polyacrylamidoxime
chelating fiber and its efficiency in the retention of palladium ions.
Journal of Applied Polymer Science 2004, 92, 3730–3735.
Bilba, D.; Moroi, G.; Bilba, N. Copper (II) and mercury (II) retention
properties of a polyacrylamidoxime chelating fiber. Environmental
Engineering and Management Journal 2006, 5, 297–305.
Bilba, D.; Bilba, N.; Moroi, G. Removal of mercury(II) ions from aqueous
solutions by the polyacrylamidoxime chelating fiber. Separation
Science and Technology 2007, 42, 171–184.
Boparai, H. K.; Joseph, M.; O’Carroll, D. M. Kinetics and thermodynamics
of cadmium ion removal by adsorption onto nanozerovalent iron
particles. Journal of Hazardous Materials 2011, 186, 458465.
Bradl, H. Adsorption of heavy metal ions on clays. In Encyclopedia of
surface and colloid science update supplement; Editor, P.
Somasundaran; Marcel Dekker Inc.: New York, 2004; Vol. 5, pp.
35–47.
176 Gabriela-Nicoleta Moroi

Brdar, M. M.; Takači, A. A.; Šćiban, M. B.; Rakić, D. Z. Isotherms for the
adsorption of Cu(II) onto lignin – comparison of linear and non-linear
methods. Hemijska Industrija 2012, 66, 497–503.
Casas, J. S.; Sordo, J. Lead: chemistry, analytical aspects, environmental
impact and health effects (1st ed.); Elsevier: Amsterdam, 2006.
Foo, K. Y.; Hameed, B. H. Insights into the modeling of adsorption
isotherm systems. Chemical Engineering Journal 2010, 156, 210.
Freundlich, H. M. F. Über die adsorption in lösungen. Zeitschrift für
Physikalische Chemie 1906, 57A, 385470. [Adsorption in solution.
Journal of Physical Chemistry 57A: 385470].
Hamdaoui, O.; Naffrechoux, E. Modeling of adsorption isotherms of
phenol and chlorophenols onto granular activated carbon: Part I. Two-
parameter models and equations allowing determination of
thermodynamic parameters. Journal of Hazardous Materials 2007,
147, 381–394.
Han, R.; Zhang, J.; Han, P.; Wang, Y.; Zhao, Z.; Tang, M. Study of
equilibrium, kinetic and thermodynamic parameters about methylene
blue adsorption onto natural zeolite. Chemical Engineering Journal
2009, 145, 496–504.
He, J.; Hong, S.; Zhang, L.; Gan, F.; Ho, Y. S. Equilibrium and
thermodynamic parameters of adsorption of Methylene Blue onto
rectolite. Fresenius Environmental Bulletin 2010, 19, 26512656.
Ho, Y. S.; McKay, G. Sorption of dye from aqueous solution by peat.
Chemical Engineering Journal 1998, 70, 115–124.
Ho, Y. S.; Porter, J. F.; McKay, G. Equilibrium isotherm studies for the
sorption of divalent metal ions onto peat: copper, nickel and lead
single component systems. Water, Air, and Soil Pollution 2002, 141,
133.
Kumar, K. V. Comparative analysis of linear and non-linear method of
estimating the sorption isotherm parameters for malachite green onto
activated carbon. Journal of Hazardous Materials 2006, B136, 197–
202.
Kumar, K. V.; Porkodi, K.; Rocha, F. Isotherms and thermodynamics by
linear and non-linear regression analysis for the sorption of methylene
Linear Regression versus Non-Linear Regression … 177

blue onto activated carbon: Comparison of various error functions.

Journal of Hazardous Materials 2008, 151, 794–804.
Langmuir, I. The constitution and fundamental properties of solids and
liquids. Part I. Solids. Journal of the American Chemical Society 1916,
38, 22212295.
Legates, D. R.; McCabe, G. J. Jr. Evaluating the use of “goodness-of-fit”
measures in hydrologic and hydro-climatic model validation. Water
Resources Research 1999, 35, 233241.
McKay, G.; Blair, H. S.; Gardener, J. R. Adsorption of dyes on chitin. I.
Equilibrium studies. Journal of Applied Polymer Science 1982, 27,
30433057.
McKay, G. (Ed.), Use of Adsorbents for the Removal of Pollutants from
Wastewaters; CRC Press: Boca Raton, 1995.
Moroi, G.; Bilba, D.; Bilba, N. Thermal behaviour of palladium
complexing polyacrylamidoxime polymer. Polymer Degradation and
Stability 2001, 72, 525–535.
Moroi, G.; Bilba, D.; Bilba, N. Thermal degradation of mercury chelated
polyacrylamidoxime. Polymer Degradation and Stability 2004, 84,
207214.
Moroi, G.; Bilba, D.; Bilba, N.; Ciobanu, C. Thermal behaviour of
polyacrylamidoxime-copper chelates. Polymer Degradation and
Stability 2006, 91, 535–540.
Moroi, G. N. Investigation on structure and properties of
cobalt(II)/polyesterurethane metallopolymer films. Journal of Polymer
Research 2012, 19, 110.
Moroi, G. N.; Avram, E.; Bulgariu, L. Adsorption of heavy metal ions
onto surface-functionalised polymer beads. I. Modelling of equilibrium
isotherms by using non-linear and linear regression analysis. Water,
Air, and Soil Pollution 2016, 227, 1–18. Erratum to: Adsorption of
heavy metal ions onto surface-functionalised polymer beads. I.
Modelling of equilibrium isotherms by using non-linear and linear
regression analysis. Water, Air, and Soil Pollution 2016, 227, 1–2.
Salarirad, M. M.; Behnamfard, A. Modeling of equilibrium data for free
cyanide adsorption onto activated carbon by linear and non-linear
178 Gabriela-Nicoleta Moroi

regression methods. 2011 International Conference on Environment

and Industrial Innovation IPCBEE, 2011; 12, 79–84, IACSIT Press,
Singapore.
Sigel, A.; Sigel, H.; Sigel, R. K. O. Cadmium: from toxicity to essentiality;
Springer, Dordrecht, 2013.
Subramanyam, B.; Das, A. Linearized and non-linearized isotherm models
comparative study on adsorption of aqueous phenol solution in soil.
International Journal of Environmental Science and Technology 2009,
6, 633640.
Temkin, M. I. Adsorption equilibrium and the kinetics of processes on
nonhomogeneous surfaces and in the interaction between adsorbed
molecules. Zhurnal Fizicheskoi Khimii 1941, 15, 296–332.
Wasewar, K. L. Adsorption of metals onto tea factory waste: A review.
International Journal of Research and Reviews in Applied Sciences
2010, 3, 303322.
Zafar, M. N.; Nadeem, R.; Hanif, M. A. Biosorption of nickel from
protonated rice bran. Journal of Hazardous Materials 2007, 143, 478–
485.
INDEX

A C

absorption spectroscopy, 61 cadmium, 96, 154, 175

adjusted coefficient of determination, 160 calibration, 3, 4, 7, 14, 15, 16, 17, 23, 34,
adsorbent-adsorbate affinity, 151, 156 35, 36, 48, 49, 50, 52, 54, 55, 56, 57, 58,
adsorption, v, vii, ix, 33, 34, 51, 53, 54, 59, 59, 60, 61, 64, 65, 66, 67, 68, 86, 96, 97,
61, 149, 150, 151, 152, 153, 154, 155, 98, 99, 100, 101, 102, 103, 105, 107,
156, 157, 158, 159, 160, 161, 162, 163, 110, 111, 112, 113, 114
164, 165, 166, 167, 168, 169, 170, 171, calibration programs, 23
172, 173, 174, 175, 176, 177, 178 Cd(II) adsorption, 162, 163, 165, 166, 167,
adsorption isotherm modeling, ix, 149, 150 168, 170, 171, 172
adsorption isotherms, 151, 176 chemical, viii, 3, 11, 55, 59, 67, 69, 102
adsorption mechanism, 151 chemical properties, 11
algorithm, 15, 55, 93, 94 chemisorption, 157, 162, 173, 175
analytical applications, 98 chemometrics, 71
aqueous solutions, 150, 154, 174, 175 chi-square test, 152, 160
Arrhenius equation, 35 chromatography, 15, 17, 35, 50, 57, 58, 67
average absolute error, 159 coefficient of determination, 80, 86, 121,
average relative error, 159 125, 126, 133, 160
average relative standard error, 160 coefficient of variation, 14, 78, 110

B D

binding energy, 156, 157, 173 data analysis, 46, 118

bioavailability, 15, 57 data generation, 129
180 Index

data set, ix, 6, 7, 16, 37, 102, 106, 118, 121, heteroscedasticity, 38, 110
132 homogeneity, 3, 27, 28, 38, 39, 63, 114
Dubinin–Radushkevich, 33, 52, 151 hybrid fractional error function, 160
dynamic thermogravimetric analysis, 60

I
E
independent variable, vii, ix, 24, 87, 150,
Environmental Protection Agency (EPA), 160
22, 54 intervals, 76
enzyme-linked immunosorbent assay, 34, interval-valued, vii, ix
67 ionic liquid-like functionalities, 150
enzyme(s), 15, 16, 18, 33, 34, 51, 54, 65, 67 ions, 10, 175, 177
equilibrium, vii, ix, 31, 32, 33, 34, 53, 61, isotherm model parameters, 162, 166
150, 151, 154, 156, 157, 158, 160, 162, isotherm model validity, ix, 149, 151
174, 176, 177, 178 isotherm models, vii, ix, 33, 149, 150, 151,
equilibrium adsorption capacity, 151, 154, 152, 153, 154, 155, 161, 162, 166, 174,
156, 157, 158 178
equilibrium adsorption isotherms, 151 isotherms, 151, 162, 163, 164, 165, 167,
error functions, vii, ix, 150, 151, 152, 153, 168, 169, 171, 172, 177
155, 159, 161, 162, 168, 169, 174, 177
error percent deviations, 170, 171, 172
K

F kinetic constants, 53, 54

kinetic equations, 34, 53
formula, 29, 74, 81, 127 kinetic parameters, 33
Freundlich isotherm model, 155, 157, 161 kinetics, 3, 14, 16, 34, 35, 57, 61, 65, 178
function estimation, 11, 12, 16, 55, 65 Koble–Corrigan, 151, 153

G L

Galileo, 96 Lagrange multipliers, 87, 103, 111

Gibbs free energy change, 156 Langmuir isotherm model, 155
LC-MS, 15, 55, 109
LC-MS/MS, 15, 55
H
least squares, viii, ix, 1, 2, 3, 4, 8, 14, 16,
17, 19, 25, 26, 46, 48, 52, 53, 54, 56, 60,
health effects, 176
63, 64, 65, 68, 70, 73, 74, 77, 86, 88, 94,
heavy metal ions, vii, ix, 149, 150, 174, 175,
96, 101, 103, 105, 107, 108, 109, 111,
177
114, 115, 118, 120, 121, 123, 124, 127,
heterogeneity, 27, 158
128, 135, 142, 146, 147, 153
heterogeneous variances, 27
Index 181

least squares method, weighting, nickel, 176, 178

transforming data, 2 nicotinamide, 20
least-squares regression analysis, 158 NOAA, 132
linear function, 78 non-linear regression, vii, ix, 26, 149, 150,
linear model, 23, 25, 86, 126, 146 151, 152, 153, 154, 167, 172, 173, 174,
linear regression, v, vii, ix, 1, 2, 4, 5, 15, 16, 176, 178
17, 20, 22, 23, 32, 33, 45, 47, 48, 49, 57, normal distribution, 27, 37, 95
58, 59, 61, 63, 64, 67, 70, 73, 82, 94, 95,
96, 101, 102, 106, 110, 112, 113, 114,
O
117, 118, 119, 120, 121, 123, 125, 126,
127, 128, 129, 131, 133, 135, 137, 139,
orthogonal regression, vii, viii, 70, 90, 92,
141, 143, 145, 146, 147, 148, 149, 150,
112
151, 152, 153, 154, 167, 173, 174, 177
liquid chromatography, 3, 14, 15, 17, 51,
63, 65, 68 P

palladium, 175, 177

M Pb(II) adsorption, 155, 158, 161, 162, 164,
165, 166, 167, 169, 170, 171, 172, 173,
mass spectrometry, 14, 15, 17, 51, 57, 58,
174
67
polymeric supports, 150
mathematical modeling, v, ix, 149, 151,
polymer(s), vii, ix, 33, 53, 149, 150, 174,
155, 158, 173, 174
177
mercury, 101, 115, 175, 177
metal ion, vii, ix, 149, 150, 174, 175, 176,
177 Q
metals, 16, 59, 60, 150, 178
methylene blue, 153, 176, 177 quality control, 15, 64
models, vii, viii, ix, 2, 3, 4, 11, 13, 14, 15, quantification, 15, 34, 61, 68
22, 23, 33, 34, 36, 47, 54, 55, 58, 61, 66,
70, 80, 81, 85, 92, 95, 96, 99, 100, 101, R
102, 105, 106, 107, 113, 114, 117, 118,
121, 128, 129, 130, 131, 132, 135, 146, radius, 119, 120, 122, 123, 131, 134, 135
148, 149, 150, 151, 152, 153, 154, 155, random errors, 8, 9, 62, 101, 104
158, 161, 162, 166, 173, 174, 176, 178 Redlich Peterson isotherm model, 152
Monte Carlo method, 16, 65 regression, vii, viii, ix, 1, 2, 3, 4, 5, 8, 15,
16, 17, 20, 22, 23, 26, 29, 34, 35, 45, 46,
N 48, 49, 52, 53, 54, 55, 56, 57, 58, 59, 60,
61, 63, 64, 65, 68, 70, 71, 73, 76, 77, 80,
National Bureau of Standards, 60 81, 82, 83, 84, 85, 86, 87, 89, 90, 92, 94,
neglect, 105 95, 96, 97, 98, 99, 100, 101, 102, 103,
Netherlands, 56 104, 105, 106, 107, 108, 109, 110, 111,
182 Index

112, 113, 114, 115, 117, 118, 120, 121, surface properties, 151
123, 124, 126, 128, 129, 135, 145, 146, surface-functionalized polymer beads, vii,
147, 148, 149, 150, 151, 152, 153, 154, ix, 149, 150, 174
158, 160, 167, 173, 174, 176, 177, 178
regression analysis, ix, 2, 3, 8, 15, 56, 59,
T
64, 101, 102, 111, 145, 149, 151, 152,
158, 176, 177
Temkin isotherm model, 157
regression equation, vii, ix, 48, 107, 108,
thermodynamic parameters, 176
150, 160
thermodynamics, 175, 176
regression line, vii, viii, 64, 70, 76, 77, 97,
transformation(s), viii, 1, 2, 3, 7, 9, 22, 23,
99, 108, 129
24, 25, 27, 28, 29, 30, 31, 33, 35, 36, 37,
regression method, x, 94, 96, 110, 150, 153,
40, 42, 43, 44, 46, 47, 48, 49, 50, 51, 53,
178
55, 56, 58, 59, 60, 61, 62, 63, 67, 87, 120
regression model, vii, viii, 15, 29, 34, 35,
treatment, 34, 47, 48, 89, 150, 175
56, 61, 63, 64, 70, 86, 95, 97, 104, 105,
trigonometric functions, 23
106, 109, 121, 126, 146, 147
root mean square error, 159
root(s), 23, 24, 28, 47, 79, 97, 159 U

universal gas constant, 157

sample variance, 124, 131 V

science, 47, 98, 105, 175
scientific theory, 47 validation, 15, 47, 50, 58, 61, 64, 99, 102,
separation factor, 156 105, 113, 114, 115, 131, 133, 177
set theory, 118 vapor, 42, 43, 44, 45, 46
simple linear regression, 20, 33, 120 variables, viii, 1, 2, 3, 6, 7, 11, 23, 67, 70,
solution, 14, 36, 63, 82, 106, 124, 127, 128, 78, 83, 92, 94, 95, 98, 100, 102, 103,
142, 151, 154, 156, 157, 158, 176, 178 111, 115, 145, 147, 148
sorption, 16, 33, 35, 61, 176 variations, 15, 32
sorption process, 33 volatile organic compounds, 57
spectrophotometry, 3, 56, 62
statistics, 4, 5, 10, 60, 71, 80, 85, 86, 102, W
105, 113, 118
styrene-divinylbenzene copolymer beads, weighted regression, 15, 16, 56, 60, 92, 100,
150 103
sum of squared errors, 155
sum of the absolute errors, 159