0% found this document useful (0 votes)

63 views16 pages

Chapter5 Regression TransformationAndWeightingToCorrectModelInadequacies

This document discusses methods for correcting model inadequacies through data transformation when assumptions of regression analysis are violated. It describes variance stabilizing transformations that can be used when the variance of disturbances is not constant. Examples of commonly used transformations are provided. The document also discusses transforming nonlinear models into linear models to take advantage of statistical tools for linear regression. The Box-Cox method allows simultaneous estimation of transformation parameters and model parameters using maximum likelihood.

Uploaded by

Kunal Mitra

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

63 views16 pages

Chapter5 Regression TransformationAndWeightingToCorrectModelInadequacies

Uploaded by

Kunal Mitra

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 16

Chapter 5

Transformation and Weighting to Correct Model Inadequacies

The graphical methods help in detecting the violation of basic assumptions in regression analysis. Now we
consider the methods and procedures for building the models through data transformation when some of the
assumptions are violated.

Variance stabilizing transformations

In regression analysis, it is assumed that the variance of disturbances is constant, i.e.,
Var ( i )   2 , i  1, 2,..., n. Suppose this assumption is violated. A common reason for such isolation is that

the study variable follows a probability distribution in which the variance is functionally related to mean.

For example, if the study variable ( y ) in the model is Poisson random variable in a simple linear regression
model, then its variance is the same as the mean. Since mean of y is related to the explanatory variable x ,
so the variance of y will be proportional to x . In such cases, variance stabilizing transformations are
useful.

In another example, if y is proportion, i.e., 0  yi  1 then in such cases the variance of y is proportional

to E ( y )[1  E ( y )]. In such case, the variance – stabilizing transformation is useful.

Some commonly used variance-stabilizing transformations in the order of their strength are as follows:
Relation of  2 to E ( y ) Transformation

 2  constant y*  y (no transformation)

 2 E ( y ) y*  y (Poisson data)
 2  E ( y )[1  E ( y )] y*  sin 1 ( y ) (Binomial proportion 0  yi  1)
 2  [ E ( y )]2 y*  ln( y )
 2  [ E ( y )]3 y*  1/ y

1
 2  [ E ( y )]4 y* 
y

After making a suitable transformation, use y * as a study variable in the respective case.
Regression Analysis | Chapter 5 | Transf. Weight. Correct Model Inadequacies | Shalabh, IIT Kanpur
1
The strength of a transformation depends on the amount of curvature present in the curve between the study
and the explanatory variable. The transformation mentioned here ranges from relatively mild to relatively
strong. The square root transformation is relatively mild and reciprocal transformation is relatively strong.
The square root transformation is relatively mild and reciprocal transformation is relatively strong.

In general, a mild transformation applied when the minimum and maximum values do not range much (e.g.
ymax / ymin  2,3) and such transformation has little effect on the curvature. On the other hand, when the

minimum and maximum vary much, then a strong transformation is needed that will have a substantial
impact on the analysis.

In the presence of non-constant variance, the OLSE will remain unbiased but will looses the minimum
variance property.

When the study variable has been transformed as y* , then the predicted values are in the transformed scale.
It is often necessary to convert the predicted values back to the original units ( y ).

When the inverse transformation is applied directly to the original values, then it gives an estimate of the
median of the distribution of the study variable instead of the mean. So one needs to be careful while doing
so.

Confidence interval and prediction interval may be directly converted from one metric to another. The
reason being that the interval estimates are percentile of distribution and percentiles are unaffected by the
transformation. One may note that the resulting intervals may or may not be or remain the shortest possible
intervals.

Transformations to linearize the model

The basic assumption in linear regression analysis is that the relationship between the study variable and
explanatory variables is linear. Suppose this assumption is violated. Such violation can be checked by scatter
plot matrix, scatter diagrams, partial regression plots, lack of fit test etc.

In some cases, a nonlinear model can be linearized by using a suitable transformation. Such nonlinear
models are called intrinsically or transformable linear. The advantage of transforming the nonlinear

Regression Analysis | Chapter 5 | Transf. Weight. Correct Model Inadequacies | Shalabh, IIT Kanpur
2
function into the linear function is that the statistical tools are developed for the case of a linear regression
model. For example, exact tests for the test of hypothesis, confidence interval estimation etc. are developed
for the case of a linear regression model. Once the nonlinear function is transformed to a linear function, all
such tools can be readily applied, and there is no need to develop them separately.

Some linearizable functions are as follows:

1. If the curve between y and x is like as follows:

then the possible linearizable function is of the form

y   0 x 1 .

Using the transformation y*  ln y, x*  ln x, i.e., by taking log on both sides, the model
becomes
log y  log  0  1 log x

or y*   0*  1 x *

where  0*  log  0 and the model becomes a linear model. Note that the parameter  0 changes

to log  0 in the transformed model.

Regression Analysis | Chapter 5 | Transf. Weight. Correct Model Inadequacies | Shalabh, IIT Kanpur
3
2. If the curve between y and x is like as follows

then the possible linearizable function is of the form

y   0 exp ( 1 x)

Taking log e (ln) on both sides,

ln y  ln  0  1 x
or y*   0*  1 x
where y*  ln y and  0*  ln  0 .
So y*  ln y is the transformation needed in this case. The intercept term  0 becomes ln  0 in

the transformed model.

3. If the curve between y and x is like as follows

then the possible linearizable function is of the form

y   0  1 log x
Regression Analysis | Chapter 5 | Transf. Weight. Correct Model Inadequacies | Shalabh, IIT Kanpur
4
which can be written as
y   0  1 x *

using the transformation x*  log x.

4. If the curve between y and x is like as follows

then the possible linearizable function is of the form

x
y
 0 x  1
which can be written as
1 
 0  1
y x
or y*   0  1 x * .

1 1
which becomes a linear model by using the transformation y*  , x*   .
y x
 With the observed behaviour of the plots, one can choose any such curve and use the linearized form
of the function.
 When such transformations are used, many times the form of  also gets changed. For example, in
the case of
y   0 exp( 1 x) 

ln y  ln  0  1 x  ln 

or y*   0*  1 x   * .

Regression Analysis | Chapter 5 | Transf. Weight. Correct Model Inadequacies | Shalabh, IIT Kanpur
5
This implies that the multiplicative error in the original model is log normally distributed in the
transformed model. Many times, we ignore this aspect and continue to assume that the random errors
are still normally distributed. In such cases, the residuals from the transformed model should be
checked for the validity of the assumptions.
 When such transformations are used, the OLSE has the desired properties with respect to the
transformed data and not the original data.

Analytical methods for selecting a transformation on study variable

The Box-Cox method
Suppose the normality and/or constant variance of the study variable y can be corrected through a power

transformation on y . This means y is to be transformed as y  where  is the parameter to be determined.

For example, if   0.5, then the transformation is the square root and y is used as a study variable in

place of y .

Now the linear regression model has parameters  ,  2 and  . Box and Cox method tells how to estimate
simultaneously the  and parameters of the model using the method of maximum likelihood.

Note that as  approaches zero, y  approaches to 1. So there is a problem at   0 because this makes all
the observation y to be unity. It is meaningless that all the observation on the study variable are constant.

y 1
So there is a discontinuity at   0 . One approach to solve this difficulty is to use as a study

y 1
variable. Note that as   0,  ln y . So a possible solution is to use the transformed study variable

as
 y 1
 for   0
W  
ln y for   0.

So the family W is continuous. Still, it has a drawback. As  changes, the value of W change
dramatically. So it is difficult to obtain the best value of  . If different analysts obtain different values of
 , then it will fit different models. It may then not be appropriate to compare the models with different
values of  . So it is preferable to use an alternative form

Regression Analysis | Chapter 5 | Transf. Weight. Correct Model Inadequacies | Shalabh, IIT Kanpur
6
 y 1
 for   0
y ( )  V    y* 1
 y ln y for   0
 *
where y* is the geometric mean of yi ' s as y*  ( y1 y2 ... yn )1/ n which is constant.

For calculation purpose, we can use

1 n
ln y*   ln yi .
n i 1

When V is applied to each yi , we get V  V1 ,V2 ,...,Vn  ' as a vector of observation on transformed study

variable, and we use it to fit a linear model

V  X 
using the least squares or maximum likelihood method.

The quantity  y* 1 in the denominator is related to the nth power of Jacobian of transformation. See how:

We want to convert yi into yi(  ) as

yi  1
yi(  )  Wi  ;   0.

Let y   y1 , y2 ,..., yn  ', W  (W1 , W2 ,..., Wn ) '.

y1  1
Note that if W1  , then

W1  y1 1
  y1 1
y1 
W1
 0.
y2
In general,
Wi  yi 1 if i  j

y j 0 if i  j.

The Jacobian of transformation is given by

yi 1 1
J ( yi  Wi )    .
Wi  Wi  yi 1
 
 yi 

Regression Analysis | Chapter 5 | Transf. Weight. Correct Model Inadequacies | Shalabh, IIT Kanpur
7
W1 W1 W1

y1 y2 yn
y1 1 0 0  0
W2 W2 W2  1
 0 y2 0  0
J (W  y )  y1 y2 yn 
    
     1
0 0 0  yn
Wn Wn Wn

y1 y2 yn
n
  yi 1
i 1
 1
 n 
   yi 
 i 1 
 1
 
1  1 
J(y W)   n  .
J (W  Y )  
  yi 
 i 1 
Since this is a Jacobian when we want to transform the whole vector y to whole vector W . If an individual
yi is to be transform into Wi , then take its geometric mean as
 1
 
 
 1 
J ( yi  Wi )   1 
.
 n n 
   yi  
  i 1  
1
The quantity J (Y  W )  n
ensures that unit volume is preserved moving from the set of yi to the
yi 1
 1
i

set of Vi . This is a factor which scales and ensures that the residual sum of squares obtained from different

values of  can be compared.

To find the appropriate family, consider

y ( )  V  X   

y 1
where y (  )  ,  ~ N (0,  2 I ).
 y*
 1

Regression Analysis | Chapter 5 | Transf. Weight. Correct Model Inadequacies | Shalabh, IIT Kanpur
8
Applying the method of maximum likelihood for likelihood function for y (  ) ,

 n 2
  i 
n
( )  1  2
L  y    2 
exp   i 1 2 
 2   2 
 
n
 1 2   ' 
= 2 
exp   2 
 2   2 
n
 1 2  ( y (  )  X  ) '( y (  )  X  ) 
 2 
exp   
 2   2 2 
n  ( y (  )  X  ) '( y (  )  X  ) 
ln L  y (  )    ln  2    (ignoring constant).
2  2 2 
Solving
 ln L  y (  ) 
0

 ln L  y (  ) 
0
 2
gives the maximum likelihood estimators

ˆ ( )  ( X ' X ) 1 X ' y (  )
1 ( ) y (  ) ' Hy (  )
ˆ 2 ( )  y '  I  X ( X ' X ) 1 X ' y (  ) 
n n
for a given value of  .

Substituting these estimates in the log-likelihood function ln L  y (  )  gives

n n
L( )   ln ˆ 2   ln  SSr e s ( ) 
2 2
where SS r e s ( ) is the sum of squares due to residuals which is a function of  . Now maximize L( )
with respect to  . It is difficult to obtain any closed form of the estimator of  . So we maximize it
numerically.
n
The function  ln  SSr e s ( )  is called as the Box-Cox objective function.
2

Regression Analysis | Chapter 5 | Transf. Weight. Correct Model Inadequacies | Shalabh, IIT Kanpur
9
Let max be the value of  which minimizes the Box-Cox objective function. Then under fairly general

conditions, for any other 

n ln  SS r e s ( )   n ln  SS r e s (max ) 

has approximately  2 (1) distribution. This result is based on the large sample behaviour of the likelihood
ratio statistic. This is explained as follows:

The likelihood ratio test statistic in our case is

Max L
o
n   
Max L

n
 1 2
Max  2 
o   
 n
 1 2
Max  2 
  
n
 1 2
 ˆ 2 ( ) 
  
n
 1 2
 ˆ2 
  (max ) 
n
 1/ SS r e s ( )  2
 
 1/ SSr e s (max ) 

n  SSr e s (max ) 
ln   ln  
2  SSr e s ( ) 
n  SSr e s ( ) 
 ln   ln  
2  SS r e s (max ) 
n n
 ln  SSr e s ( )   ln  SSr e s (max ) 
2 2
  L( )  L(max )
where
n
L( )   ln  SSr e s ( ) 
2
n
L(max )   ln  SSr e s (max )  .
2

Regression Analysis | Chapter 5 | Transf. Weight. Correct Model Inadequacies | Shalabh, IIT Kanpur
10
Since under certain regularity conditions, 2 ln n converges in distribution to  2 (1) when the null

hypothesis is true, so
 2 ln  ~  2 (1)
 2 (1)
or  ln  ~
2
 2 (1)
or L(max )  L( ) ~ .
2

Computational procedure
The maximum- likelihood estimate of  corresponds to the value of  for which residual sum of squares
from the fitted model SSr e s ( ) is a minimum. To determine such  , we proceed computationally as follows:

- Fit y (  ) for various values of  . For example, start with values in (-1, 1) then take the values in
(-2, 2) and so on. Take about 15 to 20 values of  which are expected to be sufficient for the
estimation of optimum value.
- Plot SSr e s ( ) versus  .

- Find the value of  which minimizes SSr e s ( ) from the graph.

- A second iteration can be performed using a finer mesh of values of desired.

Note that the value of  can not be selected by directly comparing the residual sum of squares from the
regression of y  on x because for each  , the residual sum of squares is measured on a different scale.

It is better to use simple values of  . For example, the practical difference between   0.5 and   0.58 is
likely to be small but   0.5 is much easier to interpret.

Once  is selected, then use

 y  as a study variable if   0
 ln y as a study variable if   0.

It is entirely acceptable to use y (  ) as a response to the final model. This model will have a scale difference

and an origin shift in comparison to model using y  (or ln y ) as the response.

Regression Analysis | Chapter 5 | Transf. Weight. Correct Model Inadequacies | Shalabh, IIT Kanpur
11
An approximate confidence interval for 
We can find an approximate confidence interval for the transformation parameter  . This interval helps in
selecting the final value of  . For example, if ˆ  0.58 is the value of  which is minimizing the sum of
squares due to residual. But if   0.5 is in the confidence interval, then one may use the square root
transformation because it is easier to explain. Furthermore, if   1 is in the confidence interval, then it may
be concluded that no transformation is necessary.

In applying the method of maximum likelihood to the regression model, we are essentially maximizing
n
L( )   ln  SS r e s ( ) 
2
or equivalently, we are minimizing SS r e s ( ) .

An approximate 100(1-  ) % confidence interval for  consists of those values of  that satisfy

2 (1)
L(ˆ )  L( ) 
2
where 2 (1) is the upper  % point of the Chi-square distribution with one degree of freedom.

The approximate confidence interval is constructed using the following steps:

- Draw a plot of L( ) versus  .
- Draw a horizontal line at the height
 (1)
2
L(ˆ )  
2
on the vertical scale.
- This line would cut the L( ) at two points.
- The location of these two points on the  -axis defines the two endpoints of the approximate
confidence interval.
- If the sum of squares due to residuals is minimized and SSr e s ( ) versus  is plotted, then the line

must be plotted at the height

  2 (1) 
SS *  SS r e s (ˆ ) exp   
 n 

where ̂ is the value of  which minimizes the sum of squares due to residuals. See how:

Regression Analysis | Chapter 5 | Transf. Weight. Correct Model Inadequacies | Shalabh, IIT Kanpur
12
 2 (1) n  2 (1)
L(ˆ )     ln  SSr e s (ˆ )   
2 2 2
n  (1) 
 
2
  ln SS Re s (ˆ )   
2 n 
n    2 (1)  
2 
 
  ln SS r e s (ˆ )  ln exp    
  n  
n   ˆ  2 (1)  
  ln  SSr e s ( ).exp   
2    n  
n
  ln SS * .
2
Using the expansion of exponential function as
t2
exp(t )  1  t   ...
2!
 1  t,

 2 (1)  2 (1)  2 (1) 

we can approximate and replace exp   by 1  . So in place of exp   in applying the
 n  n  n 
confidence interval procedure, we can use the following:
Z2 /2  Z2 /2 
1  or 1  
  n 
t2 /2  t2 /2 
or 1   or 1  
  n 
2 /2  2 /2 
or 1   or 1  
  n 

where  is the degrees of freedom associated with the sum of squares due to residuals.
These expressions are based on the fact that
 2 (1)  Z 2  t2 if  is small.
It is debatable to use either  or n but practically the difference is very little between the confidence
interval results.

Box-Cox transformation was originally introduced to reduce the nonnormality in the data. It also helps in
reducing the nonlinearity. The approach is to find out the transformations, which attempts to reduce the
residuals associated with outliers and also reduce the problem of non-constant error variance if there was no
acute nonlinearity, to begin with.

Regression Analysis | Chapter 5 | Transf. Weight. Correct Model Inadequacies | Shalabh, IIT Kanpur
13
Transformation on explanatory variables: Box and Tidwell procedure
Suppose the relationship between y and one or more of the explanatory variables is nonlinear. Other usual
assumptions normally and independently distributed study variable with constant variance are at least
approximately satisfied.

We want to select an appropriate transformation on the explanatory variable so that the relationship between
y and transformed explanatory variable is as simple as possible.

Box and Tidwell procedure describes a general analytical procedure for determining the form of
transformation on x

Suppose that the study variable y is related to the power of explanatory variables. Box and Tidwell
procedures for explanatory variables choose the variables as
 xij j  1
 when  j  0, i  1, 2,.., n; j  1, 2,..., k
zij    j

ln xij when  j  0.

We need to estimate  j ' s . Since the dependent variable is not being transformed, we need not worry about

the changes of scale and minimize

 y    1 zi1  ...   k zik 

2
i 0
i 1

by using the nonlinear least-squares techniques.

We consider this for simple linear regression model instead of a nonlinear regression model.

Assume y is related to   x as
E ( y )  f ( ,  0 , 1 )   0  1
 x if   0
where   
ln x if   0

where  0 , 1 and  are the unknown parameters.

Suppose  0 is the initial guess of constant  .

Usually, the first guess is  0  1 so that   x or no transformation is applied in the first iteration.
Regression Analysis | Chapter 5 | Transf. Weight. Correct Model Inadequacies | Shalabh, IIT Kanpur
14
Expand about the initial guess in a Taylor series and ignoring terms of order higher them one gives
 df ( ,  0 , 1 ) 
E ( y )  f ( 0 ,  0 , 1 )  (   0 )  
 d  0
0

 df ( ,  0 , 1 ) 
  0  1 x  (  1)   .
 d  0
0

 df ( ,  0 , 1 ) 
Suppose the term   is known, then it can be treated just like as an additional explanatory
 d  0
0

variable. Then the parameters  0 , 1 and  can be estimated by least-squares method.

The estimate of  can be considered as an improved estimate of the transformation parameter.

This term can be written as

 df ( ,  0 , 1 )   df ( ,  0 , 1 )   d  
      .
 d  0  d  0  d   
0
0

d
Since the form of transformation is known, i.e.,   x , so  x ln x.
d
Furthermore
 df ( ,  0 , 1 )  d (  0  1 x)
    1.
 d  0 dx

So 1 can be estimated by fitting the model

ŷ  ˆ0  ˆ1 x

by least-squares method.

Then an “adjustment” to initial guess  0  1 is computed by defining a second regression variable as

  x ln x
estimating the parameter in
E ( y )   0*  1* x  (  1) 1
  0*  1* x  
by least-squares.

Regression Analysis | Chapter 5 | Transf. Weight. Correct Model Inadequacies | Shalabh, IIT Kanpur
15
This gives the following:
yˆ  ˆ0*  ˆ1* x  
ˆ
ˆ  (  1) ˆ1

ˆ
or 1  1
ˆ1
as the revised estimate of  .

Note that ˆ1 is obtained from ŷ  ˆ0  ˆ1 x and ˆ is obtained from ŷ  ˆ0*  ˆ1* x  
ˆ .

Generally, ˆ1 and ˆ1* will differ.

This procedure may be repeated using a new regression x*  x1 in the calculation.

This procedure generally converges rapidly.

Usually, the first stage result 1 is a satisfactory estimate of  . The round-off error is a potential problem.

If enough decimal places are not taken care, then the successive values of  may oscillate badly. If the
standard deviation of error ( ) is large or the range of the explanatory variable is very small relative to its
mean, then the estimator may face convergence problems. This situation implies that the data do not support
the need for any transformation.

Regression Analysis | Chapter 5 | Transf. Weight. Correct Model Inadequacies | Shalabh, IIT Kanpur
16

Risk Assessment For Testing and Commissioning of Electrical System
100% (10)
Risk Assessment For Testing and Commissioning of Electrical System
8 pages
Mixed Effects Models in S and S-Plus PDF
100% (1)
Mixed Effects Models in S and S-Plus PDF
537 pages
APPLIED REGRESSION ANALYSIS AND GENERALIZED LINEAR MODELS Fox 2008
0% (1)
APPLIED REGRESSION ANALYSIS AND GENERALIZED LINEAR MODELS Fox 2008
103 pages
Chapter 4
No ratings yet
Chapter 4
53 pages
Multivariate Data Analysis Joseph F. Hair Jr. William C. Black Barry J. Babin Rolph E. Anderson Seventh Edition
0% (1)
Multivariate Data Analysis Joseph F. Hair Jr. William C. Black Barry J. Babin Rolph E. Anderson Seventh Edition
7 pages
A Handbook To Conquer Casella and Berger Book in Ten Days: Oliver Y. Chén Last Update: June 25, 2016
No ratings yet
A Handbook To Conquer Casella and Berger Book in Ten Days: Oliver Y. Chén Last Update: June 25, 2016
15 pages
Probit Analysis MiniTab - Waktu (LT50)
100% (1)
Probit Analysis MiniTab - Waktu (LT50)
3 pages
Chapter5 Regression TransformationAndWeightingToCorrectModelInadequacies
No ratings yet
Chapter5 Regression TransformationAndWeightingToCorrectModelInadequacies
16 pages
Chapter 9 Variable Transformation Heteroscedasticity
No ratings yet
Chapter 9 Variable Transformation Heteroscedasticity
25 pages
Unit II
No ratings yet
Unit II
12 pages
Reg 05
No ratings yet
Reg 05
17 pages
Chapter 4 Transformations and Weighting To Correct Model Inadequacies 13 March
No ratings yet
Chapter 4 Transformations and Weighting To Correct Model Inadequacies 13 March
27 pages
Lecture 8
No ratings yet
Lecture 8
25 pages
CH 2
No ratings yet
CH 2
31 pages
ECE 3040 Lecture 18: Curve Fitting by Least-Squares-Error Regression
No ratings yet
ECE 3040 Lecture 18: Curve Fitting by Least-Squares-Error Regression
38 pages
MBA 8040 MODEL BUILDING With Data Transformations PDF
No ratings yet
MBA 8040 MODEL BUILDING With Data Transformations PDF
17 pages
3 Transformations in Regression: Y X Y X
No ratings yet
3 Transformations in Regression: Y X Y X
13 pages
Mungadze Linear
No ratings yet
Mungadze Linear
21 pages
WLR
No ratings yet
WLR
4 pages
Lec 21
No ratings yet
Lec 21
14 pages
Chapter 2 Simple Linear Regression
No ratings yet
Chapter 2 Simple Linear Regression
31 pages
Machine Learning (CSO851) - Lecture 02
No ratings yet
Machine Learning (CSO851) - Lecture 02
74 pages
CHAPTER 1 - INTRODUCTION - Introduction To Linear Regression Analysis, 5th Edition
No ratings yet
CHAPTER 1 - INTRODUCTION - Introduction To Linear Regression Analysis, 5th Edition
11 pages
Output Input Linear Correlation Coefficient Regression Analysis
No ratings yet
Output Input Linear Correlation Coefficient Regression Analysis
6 pages
8.5 Transformation Notes FILLED IN
No ratings yet
8.5 Transformation Notes FILLED IN
4 pages
Chapter 12: Regression
No ratings yet
Chapter 12: Regression
10 pages
Unit - 1
No ratings yet
Unit - 1
8 pages
Diagnostico de Modelos
No ratings yet
Diagnostico de Modelos
4 pages
Variable Selection 8.1 The Model Building Problem
No ratings yet
Variable Selection 8.1 The Model Building Problem
18 pages
Variable Selection 8.1 The Model Building Problem
No ratings yet
Variable Selection 8.1 The Model Building Problem
18 pages
Section 2
No ratings yet
Section 2
22 pages
Lesson - 4.2 - Exploratory Data Analysis - Analyze - Phase
No ratings yet
Lesson - 4.2 - Exploratory Data Analysis - Analyze - Phase
50 pages
What Is Empirical - Models
No ratings yet
What Is Empirical - Models
14 pages
STBE Finals Reviewer
No ratings yet
STBE Finals Reviewer
11 pages
Estad Istica II Chapter 5. Regression Analysis (Second Part)
No ratings yet
Estad Istica II Chapter 5. Regression Analysis (Second Part)
39 pages
Empirical Models: Data Collection
No ratings yet
Empirical Models: Data Collection
16 pages
Linear Models
No ratings yet
Linear Models
92 pages
Curve Fitting: ME 537 Numerical Methods For Engineers University of Gaziantep Faculty of Engineering Dr. Mustafa Özakça
No ratings yet
Curve Fitting: ME 537 Numerical Methods For Engineers University of Gaziantep Faculty of Engineering Dr. Mustafa Özakça
171 pages
Data Science Interview Preparation
100% (1)
Data Science Interview Preparation
113 pages
Business Analytics
No ratings yet
Business Analytics
19 pages
Chapter Three
No ratings yet
Chapter Three
22 pages
Transformations and Weighting To Correct Model Inadequacies: Linear Regression Analysis 5E Montgomery, Peck & Vining 1
No ratings yet
Transformations and Weighting To Correct Model Inadequacies: Linear Regression Analysis 5E Montgomery, Peck & Vining 1
45 pages
Econometrics Session
No ratings yet
Econometrics Session
43 pages
Lectures 7 8-Simple Regression Analysis - Assumptions and Estimations (OLS)
No ratings yet
Lectures 7 8-Simple Regression Analysis - Assumptions and Estimations (OLS)
21 pages
Linear Regression
No ratings yet
Linear Regression
31 pages
Regression Analysis - Chapter 4 - Model Adequacy Checking - Shalabh, IIT Kanpur
No ratings yet
Regression Analysis - Chapter 4 - Model Adequacy Checking - Shalabh, IIT Kanpur
36 pages
Chap01-3 (Autosaved)
No ratings yet
Chap01-3 (Autosaved)
51 pages
Machine Learning Lecture 1
No ratings yet
Machine Learning Lecture 1
5 pages
7 Generalized Linear Models
No ratings yet
7 Generalized Linear Models
16 pages
13 Chapter14
No ratings yet
13 Chapter14
28 pages
Module 3 EDA
No ratings yet
Module 3 EDA
14 pages
Linear Regression - Module 3
No ratings yet
Linear Regression - Module 3
16 pages
Session 1: Simple Linear Regression: Figure 1 - Supervised and Unsupervised Learning Methods
No ratings yet
Session 1: Simple Linear Regression: Figure 1 - Supervised and Unsupervised Learning Methods
16 pages
R18&19
No ratings yet
R18&19
32 pages
Lec2 ASE
No ratings yet
Lec2 ASE
86 pages
Standardized Multiple Regression Analysis
No ratings yet
Standardized Multiple Regression Analysis
18 pages
Chapter 15: Regression Analysis With Linear Algebra Primer
No ratings yet
Chapter 15: Regression Analysis With Linear Algebra Primer
26 pages
STAT22209 - Chapter 02-Regression Analyisis - 2022
No ratings yet
STAT22209 - Chapter 02-Regression Analyisis - 2022
41 pages
H-311 Linear Regression Analysis With R
100% (1)
H-311 Linear Regression Analysis With R
71 pages
Lesson 4 Linear Assumptions
No ratings yet
Lesson 4 Linear Assumptions
27 pages
Data Analytics Unit III
No ratings yet
Data Analytics Unit III
15 pages
Unit III
No ratings yet
Unit III
13 pages
DS Unit-Iv
No ratings yet
DS Unit-Iv
34 pages
Chapter 4
No ratings yet
Chapter 4
15 pages
Calculus Refresher
From Everand
Calculus Refresher
A. A. Klaf
3/5 (8)
Sims Zha - Error Bands For IRF
No ratings yet
Sims Zha - Error Bands For IRF
42 pages
Lecture13 PDF
No ratings yet
Lecture13 PDF
48 pages
Ivey 2004 What Do Parents Expect A Study of Likelihood and Importance Issues For Children With Autism Spectrum Disorders
No ratings yet
Ivey 2004 What Do Parents Expect A Study of Likelihood and Importance Issues For Children With Autism Spectrum Disorders
7 pages
Probability Theory: Sargur N. Srihari Srihari@cedar - Buffalo.edu
No ratings yet
Probability Theory: Sargur N. Srihari Srihari@cedar - Buffalo.edu
49 pages
Structural Safety: Zi-Tong Zhao, He-Qing Mu, Ka-Veng Yuen
No ratings yet
Structural Safety: Zi-Tong Zhao, He-Qing Mu, Ka-Veng Yuen
15 pages
Bayes Theorem Exercise Problems and Solutions
No ratings yet
Bayes Theorem Exercise Problems and Solutions
6 pages
ATBU
No ratings yet
ATBU
22 pages
Smart Money, Dumb Money, and Learning Type From Price
No ratings yet
Smart Money, Dumb Money, and Learning Type From Price
23 pages
5B Bayesian Inference: Class Problems
No ratings yet
5B Bayesian Inference: Class Problems
9 pages
Mixture Modeling of Individual Learning Curves: Matthew Streeter
No ratings yet
Mixture Modeling of Individual Learning Curves: Matthew Streeter
8 pages
Appendix P Risk Analysis ARTM 09232010
No ratings yet
Appendix P Risk Analysis ARTM 09232010
25 pages
CAD 712 Issue 2 - May 2016
No ratings yet
CAD 712 Issue 2 - May 2016
63 pages
CS1B April 2024 Exam Paper
No ratings yet
CS1B April 2024 Exam Paper
7 pages
SaTScan Users Guide
100% (1)
SaTScan Users Guide
116 pages
Handling Overdispersion in Poisson Regression Using Negative Binomial Regression For Poverty Case in West Java
No ratings yet
Handling Overdispersion in Poisson Regression Using Negative Binomial Regression For Poverty Case in West Java
7 pages
Mixed Logit or Random Parameter Logit Model
No ratings yet
Mixed Logit or Random Parameter Logit Model
12 pages
Sujit Ghosh Web CV
No ratings yet
Sujit Ghosh Web CV
14 pages
Rubin - Multiple Imputation After 18+ Years
No ratings yet
Rubin - Multiple Imputation After 18+ Years
17 pages
International Statistical Institute (ISI)
No ratings yet
International Statistical Institute (ISI)
15 pages
How Affordable Is HUD Affordable Housing - Reportv4
No ratings yet
How Affordable Is HUD Affordable Housing - Reportv4
52 pages
Stat Proof Book
No ratings yet
Stat Proof Book
660 pages
Camerer Cognitive Hierarchy Model
No ratings yet
Camerer Cognitive Hierarchy Model
38 pages
Traffic Analysis of Uncontrolled Intersection
No ratings yet
Traffic Analysis of Uncontrolled Intersection
20 pages
Coaching Actuaries Exam STAM Suggested Study Schedule: Phase 1: Learn
No ratings yet
Coaching Actuaries Exam STAM Suggested Study Schedule: Phase 1: Learn
7 pages
(Lecture Notes in Earth Sciences 98) Tetsuo Takanami, Genshiro Kitagawa (auth.)-Methods and Applications of Signal Processing in Seismic Network Operations-Springer-Verlag Berlin Heidelberg (2003).pdf
No ratings yet
(Lecture Notes in Earth Sciences 98) Tetsuo Takanami, Genshiro Kitagawa (auth.)-Methods and Applications of Signal Processing in Seismic Network Operations-Springer-Verlag Berlin Heidelberg (2003).pdf
266 pages

Chapter5 Regression TransformationAndWeightingToCorrectModelInadequacies

Uploaded by

Chapter5 Regression TransformationAndWeightingToCorrectModelInadequacies

Uploaded by

Chapter 5

Transformation and Weighting to Correct Model Inadequacies

Variance stabilizing transformations

to E ( y )[1  E ( y )]. In such case, the variance – stabilizing transformation is useful.

 2  constant y*  y (no transformation)

Transformations to linearize the model

Some linearizable functions are as follows:

then the possible linearizable function is of the form

to log  0 in the transformed model.

then the possible linearizable function is of the form

Taking log e (ln) on both sides,

the transformed model.

3. If the curve between y and x is like as follows

then the possible linearizable function is of the form

using the transformation x*  log x.

4. If the curve between y and x is like as follows

then the possible linearizable function is of the form

Analytical methods for selecting a transformation on study variable

transformation on y . This means y is to be transformed as y  where  is the parameter to be determined.

For calculation purpose, we can use

variable, and we use it to fit a linear model

We want to convert yi into yi(  ) as

The Jacobian of transformation is given by

values of  can be compared.

To find the appropriate family, consider

Substituting these estimates in the log-likelihood function ln L  y (  )  gives

conditions, for any other 

The likelihood ratio test statistic in our case is

- Find the value of  which minimizes SSr e s ( ) from the graph.

- A second iteration can be performed using a finer mesh of values of desired.

Once  is selected, then use

and an origin shift in comparison to model using y  (or ln y ) as the response.

The approximate confidence interval is constructed using the following steps:

must be plotted at the height

 2 (1)  2 (1)  2 (1) 

the changes of scale and minimize

 y    1 zi1  ...   k zik 

by using the nonlinear least-squares techniques.

where  0 , 1 and  are the unknown parameters.

Suppose  0 is the initial guess of constant  .

variable. Then the parameters  0 , 1 and  can be estimated by least-squares method.

The estimate of  can be considered as an improved estimate of the transformation parameter.

This term can be written as

So 1 can be estimated by fitting the model

Then an “adjustment” to initial guess  0  1 is computed by defining a second regression variable as

Generally, ˆ1 and ˆ1* will differ.

This procedure generally converges rapidly.

You might also like