0% found this document useful (0 votes)
29 views

Unit - II Regression-LogisticRegressionModels

Uploaded by

sakthi vel
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
29 views

Unit - II Regression-LogisticRegressionModels

Uploaded by

sakthi vel
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

Chapter 14

Logistic Regression Models

In the linear regression model X    , there are two types of variables – explanatory variables X 1 , X 2 ,.., X k

and study variable y . These variables can be measured on a continuous scale as well as like an indicator
variable. When the explanatory variables are qualitative, then their values are expressed as indicator
variables, and then dummy variable models are used.

When the study variable is a qualitative variable, then its values can be expressed using an indicator variable
taking only two possible values 0 and 1. In such a case, the logistic regression is used. For example, y can
denotes the values like success or failure, yes or no, like or dislike, which can be denoted by two values 0
and 1.

Consider the model


yi   0  1 xi1   2 xi 2  ...   k xik   i
 xi'    i , i  1, 2,..., n

where xi'  1, xi1 , xi 2 ,..., xik  ,  '    0 , 1 ,  2 ,...,  k  .

The study variable takes two values as yi  0 or 1. Assume that yi follows a Bernoulli distribution with a

parameter  i , so its probability distribution is

1 with P ( yi  1)   i
yi  
0 with P ( yi  0)  1   i .

Assuming E ( i )  0,

E ( yi ) 1. i  0.(1   i )   i .

From the model yi  xi'    i , we have

E ( yi )  xi' 
 E ( yi )  xi'    i
 E ( yi )  P ( yi  1).

Thus response function E ( yi ) is simply the probability that yi  1.

Regression Analysis | Chapter 14 | Logistic Regression Models | Shalabh, IIT Kanpur


1
Note that  i  yi  xi'  , so

- when yi  1, then  i  1  xi' 

- yi  0, then  i   xi'  .

Recall that earlier  i was assumed to follow a normal distribution when y was not an indicator variable.

When y is an indicator variable, then  i takes only two values, so it cannot be assumed to follow a normal

distribution.

In the usual regression model, the errors are homoskedastic, i.e., Var ( i )   2 and so Var ( yi )   2 . When y

is an indicator variable, then

Var ( yi )  E  yi  E ( yi ) 
2

 (1   i ) 2  i  (0   i ) 2 (1   i )
  i (1   i ) 1   i   i 
  i (1   i )
 E ( yi ) 1  E ( yi ) 
  y2i .

Thus Var ( yi ) depends on yi and is a function mean of yi . Moreover, since E ( yi )   i and  i is the

probability, so 0   i  1 and thus there is a constraint on E ( yi ) that 0  E ( yi )  1. This puts a big constraint

on the choice of the linear response function. One cannot fit a model in which the predicted values lie outside
the interval of 0 and 1.

When y is a dichotomous variable, then empirical pieces of evidence suggest that the function E ( y ) on the
whole real line that can be mapped to [0,1] has the sigmoid shape. It is a nonlinear S  shape like
1 1

E(y) E(y)

0 0
x x

Regression Analysis | Chapter 14 | Logistic Regression Models | Shalabh, IIT Kanpur


2
A natural choice for E ( y ) would be the cumulative distribution function of a random variable. In particular,
the logistic distribution, whose cumulative distribution function is the simplified logistic function yields a
good link and is given by
exp( y )
E ( y) 
1  exp( y )
exp( x '  )

1  exp( x '  )
1
 .
1  exp( x '  )

Linear predictor and link functions:


The systematic component in E(y) is the linear predictor and is denoted as
i    j xij  xi'  , i  1, 2,..., n, j  0,1, 2,..., k .
j

The link function in generalized linear model relates the linear predictor i to the mean response i .

Thus
g ( i )  i
or i  g 1 (i ).

In the usual linear models based on the normally distributed study variable, the link g ( i )  i is used and is

called an identity link. A link function maps the range of i onto the whole real line, provides good

empirical approximation and carries meaningful interpretations in real applications.

In the case of logistic regression, the link function is defined as



  ln .
1

This transformation is called as the logit transformation of probability  and is called as odds. The
1 
link  is also called as log-odds. This link function is obtained as follows:

Regression Analysis | Chapter 14 | Logistic Regression Models | Shalabh, IIT Kanpur


3
1

1  exp( )

or  1  exp( )   1

1 
or e  


or   ln .
1 

Note: Similar to logit function, there are other functions also which have the same shape as of logistic
function. These functions can also be transformed through  . There are two such popular functions – probit
transformation and complementary log-log transformation. The probit transformation is based on the
transformation of  using the cumulative distribution function of normal distribution and based on this is the
probit regression model.

The complementary log-log transformation of  is ln   ln(1   )  .

Maximum likelihood estimation of parameters:


Consider the general form of the logistic regression model
yi  E ( yi )   i

where yi ' s are independent Bernoulli random variable with a parameter  i with

E ( yi )   i
exp( xi'  )

1  exp( xii  )

The probability density function of yi is

f i ( yi )   iyi (1   i )1 yi , i  1, 2,..., n, yi  0 or 1.

The likelihood function is


n
L( y1 , y2 ,..., yn , 1 ,  2 ,...,  k )  L   f i ( yi )
i 1
n
  f i ( yi )(1   i )1 yi
i 1

Regression Analysis | Chapter 14 | Logistic Regression Models | Shalabh, IIT Kanpur


4
n
ln L   ln  iyi  ln(1   i )1 yi 
i 1
n
   yi ln  i  (1  yi ) ln(1   i ) 
i 1

n 
   n
   yi ln  i      ln(1   i ) .
i 1   1   i   i 1
Since
exp( xi'  )
i  ,
1  exp( xi'  )
1
1i  ,
1  exp( xi'  )
i
 exp( xi'  ),
1 i

ln i  exp xi'  ,
1 i
so
n n
ln L   yi xi'   ln 1  exp( xi'  )  .
i 1 i 1

Suppose repeated observations are available at each level of the x -variables. Let yi be the numbers of 1’s

observed for i th observation and ni be the number of trials at each observation. Then
n n n
ln L   yi i   ni ln(1   i )  yi ln(1   i ) .
i 1 i 1 i 1

The maximum likelihood estimate ̂ of  is obtained by the numerical maximization.

If V ( )  , then asymptotically

E ( ˆ )  
V ( ˆ )  ( X '  1 X ) 1.

After obtaining ̂ , the linear predictor is estimated by

ˆi  xi'  .
The fitted value is
exp(ˆi ) 1 1
yˆi  ˆi    .
1  exp(ˆi ) 1  exp(ˆi ) 1  exp( xi' ˆ )

Regression Analysis | Chapter 14 | Logistic Regression Models | Shalabh, IIT Kanpur


5
Interpretation of parameters:
To understand the interpretation of the related  ' s in the logistic regression model, first, consider a simple
case with only one variable as
 ( x)   0  1 x.

After fitting of the model, ˆ0 and ˆ1 are obtained as the estimators of  0 and 1 respectively. Then the

fitted linear predictor at x  xi is

ˆ ( xi )  ˆ0  ˆ1 xi
which is the log-odds at x  xi . The fitted value at x  xi  1 is

ˆ ( xi  1)  ˆ0  ˆ1 ( xi  1)
which is the log-odds at x  xi  1.

Thus
ˆ1  ˆ ( xi  1)  ˆ ( xi )
 ln  odds( xi  1)   ln  odds( xi ) 
 odds(xi  1) 
 ln  
 odds( xi ) 
odds(xi  1)
  exp( ˆ1 ).
odds( xi )

This is termed as odd ratio, which is the estimated increase in the probability of success when the value of
explanatory variable changes by one unit.

When there are more than one explanatory variables in the model, then the interpretation of  j ' s is similar

as in the case of a single explanatory variable case. The odds ratio is exp ( ˆ j ) associated with explanatory

variable x j keeping other explanatory variables constant. This is similar to the interpretation of  j in

multiple linear regression model.

If there is a m unit change is the explanatory variable, then the estimated increase in odds ratio is exp (mˆ j ).

Regression Analysis | Chapter 14 | Logistic Regression Models | Shalabh, IIT Kanpur


6
Test of hypothesis:
The test of hypothesis for the parameters in the logistic regression model is based on asymptotic theory. It is
a large sample test based on the likelihood ratio test based on a statistic termed as deviance.

A model with exactly p parameters that perfectly fit the sample data is termed as a saturated model.

The statistic that compares the log-likelihoods of fitted and saturated models is called as model deviance. It
is defined as
 (  )  2 ln L(saturated model)  2 ln L( ˆ )

where ln L() is the log-likelihood and ̂ is the maximum likelihood estimate of  .

In the case of the logistic regression model, yi  0 or 1 and  i ’s are completely unrestricted. So the

likelihood will be maximum at  i  yi , and the maximum value of L (saturated modal) is


Maximum L(saturated model)  1
 ln Maximum L (saturated model)  0.

Let ̂ be the maximum likelihood estimator of  , then log-likelihood is maximum at   ˆ , and


n n
ln L( ˆ )   yi xi' ˆi  ln 1  exp( xi'  ) 
i 1 i 1

 ln L(saturated model).

Assuming that the logistic regression function is correct, the large sample distribution of likelihood ratio test
statistic  (  ) is approximately distributed as  2 (n  p ) , when n is large.

A large value of  (  ) implies the model is incorrect. A small value of  (  ) implies that the model is well
fitted and is as good as the saturated model. Note that generally, the fitted model will be having a smaller
number of parameters than the saturated model that is based on all the parameters. Thus at  % level of
significance.

Regression Analysis | Chapter 14 | Logistic Regression Models | Shalabh, IIT Kanpur


7

You might also like