0% found this document useful (0 votes)
409 views

Chapter-15: Research Methodology

This document discusses correlation and regression analysis. It defines correlation as the degree of association between two or more variables and describes positive, negative, and zero correlation. Simple linear regression is expressed as Y = α + βX + U and is estimated using ordinary least squares to minimize error sum of squares. The significance of the regression parameters and goodness of fit are tested using t-statistics and F-statistics, respectively.

Uploaded by

sairam
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
409 views

Chapter-15: Research Methodology

This document discusses correlation and regression analysis. It defines correlation as the degree of association between two or more variables and describes positive, negative, and zero correlation. Simple linear regression is expressed as Y = α + βX + U and is estimated using ordinary least squares to minimize error sum of squares. The significance of the regression parameters and goodness of fit are tested using t-statistics and F-statistics, respectively.

Uploaded by

sairam
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 25

DR NEENA SONDHI

CHAPTER-15
CORRELATION AND REGRESSION ANALYSIS
DR DEEPAK CHAWLA

RESEARCH METHODOLOGY CONCEPTS AND CASES


DR NEENA SONDHI SLIDE
SLIDE 7-1
15-1

What is Correlation?

Correlation measures the degree of


association between two or more variables.
When we are dealing with two variables, we
are talking in terms of simple correlation and
when more than two variables are involved,
DR DEEPAK CHAWLA

the subject matter of interest is called multiple


correlation.

RESEARCH METHODOLOGY CONCEPTS AND CASES


DR NEENA SONDHI SLIDE 15-2

Types of Correlation
 Positive correlation - When two variables X and
Y move in the same direction, the correlation
between the two is positive.
 Negative correlation: When two variables X and
Y move in the opposite direction, the correlation is
DR DEEPAK CHAWLA

negative.
 Zero correlation: The correlation between two
variables X and Y is zero when the variables
move in no connection with each other.

RESEARCH METHODOLOGY CONCEPTS AND CASES


SLIDE 15-3

Graphical Presentation of Positive


DR NEENA SONDHI

Correlation
DR DEEPAK CHAWLA

RESEARCH METHODOLOGY CONCEPTS AND CASES


SLIDE 15-4

Graphical Presentation of Negative


DR NEENA SONDHI

Correlation
DR DEEPAK CHAWLA

RESEARCH METHODOLOGY CONCEPTS AND CASES


SLIDE 15-5

Graphical Presentation of Zero


DR NEENA SONDHI

Correlation
DR DEEPAK CHAWLA

RESEARCH METHODOLOGY CONCEPTS AND CASES


SLIDE 15-6
Quantitative Estimate of a Linear
DR NEENA SONDHI

Correlation
 A quantitative estimate of a linear correlation between
two variables X and Y is given by Karl Pearson as:
DR DEEPAK CHAWLA

RESEARCH METHODOLOGY CONCEPTS AND CASES


SLIDE 15-7
Quantitative Estimate of a Linear
DR NEENA SONDHI

Correlation
 The linear correlation coefficient takes a value
between –1 and +1 (both values inclusive).
 If the value of the correlation coefficient is equal to 1,
the two variables are perfectly positively correlated
and the scatter of the points of the variables X and Y
will lie on a positively sloped straight line.
DR DEEPAK CHAWLA

 Similarly, if the correlation coefficient between the two


variables X and Y is –1, the scatter of the points of
these variables will lie on a negatively sloped straight
line and such a correlation will be called a perfectly
negative correlation.

RESEARCH METHODOLOGY CONCEPTS AND CASES


SLIDE 15-8
Testing the significance of the
DR NEENA SONDHI

correlation coefficient
The hypothesis to be tested is mentioned below:
H0 : ρ = 0
H1 : ρ ≠ 0
Test statistic is given by,
DR DEEPAK CHAWLA

Where,

ρ = Population correlation coefficient between the variables X and Y


r = Sample correlation coefficient between the variables X and Y
n – 2 = The degrees of freedom

Now for a given level of significance, if computed |t| is greater than


tabulated |t| with n – 2 degrees of freedom, the null hypothesis of
no correlation between X and Y is rejected.
RESEARCH METHODOLOGY CONCEPTS AND CASES
DR NEENA SONDHI SLIDE 15-9

Simple Regression Analysis


Simple linear regression equation can be presented as
Y = α + βX + U
Where,
U = Stochastic error term
α, β = Parameters to be estimated
DR DEEPAK CHAWLA

 The equation is estimated using the ordinary least


squares (OLS) method of estimation.
 The OLS method of estimation states that the
regression line should be drawn in such a way so as
to minimize the error sum of squares.

RESEARCH METHODOLOGY CONCEPTS AND CASES


DR NEENA SONDHI SLIDE 15-10

Simple Regression Analysis


The OLS method of estimation would result in the
following two normal equations:

Solving the above normal equations results in:


DR DEEPAK CHAWLA

Once βˆ is
estimated, the value
of α may be
computed as,

RESEARCH METHODOLOGY CONCEPTS AND CASES


SLIDE 15-11

Simple Regression Analysis


DR NEENA SONDHI

 The estimate of error (residual) term is obtained as:

where Ŷ  estimated value of the dependent variable


where Y  observed value of the dependent variable

The estimate of the variance of the error term is given by:


DR DEEPAK CHAWLA

 n and k denote the sample size and number of parameters to be estimated


respectively.

RESEARCH METHODOLOGY CONCEPTS AND CASES


SLIDE 15-12
Test of Significance of Regression
DR NEENA SONDHI

Parameters
The hypothesis to be tested for the slope coefficient is mentioned
below as:
H0 : β = 0
H1 : β ≠ 0

The test statistic to be used to test the significance of the slope


coefficient is given by:
DR DEEPAK CHAWLA

At a given level of significance, computed t is compared with


absolute t to accept or reject the null hypothesis.

RESEARCH METHODOLOGY CONCEPTS AND CASES


SLIDE 15-13
Goodness of fit of regression
DR NEENA SONDHI

equation
 r2 is used to measure the goodness of fit of regression equation.
This measure is also called the coefficient of determination of a
regression equation and it takes value between 0 and 1. Higher
the value of r2, higher is the goodness of fit.
DR DEEPAK CHAWLA

RESEARCH METHODOLOGY CONCEPTS AND CASES


DR NEENA SONDHI SLIDE 15-14

Testing the Significance of r2


The hypothesis to be tested is:
H0 : r2 = 0
H1 : r2 > 0
The test statistic F is given by the expression:
DR DEEPAK CHAWLA

For a given level of significance α, the computed value of the


F statistic is compared with the tabulated value of F with k – 1
degrees of freedom in the numerator and n – k degrees of
freedom in the denominator. If the computed F exceeds the
tabulated F, the null hypothesis is rejected in favour of the
alternative hypothesis.

RESEARCH METHODOLOGY CONCEPTS AND CASES


SLIDE 15-15

Multiple Regression Model


DR NEENA SONDHI

In the multiple regression model, there are at least two independent


variables. The linear multiple regression model with two
independent variables would look like:
Y = b0 + b1 X1 + b2 X2 + U
b0, b1 and b2 are the parameters to be estimated.

The estimation is carried out using the OLS method which results in
the following:
DR DEEPAK CHAWLA

RESEARCH METHODOLOGY CONCEPTS AND CASES


DR NEENA SONDHI SLIDE 15-16

Multiple Regression Model


 In case of multiple regression model, we have the concept of the
multiple correlation squared given by R2Y.X1X2 which indicates the
explanatory power of the model. The various formulae for R2 are
given as under:
DR DEEPAK CHAWLA

The test of significance of the individual parameters is


conducted using the t statistic. To be able to use the t
statistic we need the estimates of the variance of the
estimated coefficients of the regression equation.

RESEARCH METHODOLOGY CONCEPTS AND CASES


DR NEENA SONDHI SLIDE 15-17

Multiple Regression Model


 The estimates of the variance of estimated coefficients are
presented below:
DR DEEPAK CHAWLA

RESEARCH METHODOLOGY CONCEPTS AND CASES


DR NEENA SONDHI SLIDE 15-18

Multiple Regression Model


Let us assume that we want to test the significance of the slope
coefficient of the variable X1. We can write the null and alternative
hypothesis as:
H 0 : b1 = 0
H 1 : b1 ≠ 0
The test statistic may be written as:
DR DEEPAK CHAWLA

The value of the test statistic t is computed and compared with


the table value of t for a given level of significance. If the
computed value of |t| is greater than table value of |t|, we reject
H0 in favour of the alternative hypothesis H1.

RESEARCH METHODOLOGY CONCEPTS AND CASES


DR NEENA SONDHI SLIDE 15-19

Testing the Significance of R2


 The test for the significance of R2 is carried out using the F
statistic, which is already explained in the case of the two
variable linear regression model. The hypothesis to be tested is
listed as under:
H 0 : b0 = b 1 = b 2 = 0 ⇒ R 2 = 0
H1 : All b’s are not zero ⇒ R2 > 0
DR DEEPAK CHAWLA

 If R2 is equal to 0 that means all the coefficients are equal to zero


since none of the independent variables would explain any
variations in Y.
 The test for the significance of R2 is shown through the analysis
of variance (ANOVA) table.

RESEARCH METHODOLOGY CONCEPTS AND CASES


DR NEENA SONDHI SLIDE 15-20

Testing the Significance of R2


DR DEEPAK CHAWLA

RESEARCH METHODOLOGY CONCEPTS AND CASES


SLIDE 15-21
Dummy Variables in Regression
DR NEENA SONDHI

Analysis
 In regression analysis, the dependent variable is generally metric
in nature and it is most often influenced by other metric
variables.
 However, there could be situations where the dependent variable
may be influenced by the qualitative variables like gender,
marital status, profession, geographical region, colour, or
religion.
 The question arises how to quantify qualitative variables.
DR DEEPAK CHAWLA

 In situations like this, the dummy variables come to our rescue.


They are used to quantify the qualitative variables.
 The number of dummy variables required in the regression
model is equal to the number of categories of data less one.
 Dummy variables usually assume two values 0 and 1.

RESEARCH METHODOLOGY CONCEPTS AND CASES


SLIDE 15-22
Example of a Dummy Variable
DR NEENA SONDHI

Regression
Suppose the starting salary of a college lecturer is influenced not
only by years of teaching experience but also by gender. Therefore,
the model could be specified as:
Y = f (X, D)
Where,
Y = Starting salary of a college lecturer in thousands ` per month
X = No. of years of work experience
DR DEEPAK CHAWLA

D is a dummy variable which takes values


D = 1 (if the respondent is a male)
= 0 (if the respondent is a female)

The model could be written as,


Y=α+βX+γD+U

RESEARCH METHODOLOGY CONCEPTS AND CASES


SLIDE 15-23
Example of a Dummy Variable
DR NEENA SONDHI

Regression
This can be estimated by using ordinary least squares (OLS)
techniques. Suppose the estimated regression equation looks like:
DR DEEPAK CHAWLA

The above two equations differ by the amount γˆ. It is known that
γˆ can be positive or negative. If γˆ is positive it would imply that
the average salary of a male lecturer is more than that of a
female lecturer by the amount γˆ while keeping the number of
years of experience constant.
RESEARCH METHODOLOGY CONCEPTS AND CASES
DR NEENA SONDHI

END OF CHAPTER
DR DEEPAK CHAWLA

RESEARCH METHODOLOGY CONCEPTS AND CASES

You might also like