0% found this document useful (0 votes)

289 views66 pages

Econometrics ch4

The document summarizes the method of ordinary least squares (OLS) used to estimate the parameters in a linear regression model. It explains that OLS chooses parameter estimates to minimize the sum of squared residuals. The normal equations are derived from taking the partial derivatives of the sum of squared residuals. This yields unique estimates of the parameters that provide the best fit of the regression line to the data. Assumptions of the classical linear regression model are also outlined, including that the expected value of the error term given the regressor is zero and that the errors exhibit homoscedasticity and no serial correlation.

Uploaded by

Kashif Khurshid

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

289 views66 pages

Econometrics ch4

Uploaded by

Kashif Khurshid

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 66

405 ECONOMETRICS

Chapter # 3: TWO-VARIABLE REGRESSION

MODEL: THE PROBLEM OF ESTIMATION

By Domodar N. Gujarati

Prof. M. El-Sakka
Dept of Economics Kuwait University

THE METHOD OF ORDINARY LEAST SQUARES

To understand this method, we first explain the least squares principle.

Recall the two-variable PRF:
Yi = 1 + 2Xi + ui
(2.4.2)
The PRF is not directly observable. We estimate it from the SRF:
Yi = 1 + 2Xi +ui
= Yi +ui
where Yi is the estimated (conditional mean) value of Yi .
But how is the SRF itself determined? First, express (2.6.3) as

(2.6.2)
(2.6.3)

ui = Yi Yi
= Yi 1 2Xi
(3.1.1)
Now given n pairs of observations on Y and X, we would like to determine the
SRF in such a manner that it is as close as possible to the actual Y. To this
end, we may adopt the following criterion:
Choose the SRF in such a way that the sum of the residuals ui = (Yi Yi) is
as small as possible.

But this is not a very good criterion. If we adopt the criterion of

minimizing ui , Figure 3.1 shows that the residuals u2 and u3 as well as
the residuals u1 and u4 receive the same weight in the sum (u1 + u2 + u3
+ u4). A consequence of this is that it is quite possible that the algebraic
sum of the ui is small (even zero) although the ui are widely scattered
about the SRF.
To see this, let u1, u2, u3, and u4 in Figure 3.1 take the values of 10, 2,
+2, and 10, respectively. The algebraic sum of these residuals is zero
although u1 and u4 are scattered more widely around the SRF than u2
and u3.
We can avoid this problem if we adopt the least-squares criterion, which
states that the SRF can be fixed in such a way that
u2i = (Yi Yi)2
= (Yi 1 2Xi)2
(3.1.2)
is as small as possible, where u2i are the squared residuals.

By squaring ui , this method gives more weight to residuals such as

u1 and u4 in Figure 3.1 than the residuals u2 and u3.
A further justification for the least-squares method lies in the fact
that the estimators obtained by it have some very desirable statistical
properties, as we shall see shortly.

It is obvious from (3.1.2) that:

u2i = f (1, 2)
(3.1.3)
that is, the sum of the squared residuals is some function of the estimators
1 and 2. To see this, consider Table 3.1 and conduct two experiments.

Since the values in the two experiments are different, we get

different values for the estimated residuals.
Now which sets of values should we choose? Obviously the s of
the first experiment are the best values. But we can make endless
experiments and then choosing that set of values that gives us the
least possible value of u2i
But since time, and patience, are generally in short supply, we need to
consider some shortcuts to this trial-and-error process. Fortunately,
the method of least squares provides us with unique estimates of 1
and 2 that give the smallest possible value of u2i.

u2i = (Yi 1 2Xi)2

(3.1.2)

The process of differentiation yields the following equations for estimating

1 and 2:
Yi Xi = 1Xi + 2X2i
(3.1.4)
Yi = n1 + 2Xi
(3.1.5)
where n is the sample size. These simultaneous equations are known as the
normal equations. Solving the normal equations simultaneously, we obtain:

where X and Y are the sample means of X and Y and where we

define xi = (Xi X ) and yi = (Yi Y). Henceforth we adopt the
convention of letting the lowercase letters denote deviations from mean
values.

The last step in (3.1.7) can be obtained directly from (3.1.4) by simple
algebraic manipulations. Incidentally, note that, by making use of simple
algebraic identities, formula (3.1.6) for estimating 2 can be alternatively
expressed as:

The estimators obtained previously are known as the least-squares estimators.

Note the following numerical properties of estimators obtained by the

method of OLS:
I. The OLS estimators are expressed solely in terms of the observable (i.e.,
sample) quantities (i.e., X and Y). Therefore, they can be easily computed .

II. They are point estimators; that is, given the sample, each estimator will
provide only a single (point, not interval) value of the relevant population
parameter.

III. Once the OLS estimates are obtained from the sample data, the sample
regression line (Figure 3.1) can be easily obtained .

The regression line thus obtained has the following properties:

1. It passes through the sample means of Y and X. This fact is obvious
from (3.1.7), for the latter can be written as Y = 1 + 2X , which is
shown diagrammatically in Figure 3.2.

2. The mean value of the estimated Y = Yi is equal to the mean value of

the actual Y for:
Yi = 1 + 2Xi
= (Y 2X ) + 2Xi
= Y + 2(Xi X)
(3.1.9)

Summing both sides of this last equality over the sample values and
dividing through by the sample size n gives
Y = Y

(3.1.10)

where use is made of the fact that (Xi X ) = 0.

3. The mean value of the residuals ui is zero. From Appendix 3A,
Section 3A.1, the first equation is:
2(Yi 1 2Xi) = 0
But since ui = Yi 1 2Xi , the preceding equation reduces to
2 ui = 0, whence u = 0

As a result of the preceding property, the sample regression

Yi = 1 + 2Xi +ui

(2.6.2)
can be expressed in an alternative form where both Y and X are expressed as
deviations from their mean values. To see this, sum (2.6.2) on both sides to
give:
Yi = n1 + 2Xi +ui
= n1 + 2Xi
since ui = 0
(3.1.11)
Dividing Eq. (3.1.11) through by n, we obtain
Y = 1 + 2X
(3.1.12)
which is the same as (3.1.7). Subtracting Eq. (3.1.12) from (2.6.2), we obtain
Yi Y = 2(Xi X ) + ui
Or
yi = 2xi +ui

(3.1.13)

Equation (3.1.13) is known as the deviation form. Notice that the

intercept term 1 is no longer present in it. But the intercept term can
always be estimated by (3.1.7), that is, from the fact that the sample
regression line passes through the sample means of Y and X.
An advantage of the deviation form is that it often simplifies
computing formulas. In passing, note that in the deviation form, the
SRF can be written as:
yi = 2xi
(3.1.14)
whereas in the original units of measurement it was Yi = 1 + 2Xi ,
as shown in (2.6.1).

4. The residuals ui are uncorrelated with the predicted Yi . This statement

can be verified as follows: using the deviation form, we can write:

where use is made of the fact that

5. The residuals ui are uncorrelated with Xi ; that is,
fact follows from Eq. (2) in Appendix 3A, Section 3A.1.

This

THE CLASSICAL LINEAR REGRESSION MODEL: THE ASSUMPTIONS

UNDERLYING THE METHOD OF LEAST SQUARES

In regression analysis our objective is not only to obtain 1 and 2

but also to draw inferences about the true 1 and 2. For example, we
would like to know how close 1 and 2 are to their counterparts in the
population or how close Yi is to the true E(Y | Xi).
Look at the PRF: Yi = 1 + 2Xi + ui . It shows that Yi depends on both
Xi and ui . The assumptions made about the Xi variable(s) and the
error term are extremely critical to the valid interpretation of the
regression estimates.
The Gaussian, standard, or classical linear regression model
(CLRM), makes 10 assumptions.

Keep in mind that the regressand Y and the regressor X themselves may be
nonlinear.

look at Table 2.1. Keeping the value of income X fixed, say, at $80, we
draw at random a family and observe its weekly family consumption
expenditure Y as, say, $60. Still keeping X at $80, we draw at random
another family and observe its Y value as $75. In each of these
drawings (i.e., repeated sampling), the value of X is fixed at $80. We
can repeat this process for all the X values shown in Table 2.1.
This means that our regression analysis is conditional regression
analysis, that is, conditional on the given values of the regressor(s) X.

As shown in Figure 3.3, each Y population corresponding to a given X

is distributed around its mean value with some Y values above the
mean and some below it. the mean value of these deviations
corresponding to any given X should be zero.
Note that the assumption E(ui | Xi) = 0 implies that E(Yi | Xi) = 1 +
2Xi.

E(ui | Xi) = 0

Technically, (3.2.2) represents the assumption of homoscedasticity, or equal

(homo) spread (scedasticity) or equal variance. Stated differently, (3.2.2) means
that the Y populations corresponding to various X values have the same
variance.
Put simply, the variation around the regression line (which is the line of
average relationship between Y and X) is the same across the X values; it
neither increases or decreases as X varies

In Figure 3.5, where the conditional variance of the Y population

varies with X. This situation is known as heteroscedasticity, or unequal
spread, or variance. Symbolically, in this situation (3.2.2) can be
written as
var (ui | Xi) = 2i
(3.2.3)
Figure 3.5. shows that, var (u| X1) < var (u| X2), . . . , < var (u| Xi).
Therefore, the likelihood is that the Y observations coming from the
population with X = X1 would be closer to the PRF than those coming
from populations corresponding to X = X2, X = X3, and so on. In short,
not all Y values corresponding to the various Xs will be equally
reliable, reliability being judged by how closely or distantly the Y
values are distributed around their means, that is, the points on the
PRF.

The disturbances ui and uj are uncorrelated, i.e., no serial correlation. This

means that, given Xi , the deviations of any two Y values from their mean
value do not exhibit patterns. In Figure 3.6a, the us are positively correlated,
a positive u followed by a positive u or a negative u followed by a negative u.
In Figure 3.6b, the us are negatively correlated, a positive u followed by a
negative u and vice versa. If the disturbances follow systematic patterns,
Figure 3.6a and b, there is auto- or serial correlation. Figure 3.6c shows that
there is no systematic pattern to the us, thus indicating zero correlation.

Suppose in our PRF (Yt = 1 + 2Xt + ut) that ut and ut1 are positively
correlated. Then Yt depends not only on Xt but also on ut1 for ut1 to
some extent determines ut.

The disturbance u and explanatory variable X are uncorrelated. The PRF

assumes that X and u (which may represent the influence of all the omitted
variables) have separate (and additive) influence on Y. But if X and u are
correlated, it is not possible to assess their individual effects on Y. Thus, if X
and u are positively correlated, X increases when u increases and it decreases
when u decreases. Similarly, if X and u are negatively correlated, X increases
when u decreases and it decreases when u increases. In either case, it is
difficult to isolate the influence of X and u on Y.

In the hypothetical example of Table 3.1, imagine that we had only the first
pair of observations on Y and X (4 and 1). From this single observation there
is no way to estimate the two unknowns, 1 and 2. We need at least two pairs
of observations to estimate the two unknowns

This assumption too is not so innocuous as it looks. Look at Eq. (3.1.6). If

all the X values are identical, then Xi = X and the denominator of that
equation will be zero, making it impossible to estimate 2 and therefore 1.
Looking at our family consumption expenditure example in Chapter 2, if
there is very little variation in family income, we will not be able to explain
much of the variation in the consumption expenditure.

An econometric investigation begins with the specification of the

econometric model underlying the phenomenon of interest. Some important
questions that arise in the specification of the model include the following:
(1) What variables should be included in the model?
(2) What is the functional form of the model? Is it linear in the parameters,
the variables, or both?
(3) What are the probabilistic assumptions made about the Yi , the Xi, and
the ui entering the model?

Suppose we choose the following two models to depict the underlying

relationship between the rate of change of money wages and the
unemployment rate:
Y i = 1 + 2 Xi + u i
(3.2.7)
Yi = 1 + 2 (1/Xi ) + ui
(3.2.8)
where Yi = the rate of change of money wages, and X i = the unemployment rate.
The regression model (3.2.7) is linear both in the parameters and the
variables, whereas (3.2.8) is linear in the parameters (hence a linear
regression model by our definition) but nonlinear in the variable X. Now
consider Figure 3.7.
If model (3.2.8) is the correct or the true model, fitting the model (3.2.7)
to the scatterpoints shown in Figure 3.7 will give us wrong predictions.
Unfortunately, in practice one rarely knows the correct variables to include in
the model or the correct functional form of the model or the correct
probabilistic assumptions about the variables entering the model for the
theory underlying the particular investigation may not be strong or robust
enough to answer all these questions.

We will discuss this assumption in Chapter 7, where we discuss multiple

regression models.

PRECISION OR STANDARD ERRORS OF LEASTSQUARES ESTIMATES

The least-squares estimates are a function of the sample data. But since the
data change from sample to sample, the estimates will change. Therefore,
what is needed is some measure of reliability or precision of the estimators
1 and 2. In statistics the precision of an estimate is measured by its
standard error (se), which can be obtained as follows:

2 is the constant or homoscedastic variance of ui of Assumption 4.

2 itself is estimated by the following formula:

where 2 is the OLS estimator of the true but unknown 2 and where the
expression n2 is known as the number of degrees of freedom (df),
is
the residual sum of squares (RSS). Once
is known, 2 can be easily
computed.

Compared with Eq. (3.1.2), Eq. (3.3.6) is easy to use, for it does not require
computing ui for each observation.

Since
an alternative expression for computing

In passing, note that the positive square root of 2

is known as the standard error of estimate or the standard error of the

regression (se). It is simply the standard deviation of the Y values about the
estimated regression line and is often used as a summary measure of the
goodness of fit of the estimated regression line.

Note the following features of the variances (and therefore the standard
errors) of 1 and 2.

1. The variance of 2 is directly proportional to 2 but inversely proportional

to x2i . That is, given 2, the larger the variation in the X values, the smaller
the variance of 2 and hence the greater the precision with which 2 can be
estimated.

2. The variance of 1 is directly proportional to 2 and X2i but inversely

proportional to x2i and the sample size n.

3. Since 1 and 2 are estimators, they will not only vary from sample to
sample but in a given sample they are likely to be dependent on each other,
this dependence being measured by the covariance between them.

Since var (2) is always positive, as is the variance of any variable, the nature
of the covariance between 1 and 2 depends on the sign of X . If X is
positive, then as the formula shows, the covariance will be negative. Thus, if
the slope coefficient 2 is overestimated (i.e., the slope is too steep), the
intercept coefficient 1 will be underestimated (i.e., the intercept will be too
small).

PROPERTIES OF LEAST-SQUARES ESTIMATORS:

THE GAUSSMARKOV THEOREM

To understand this theorem, we need to consider the best linear

unbiasedness property of an estimator. An estimator, say the OLS estimator
2, is said to be a best linear unbiased estimator (BLUE) of 2 if the
following hold:
1. It is linear, that is, a linear function of a random variable, such as the
dependent variable Y in the regression model.
2. It is unbiased, that is, its average or expected value, E(2), is equal to the
true value, 2.
3. It has minimum variance in the class of all such linear unbiased
estimators; an unbiased estimator with the least variance is known as an
efficient estimator.

What all this means can be explained with the aid of Figure 3.8. In Figure
3.8(a) we have shown the sampling distribution of the OLS estimator 2, that
is, the distribution of the values taken by 2 in repeated sampling experiment.
For convenience we have assumed 2 to be distributed symmetrically. As the
figure shows, the mean of the 2 values, E(2), is equal to the true 2. In this
situation we say that 2 is an unbiased estimator of 2. In Figure 3.8(b) we
have shown the sampling distribution of 2, an alternative estimator of 2
obtained by using another (i.e., other than OLS) method.

For convenience, assume that *2, like 2, is unbiased, that is, its average or
expected value is equal to 2. Assume further that both 2 and *2 are linear
estimators, that is, they are linear functions of Y. Which estimator, 2 or *2,
would you choose? To answer this question, superimpose the two figures, as
in Figure 3.8(c). It is obvious that although both 2 and *2 are unbiased
the distribution of *2 is more diffused or widespread around the mean
value than the distribution of 2. In other words, the variance of *2 is
larger than the variance of 2.
Now given two estimators that are both linear and unbiased, one would
choose the estimator with the smaller variance because it is more likely to
be close to 2 than the alternative estimator. In short, one would choose the
BLUE estimator.

THE COEFFICIENT OF DETERMINATION r2:

A MEASURE OF GOODNESS OF FIT

We now consider the goodness of fit of the fitted regression line to a set of
data; that is, we shall find out how well the sample regression line fits the
data. The coefficient of determination r2 (two-variable case) or R2 (multiple
regression) is a summary measure that tells how well the sample regression
line fits the data.
Consider a heuristic explanation of r2 in terms of a graphical device, known
as the Venn diagram shown in Figure 3.9.
In this figure the circle Y represents variation in the dependent variable Y and
the circle X represents variation in the explanatory variable X. The overlap of
the two circles indicates the extent to which the variation in Y is explained
by the variation in X.

To compute this r2, we proceed as follows: Recall that

Yi = Yi +ui
or in the deviation form
yi = yi + ui

(3.5.1)
where use is made of (3.1.13) and (3.1.14). Squaring (3.5.1) on both sides
and summing over the sample, we obtain

(2.6.3)

Since

= 0 and yi = 2xi .

The various sums of squares appearing in (3.5.2) can be described as

follows:
= total variation of the actual Y values about their
sample mean, which may be called the total sum of squares ( TSS).
= variation of the estimated Y
values about their mean (Y = Y), which appropriately may be called the
sum of squares due to/or explained by regression, or simply the explained
sum of squares (ESS).
= residual or unexplained variation of the Y
values about the regression line, or simply the residual sum of squares (RSS).
Thus, (3.5.2) is
TSS = ESS + RSS
(3.5.3)
and shows that the total variation in the observed Y values about their mean
value can be partitioned into two parts, one attributable to the regression
line and the other to random forces because not all actual Y observations lie
on the fitted line. Geometrically, we have Figure 3.10

The quantity r2 thus defined is known as the (sample) coefficient of

determination and is the most commonly used measure of the goodness of fit
of a regression line. Verbally, r2 measures the proportion or percentage of the
total variation in Y explained by the regression model.
Two properties of r2 may be noted:
1. It is a nonnegative quantity .
2. Its limits are 0 r2 1. An r2 of 1 means a perfect fit, that is, Yi = Yi for
each i. On the other hand, an r2 of zero means that there is no relationship
between the regressand and the regressor whatsoever (i.e., 2 = 0). In this
case, as (3.1.9) shows, Yi = 1 = Y, that is, the best prediction of any Y value
is simply its mean value. In this situation therefore the regression line will
be horizontal to the X axis.
Although r2 can be computed directly from its definition given in (3.5.5), it
can be obtained more quickly from the following formula:

Some of the properties of r are as follows (see Figure 3.11):

1. It can be positive or negative,
2. It lies between the limits of 1 and +1; that is, 1 r 1.
3. It is symmetrical in nature; that is, the coefficient of correlation between X
and Y(rXY) is the same as that between Y and X(rYX).
4. It is independent of the origin and scale; that is, if we define X*i = aXi + C
and Y*i = bYi + d, where a > 0, b > 0, and c and d are constants, then r
between X* and Y* is the same as that between the original variables X and Y.
5. If X and Y are statistically independent, the correlation coefficient between
them is zero; but if r = 0, it does not mean that two variables are
independent.
6. It is a measure of linear association or linear dependence only; it has no
meaning for describing nonlinear relations.
7. Although it is a measure of linear association between two variables, it
does not necessarily imply any cause-and-effect relationship.

In the regression context, r2 is a more meaningful measure than r, for the

former tells us the proportion of variation in the dependent variable
explained by the explanatory variable(s) and therefore provides an overall
measure of the extent to which the variation in one variable determines the
variation in the other. The latter does not have such value. Moreover, as we
shall see, the interpretation of r (= R) in a multiple regression model is of
dubious value. In passing, note that the r2 defined previously can also be
computed as the squared coefficient of correlation between actual Y i and the
estimated Yi , namely, Yi . That is, using (3.5.13), we can write

where Yi = actual Y, Yi = estimated Y, and Y = Y = the mean of Y. For

proof, see exercise 3.15. Expression (3.5.14) justifies the description of r2 as
a measure of goodness of fit, for it tells how close the estimated Y values are
to their actual values.

A NUMERICAL EXAMPLE

1 = 24.4545 var (1) = 41.1370 and se (1) = 6.4138

2 = 0.5091 var (2) = 0.0013 and
se (2) = 0.0357
cov (1, 2) = 0.2172
2 = 42.1591 (3.6.1)
r2 = 0.9621 r = 0.9809
df = 8
The estimated regression line therefore is
Yi = 24.4545 + 0.5091Xi
(3.6.2)
which is shown geometrically as Figure 3.12.
Following Chapter 2, the SRF [Eq. (3.6.2)] and the associated regression line
are interpreted as follows: Each point on the regression line gives an estimate
of the expected or mean value of Y corresponding to the chosen X value; that
is, Yi is an estimate of E(Y | Xi). The value of 2 = 0.5091, which measures the
slope of the line, shows that, within the sample range of X between $80 and
$260 per week, as X increases, say, by $1, the estimated increase in the mean
or average weekly consumption expenditure amounts to about 51 cents. The
value of 1 = 24.4545, which is the intercept of the line, indicates the average
level of weekly consumption expenditure when weekly income is zero.

However, this is a mechanical interpretation of the intercept term. In

regression analysis such literal interpretation of the intercept term may not
be always meaningful, although in the present example it can be argued
that a family without any income (because of unemployment, layoff, etc.)
might maintain some minimum level of consumption expenditure either by
borrowing or dissaving. But in general one has to use common sense in
interpreting the intercept term, for very often the sample range of X values
may not include zero as one of the observed values. Perhaps it is best to
interpret the intercept term as the mean or average effect on Y of all the
variables omitted from the regression model. The value of r 2 of 0.9621 means
that about 96 percent of the variation in the weekly consumption expenditure
is explained by income. Since r 2 can at most be 1, the observed r 2 suggests
that the sample regression line fits the data very well.26 The coefficient of
correlation of 0.9809 shows that the two variables, consumption
expenditure and income, are highly positively correlated. The estimated
standard errors of the regression coefficients will be interpreted in Chapter
5.

See numerical exapmles 3.1-3.3

Late Marriage and Less Marriage in Japan
No ratings yet
Late Marriage and Less Marriage in Japan
39 pages
Darren George, Paul Mallery - IBM SPSS Statistics 29 Step by Step - 11
No ratings yet
Darren George, Paul Mallery - IBM SPSS Statistics 29 Step by Step - 11
1 page
Testing of Hypothesis Notes.-Iv BSC Module 2
No ratings yet
Testing of Hypothesis Notes.-Iv BSC Module 2
26 pages
6 TTE Regression
No ratings yet
6 TTE Regression
37 pages
Least-Square Multiple Regression
No ratings yet
Least-Square Multiple Regression
2 pages
EDA Template
No ratings yet
EDA Template
18 pages
The Effect of The Implementation of Reward Incenti
No ratings yet
The Effect of The Implementation of Reward Incenti
9 pages
Correlation and Regression
No ratings yet
Correlation and Regression
15 pages
Smartpls Report: Complete Final Results
No ratings yet
Smartpls Report: Complete Final Results
107 pages
Programme Guide PGDAST 2022
No ratings yet
Programme Guide PGDAST 2022
52 pages
Simple Linear Regression
No ratings yet
Simple Linear Regression
54 pages
Naive Bayes
No ratings yet
Naive Bayes
32 pages
1140 3399 1 PB
No ratings yet
1140 3399 1 PB
9 pages
Simple Linear Regression
No ratings yet
Simple Linear Regression
2 pages
Mec-001/101: Microeconomic Analysis: Assignment (TMA)
50% (2)
Mec-001/101: Microeconomic Analysis: Assignment (TMA)
23 pages
MATH1541-WE01 Statistics I May 2016
No ratings yet
MATH1541-WE01 Statistics I May 2016
8 pages
EC 264: Decisions & Games: Problem 3: Correlated Equilibria vs. Nash Equilibria
No ratings yet
EC 264: Decisions & Games: Problem 3: Correlated Equilibria vs. Nash Equilibria
1 page
BBM 351 or Assignment 2 June 2021
No ratings yet
BBM 351 or Assignment 2 June 2021
2 pages
TWO-VARIABLE New
No ratings yet
TWO-VARIABLE New
19 pages
Mcgraw Hill/Irwin
No ratings yet
Mcgraw Hill/Irwin
12 pages
Game Theory: Managerial Economics
No ratings yet
Game Theory: Managerial Economics
53 pages
Tutorial 5: An Introduction To Asset Pricing Models
No ratings yet
Tutorial 5: An Introduction To Asset Pricing Models
49 pages
Simple Regression
No ratings yet
Simple Regression
18 pages
ARDL Model
No ratings yet
ARDL Model
5 pages
4 Decision Making Under Risk
No ratings yet
4 Decision Making Under Risk
49 pages
Chapter 7
No ratings yet
Chapter 7
71 pages
Resume Ekonometrika Bab 3
No ratings yet
Resume Ekonometrika Bab 3
5 pages
Univariate Analysis of Variance: Between-Subjects Factors
No ratings yet
Univariate Analysis of Variance: Between-Subjects Factors
3 pages
IEOR E4703 Spring 2016 Syllabus
No ratings yet
IEOR E4703 Spring 2016 Syllabus
2 pages
Chapter 15 Qualitative Response Regression Models Part 2
No ratings yet
Chapter 15 Qualitative Response Regression Models Part 2
31 pages
3.binomial Distribution
No ratings yet
3.binomial Distribution
14 pages
2A Beta Estimation Coke Cola Student
No ratings yet
2A Beta Estimation Coke Cola Student
64 pages
CH 05
No ratings yet
CH 05
34 pages
Package Rfit': R Topics Documented
No ratings yet
Package Rfit': R Topics Documented
35 pages
Jury of Executive Opinion
50% (2)
Jury of Executive Opinion
38 pages
CH 05 Wooldridge 5e PPT
No ratings yet
CH 05 Wooldridge 5e PPT
8 pages
Course Outlines For Fundaments of Corporate Finance: Book: Fundamentals of Corporate Finance Author: Ross, Westerfield and Jordan. 10 Edition
No ratings yet
Course Outlines For Fundaments of Corporate Finance: Book: Fundamentals of Corporate Finance Author: Ross, Westerfield and Jordan. 10 Edition
1 page
Estimating Risk and Return On Assets: S A R Q P I. Questions
No ratings yet
Estimating Risk and Return On Assets: S A R Q P I. Questions
13 pages
ECON 5027FG Chu PDF
No ratings yet
ECON 5027FG Chu PDF
3 pages
405 Econometrics: Domodar N. Gujarati
No ratings yet
405 Econometrics: Domodar N. Gujarati
12 pages
Econometrics I: Chapter 3: Two Variable Regression Model: The Problem of Estimation
No ratings yet
Econometrics I: Chapter 3: Two Variable Regression Model: The Problem of Estimation
35 pages
3: Research Methodology
No ratings yet
3: Research Methodology
2 pages
Asian Economic and Financial Review: Toheed Alam
No ratings yet
Asian Economic and Financial Review: Toheed Alam
11 pages
MPRA Paper 24611 PDF
No ratings yet
MPRA Paper 24611 PDF
11 pages
Nash On The Crossroads of Information and Game Theory
No ratings yet
Nash On The Crossroads of Information and Game Theory
7 pages
Chapter Two: Bivariate Regression Mode
100% (1)
Chapter Two: Bivariate Regression Mode
54 pages
Ix. Introduction To Statistical Concepts: Frequency Distribution Measures of Central Tendency Measures of Variability
No ratings yet
Ix. Introduction To Statistical Concepts: Frequency Distribution Measures of Central Tendency Measures of Variability
119 pages
Questions Regarding Panel Data
No ratings yet
Questions Regarding Panel Data
3 pages
Introduction To Advertising: Outline
No ratings yet
Introduction To Advertising: Outline
10 pages
Estimatingthe Optimal Capital Structure
No ratings yet
Estimatingthe Optimal Capital Structure
22 pages
CF Assignment 1
No ratings yet
CF Assignment 1
4 pages
CF Assignment 1 PDF
No ratings yet
CF Assignment 1 PDF
4 pages
405 Econometrics Odar N. Gujarati: Prof. M. El-Sakka
100% (1)
405 Econometrics Odar N. Gujarati: Prof. M. El-Sakka
27 pages
The Valuation and Characteristics of Bonds
100% (1)
The Valuation and Characteristics of Bonds
56 pages
Econometrics: Specification Errors
100% (2)
Econometrics: Specification Errors
13 pages
Should Banks Merger With The Cement Industry: Empirical Study of Pakistan
No ratings yet
Should Banks Merger With The Cement Industry: Empirical Study of Pakistan
5 pages
6.2009 Hypothesis Testing-Binomial S
100% (2)
6.2009 Hypothesis Testing-Binomial S
6 pages
Structural Equation Modeling: Dr. Arshad Hassan
No ratings yet
Structural Equation Modeling: Dr. Arshad Hassan
47 pages
Strategic Financial Management
No ratings yet
Strategic Financial Management
4 pages
By: Domodar N. Gujarati: Prof. M. El-Sakka
No ratings yet
By: Domodar N. Gujarati: Prof. M. El-Sakka
22 pages
Lecture 4 Day 3 Stochastic Frontier Analysis
100% (1)
Lecture 4 Day 3 Stochastic Frontier Analysis
45 pages
Probability of End-Year Portfolio Value
No ratings yet
Probability of End-Year Portfolio Value
130 pages
Econometrics ch6
No ratings yet
Econometrics ch6
51 pages
Two-Variable Regression, Interval Estimation and Hypothesis Testing
No ratings yet
Two-Variable Regression, Interval Estimation and Hypothesis Testing
51 pages
Chapter # 6: Multiple Regression Analysis: The Problem of Estimation
No ratings yet
Chapter # 6: Multiple Regression Analysis: The Problem of Estimation
43 pages
CH 02
No ratings yet
CH 02
88 pages
Econometrics Project
No ratings yet
Econometrics Project
18 pages
Multicollinearity Among The Regressors Included in The Regression Model
No ratings yet
Multicollinearity Among The Regressors Included in The Regression Model
13 pages
Solutions To Text Problems: Chapter 6: Quick Quizzes
No ratings yet
Solutions To Text Problems: Chapter 6: Quick Quizzes
60 pages
State of Nature Decision Good Foreign Competitive Conditionspoor Foreign Competitive Conditions
100% (1)
State of Nature Decision Good Foreign Competitive Conditionspoor Foreign Competitive Conditions
4 pages
Chapter 2 (Econometrics)
No ratings yet
Chapter 2 (Econometrics)
36 pages
Relationships Between Inflation, Interest Rates, and Exchange Rates
No ratings yet
Relationships Between Inflation, Interest Rates, and Exchange Rates
18 pages
Financial Econometrics - Homework 3
No ratings yet
Financial Econometrics - Homework 3
2 pages
Exercise 4 Chap 4
No ratings yet
Exercise 4 Chap 4
11 pages
Two-Variable Regression Model, The Problem of Estimation
No ratings yet
Two-Variable Regression Model, The Problem of Estimation
67 pages
Lakhani Zarmeen 17398 Quiz Submission
No ratings yet
Lakhani Zarmeen 17398 Quiz Submission
18 pages
Chapter9 - Serial Correlation
No ratings yet
Chapter9 - Serial Correlation
37 pages
Dummy Reg
No ratings yet
Dummy Reg
68 pages
ch14 Nonlinear Regression Models
100% (1)
ch14 Nonlinear Regression Models
18 pages
Chap024 Rev
No ratings yet
Chap024 Rev
14 pages
Forecasting Interest Rates
No ratings yet
Forecasting Interest Rates
63 pages
Chapter 17
No ratings yet
Chapter 17
1 page
Chapter Three Multiple
No ratings yet
Chapter Three Multiple
15 pages
Bodie, Kane, Marcus, Perrakis and Ryan, Chapter 6: Answers To Selected Problems
No ratings yet
Bodie, Kane, Marcus, Perrakis and Ryan, Chapter 6: Answers To Selected Problems
4 pages
CH 10 TB
100% (1)
CH 10 TB
23 pages
CH 09
No ratings yet
CH 09
172 pages
Monte Carlo My Presentation PDF
No ratings yet
Monte Carlo My Presentation PDF
11 pages
CLRM Assumptions
No ratings yet
CLRM Assumptions
20 pages
Answer Set 5 - Fall 2009
No ratings yet
Answer Set 5 - Fall 2009
38 pages
04 Moments, Skewness & Kurtosis
100% (1)
04 Moments, Skewness & Kurtosis
6 pages
CH 9 Solutions Manual PDF
No ratings yet
CH 9 Solutions Manual PDF
67 pages
Chapter8 Econometrics Heteroskedasticity
No ratings yet
Chapter8 Econometrics Heteroskedasticity
15 pages
Microeconomic Theory: Basic Principles and Extensions, 9e
100% (1)
Microeconomic Theory: Basic Principles and Extensions, 9e
32 pages
Correlation and Covariance
No ratings yet
Correlation and Covariance
11 pages
Heteroscedasticity Notes
No ratings yet
Heteroscedasticity Notes
9 pages
Notes Chapter 8
No ratings yet
Notes Chapter 8
26 pages
Chapter 2
No ratings yet
Chapter 2
22 pages
Chapter 5
No ratings yet
Chapter 5
15 pages
07 - Lent - Topic 2 - Generalized Method of Moments, Part II - The Linear Model - mw217
No ratings yet
07 - Lent - Topic 2 - Generalized Method of Moments, Part II - The Linear Model - mw217
16 pages
Time Series Components
No ratings yet
Time Series Components
3 pages
Decisions Under Risk and Uncertainty
No ratings yet
Decisions Under Risk and Uncertainty
18 pages
ARDL
No ratings yet
ARDL
10 pages
Index Numbers II
No ratings yet
Index Numbers II
13 pages
Panel Vs Pooled Data
No ratings yet
Panel Vs Pooled Data
9 pages
ARCH Model
No ratings yet
ARCH Model
26 pages
Anova Notes
No ratings yet
Anova Notes
7 pages
Chapter 24. Tool Kit For Portfolio Theory, Asset Pricing Models, and Behavioral Finance
No ratings yet
Chapter 24. Tool Kit For Portfolio Theory, Asset Pricing Models, and Behavioral Finance
22 pages

Econometrics ch4

Uploaded by

Econometrics ch4

Uploaded by

405 ECONOMETRICS

Chapter # 3: TWO-VARIABLE REGRESSION

THE METHOD OF ORDINARY LEAST SQUARES

To understand this method, we first explain the least squares principle.

But this is not a very good criterion. If we adopt the criterion of

By squaring ui , this method gives more weight to residuals such as

It is obvious from (3.1.2) that:

Since the values in the two experiments are different, we get

u2i = (Yi 1 2Xi)2

The process of differentiation yields the following equations for estimating

where X and Y are the sample means of X and Y and where we

The estimators obtained previously are known as the least-squares estimators.

Note the following numerical properties of estimators obtained by the

The regression line thus obtained has the following properties:

2. The mean value of the estimated Y = Yi is equal to the mean value of

where use is made of the fact that (Xi X ) = 0.

As a result of the preceding property, the sample regression

Equation (3.1.13) is known as the deviation form. Notice that the

4. The residuals ui are uncorrelated with the predicted Yi . This statement

where use is made of the fact that

THE CLASSICAL LINEAR REGRESSION MODEL: THE ASSUMPTIONS

In regression analysis our objective is not only to obtain 1 and 2

As shown in Figure 3.3, each Y population corresponding to a given X

Technically, (3.2.2) represents the assumption of homoscedasticity, or equal

In Figure 3.5, where the conditional variance of the Y population

The disturbances ui and uj are uncorrelated, i.e., no serial correlation. This

The disturbance u and explanatory variable X are uncorrelated. The PRF

This assumption too is not so innocuous as it looks. Look at Eq. (3.1.6). If

An econometric investigation begins with the specification of the

Suppose we choose the following two models to depict the underlying

We will discuss this assumption in Chapter 7, where we discuss multiple

PRECISION OR STANDARD ERRORS OF LEASTSQUARES ESTIMATES

2 is the constant or homoscedastic variance of ui of Assumption 4.

In passing, note that the positive square root of 2

is known as the standard error of estimate or the standard error of the

1. The variance of 2 is directly proportional to 2 but inversely proportional

2. The variance of 1 is directly proportional to 2 and X2i but inversely

PROPERTIES OF LEAST-SQUARES ESTIMATORS:

To understand this theorem, we need to consider the best linear

THE COEFFICIENT OF DETERMINATION r2:

To compute this r2, we proceed as follows: Recall that

The various sums of squares appearing in (3.5.2) can be described as

The quantity r2 thus defined is known as the (sample) coefficient of

Some of the properties of r are as follows (see Figure 3.11):

In the regression context, r2 is a more meaningful measure than r, for the

where Yi = actual Y, Yi = estimated Y, and Y = Y = the mean of Y. For

1 = 24.4545 var (1) = 41.1370 and se (1) = 6.4138

However, this is a mechanical interpretation of the intercept term. In

See numerical exapmles 3.1-3.3

You might also like