0% found this document useful (0 votes)
51 views47 pages

Week 1

Uploaded by

Như Quỳnh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
51 views47 pages

Week 1

Uploaded by

Như Quỳnh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 47

Week 1

Introducing Regression Analysis

Applied Econometrics
ECO440/ECO640
Niagara University
Lecture Outline
• What is Econometrics and How is it Used?

• Introducing Regression Analysis and the Theoretical


Regression Equation

• Using and Interpreting the Estimated Regression Equation

• Two Examples of Applied Econometrics


– The Impact of a Person’s Height on Weight
– Determining the Contributions of Structural
Characteristics to Housing Prices
What Is Econometrics?
• Econometrics can mean different things to different people.
• Formally defined:
– Econometrics is the quantitative measurement and
analysis of actual business and economic
phenomena.
• Econometrics attempts to quantitatively bridge gap
between
– Economic theory
– The real world phenomena theory is trying to
explain/predict
Functions of Applied Econometrics
• In practice, applied econometrics serves three primary
functions.

– Function 1: Describing Economic Reality

– Function 2: Testing Hypotheses About Economic


Theory and Policy.

– Function 3: Forecasting Future Economic Activity


Econometrics as Measuring
Economic Reality
Use 1: Describing Economic Reality
• Econometrics can quantify and measure marginal effects
and estimate numbers for theoretical equations.
– For example, consumer demand for a product often
can be thought of as a relationship between the
quantity demanded (Q) and several variables
 Price of the good (P)
 Price of a substitute good (Ps)
 Disposable income (Yd).
– Econometrics can help us quantitatively measure the
relationship between Q and any of these “explanatory
variables”
Econometrics as Measuring Economic
Reality
• Let us consider writing this relationship as a theoretical
equation:
Q = β0 + β1P + β2PS + β3 Yd (1.1)
• There are two components to this equation:
– Variables
 Dependent variable (Q)
 Independent variables (P, Ps, Yd)
– Regression Coefficients
 Example: β3
 Relationship between values of the independent
variable and the dependent variable
– Theory suggests β3 should be positive for most goods
Econometrics as Measuring Economic
Reality
• We can use real world data to determine what values
each of the coefficients (betas) should take on.
– What do we think is the value of β3
Using Regression to Test Economic
Hypotheses
Function 2: Testing Hypotheses About Economic
Theory and Policy.
• Much of economics involves building theoretical models
and testing them against evidence.
• Hypothesis testing is a vital part of that process.
• You could test the hypothesis that the good or service in
Equation (1.1) is a normal good.
– Normal good = Quantity demanded rises with Yd
– Intuition: How confident are we that the true value of β3
is greater than zero?
Q = β0 + β1 P + β2 PS + β3 Yd (1.1)
Using Regression to Test Economic
Hypotheses
Q = 27.7 + 0.11P + 0.03 PS + 0.23 Yd (1.2)
• Since β3 was estimated to be 0.23, the evidence from our
data seems to support the hypothesis.
– But the “statistical significance” of the estimate would
have to be investigated before such a conclusion
could be justified.
• To understand why we might not be confident in our
predictions/estimated regression coefficients consider
– Different responses to increases in income across
alternative groups in the population
Econometrics and Forecasting
Function 3: Forecasting Future Economic Activity
• The most difficult use of econometrics is to forecast or
predict the future using past data.
– Economists use econometrics to forecast a variety of
variables (GDP, sales, inflation, etc.).
– Accuracy of forecasts depends in large measure on the
degree to which the past is a good guide to the future.
– To the extent econometrics can shed light on the
future, leaders will be better equipped to make
decisions.
Varied Approaches to Econometrics
by Researcher’s Objective
• Dependent on the situation and question considered,
different approaches make sense in economics.
– For example: Forecasting uses different techniques
than econometric modeling for descriptive purposes.
• To put into context, consider the steps to nonexperimental
quantitative research:
1. Specifying the models or relationships to be studied.
2. Collecting the data needed to quantify the models.
3. Quantifying the models with the data.
• While these steps remain constant, the choices one makes
in each step is left to individual (but should be justifiable!)
Two Primary Uses of Applied
Econometrics
• In applied econometrics, we generally face two primary
goals:
1. Predicting Values of the Dependent Variable
2. Estimating Causal Effects
• Predicting Values of the Dependent Variable
– Objective: Construct an empirical regression model
that can accurately predict the true value of the
dependent variable
• Estimating Causal Effects
– Objective: Estimate values of particular slope
coefficients, such that they accurately measure the sole
influence of an independent variable on the dependent
variable
Regression Aimed at Prediction
• Let’s consider this distinction in the context of the following
theoretical and estimated wage regressions:

• Predicting Values of the Dependent Variable


– By plugging in values of EDUC, we can make
predictions over a worker’s salary
– Assess desirability of this model by its ability to
generate accurate guesses
– Related to concept of model “fit”
Regression Aimed at Estimating
Causal Effects
• Let’s consider this distinction in the context of the following
theoretical and estimated wage regressions:

• Estimating Causal Effects


– Our estimated implies an extra year of education
increases salaries by 2,500
– However, as we will discuss later in the semester, this
may be reflecting factors other than just education
(omitted variable bias)
– Omitted variable bias may lead to us over or
understating the direct importance of a specific
independent variable on the dependent variable
Prediction and Causation in Empirical
Housing Price Example
• What does the model predict for a 1600 sqft house?
– Plug in independent variable values to find predicted
value of the dependent variable:
·
PRICE i = 40.0 + 0.138(1600) = 260.8

• Since PRICE is in thousands, estimated price is $260,800


– Perhaps the price of a house is influenced by more
than just the size of the house?
– What other variables should we include?
 Foreshadowing potential problem: Other structural
characteristics are likely related to housing size and
our estimates over/understate impact of housing
size on housing price
Lecture Outline
• What is Econometrics and How is it Used?

• Introducing Regression Analysis and the Theoretical


Regression Equation

• Using and Interpreting the Estimated Regression Equation

• Two Examples of Applied Econometrics


– The Impact of a Person’s Height on Weight
– Determining the Contributions of Structural
Characteristics to Housing Prices
What Is Regression Analysis?
• It is a statistical technique that
– Attempts to “explain” differences across observations
in one dependent variable….
 Synonyms for dependent variable: Outcome
variable, regressand
– as a function of a set of independent variables….
 Synonyms: explanatory variables, regressors
– through the quantification of one or more equations.
• For example, in equation (1.1):
Q = β0 + β1 P + β2 PS + β3 Yd (1.1)
dependent variable: Q
independent variables: P, Ps, and Yd
Regression Analysis and Causation
• Economists are often interested in cause-and-effect.
– However, don’t be deceived by the words “dependent” and
“independent” variables.
– Regression results often cannot prove causality!
• For example, if variables A and B are related statistically,
then:
– Classic “chicken and egg” problem of what comes 1 st:
– A might “cause” B.
– B might “cause” A.
– Alternatively, some third factor might “cause” both.
– The relationship might have happened by chance.
– Important topic we will encounter in our discussion of
endogeneity
Theoretical and Estimated
Regression Equations
• Regression analysis consists of two separate steps,
related to different vintages of regression equations:
• Step 1: Constructing a regression equation
– Theoretical regression equation
– Using theory and your specific research question to
determine the form of your regression equation/model
• Step 2: Empirically quantifying the relationships
(parameters) between independent variables and the
dependent variable of your theoretical regression model
– Estimated regression equation or empirical regression
equation
– Having constructed your regression model, use data to
estimate the parameters (β) of your model
Theoretical Regression Equation
• The simplest single-equation linear model is:

Y = β0 +β1X (1.3)

• Y is the dependent variable


• X is the independent variable
• β’s are coefficients
β0 is the constant or intercept term
β1 is the slope coefficient
Graph of Theoretical Regression Line
Figure 1.1 Graphical Representation of the Coefficients of the
Theoretical Regression Line
Slope Coefficients and Regression
• The slope coefficient, β1, indicates the amount Y will
change if X increases by one unit.

(Y2  Y1 ) ΔY
= = β1
(X2  X1 ) ΔX

• If linear regression techniques are going to be applied to


an equation, that equation must be linear.
• An equation is linear if plotting the function in terms of X
and Y generates a straight line.
Role of the Error Term in the
Theoretical Regression Equation
• Even if much of the variation in Y is caused by X, there is
almost always variation that comes from other sources.

– A stochastic error term (ε) is


 Added to a regression equation
 Accounts for variation in the dependent variable (Y)
that is not explained by the included regressors (X)
 This is usually notated by adding an epsilon (ε) to
the regression equation:

Y = β0 +β1X + ε (1.4)
Intuition behind the Error Term
• The stochastic term (the error term, ε) “catches” the sources of
variation that the deterministic part does not.
• There are at least four sources of variation in Y not captured by
the included X(s):
1. Influences omitted from the equation
2. Measurement error in the dependent variable
3. The true theoretical equation has a different functional
form than the one chosen for the regression
4. Human behavior can be unpredictable and purely random
Consumption Function Example of
the Value of the Error Term
Example: Aggregate consumption function

Consumption  β0  β1 Disposable Income  ε

• Possible sources of error?


1. Consumer uncertainty hard to measure (omitted variable)
2. Observed consumption different than actual consumption
(sampling error)
3. The consumption function might not be linear (different
functional form)
4. Some random event (purely random)
Regression Errors from an Incorrect
Functional Form
Figure 1.2 Errors Caused by Using a Linear Functional Form
to Model a Nonlinear Relationship
Proper Notation of the Theoretical
Regression Equation
• Notation needs to be extended to allow for more than one
independent variable and reference specific observations.
• First, extend notation to reference specific observations:

Yi = β0 +β1Xi + ε i (i =1,2,...,N) (1.7)

where:
Yi = the ith observation of the dependent variable
Xi = the ith observation of the independent variable
εi = the ith observation of the stochastic error term
β0, β1 are the regression coefficients
N is the number of observations
Proper Notation of the Theoretical
Regression Equation
• Second, extend notation to allow for more than one independent
variable.
• If we define:
X1i = the ith observation of the first independent variable

X2i = the ith observation of the second independent


variable
X3i = the ith observation of the third independent variable
• Then, all three variables can be expressed as determinants
(independent variables) of Y.
Multivariate Regression Equation
• These extensions result in a multivariate linear
regression model:

Yi = β0 +β1X1i +β2 X2i +β3 X3i + ε i (1.8)

• Each slope coefficient gives the impact on Y of a 1 unit


increase in its value of X
– holding constant other included X’s.
– For example, if X2 increases by 1 unit
 Y increases by β2, holding the value of X1 and X3
constant.
– Important Point: If a variable is not included in an
equation, its impact on Y is not held constant!
Example 1 of Multivariate Regression
Equation: Weight and Height
Example: Weight as a function of height

Weight i = β0 +β1 Height i + ε i (1.9)

• Each value of i represents an individual in the sample.


• If you select four individuals (Woody, Lesley, Bruce, and Mary), then
you could write out an equation for each:
Weight woody = β0 +β1Height woody + ε woody

Weight lesley = β0 +β1Height lesley + ε lesley

Weight mary = β0 +β1Height mary + ε mary

Weight bruce = β0 +β1Height bruce + ε bruce


Relating Multivariate Regression
Equation to Individual Observations
• We make two important observations based on knowing
that each individual has their own height and weight.
• Points to Consider:
– 1.) Random events impact people differently.
 To account for these random differences each
observation has its own value of the error term (εi).

– 2.) The regression coefficients (the β’s) don’t vary by


individual.
 Rather, the β’s apply to the whole sample.
Decomposing the Theoretical
Regression Equation
Y = β0 +β1X + ε (1.4)
• Equation (1.4) can be thought of as having two parts:
1. Deterministic Component: β0 + β1X
2. Stochastic (or random) Component: ε
• The deterministic component indicates the value of Y that
is determined by a given value of X.
• The deterministic component can be thought of as the
expected value or (average value) of Y given X:
E(Y | X) = β0 + β1X (1.5)
Example 2 of Multivariate Regression
Equation: Determining Wages
Example: What influences wages?
• Wage (WAGE) of worker is dependent variable
• Possible independent variables?

experience (EXP), education (EDU), gender (GEND)


• Redefine variables in Equation (1.8):

Y = WAGE X2 = EDU

X1 = EXP X3 = GEND

• Substituting these into Equation (1.8):

WAGEi = β0 +β1 EXPi +β2 EDUi +β3 GENDi + ε i (1.10)


Example 2 of Multivariate Regression
Equation: Determining Wages
WAGEi = β0 +β1 EXPi +β2 EDUi +β3 GENDi + ε i (1.10)

• What is the meaning of β1 in equation (1.10)?


– It is the impact on wages of an additional year of
experience holding constant education and gender.
• General multivariate linear regression model with K
variables:

Yi = β0 +β1X1i +β2 X2i +...+βK XKi + ε i (1.11)


Lecture Outline
• What is Econometrics and How is it Used?

• Introducing Regression Analysis and the Theoretical


Regression Equation

• Using and Interpreting the Estimated Regression Equation

• Two Examples of Applied Econometrics


– The Impact of a Person’s Height on Weight
– Determining the Contributions of Structural Characteristics
to Housing Prices
The Estimated Regression
Equation
• The quantified, sample-specific version of a theoretical regression
equation is the estimated regression equation.
Theoretical: Yi = β0 +β1Xi + ε i (1.12)

Estimated: ˆ =103.40 + 6.38X


Y (1.13)
i i

– Ŷ (read “Y-hat”) is the estimated or fitted value of Yi


i

– Put another way: ˆ


E[Yi | Xi ]  Y i

• The closer the Yˆ i ' s are to Yi’s, the better the “fit” of the estimated
regression equation
Estimates, Residuals, and the
Theoretical Error Term
The difference between Ŷi and Yi is the residual (ei).
• Mathematically, it is the “empirical error term”:
ei  Yi  Yˆ i (1.15)
• Note the difference between ei and εi:

ε i  Yi  E(Yi | Xi ) (1.16)

• The residual (ei) can also be considered as an estimate of


the “true” error term (εi ).
• Figure 1.3 on the next slide graphically displays these
concepts and differences
Graphical Comparison of Theoretical
and Estimated Regression Equations
Figure 1.3 True and Estimated Regression Lines
Empirical vs. Theoretical
Parameter Estimates

• The estimated regression model can be extended by


adding additional X’s.

ˆ = βˆ +βˆ X +βˆ X +...+βˆ X


Y (1.17)
i 0 1 1i 2 2i K Ki
Lecture Outline
• What is Econometrics and How is it Used?

• Introducing Regression Analysis and the Theoretical


Regression Equation

• Using and Interpreting the Estimated Regression Equation

• Two Examples of Applied Econometrics


– The Impact of a Person’s Height on Weight
– Determining the Contributions of Structural
Characteristics to Housing Prices
A Simple Example of Applied
Econometrics: Weight and Height
• You’ve accepted a job as a weight guesser at Six Flags Darrien Lake.
• You hypothesize the following theoretical relationship.
+
Yi = β0 +β1 Xi + ε i (1.18)
where:
Yi = the weight (in pounds) of ith customer

Xi = the height (in inches above 5 ft) of ith customer

εi = the value of the stochastic error term for the ith customer

• Using height and weight data, you use regression to arrive at the
following estimated regression equation:
EstimatedWeight = 103.40 + 6.38Height(> 5ft) (1.19)
Applying Concepts of Estimated
Regression Equation
Table 1.1 Data for and Results of the Weight-Guessing Equation
Observation I Height Above 5’Xi Weight Yi Predicted Weight Ŷi Residual ei $ Gain or Loss
(1) (2) (3) (4) (5) (6)
1 5.0 140.0 135.3 4.7 +2.00
2 9.0 157.0 160.8 –3.8 +2.00
3 13.0 205.0 186.3 18.7 –3.00
4 12.0 198.0 179.9 18.1 –3.00
5 10.0 162.0 167.2 –5.2 +2.00
6 11.0 174.0 173.6 0.4 +2.00
7 8.0 150.0 154.4 –4.4 +2.00
8 9.0 165.0 160.8 4.2 +2.00
9 10.0 170.0 167.2 2.8 +2.00
10 12.0 180.0 179.9 0.1 +2.00
11 11.0 170.0 173.6 –3.6 +2.00
12 9.0 162.0 160.8 1.2 +2.00
13 10.0 165.0 167.2 –2.2 +2.00
Applying Concepts of Estimated
Regression Equation
Table 1.1 [Continued]
Observation I Height Above 5’Xi Weight Yi Predicted Weight Ŷi Residual ei $ Gain or Loss
(1) (2) (3) (4) (5) (6)
14 12.0 180.0 179.9 0.1 +2.00
15 8.0 160.0 154.4 5.6 +2.00
16 9.0 155.0 160.8 –5.8 +2.00
17 10.0 165.0 167.2 –2.2 +2.00
18 15.0 190.0 199.1 –9.1 +2.00
19 13.0 185.0 186.3 –1.3 +2.00
20 11.0 155.0 173.6 –18.6 –3.00
blank blank blank blank TOTAL = $25.00
Applying Concepts of Estimated
Regression Equation
Figure 1.4 A Weight-Guessing Equation
Using Regression to Explain Housing
Prices
• Want to measure the impact of house size on price.
• Construct a Simple Theoretical model:
+
PRICEi = β0 +β1 SIZEi + ε i (1.20)

where:
PRICEi = the price (in thousands of $) of the ith house

SIZEi = the size (in square feet) of the ith house

εi = the value of the stochastic error term for the ith house

• Using house sales data, we estimate the following estimated


regression equation for the impact of structure size on home value:
·
PRICE = 40.0 + 0.138SIZE (1.21)
i i
Graphical Depiction of Empirical
Housing Price Equation
Figure 1.5 A Cross-Sectional Model of Housing Prices
Interpreting Empirical Housing Price
Equation
• What does βˆ 0 = 40.0 mean?
– It is the estimate of the constant or intercept term (β0).
– It is not the baseline price of a nonsensical 0 sq. foot
house
• What does βˆ 1 = 0.138 mean?
– It is the estimate of the coefficient of SIZE (β1).
– Interpretation: If the size of a house increases by 1
square foot, the estimated price of the house will
increase $138.
 Note the importance of units here for determining
the appropriate interpretation!

You might also like