0% found this document useful (0 votes)

8 views

Lect 2

Uploaded by

vss.yt15

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

8 views

Lect 2

Uploaded by

vss.yt15

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 38

ECON4150 - Introductory Econometrics

Lecture 3: Review of Statistics & OLS

Stock and Watson Chapter 3-4

Lecture outline

• Comparing means from different populations

• Ideal randomized experiment

• Using the t-statistic when n is small

• Relationship between two random variables

• California test score data

• scatter plot

• sample covariance

• sample correlation

• Linear regression with 1 regressor

• derivation of the OLS estimators

• measures of fit (R 2 and SER)

Comparing means from different populations

• Previous lecture we tested the hypothesis that the mean wage of

individuals with a master degree equals 60000

• Suppose we would like to test whether the mean wages of men and
women with a master degree differ by an amount d0

H0 : µw M − µw F = d0 H1 : µw M − µw F 6= d0

• To test the null hypothesis against the two-sided alternative we follow

the 4 steps with some adjustments

Step 1: Estimate (µw M − µw F ) by W M − W F

• Because a weighted average of 2 independent normal random variables

is itself normally distributed we have (Cov W M , W F = 0)

σWM σWF
WM − WF ∼ N µw M − µw F , +
nM nF
4

Comparing means from different populations

Step 2: Estimate σWM and σWF to obtain SE W M − W F
s
2 2
sW M
sW F
SE W M − W F = +
nM nF

Step 3: compute the t-statistic

W M − W F − d0
act
t =
SE W M − W F

Step 4: Reject H0 at a 5% significance level if

• |t act | > 1.96
• or if p − value < 0.05
5

Comparing means from different populations

Suppose we have random samples of 500 men and 500 women with a
master degree

and we would like to test that the mean wages are equal:

H0 : µw M − µw F = 0 H1 : µw M − µw F 6= 0

Step 1: W M − W F = 64159.45 − 53163.41 = 10996.04

Step 2: SE W M − W F = 1240.709

(W M −W F )−0
Step 3: t act = SE (W M −W F )
= 10996.04
1240.709
= 8.86

Step 4: Since we use a 5% significance level, we reject H0 because

|t act | = 8.86 > 1.96
6

Comparing means from different populations

Difference in mean wages between men and women with a master degree

Thursday January 12 15:47:46 2017 Page 1

_ __(R)

/__ / ____/ / ____/
This is how to do the test in Stata: ___/ / /___/ / /___/
Statistics/Data Analysis

1 . ttest wage, by(female)

Two-sample t test with equal variances

Group Obs Mean Std. Err. Std. Dev. [95% Conf. Interval]

0 500 64159.45 847.7946 18957.26 62493.76 65825.13

1 500 53163.41 905.8709 20255.89 51383.62 54943.2

combined 1,000 58661.43 643.9819 20364.5 57397.72 59925.14

diff 10996.04 1240.709 8561.34 13430.73

diff = mean( 0) - mean( 1) t = 8.8627

Ho: diff = 0 degrees of freedom = 998

Ha: diff < 0 Ha: diff != 0 Ha: diff > 0

Pr(T < t) = 1.0000 Pr(|T| > |t|) = 0.0000 Pr(T > t) = 0.0000
7

Confidence interval for the difference in population means

• The method for constructing a confidence interval for 1 population mean

can be easily extended to the difference between 2 population means

• A hypothesized value of the difference in means d0 will be rejected if

|t| > 1.96

• and will be in the confidence set if |t| ≤ 1.96

• Thus the 95% confidence interval for (µWM − µWF ) are the values of d0

within ±1.96 standard errors of W M − W F

95% confidence interval for µWM − µWF

W M − W F ± 1.96 · SE W M − W F

10996.04 ± 1.96 · 1240.709

{8561.34 , 13430.73}
8

Comparing means from different populations

Example: An ideal randomized experiment

In this course we will focus on estimating causal effects:

the expected effect on Y of a change in X

A causal effect can be measured by an ideal randomized experiment:

• Subjects are selected by simple random sampling from the population of

interest

• Subjects are randomly assigned to a treatment or control group

• Treatment group receives treatment of interest (X = 1), control group

receives no treatment (X = 0).

• The mean causal effect is the difference between the mean outcome
when treated and the mean outcome when untreated

Mean causal effect = µX =1 − µX =0

Comparing means from different populations

Example: An ideal randomized experiment

If we want to know whether the treatment is effective we can test:

H0 : µX =1 − µX =0 = 0 H1 : µX =1 − µX =0 6= 0

Step 1: Estimate (µX =1 − µX =0 ) by computing the difference in mean

outcomes of individuals in the treatment and control group:

Y Treated − Y Control

Step 2: Compute SE Y Treated − Y Control

(Y Treated −Y Control )−0

Step 3: Compute t act = SE (Y Treated −Y Control )

Step 4: Reject the null hypothesis of no treatment effect at a 5%

significance level if |t act | > 1.96
10

Using the t-statistic when n is small

• The test on the previous slide is based on the sample size n being large

• Especially in actual randomized experiments n can be small

• If the hypothesis test concerns 1 population mean, the t-statistic

Y − µY ,0
t act =
SE Y

• is not normally distributed for small n!

• has the student-t distribution in the special case that the population
distribution of Y is normal.

• If the hypothesis test concerns the difference in 2 population means, the

t-statistic
Y M − Y F − d0
t act =
SE Y M − Y F

• is not normally distributed for small n!

• does not have a student-t distribution even if the population
distributions are normal!
11

Relationship between two random variables

• In general, questions in econometrics involve a relationship between 2

(or more) random variables:

• What is the relation between education and earnings?

• What is the relation between interest rates and economic growth?

• What is the relation between the beer tax and traffic fatalities?

• What is the relation between class size and student test scores?

• In this and coming lectures we will focus on the last of these questions.
12

California test score data

• We will use a data set that contains data on test performance, school
characteristics and student demographic backgrounds.
• The data are from 420 districts in California.
• Data were obtained from the California Department of Education
• Main variables of interest:
• TestScore is the district average of the reading and math scores of
5th grade students
• ClassSize is defined as the number of students divided by the
number of full-time equivalent teachers in the district.
13

The relation between class size and test scores

• To examine the relation between class size and test scores we can
make a scatter plot
A scatter plot is a plot of n observations on Xi and Yi in which each
observation is represented by the point (Xi , Yi )

700

680
Test score

660

640

620

600
14 16 18 20 22 24 26
Class size
.
14

Sample covariance

• The covariance is a measure of the extend to which two random

variables X and Y move together,

Cov (X , Y ) = σXY = E [(X − µX ) · (Y − µY )]

• The population covariance is unobserved but can be estimated by the

sample covariance sXY
n
1 X
sXY = Xi − X Yi − Y
n−1
i=1

• If (Xi , Yi )are i.i.d and have finite fourth moments E X 4 < ∞ &

E Y4 < ∞

p
sXY −→ σXY
• The sample covariance between class size and test scores sCT =-8.16
15

Sample correlation

• What does it mean for the sample covariance between test scores and
class size to equal -8.16?

• The units of the covariance are the units of test scores multiplies by the
units of class size

• The sample correlation rXY measures the strength of the linear

association between X and Y that is unit-free and lies between -1 and 1
sXY
rXY =
sX sY

• The sample correlation between class size and test scores rCT =-0.23
Friday January 13 10:48:09 2017 Page 1 16

Sample covariance and correlation in Stata _ _

/__ / ____
To compute the sample covariance in Stata: ___/ / /___/
Statistics/Dat

1 . corr test_score class_size, covariance

(obs=420)

test_s~e class_~e
Friday January 13 10:48:38 2017 Page 1
test_score 363.03
class_size -8.15932 3.57895
___ _
/__
___/ /
To compute the sample correlation in Stata: Statis

1 . corr test_score class_size

(obs=420)

test_s~e class_~e

test_score 1.0000
class_size -0.2264 1.0000
.
Linear regression with one regressor
18

Linear regression with one regressor

Suppose we would like to answer the following question:

What is the effect on district test scores if we would increase district average
class size by 1 student?

We would like to know

4Test score
βClassSize =
4Class size

βClassSize is the definition of the slope of a straight line relating test scores and
class size
Test score = β0 + βClassSize × Class size
where β0 is the intercept of the straight line.
19

Linear regression with one regressor

• The average test score in district i does not only depend on the average
class size

• It also depends on factors such as

• Quality of the teachers

• Student background

• quality of text books

• .....

• The equation describing the linear relation between Test score and
Class size is better written as

Test scorei = β0 + βClassSize × Class sizei + ui

where ui lumps together all other district characteristics that affect

average test scores.
20

Terminology for the Linear Regression Model with One Regressor

The linear regression model with one regressor is denoted by

Yi = β0 + β1 Xi + ui

where

• Yi is the dependent variable

• Xi is the independent variable or regressor

• β0 + β1 Xi is the population regression line

• β0 is the intercept of the population regression line (expected value of Y

when X = 0)

• β1 is the slope of the population regression line

• ui is the error term (all other factors determining Yi )

Linear regression with one regressor

u1 u6

X
22

Linear regression with one regressor

• In general we don’t know β0 and β1 and we have to estimate them using
a random sample of data.
• How to find the line that fits the data best?

700

680
Test score

660

640

620

600
14 16 18 20 22 24 26
Class size
23

The Ordinary Least Squares Estimator (OLS)

The OLS estimator chooses the regression coefficients so that the estimated
regression line is as close as possible to the observed data,
where closeness is measured by the sum of the squared
mistakes made in predicting Y given X

• Let b0 and b1 be estimators of β0 and β1

• The predicted value of Yi given Xi using these estimators is b0 + b1 Xi

• The prediction mistake is

Yi − (b0 + b1 Xi ) = Yi − b0 − b1 Xi

• The estimators of the slope and intercept that minimize

n
X
(Yi − b0 − b1 Xi )2
i=1

are called the ordinary least squares (OLS) estimators of β0 and β1

Y is the ordinary least squares estimator of µY

• Suppose there is no X only Y

Yi = µY + ui
• Let m be an estimator of µY

• The least squares estimator minimizes

n
X
(Yi − m)2
i=1

• Taking the derivative w.r.t m and setting it to zero gives

Pn 2
∂
−2 ni=1 (Yi − m)
P
∂m i=1 (Yi − m) = =0

−2 ni=1 Yi + 2 · n · m
P
=0
1
Pn
n i=1 Yi − m =0
• Solving for m gives
n
1X
m= Yi = Y
n
i=1
25

The Simple Linear Regression Model

Yi = β0 + β1 Xi + ui

• OLS minimizes sum of squared prediction mistakes:

n
X n
X 2
bi2 =
u Yi − βb0 − βb1 Xi
i=1 i=1

• Step 1:
n 2
∂ X
Yi − βb0 − βb1 Xi = 0
∂ βb0 i=1

• Step 2:
n 2
∂ X
Yi − βb0 − βb1 Xi = 0
∂ βb1 i=1
26

Step 1: OLS estimator of β0

Pn Pn
∂
∂β i=1 ui2 = −2 i=1 Yi − βb0 − βb1 Xi =0
0
b

P
n Pn
1
βb0 − ni=1 βb1 Xi
P
= n i=1 Yi − i=1 =0

Pn
1
Yi − n1 nβb0 − βb1 n1 ni=1 Xi
P
= n i=1 =0

= Yi − βb0 − βb1 Xi =0

• This gives

c0 = Y − βb1 X
β
27

Step 2: OLS estimator of β1

Pn Pn
∂
∂β i=1 ui2 = −2 · i=1 −Xi Yi − βb0 − βb1 Xi =0
1
b

Devide by − 2 and substitute for βb0 :

Pn
= i=1 X i Y i − Y − b1 X − βb1 Xi
β =0

rewrite
Pn
i=1 Xi Yi − Y − βb1 Xi − βb1 X

rewrite
Pn
Yi − Y − βb1 ni=1 Xi Xi − X
P
= i=1 Xi =0

Algebra trick
Pn P
= i=1 Xi − X Yi − Y − βb1 ni=1 Xi − X Xi − X =0
28

Step 2: OLS estimator of β1

Algebra trick:

Pn Pn Pn Pn Pn
i=1 Xi − X Yi − Y = i=1 Xi Yi − i=1 Xi Y − i=1 X Yi + i=1 XY
Pn Pn 1
Pn
= i=1 Xi Yi − i=1 Xi Y − nX n i=1 Yi + nX Y
Pn Pn
= i=1 Xi Yi − i=1 Xi Y −nX Y + nX Y
Pn Pn
= Xi Yi − i=1 Xi Y
i=1

= ni=1 Xi Yi − Y
P

By a similar reasoning:

Pn P
i=1 Xi Xi − X = ni=1 Xi − X Xi − X .
29

Step 2: OLS estimator of β1

Pn Pn P
∂
∂β i=1 ui2 = i=1 Xi − X Yi − Y − βb1 ni=1 Xi − X Xi − X =0
1
b

Solving for βb1 gives the OLS estimator:

Pn 1
Pn
Pi=1
(Xi −X )(Yi −Y ) n−1 i=1 (Xi −X )(Yi −Y ) sxy
β
c1 = n = = sx2
i=1 (Xi −X )(Xi −X )
1
Pn
n−1 i=1 (Xi −X )(Xi −X )

The OLS predicted values Y

bi and residuals u
bi are:

Y
bi = βb0 + βb1 Xi

bi = Yi − Y
u bi
30

The Simple Linear Regression Model

Example: Class size and test scores

TestScore_hat=698.9 - 2.28 * ClassSize

700

680
Test score

660

640

620

600
15 20 25
Class size
.
31

The Simple Linear Regression Model

Example: Class size and test scores

Friday January 13 14:48:31 2017 Page 1

TestScorei = β0 + β1 ClassSize i + u____

___ ____ i ____ ____(R)
/__ / ____/ / ____/
. ___/ / /___/ / /___/
Statistics/Data Analysis

1 . regress test_score class_size, robust

Linear regression Number of obs = 420

F(1, 418) = 19.26
Prob > F = 0.0000
R-squared = 0.0512
Root MSE = 18.581

Robust
test_score Coef. Std. Err. t P>|t| [95% Conf. Interval]

class_size -2.279808 .5194892 -4.39 0.000 -3.300945 -1.258671

_cons 698.933 10.36436 67.44 0.000 678.5602 719.3057

• βb1 = −2.27 A reduction in class size by 1 student is associated with an

increase in test scores by 2.27 points

• βb0 = 698.93 The expected test score when class size is zero equals
698.93 (what does it mean for class size to be zero)?
Friday January 13 15:00:27 2017 Page 1 32

Y is the ordinary least squares estimator of µY _ __

Example: test scores /__ / ____/ / _
___/ / /___/ / /_
The sample mean of district average test scores TestScore Statistics/Data
= 654.16 Analysi

1 . mean test_score

Mean estimation Number of obs = 420

Mean Std. Err. [95% Conf. Interval]

test_score
Friday 654.1565
January 13 15:00:50 2017 .9297082
Page 1 652.3291 655.984

_ __(R)

/__ / ____/ / ____/
As shown on slide 24 we can also obtain the sample /mean
___/ by OLS
/___/ / /___/
Statistics/Data Analysis

1 . regress test_score

Source SS df MS Number of obs = 420

F(0, 419) = 0.00
Model 0 0 . Prob > F = .
Residual 152109.594 419 363.030056 R-squared = 0.0000
Adj R-squared = 0.0000
Total 152109.594 419 363.030056 Root MSE = 19.053

test_score Coef. Std. Err. t P>|t| [95% Conf. Interval]

_cons 654.1565 .9297082 703.61 0.000 652.3291 655.984

.
33

Measures of fit

How well does the estimated regression line describe the data?

• Does the regressor X account for much or for little variation in Y ?

• Are the observations in the scatter plot clustered closely around the
regression line?

Two measures of how well the OLS line fits the data.

The R 2 measures the fraction of the variation in Yi

explained/predicted by Xi

The standard error of the regression SER measures how far Yi typically is
from its predicted value
34

2
The R

R 2 is the fraction of the sample variance of Yi explained/predicted by Xi

We can write
Yi = Y
bi + u
bi
2
which implies that the R is the ratio of the sample variance of Y
bi and the
sample variance of Yi
Pn b 2
Explained sum of squares ESS i=1 Yi − Y
R2 = = = P 2
Total sum of squares TSS n
i=1 Yi − Y

The R 2 ranges from 0 to 1

• If R 2 = 0, Xi explains no none of the variation in Yi

• If R 2 = 1, Xi explains all of the variation in Yi (Yi = Y
bi )
• in practice 0 < R 2 < 1
35

2
The R

The total sum of squares TSS can be divided in the explained sum of
squares ESS and the residual sum of squares SSR:

TSS = ESS + SSR

Pn 2 Pn b 2 P 2
i=1 Yi − Y = i=1 Yi − Y + ni=1 Yi − Y
bi

Pn 2 Pn b 2 P
i=1 Yi − Y = i=1 Yi − Y + ni=1 u
bi2

This implies that the R 2 can also be written as

Pn
ESS TSS − SSR SSR b2
u
R2 = = =1− = P i=1 i 2
TSS TSS TSS n
i=1 Yi − Y
36

2
The R
Friday
Example: January
Class 13 test
size and 14:48:31 2017
scores Page 1

_ __(R)

/__ / ____/ / ____/
___/ / /___/ / /___/
Statistics/Data Analysis

1 . regress test_score class_size, robust

Linear regression Number of obs = 420

F(1, 418) = 19.26
Prob > F = 0.0000
R-squared = 0.0512
Root MSE = 18.581

Robust
test_score Coef. Std. Err. t P>|t| [95% Conf. Interval]

class_size -2.279808 .5194892 -4.39 0.000 -3.300945 -1.258671

_cons 698.933 10.36436 67.44 0.000 678.5602 719.3057

R 2 = 0.0512

Note: the R 2 is uninformative about whether an increase in class size causes

a reduction in test scores!
37

The standard error of the regression

• Another measures of fit is the SER.

The standard error of the regression (SER) is an estimator of the standard

deviation of the regression error ui
v
u n
q u 1 X
2 bi2
SER = sbu = sbu = t u
n−2
i=1

It measures the spread of the observations around the regression line in the
units of the dependent variable

• The divisor n-2 is used because 2 degrees of freedom were lost in

estimating the two regression coefficients β0 and β1 .
38

The standard error of the regression

Example: Class
Friday size and
January 13 test scores
14:48:31 2017 Page 1

_ __(R)

/__ / ____/ / ____/
___/ / /___/ / /___/
Statistics/Data Analysis

1 . regress test_score class_size, robust

Linear regression Number of obs = 420

F(1, 418) = 19.26
Prob > F = 0.0000
R-squared = 0.0512
Root MSE = 18.581

Robust
test_score Coef. Std. Err. t P>|t| [95% Conf. Interval]

class_size -2.279808 .5194892 -4.39 0.000 -3.300945 -1.258671

_cons 698.933 10.36436 67.44 0.000 678.5602 719.3057

In Stata the SER is denoted as Root MSE.

SER = 18.6

CHYS 3P15 Final Exam Review
No ratings yet
CHYS 3P15 Final Exam Review
7 pages
Decision Science Assignment
No ratings yet
Decision Science Assignment
10 pages
4 Statistics and Probability G11 Quarter 4 Module 4 Identifying The Appropriate Test Statistics Involving Population Mean
78% (18)
4 Statistics and Probability G11 Quarter 4 Module 4 Identifying The Appropriate Test Statistics Involving Population Mean
27 pages
M01 StockWatson123635 03 Econ Part01
No ratings yet
M01 StockWatson123635 03 Econ Part01
61 pages
Chapter 2 & 3-Review of Probability and Statistics
No ratings yet
Chapter 2 & 3-Review of Probability and Statistics
93 pages
Econometrie
No ratings yet
Econometrie
63 pages
Watson Introduccion A La Econometria PDF
No ratings yet
Watson Introduccion A La Econometria PDF
253 pages
Introduction To Econometrics (Lecture Slides Complete 1 - 13)
No ratings yet
Introduction To Econometrics (Lecture Slides Complete 1 - 13)
657 pages
CH 123
No ratings yet
CH 123
63 pages
Ch01 02 03 Final B
No ratings yet
Ch01 02 03 Final B
63 pages
CH 123
No ratings yet
CH 123
63 pages
Chapters 1 & 2-Final - PPT Econmetrics - Smith/Watson
100% (1)
Chapters 1 & 2-Final - PPT Econmetrics - Smith/Watson
71 pages
Module-5-Inferential-Statistics
No ratings yet
Module-5-Inferential-Statistics
8 pages
Test On Variables: in Surveys, The Foolish Ask Questions, Wise Cannot Answers
No ratings yet
Test On Variables: in Surveys, The Foolish Ask Questions, Wise Cannot Answers
24 pages
CH 1, 2, 3 Slides
No ratings yet
CH 1, 2, 3 Slides
64 pages
10 Stockwatson 1
No ratings yet
10 Stockwatson 1
65 pages
T-Test
No ratings yet
T-Test
11 pages
James Stock CH 1, 2, 3 Slides
No ratings yet
James Stock CH 1, 2, 3 Slides
66 pages
Module10-Hypothesis Testing and Statistical Tools (Business)
No ratings yet
Module10-Hypothesis Testing and Statistical Tools (Business)
18 pages
23MT2013 DSS CO4 Session 19 Statistical Tests
No ratings yet
23MT2013 DSS CO4 Session 19 Statistical Tests
42 pages
Parametric Test
No ratings yet
Parametric Test
23 pages
Modelling in R
No ratings yet
Modelling in R
47 pages
ECON 332 Business Forecasting Methods Prof. Kirti K. Katkar
No ratings yet
ECON 332 Business Forecasting Methods Prof. Kirti K. Katkar
46 pages
Econ140 Spring2016 Section05 Handout Solutions
No ratings yet
Econ140 Spring2016 Section05 Handout Solutions
5 pages
AE 2023 Lecture4 PDF
No ratings yet
AE 2023 Lecture4 PDF
38 pages
Probability and Statistics - 3
No ratings yet
Probability and Statistics - 3
59 pages
(eBook PDF) Introduction to Econometrics, 4th Global Edition instant download
100% (6)
(eBook PDF) Introduction to Econometrics, 4th Global Edition instant download
57 pages
Lec2 PDF
No ratings yet
Lec2 PDF
8 pages
Hypothesis Test
No ratings yet
Hypothesis Test
19 pages
Huypothesis Testing Final Notes 2020 - 2021
No ratings yet
Huypothesis Testing Final Notes 2020 - 2021
33 pages
PARAMETRIC-TEST
No ratings yet
PARAMETRIC-TEST
49 pages
Psychstat Semifinals Reviewer (Bundalian)
No ratings yet
Psychstat Semifinals Reviewer (Bundalian)
8 pages
LESSON 5 Paired and Unpaired T Test Calculations
No ratings yet
LESSON 5 Paired and Unpaired T Test Calculations
32 pages
Psychology Statistic Note
No ratings yet
Psychology Statistic Note
13 pages
Statistics in Research, Step
No ratings yet
Statistics in Research, Step
35 pages
Chapter 2 Design of Experiment
No ratings yet
Chapter 2 Design of Experiment
24 pages
Student's t Test
No ratings yet
Student's t Test
24 pages
SCSIT Act 4 (ANSWER)
No ratings yet
SCSIT Act 4 (ANSWER)
9 pages
t-test-for-means
No ratings yet
t-test-for-means
17 pages
Psychstat Semifinals Reviewer (1)
No ratings yet
Psychstat Semifinals Reviewer (1)
5 pages
Assignment A (Hand In)
No ratings yet
Assignment A (Hand In)
6 pages
Review and Non Parametric Using SPSS 2023
No ratings yet
Review and Non Parametric Using SPSS 2023
69 pages
(eBook PDF) Introduction to Econometrics 4th Edition by James H. Stockinstant download
100% (2)
(eBook PDF) Introduction to Econometrics 4th Edition by James H. Stockinstant download
38 pages
Module 5 94 128 2
No ratings yet
Module 5 94 128 2
35 pages
BIOSTAT Lab Discussion Midterm
No ratings yet
BIOSTAT Lab Discussion Midterm
48 pages
Unit 1 SNM - New (Compatibility Mode) Solved Hypothesis Test PDF
No ratings yet
Unit 1 SNM - New (Compatibility Mode) Solved Hypothesis Test PDF
50 pages
6.1 Test For Single Mean: Assumptions
No ratings yet
6.1 Test For Single Mean: Assumptions
17 pages
4.02 Comparing Group Means - T-Tests and One-Way ANOVA Using Stata, SAS, R, and SPSS (2009)
No ratings yet
4.02 Comparing Group Means - T-Tests and One-Way ANOVA Using Stata, SAS, R, and SPSS (2009)
51 pages
2 Correlation and Linear Regression PDF
No ratings yet
2 Correlation and Linear Regression PDF
26 pages
11paired T
No ratings yet
11paired T
49 pages
Student S T Statistic: Test For Equality of Two Means Test For Value of A Single Mean
No ratings yet
Student S T Statistic: Test For Equality of Two Means Test For Value of A Single Mean
35 pages
Statistics: Session 1/march 6 Session 2/march 7 Session 3/march 8 Session 4/march 9
No ratings yet
Statistics: Session 1/march 6 Session 2/march 7 Session 3/march 8 Session 4/march 9
6 pages
Comparison of Means: Hypothesis Testing
No ratings yet
Comparison of Means: Hypothesis Testing
52 pages
Manzan SW4e Ch01 02 03
No ratings yet
Manzan SW4e Ch01 02 03
70 pages
Inbound 3979052102379407773
No ratings yet
Inbound 3979052102379407773
53 pages
Learn Statistics Fast: A Simplified Detailed Version for Students
From Everand
Learn Statistics Fast: A Simplified Detailed Version for Students
Hesbon R.M
No ratings yet
Statistics I Essentials
From Everand
Statistics I Essentials
Emil G. Milewski
No ratings yet
Elementary Statistics
From Everand
Elementary Statistics
jay prakash Maheshwari
5/5 (1)
Statistical Foundations for Psychology
From Everand
Statistical Foundations for Psychology
James C. Ware
No ratings yet
Digital Signal Processing (DSP) with Python Programming
From Everand
Digital Signal Processing (DSP) with Python Programming
Maurice Charbit
No ratings yet
Measurement of Length - Screw Gauge (Physics) Question Bank
From Everand
Measurement of Length - Screw Gauge (Physics) Question Bank
Mohmmad Khaja Shareef
No ratings yet
Statistics II Essentials
From Everand
Statistics II Essentials
Emil Milewski
2.5/5 (1)
Student's Solutions Manual and Supplementary Materials for Econometric Analysis of Cross Section and Panel Data, second edition
From Everand
Student's Solutions Manual and Supplementary Materials for Econometric Analysis of Cross Section and Panel Data, second edition
Jeffrey M. Wooldridge
No ratings yet
Indicates That It Is Significant and Will Reject The Null Hypothesis and Accept The Alternative Hypothesis Since It Is Lower Than 0.05
No ratings yet
Indicates That It Is Significant and Will Reject The Null Hypothesis and Accept The Alternative Hypothesis Since It Is Lower Than 0.05
4 pages
Dav Exp3 66
No ratings yet
Dav Exp3 66
4 pages
Chi-Square Test of Independence
No ratings yet
Chi-Square Test of Independence
15 pages
Stats Medic Unit 6 Important Ideas
No ratings yet
Stats Medic Unit 6 Important Ideas
5 pages
Lecture 4 Spearman
No ratings yet
Lecture 4 Spearman
2 pages
Module 6
No ratings yet
Module 6
8 pages
Lesson 8 Statistical Treatment
No ratings yet
Lesson 8 Statistical Treatment
23 pages
Instant Access To Probability and Statistics For Computer Scientists Third Edition Michael Baron Ebook Full Chapters
No ratings yet
Instant Access To Probability and Statistics For Computer Scientists Third Edition Michael Baron Ebook Full Chapters
62 pages
Tabel Statistik 1
No ratings yet
Tabel Statistik 1
22 pages
MMW Finals Module
No ratings yet
MMW Finals Module
83 pages
Linearregression PDF
No ratings yet
Linearregression PDF
30 pages
Journal of Rock Mechanics and Geotechnical Engineering: Luis-Fernando Contreras, Edwin T. Brown
No ratings yet
Journal of Rock Mechanics and Geotechnical Engineering: Luis-Fernando Contreras, Edwin T. Brown
16 pages
Todini - Hydrological Catchment Modelling - Past-Present and Future
No ratings yet
Todini - Hydrological Catchment Modelling - Past-Present and Future
15 pages
Mosconi W1
No ratings yet
Mosconi W1
14 pages
Introductory Econometrics Test Bank Compress
No ratings yet
Introductory Econometrics Test Bank Compress
134 pages
DR - Arunachalam Rajagopal - Time Series Forecasting With R A Beginner's Guide (2020)
No ratings yet
DR - Arunachalam Rajagopal - Time Series Forecasting With R A Beginner's Guide (2020)
93 pages
Lecture 19 KEY - Simple Linear Regression Worksheet
No ratings yet
Lecture 19 KEY - Simple Linear Regression Worksheet
4 pages
Midterm 1 Practice Solutions
No ratings yet
Midterm 1 Practice Solutions
12 pages
Time Series Interview Questions
No ratings yet
Time Series Interview Questions
7 pages
Solutions To Final Exam 2 Sample Test
No ratings yet
Solutions To Final Exam 2 Sample Test
3 pages
9.hypothesis Testing For The Mean and Variance of A Population
No ratings yet
9.hypothesis Testing For The Mean and Variance of A Population
70 pages
Comparing Several Means: Anova
No ratings yet
Comparing Several Means: Anova
52 pages
Praktikum M2 (Minitab)
No ratings yet
Praktikum M2 (Minitab)
78 pages
MTH408 Machine - Learning - Logistic - Regression
No ratings yet
MTH408 Machine - Learning - Logistic - Regression
43 pages
Baseball Fundamentals
No ratings yet
Baseball Fundamentals
23 pages
Maximum Likelihood Estimation (MLE)
No ratings yet
Maximum Likelihood Estimation (MLE)
4 pages
Quarter4 Statistics Lecture Notes
No ratings yet
Quarter4 Statistics Lecture Notes
29 pages

Lect 2

Uploaded by

Lect 2

Uploaded by

ECON4150 - Introductory Econometrics

Lecture 3: Review of Statistics & OLS

Stock and Watson Chapter 3-4

• Comparing means from different populations

• Ideal randomized experiment

• Using the t-statistic when n is small

• Relationship between two random variables

• California test score data

• Linear regression with 1 regressor

• derivation of the OLS estimators

• measures of fit (R 2 and SER)

Comparing means from different populations

• Previous lecture we tested the hypothesis that the mean wage of

• To test the null hypothesis against the two-sided alternative we follow

• Because a weighted average of 2 independent normal random variables

Comparing means from different populations

Step 3: compute the t-statistic

Step 4: Reject H0 at a 5% significance level if

Comparing means from different populations

Step 1: W M − W F = 64159.45 − 53163.41 = 10996.04

Step 4: Since we use a 5% significance level, we reject H0 because

Comparing means from different populations

Thursday January 12 15:47:46 2017 Page 1

___ ____ ____ ____ ____(R)

1 . ttest wage, by(female)

Two-sample t test with equal variances

0 500 64159.45 847.7946 18957.26 62493.76 65825.13

combined 1,000 58661.43 643.9819 20364.5 57397.72 59925.14

diff 10996.04 1240.709 8561.34 13430.73

diff = mean( 0) - mean( 1) t = 8.8627

Ha: diff < 0 Ha: diff != 0 Ha: diff > 0

Confidence interval for the difference in population means

• The method for constructing a confidence interval for 1 population mean

• A hypothesized value of the difference in means d0 will be rejected if

• and will be in the confidence set if |t| ≤ 1.96

95% confidence interval for µWM − µWF

10996.04 ± 1.96 · 1240.709

Comparing means from different populations

In this course we will focus on estimating causal effects:

the expected effect on Y of a change in X

A causal effect can be measured by an ideal randomized experiment:

• Subjects are selected by simple random sampling from the population of

• Subjects are randomly assigned to a treatment or control group

• Treatment group receives treatment of interest (X = 1), control group

Mean causal effect = µX =1 − µX =0

Comparing means from different populations

If we want to know whether the treatment is effective we can test:

Step 1: Estimate (µX =1 − µX =0 ) by computing the difference in mean

(Y Treated −Y Control )−0

Step 4: Reject the null hypothesis of no treatment effect at a 5%

Using the t-statistic when n is small

• Especially in actual randomized experiments n can be small

• If the hypothesis test concerns 1 population mean, the t-statistic

• is not normally distributed for small n!

• If the hypothesis test concerns the difference in 2 population means, the

• is not normally distributed for small n!

Relationship between two random variables

• In general, questions in econometrics involve a relationship between 2

• What is the relation between education and earnings?

• What is the relation between interest rates and economic growth?

California test score data

The relation between class size and test scores

• The covariance is a measure of the extend to which two random

Cov (X , Y ) = σXY = E [(X − µX ) · (Y − µY )]

• The population covariance is unobserved but can be estimated by the

• The sample correlation rXY measures the strength of the linear

Sample covariance and correlation in Stata ___ ____ ___

1 . corr test_score class_size, covariance

1 . corr test_score class_size

Linear regression with one regressor

Suppose we would like to answer the following question:

We would like to know

Linear regression with one regressor

• It also depends on factors such as

• Quality of the teachers

• quality of text books

• Because a weighted average of 2 independent normal random variables

_ __(R)

Sample covariance and correlation in Stata _ _

Y is the ordinary least squares estimator of µY _ __

_ __(R)

_ __(R)

_ __(R)