0% found this document useful (0 votes)
9 views35 pages

Hypothesis Testing

The document provides an introduction to hypothesis testing in statistics, detailing the null and alternative hypotheses, significance testing, and methods for evaluating parameter estimates. It discusses the coefficient of determination (R²) as a measure of goodness of fit and outlines the overall significance of regression analysis using F-tests. Additionally, it covers model specification techniques, functional forms in multiple regression, and evaluation metrics for regression models.

Uploaded by

Vishali Suresh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views35 pages

Hypothesis Testing

The document provides an introduction to hypothesis testing in statistics, detailing the null and alternative hypotheses, significance testing, and methods for evaluating parameter estimates. It discusses the coefficient of determination (R²) as a measure of goodness of fit and outlines the overall significance of regression analysis using F-tests. Additionally, it covers model specification techniques, functional forms in multiple regression, and evaluation metrics for regression models.

Uploaded by

Vishali Suresh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 35

Hypothesis Testing

An Introduction to Statistical Inference


Hypothesis Testing

In the hypothesis testing framework, there are always two hypotheses


that go together, known as the null hypothesis (denoted H 0 or
occasionally HN) and the alternative hypothesis (denoted H 1 or
occasionally HA). The null hypothesis is the statement or the statistical
hypothesis that is actually being tested. The alternative hypothesis
represents the remaining outcomes of interest.
Table of Contents
01
Hypothesis
02
Goodness of
03
Test of Overall
Testing Fit Significance
General Approach (t- Coefficient of F Test
test) Confidence Determination
Inteval Approach

04 Functional
05 06
Specification forms Evaluation
Choosing Explanatory Model selection Model Evaluation
variables
01
Hypothesis
Testing
Significance of Parameters
PRECISION OR STANDARD ERRORS
OF LEAST-SQUARES ESTIMATES
● In statistics the precision of an estimate is measured by its
standard error (se). Given the Gaussian assumptions, the
standard errors of the OLS estimates can be obtained as follows:

2
=

5
PRECISION OR STANDARD ERRORS
OF LEAST-SQUARES ESTIMATES

SE(0) = √ SE(1) = √

6
Test of Hypothesis
for the Parameter Estimates
● A test of hypothesis is a procedure by which sample
results are used to verify the truth or falsity of a
null hypothesis, using a test statistic obtained from
the data at hand.
● Let H0 = k H1 ≠ k
● Test statistic: t =
follows the t distribution with n − 2 degrees of
freedom.
● Accept the null hypothesis if ≤ tα/2
7
Test of Significance
for the Parameter Estimates
● Let H0 = k H1 ≠ k
● Test statistic: t =
follows the t distribution with n − 2 degrees of
freedom.
● Accept the null hypothesis if ≤ tα/2
(n=10)

8
Test of Significance
for the Parameter Estimates
● Let H0 = 0 (parameter is not significant) &
H1 ≠ 0 (parameter is significant)
● Test statistic: t = or t=
follows the t distribution with n − 2 degrees of
freedom.
● Accept the null hypothesis if ≤ tα/2

9
Testing the statistical significance of β1
H 0 : β1 = 0
H1 : β1 ≠0
H0 can be tested using the test statistic:

t1 =

m p ling of
Sa tion
1
ib u
distr

β1 Estimated values of β1 from


True value of the different samples from the
parameter 10 population
Testing the statistical significance of β1
H 0 : β1 = 0
H1 : β1 ≠0
H0 can be tested using the test statistic:
Acceptance region
t1 =

i b u tion
t distr At level of significance α
e g ion
i on r
e c t 95%
r ej rejection region
2.5%
2.5%
^
𝛽1
-t α/2 0 t α/2
Accept the null hypothesis if ≤ tα/2
11
12
Example

i = 55.72 + 0.291 Xi n = 15
SE (5.31) (0.018) R2 = 0.95
Where Y is per capita household expenditure on food and X is
per capita total household expenditure
● Interpret the results
● Test the significance of parameter estimates at
5% level of significance.

13
Example
i = 55.72 +0.291 Xi n = 15
(5.31) (0.018) R2 = 0.95
● Interpret the results
● Test the significance of parameter estimates at 5%
level of significance.

○ H0 = 0 (parameter is not significant) &


H1 ≠ 0 (parameter is significant)
t0 = = = 10.49
tα/2 = 2.16
t0 > tα/2 Reject H0
is significant
14
Example
i = 55.72 + 0.291 Xi n = 15
(5.31) (0.018) R2 = 0.95
● Interpret the results
● Test the significance of parameter estimates at
5% level of significance.

○ H0 = 0 (parameter is not significant) &


H1 ≠ 0 (parameter is significant)

t1 = = = 16.167
tα/2 = 2.16
t1 > tα/2 Reject H0
is significant
15
The confidence interval approach to hypothesis
testing
Confidence Interval approach

[ − tα/2 se () , + tα/2 se () ]

α= 5% p li ng f β1
95% Sam ution o
ib
distr
2.5%
2.5%
^
𝛽1
L? β1 U?

tα/2 se () + tα/2 se ()
17
02
Goodness of Fit
You can enter a subtitle here if you need it
THE COEFFICIENT OF DETERMINATION R 2:
A MEASURE OF “GOODNESS OF FIT”
● The coefficientof determination R2 is a summary measure that tells how well the
sample regression line fits the data.

0 ≤ R2 ≤ 1

19
And tables to compare data
A B C

Yellow 10 20 7
Blue 30 15 10
Orange 5 24 16

20
THE COEFFICIENT OF DETERMINATION r 2:
A MEASURE OF “GOODNESS OF FIT”
● The coefficientof determination R2 is a summary measure that tells how well the
sample regression line fits the data.

R2 = 1 -

R2 =
21
03
Test of Overall
Significance
TEST OF THE OVERALL SIGNIFICANCE OF THE
REGRESSION
● The overall significance of the regression can be
tested with the ratio of the explained to the
unexplained variance. This follows an F
distribution with k – 1 and n – k degrees of
freedom, where n is number of observations and k
is number of parameters estimated:
● H0: β1 = β2 = -- = βk = 0
H1: not all βi values are 0
● Fk – 1;n – k =
Testing the Overall
H0: β0= β1 = 0 significance of regression
H1: not all βi values are 0
Fk – 1;n – k =
=
= 246.75
Fα = 4.67
Fstat > Fα , Reject the null hypothesis,
Regression is significant
24
TEST OF THE OVERALL SIGNIFICANCE OF THE
REGRESSION
● If the calculated F ratio exceeds the tabular value of F at
the specified level of significance and degrees of
freedom, the hypothesis is accepted that the regression
parameters are not all equal to zero and that R2 is
significantly different from zero.
● A ‘‘high’’ value for the F statistic suggests a significant
relationship between the dependent and independent
variables, leading to the rejection of the null hypothesis
that the coefficients of all explanatory variables are
jointly zero.
04
Specification
IModel specification

● Model specification refers to the But there is a useful procedure,


determination of which known as stepwise regression,
independent variables should be which can aid you in the
included in or excluded from a development of your model.
regression equation. “forward” stepwise
● In general, the specification of a regression analysis
regression model should be based
“Backwards” stepwise
primarily on theoretical
regression
considerations rather than empirical
or methodological ones. “General” stepwise
regression
IModel specification
● In a “forward” stepwise regression analysis, the computer will begin by examining every possible
simple linear regression model, and will show you the one with the highest coefficient of
determination. Keeping the independent variable just selected for the first model, the computer
will next examine every two-independent-variables model which includes the already-selected
variable and one other, and will show you the one with the highest coefficient of determination.
● And so on, adding one variable at a time, the computer will eventually hand you a sequence of
models it deems worthy of consideration.
● “Backwards” stepwise regression begins with the model including all of the potential
independent variables, and successively throws out those which cost the least in terms of
reduction of the coefficient of determination. The sequence of models so generated may be
different from that generated using forward stepwise regression.
● “General” stepwise regression goes merrily on its way, including variables which look good, and
throwing them away later if their contribution to the explanatory power of some later model is
not too great, and eventually yields a single model.
05
Functional
forms
Multiple Regression -Example
● ri = β0 + β1Si + β2MBi + β3PEi + β4BETAi + ui

○ where: ri is the percentage annual return for the stock

○ Si is the size of firm i measured in terms of sales revenue

○ MBi is the market to book ratio of the firm

○ PEi is the price/earnings (P/E) ratio of the firm

○ BETAi is the stock’s CAPM beta coefficient


● For the above model, you obtain the following results a cross-sectional regression with
200 firms (with standard errors in parentheses)
● i = 0.080 + 0.801Si + 0.321MBi + 0.164PEi − 0.084BETAi

(0.064) (0.147) (0.136) (0.420) (0.120)


● Calculate the t-ratios. What do you conclude about the effect of each variable on the
returns of the security?
● i = 0.080 + 0.801Si + 0.321MBi + 0.164PEi − 0.084BETAi

(0.064) (0.147) (0.136) (0.420) (0.120)

t0 = 1.25

t1 = 5.449 tα/2 = 1.972


t2 = 2.36

t3 = 0.390

t4 = 0.7

Only partial slope coefficients of S and MB are statistically significant


TEST OF THE OVERALL SIGNIFICANCE OF THE REGRESSION

Earningsi = -28.008 + 2.723 Si + 0.608 Expi n = 439


(4.643) (0.26)
(0.14) = 0.2095, = 0.2058
Where S – years of schooling, Exp – years of experience
● Interpret the results
● Test the overall significance
● Test the significance of parameters
● Construct 95% confidence intervals
TEST OF THE OVERALL SIGNIFICANCE OF THE REGRESSION

Earningsi = -28.008 + 2.723 Si + 0.608 Expi n = 439


(4.643) (0.26) (0.14) = 0.2095, = 0.2058

Coefficients SE t Stat Lower 95% Upper 95% F2; 436 =


i
Intercept -28.008 4.643 -6.03 -37.134 -18.882
=
Schooling 2.723 0.26 10.473 2.21 3.234 57.77
Experience 0.608 0.14 4.343 0.332 0.883
Fα/2 = 3.02

tα/2 = 1.96
06
Evaluation
Model Evaluation
Different metrics than are used to evaluate regression
models. Regression performance metrics quantify how
close model predictions are to actual (true) values.

Mean Absolute Error (MAE) Mean Absolute Percent Error (MAPE)

Root Mean Square Error (RMSE)

You might also like