0% found this document useful (0 votes)

22 views32 pages

R18&19

The document provides an overview of linear regression analysis, including the relationship between dependent and independent variables, conditions for using linear regression, and the ordinary least squares (OLS) estimation method. It discusses the interpretation of regression results, the use of dummy variables, and the assumptions underlying linear regression, as well as hypothesis testing and confidence intervals for regression coefficients. Additionally, it touches on multiple regression and the interpretation of slope coefficients in the context of multiple explanatory variables.

Uploaded by

Hong Nhung

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

22 views32 pages

R18&19

Uploaded by

Hong Nhung

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 32

R18: LINEAR REGRESSION

REGRESSION ANALYSIS

 Regression analysis seeks to measure how changes in one variable, called

a dependent (or explained) variable can be explained by changes in one
or more other variables called the independent (or explanatory) variables.
 This relationship is captured by estimating a linear equation.

 Error may be reduced by using more independent variables or by using different,

more appropriate independent variables
LINEAR REGRESSION CONDITIONS

 To use linear regression, three conditions need to be satisfied:

 The relationship between Y and X should be linear
 The error term must be additive (i.e., the variance of the error term is
independent of the observed data)
 All X variables should be observable (i.e., makes the model inappropriate when
you have missing data)
 Appropriate transformations of the independent variable(s) can make a
nonlinear relationship amenable to be fitted using a linear model
 Independent variable data are transformed first then transformed values are put
into linear equation as X
 Dependent variable is a linear function of the coefficients.
 For example, consider an unknown parameter, p, in a function: Y = α + β𝑋 𝑝 + ε. In
this instance, β𝑋 𝑝 contains two unknown parameters (β and p) and p does not
enter the model multiplicatively and, hence, it would not be appropriate to
apply linear regression in such a case
QUIZ

1. Generally, if the value of the independent variable is zero, then the

expected value of the dependent variable would be equal to the:
A. slope coefficient.
B. intercept coefficient.
C. error term.
D. residual.

2. The error term represents the portion of the:

A. dependent variable that is not explained by the independent variable(s)
but could possibly be explained by adding addition al independent variables.
B. dependent variable that is explained by the independent variable(s).
C. independent variables that are explained by the dependent variable.
D. dependent variable that is explained by the error in the independent
variable(s).
ORDINARY LEAST SQUARES ESTIMATION

 Ordinary least squares (OLS) estimation is a process that estimates the

parameters α and β in an effort to minimize the squared residuals (i.e., error
terms)
 Rewriting our regression equation
→ the OLS sample coefficients are those that minimize
 The estimated slope coefficient (β) for the regression line describes the
change in Y for a one-unit change in X:

 The intercept term (α) is the line’s intersection with the Y-axis at X = 0
INTERPRETING REGRESSION RESULTS

 Example:
DUMMY VARIABLES

 Observations for most independent variables (e.g., firm size, level of GDP,
and interest rates) can take on a wide range of values.
 However, there are occasions when the independent variable is binary in
nature—it is either on or off.
 Independent variables that fall into this category are called dummy variables
and are often used to quantify the impact of qualitative variables.
 Dummy variables are assigned a value of 0 or 1.
 For example, in a time series regression of monthly stock returns, you could
employ a January dummy variable that would take on the value of 1 if a stock
return occurred in January, and 0 if it occurred in any other month.
 The purpose of including the January dummy variable would be to see if stock
returns in January were significantly different than stock returns in all other months
of the year.
Coefficient of Determination of a Regression (𝑅2 )

 The 𝑅 2 of a regression model captures the fit of the model; it represents the
proportion of variation in the dependent variable that is explained by the
independent variable(s).
 For a regression model with a single independent variable, 𝑅 2 is the square of the
correlation between the independent and dependent variable
ASSUMPTIONS UNDERLYING LINEAR REGRESSION

 The expected value of the error term, conditional on the independent

variable, is zero [E(εi|Xi) = 0]
 This means X has no information about the location of ε
 This assumption is not directly testable
 Evaluation of whether this assumption is reasonable requires an examination of
the data generating process. Generally, a violation would be evidenced by the
following:
 Survivorship bias: occurs when the observations are collected after-the-fact (companies
that get dropped from an index are not included in the sample)
 Sample selection bias: occurs when occurrence of an event (i.e., an observation) is
contingent on specific outcomes.
 Simultaneity bias: happens when the values of X and Y are simultaneously determined
 Omitted variables: Important explanatory (i.e., X) variables are excluded from the
model and the errors will capture the influence of the omitted. Omission of important
variables cause the coefficients to be biased and may indicate nonexistent (i.e.,
misleading) relationships
 Attenuation bias: occurs when X variables are measured with error and leads to
underestimation of the regression coefficients
ASSUMPTIONS UNDERLYING LINEAR REGRESSION

 All (X, Y) observations are independent and identically distributed (i.i.d.)

 Variance of X is positive (otherwise estimation of β would not be possible)
 Variance of the errors is constant (i.e., homoskedasticity)
 It is unlikely that large outliers will be observed in the data. OLS estimates
are sensitive to outliers, and large outliers have the potential to create
misleading regression results
→ Collectively, these assumptions ensure that the regression estimators are
unbiased.
Secondly, they ensure that the estimators are normally distributed and, as
a result, allowed for hypothesis testing
PROPERTIES OF OLS ESTIMATORS

 Because OLS estimators are derived from random samples, these estimators
are also random variables because they vary from one sample to the next
 OLS estimators will have their own probability distributions (i.e., sampling
distributions)
 These sampling distributions allow us to estimate population parameters, such as
the population mean, the population regression intercept term, and the
population regression slope coefficient
 Drawing multiple samples from a population will produce multiple sample
means.
 The distribution of these sample means is referred to as the sampling distribution
of the sample mean.
 The mean of this sampling distribution is used as an estimator of the population
mean and is said to be an unbiased estimator of the population mean.
 An unbiased estimator is one for which the expected value of the estimator is
equal to the parameter you are trying to estimate.
PROPERTIES OF OLS ESTIMATORS

 Given the central limit theorem (CLT), for large sample sizes, it is reasonable to
assume that the sampling distribution will approach the normal distribution.
 This means that the estimator is also a consistent estimator. A consistent estimator is
one for which the accuracy of the parameter estimate increases as the sample size
increases
 Like the sampling distribution of the sample mean, OLS estimators for the
population intercept term and slope coefficient also have sampling
distributions.
 The sampling distributions of OLS estimators, α and β, are unbiased and consistent
estimators of respective population parameters.
 Being able to assume that α and β are normally distributed is a key property in
allowing us to make statistical inferences about population coefficients
 The variance of the slope (β) increases with variance of the error and decreases
with the variance of the explanatory variable
 The variance of the slope indicates the reliability of the sample estimate of the
coefficient, and the higher the variance of the error, the lower the reliability of the
coefficient estimate
 Higher variance of the explanatory (X) variable(s) indicates that there is sufficient
diversity in observations (i.e., the sample is representative of the population) and,
hence, lower variability (and higher confidence) of the slope estimate
QUIZ
2. What is the most appropriate interpretation of a slope coefficient estimate equal to 10.0?
A. The predicted value of the dependent variable when the independent variable is zero is
10.0.
B. The predicted value of the independent variable when the dependent variable is zero is 0.1.
C. For every one unit change in the independent variable, the model predicts that the
dependent variable will change by 10 units.
D. For every one unit change in the independent variable, the model predicts that the
dependent variable will change by 0.1 units.

3. The reliability of the estimate of the slope coefficient in a regression model is most likely:
A. positively affected by the variance of the residuals and negatively affected by the
variance of the independent variables.
B. negatively affected by the variance of the residuals and negatively affected by the
variance of the independent variables.
C. positively affected by the variance of the residuals and positively affected by the variance
of the independent variables.
D. negatively affected by the variance of the residuals and positively affected by
the variance of the independent variables.
HYPOTHESIS TESTING

 The steps in the hypothesis testing procedure for regression coefficients are
as follows:
 Specify the hypothesis to be tested.
 Calculate the test statistic.
 Reject or fail to reject the null hypothesis after comparing the test statistic to its
critical value
 The estimated slope coefficient, β, will be normally distributed with a
standard deviation known as the standard error of the regression
coefficient (𝑆𝑏 ) → Hypothesis testing can be conducted with sample value
of the coefficient and its standard error
 Suppose we want to test the hypothesis that the value of the slope
coefficient is equal to β0

 We would use the t-statistic:

→ If absolute value of test statistic > critical t-value → Reject null hypothesis
CONFIDENCE INTERVALS

 The confidence interval of the slope coefficient = β ± (𝑡𝑐 × 𝑆𝑏 )

 𝑡𝑐 is the critical value for a given level of significance and degrees of freedom
(n-2)
 If we correctly rejected the null hypothesis of β = 0 in our hypothesis test,
zero does not fall in the confidence interval
 In other words, if the hypothesized value of the slope coefficient falls outside of
the confidence interval, we can reject the null.
 If it falls inside the confidence interval, we fail to reject the null hypothesis
CONFIDENCE INTERVALS

 Example: A regression model estimated using 46 observations has β = 0.76

and SER(b) = 0.33. Determine if the slope coefficient is statistically different
from zero at 5% level of significance. The critical t-value for a sample size of
46 and 5% level of significance is 2.02.
 Also calculate the confidence interval of slope coefficient
THE P-VALUE

 The p-value is the smallest level of significance for which the null hypothesis
can be rejected.
 An alternative method of doing hypothesis testing of regression coefficients
is to compare the p-value to the significance level:
 If the p-value is less than the significance level, the null hypothesis can be
rejected.
 If the p-value is greater than the significance level, the null hypothesis cannot be
rejected.
 In general, regression outputs will provide the p-value for the standard
hypothesis (Ho: β = 0 versus Ha: β ≠ 0).
 Consider again the example where β = 0.76, SER = 0.33, and the level of
significance is 5%. The regression output provides a p-value = 0.026.
Because the p-value < level of significance, we reject the null hypothesis
that β = 0, which is the same result as the one we got when performing the
t-test.
QUIZ
Bob Shepperd is trying to forecast 10-year T-bond yield. Shepperd tries a variety of explanatory variables in several
iterations of a single-variable model. Partial results are provided below (note that these represent three separate one-
variable regressions):

The critical t-value at 5% level of significance is equal to 2.02.

1. For the regression model involving inflation as the explanatory variable, the confidence interval for the slope
coefficient is closest to:
A. −0.27 to 2.43.
B. 0.26 to 2.43.
C. −2.27 to 2.43.
D. 0.22 to 1.88.
2. For the regression model involving unemployment rate as the explanatory variable, what are the results of a
hypothesis test that the slope coefficient is equal to 0.20 (vs. not equal to 0.20) at 5% level of significance?
A. The coefficient is not significantly different from 0.20 because the p-value is <0.001.
B. The coefficient is significantly different from 0.20 because the t-value is 2.33, which is greater than the critical t-value
of 2.02.
C. The coefficient is significantly different from 0.20 because the t-value is −5.67.
D. The coefficient is not significantly different from 0.20 because the t-value is −2.33.
R19:REGRESSION WITH MULTIPLE
EXPLANATORY VARIABLES
MULTIPLE REGRESSION

 General form of multiple regression model:

 Assumptions of multiple regression:

 The expected value of the error term, conditional on the independent variables,
is zero: [E(εi|Xi’s) = 0].
 All (Xs and Y) observations are i.i.d.
 The variance of X is positive (otherwise estimation of β would not be possible).
 The variance of the errors is constant (i.e., homoskedasticity).
 There are no outliers observed in the data
 X variables are not perfectly correlated (i.e., they are not perfectly linearly
dependent). In other words, each X variable in the model should have some
variation that is not fully explained by the other X variables (unique to multiple
regression)
MULTIPLE REGRESSION

 the interpretation of the slope coefficient is that it captures the change in

the dependent variable for a one-unit change in the independent
variable, holding the other independent variables constant
 As a result, the slope coefficients in a multiple regression are sometimes called
partial slope coefficients
 The ordinary least squares (OLS) estimation process for multiple regression
differs from single regression.
 In a stepwise fashion, first, the individual explanatory variables are regressed
against other explanatory variables and the residuals from these models become
explanatory variables in the regression using the original independent variable
 Consider a simple, two-independent-variable model:

 First we estimate the residuals in the following model using OLS estimation techniques for
a single regression:
 We then do the same, but this time estimate the residuals in the model

 The residuals are regressed against the residuals from the first step to estimate the slope
coefficient β1:
INTERPRETING MULTIPLE REGRESSION RESULTS

 Suppose we run a regression of the dependent variable Y on a single

independent variable X1 and get the following result: Y = 2.0 + 4.5𝑋1
 Interpretation of the estimated slope coefficient: if X1 increases by 1 unit, we
would expect Y to increase by 4.5 units

 Now suppose we add a second independent variable X2 to the regression

and get the following result: Y = 1.0 + 2.5 𝑋1 + 6.0 𝑋2
 estimated slope coefficient for X1 changed from 4.5 to 2.5 when we added X2 to
the regression → expect this to happen most of the time when a second variable
is added to the regression (unless X2 is uncorrelated with X1, because if X1
increases by 1 unit, then we would expect X2 to change as well)
 Interpretation of the estimated slope coefficient for X1 is that if X1 increases by 1
unit, we would expect Y to increase by 2.5 units, holding X2 constant
INTERPRETING MULTIPLE REGRESSION RESULTS
QUIZ
MEASURES OF FIT IN LINEAR REGRESSION

 Standard error of the regression (SER) measures the uncertainty about the
accuracy of the predicted values of the dependent variable.
 Graphically, the relationship is stronger when the actual x,y data points lie closer
to the regression line
 Since OLS estimation minimizes the sum of the squared differences
between the predicted value and actual value for each observation

 The equation can be rewritten as:

→
 Or:
COEFFICIENT OF DETERMINATION

 Dividing both sides by TSS, we see that 1 = (ESS/TSS) + (RSS/TSS)

 The first term on the right side captures the proportion of variation in Y that is
explained. This proportion is the coefficient of determination (𝑅 2) of a multiple
regression and is a goodness-of-fit measure
→ 𝑅 2 = ESS/TSS = % of variation explained by the regression model
→ For a multiple regression:

 While it is a goodness-of-fit measure, 𝑅 2 by itself may not be a reliable

measure of the explanatory power of the multiple regression model due to:
 𝑅 2 almost always increases as independent variables are added to the model,
even if the marginal contribution of the new variables is not statistically significant
 A relatively high 𝑅 2 may reflect the impact of a large set of independent
variables rather than how well the set explains the dependent variable →
overestimating the regression
 𝑅 2 is not comparable across models with different dependent variables
 There is no clear predefined values of 𝑅 2 that indicate whether the model is
good or not → For some noisy variables, models with low 𝑅 2 may provide
valuable insight
ADJUSTED 𝑅2

 To overcome the problem of overestimating the impact of additional

variables on the explanatory power of a regression model, many
researchers recommend adjusting 𝑅 2 for the number of independent
variables

 Adjusted 𝑅 2 will be less than or equal to 𝑅 2 → Adding new independent variable

to the model will increase 𝑅 2 and may increase or decrease adjusted 𝑅 2
 Example: An analyst runs a regression of monthly value-stock returns on 5
independent variables over 60 months. The total sum of squares for the
regression is 460, and the residual sum of squares is 170. Calculate the 𝑅 2
and adjusted 𝑅 2
 Example: Suppose the analyst now adds four more independent variables
to the previous regression, and the 𝑅2 increases to 65.0%. Identify which
model the analyst would most likely prefer.
JOINT HYPOTHESIS TESTS AND CONFIDENCE INTERVALS

 As with single regression, the magnitude of the coefficients in a multiple

regression tells us nothing about the importance of the independent
variable in explaining the dependent variable.
 Thus, we must conduct hypothesis testing on the estimated slope coefficients to
determine if the independent variables make a significant contribution to
explaining the variation in the dependent variable
 The t-statistic used to test the significance of the individual coefficients in a
multiple regression is calculated using the same formula that is used with single
regression:

→ For multiple regression, t-statistic has (n – k- 1) degrees of freedom

DETERMINING STATISTICAL SIGNIFICANCE

 The most common hypothesis test done on the regression coefficients is to

test statistical significance, which means testing the null hypothesis that the
coefficient is zero versus the alternative that it is not:
 Testing statistical significance ⇒ 𝐻𝑜 : 𝑏𝑗 = 0 versus 𝐻𝑎 : 𝑏𝑗 ≠ 0

 Confidence interval for a regression coefficient in multiple regression:

 Example: Consider the hypothesis that future 10-year real earnings growth
in the S&P 500 (EG10) can be explained by the trailing dividend payout
ratio of the stocks in the index (PR) and the yield curve slope (YCS). Test the
statistical significance of the independent variable PR in the real earnings
growth example at the 10% significance level. Assume that the number of
observations is 46 and the critical t-value for 10% level of significance is 1.68.
The results of the regression are produced in the following table.
F-TEST

 For models with multiple variables, the univariate t-test is not applicable
when testing complex hypotheses involving the impact of more than one
variable. Instead, we use the F-test
 F-test is useful to evaluate a model against other competing partial models
 For example, a model with three independent variables (X1, X2, and X3) can be
compared against a model with only one independent variable (X1). We are
trying to see if the two additional variables (X2 and X3) in the full model
contribute meaningfully to explain the variation in Y:

 F-statistic for multiple regression coefficients, which is always a one-tailed

test:
F-TEST

 Example: A researcher is seeking to explain returns on a stock using the

market returns as an explanatory variable (CAPM formulation). The
researcher wants to determine whether two additional explanatory
variables contribute meaningfully to variation in the stock’s return. Using a
sample consisting of 64 observations, the researcher found that RSS in the
model with three explanatory variables is 6,650 while the RSS in the single-
variable model is 7,140. Evaluate the model with extra variables relative to
the standard CAPM formulation. 5% significance level
F-TEST

 A more generic F-test is used to test the hypothesis that all variables
included in the model do not contribute meaningfully in explaining the
variation in Y versus at least one of the variables does contribute statistically
significantly

 We calculate the F-statistic as follows:

→ Compare against critical F → If F-stat > critical F → Reject null hypothesis

 Example: An analyst runs a regression of monthly value-stock returns on five

independent variables over 46 months. The total sum of squares is 460, and
the residual sum of squares is 170. Test the null hypothesis at the 5%
significance level (95% confidence) that all five of the independent
variables are equal to zero.

Classical Linear Regression Model (CLRM)
100% (1)
Classical Linear Regression Model (CLRM)
68 pages
Coordinates and Traverse
100% (1)
Coordinates and Traverse
39 pages
Final Paper1
No ratings yet
Final Paper1
69 pages
4.1 Multiple Choice: Chapter 4 Linear Regression With One Regressor
No ratings yet
4.1 Multiple Choice: Chapter 4 Linear Regression With One Regressor
33 pages
Hudson Thesis 2010
No ratings yet
Hudson Thesis 2010
103 pages
Recruitment and Selection Process
No ratings yet
Recruitment and Selection Process
5 pages
R-Programming - Unit 5
No ratings yet
R-Programming - Unit 5
43 pages
Castigliano Theorem
No ratings yet
Castigliano Theorem
73 pages
Motivating Office Employees
No ratings yet
Motivating Office Employees
16 pages
Ch3 Slides Ed4 2024 20
No ratings yet
Ch3 Slides Ed4 2024 20
72 pages
Chapter Two: Bivariate Regression Mode
100% (1)
Chapter Two: Bivariate Regression Mode
54 pages
Week 3-4
No ratings yet
Week 3-4
75 pages
Econometrics Unit 3 Tedy Best
No ratings yet
Econometrics Unit 3 Tedy Best
147 pages
Chapter2 Final
No ratings yet
Chapter2 Final
33 pages
Chapter 2-Simple Regression Model
No ratings yet
Chapter 2-Simple Regression Model
25 pages
Simple Linear Regression Analysis..
No ratings yet
Simple Linear Regression Analysis..
51 pages
009 - Surveying Terms and Abbreviations
No ratings yet
009 - Surveying Terms and Abbreviations
7 pages
Unit 4 Saving, Investment, and The Financial System
No ratings yet
Unit 4 Saving, Investment, and The Financial System
49 pages
Recycling: An Assessment of Material Waste Disposal Methods in The Nigerian Construction Industry
No ratings yet
Recycling: An Assessment of Material Waste Disposal Methods in The Nigerian Construction Industry
15 pages
1-Chap II Econometrics ABC DR Mitiku
No ratings yet
1-Chap II Econometrics ABC DR Mitiku
80 pages
Ch3 Slides Ed4 2024
No ratings yet
Ch3 Slides Ed4 2024
72 pages
BRM - L4,5 - Linear Regression
No ratings yet
BRM - L4,5 - Linear Regression
113 pages
Lecture 8 Correlation and Linear Regression
No ratings yet
Lecture 8 Correlation and Linear Regression
66 pages
Lecture Set 2
No ratings yet
Lecture Set 2
47 pages
Econometrics: A Simple Introduction
From Everand
Econometrics: A Simple Introduction
K.H. Erickson
3.5/5 (5)
Econometrics II: Revision Class: Introduction To Econometrics
No ratings yet
Econometrics II: Revision Class: Introduction To Econometrics
55 pages
Cognitive Translation Studies: Models and Methods at The Cutting Edge
No ratings yet
Cognitive Translation Studies: Models and Methods at The Cutting Edge
25 pages
Linear Regression Models
No ratings yet
Linear Regression Models
42 pages
Gujarati D, Porter D, 2008: Basic Econometrics 5Th Edition Summary of Chapter 3-5
No ratings yet
Gujarati D, Porter D, 2008: Basic Econometrics 5Th Edition Summary of Chapter 3-5
64 pages
Lectures 7 8-Simple Regression Analysis - Assumptions and Estimations (OLS)
No ratings yet
Lectures 7 8-Simple Regression Analysis - Assumptions and Estimations (OLS)
21 pages
Module 6A Estimating Relationships
No ratings yet
Module 6A Estimating Relationships
104 pages
Lecture 2. Simple Linear Regression
No ratings yet
Lecture 2. Simple Linear Regression
49 pages
Econometrics For Finace Lecture II-Session Three
No ratings yet
Econometrics For Finace Lecture II-Session Three
32 pages
Lecture 2
No ratings yet
Lecture 2
39 pages
BMGT 495 Project 4 Details
No ratings yet
BMGT 495 Project 4 Details
7 pages
Eco 3
No ratings yet
Eco 3
68 pages
Linear Regression Model
No ratings yet
Linear Regression Model
36 pages
FRM Level I - Quan - Time Value of Money - SLIDE
No ratings yet
FRM Level I - Quan - Time Value of Money - SLIDE
27 pages
Simple Linear Regression1
No ratings yet
Simple Linear Regression1
36 pages
Training and Developemnt Needs Assessment
No ratings yet
Training and Developemnt Needs Assessment
4 pages
Innovation Types in Public Sector Organizations: A Systematic Review of The Literature
No ratings yet
Innovation Types in Public Sector Organizations: A Systematic Review of The Literature
25 pages
FRM Part 1: Regression With Multiple Explanatory Variables
No ratings yet
FRM Part 1: Regression With Multiple Explanatory Variables
29 pages
Simple Linear Regression
No ratings yet
Simple Linear Regression
27 pages
Topic 6 Statistics
No ratings yet
Topic 6 Statistics
7 pages
Linear Regression Models
No ratings yet
Linear Regression Models
41 pages
Chap3 - Multiple Regression
No ratings yet
Chap3 - Multiple Regression
56 pages
Gujarati Chap 3
No ratings yet
Gujarati Chap 3
44 pages
P1.T2. Quantitative Analysis: Stock & Watson, Introduction To Econometrics Linear Regression With One Regressor
No ratings yet
P1.T2. Quantitative Analysis: Stock & Watson, Introduction To Econometrics Linear Regression With One Regressor
34 pages
Quantitative Research: Ms. Ayra Patricia S. Alvero
No ratings yet
Quantitative Research: Ms. Ayra Patricia S. Alvero
36 pages
ECS3706-econometric Techniques Discussion Class 2 15-09-2010
No ratings yet
ECS3706-econometric Techniques Discussion Class 2 15-09-2010
33 pages
3-Econometrics-Linear Regression
No ratings yet
3-Econometrics-Linear Regression
13 pages
Topic 2
No ratings yet
Topic 2
23 pages
Chapter 3 - Classical Simple Linear Regression
No ratings yet
Chapter 3 - Classical Simple Linear Regression
52 pages
Chapter 2 Simple Linear Regression
No ratings yet
Chapter 2 Simple Linear Regression
31 pages
41 SchedaBando LawScienceTechnology
No ratings yet
41 SchedaBando LawScienceTechnology
4 pages
PR2 Finding Answers Through Data Collection
No ratings yet
PR2 Finding Answers Through Data Collection
19 pages
FRM Level I - Quan - Discounted Cash Flows Applications - SLIDE
No ratings yet
FRM Level I - Quan - Discounted Cash Flows Applications - SLIDE
13 pages
Essay
No ratings yet
Essay
2 pages
Chapter Three
No ratings yet
Chapter Three
22 pages
Bus 173 - Lecture 5
No ratings yet
Bus 173 - Lecture 5
38 pages
Linear Regression With 1 Regressor - EXERCISE
No ratings yet
Linear Regression With 1 Regressor - EXERCISE
6 pages
Week 2 - The Simple Linear Regression Model PDF
No ratings yet
Week 2 - The Simple Linear Regression Model PDF
47 pages
Regression Analysis (Simple)
100% (1)
Regression Analysis (Simple)
8 pages
02 Simple Regression
No ratings yet
02 Simple Regression
29 pages
Basic Economterics - I
No ratings yet
Basic Economterics - I
17 pages
Section 1 Introduction: RHB Bank Berhad (Also Known As The Rashid Hussein Bank) Is A Bank Based in
No ratings yet
Section 1 Introduction: RHB Bank Berhad (Also Known As The Rashid Hussein Bank) Is A Bank Based in
39 pages
(Group6) People's Council Report
No ratings yet
(Group6) People's Council Report
11 pages
Econometrics Chapter 3
No ratings yet
Econometrics Chapter 3
24 pages
Regression: Dr. Agustinus Suryantoro, M.S
No ratings yet
Regression: Dr. Agustinus Suryantoro, M.S
31 pages
Lecture 3 Multiple Regression Model-Estimation
No ratings yet
Lecture 3 Multiple Regression Model-Estimation
40 pages
Thesis
No ratings yet
Thesis
14 pages
Hypothesis Testing - EXERCISE
No ratings yet
Hypothesis Testing - EXERCISE
5 pages
MGT447 E-Business NEW Guidelines For Your Project If Alone 20-02-24 For BB With Research Topics
No ratings yet
MGT447 E-Business NEW Guidelines For Your Project If Alone 20-02-24 For BB With Research Topics
23 pages
Rahul Rangrao Kadam: Skill Set Profile Summary
No ratings yet
Rahul Rangrao Kadam: Skill Set Profile Summary
2 pages
Product Classification Model EN
No ratings yet
Product Classification Model EN
8 pages
Econometrics Final
No ratings yet
Econometrics Final
13 pages
FOMC Introductory
No ratings yet
FOMC Introductory
4 pages
TCH442E Quantitative Methods For Finance
No ratings yet
TCH442E Quantitative Methods For Finance
21 pages
Time Value of Money - EXERCISE
No ratings yet
Time Value of Money - EXERCISE
5 pages
Discounted Cash Flow - Học Bấm Máy Nhiều
No ratings yet
Discounted Cash Flow - Học Bấm Máy Nhiều
5 pages
Introduction To Econometrics - Summary
No ratings yet
Introduction To Econometrics - Summary
23 pages
Econometrics Jimma Assignment
No ratings yet
Econometrics Jimma Assignment
6 pages
Module 3 EDA
No ratings yet
Module 3 EDA
14 pages
Econometrics 2
No ratings yet
Econometrics 2
9 pages
Badanie
No ratings yet
Badanie
6 pages
Fba 1
No ratings yet
Fba 1
9 pages
Lesson 3
No ratings yet
Lesson 3
3 pages
6034 - Classical Linear Regression Model
No ratings yet
6034 - Classical Linear Regression Model
30 pages
Multiple Regression Analysis
No ratings yet
Multiple Regression Analysis
14 pages
Ontario
No ratings yet
Ontario
14 pages
Chapter 3 Two Variable Regression Model
No ratings yet
Chapter 3 Two Variable Regression Model
7 pages
Q3 Module 3 - by Pair
No ratings yet
Q3 Module 3 - by Pair
12 pages
Learning Outcomes of Introductory Engineering Courses: Student Percep-Tions
No ratings yet
Learning Outcomes of Introductory Engineering Courses: Student Percep-Tions
15 pages
Correlation and Regression: Six Sigma Thinking, #8
From Everand
Correlation and Regression: Six Sigma Thinking, #8
Sumeet Savant
5/5 (1)
Food Chemistry: Sciencedirect
No ratings yet
Food Chemistry: Sciencedirect
9 pages
Econometrics Assignment
No ratings yet
Econometrics Assignment
4 pages
Referensi TBL1 PDF
No ratings yet
Referensi TBL1 PDF
6 pages
Document 12
No ratings yet
Document 12
4 pages
Procurement Process Analysis Using Process Mining in Cement Manufacturing Company (Case Study PT. Semen Indonesia Persero, TBK)
No ratings yet
Procurement Process Analysis Using Process Mining in Cement Manufacturing Company (Case Study PT. Semen Indonesia Persero, TBK)
6 pages
Exercises of Advanced Statistics
From Everand
Exercises of Advanced Statistics
Simone Malacrida
No ratings yet

R18&19

Uploaded by

R18&19

Uploaded by

R18: LINEAR REGRESSION

 Regression analysis seeks to measure how changes in one variable, called

 Error may be reduced by using more independent variables or by using different,

 To use linear regression, three conditions need to be satisfied:

1. Generally, if the value of the independent variable is zero, then the

2. The error term represents the portion of the:

 Ordinary least squares (OLS) estimation is a process that estimates the

 The expected value of the error term, conditional on the independent

 All (X, Y) observations are independent and identically distributed (i.i.d.)

 We would use the t-statistic:

 The confidence interval of the slope coefficient = β ± (𝑡𝑐 × 𝑆𝑏 )

 Example: A regression model estimated using 46 observations has β = 0.76

The critical t-value at 5% level of significance is equal to 2.02.

 General form of multiple regression model:

 Assumptions of multiple regression:

 the interpretation of the slope coefficient is that it captures the change in

 Suppose we run a regression of the dependent variable Y on a single

 Now suppose we add a second independent variable X2 to the regression

 The equation can be rewritten as:

 Dividing both sides by TSS, we see that 1 = (ESS/TSS) + (RSS/TSS)

 While it is a goodness-of-fit measure, 𝑅 2 by itself may not be a reliable

 To overcome the problem of overestimating the impact of additional

 Adjusted 𝑅 2 will be less than or equal to 𝑅 2 → Adding new independent variable

 As with single regression, the magnitude of the coefficients in a multiple

→ For multiple regression, t-statistic has (n – k- 1) degrees of freedom

 The most common hypothesis test done on the regression coefficients is to

 Confidence interval for a regression coefficient in multiple regression:

 F-statistic for multiple regression coefficients, which is always a one-tailed

 Example: A researcher is seeking to explain returns on a stock using the

 We calculate the F-statistic as follows:

→ Compare against critical F → If F-stat > critical F → Reject null hypothesis

 Example: An analyst runs a regression of monthly value-stock returns on five

You might also like