0% found this document useful (0 votes)

5 views19 pages

Statistics Week3

The document provides an overview of simple and multiple linear regression, detailing the regression problem, estimation of coefficients, and model fitting techniques. It covers key concepts such as the best fit line, centering for improved interpretation, statistical inference, and the use of R-squared and mean squared error for assessing model fit. The lecture concludes with a discussion on the implications of regression analysis and the importance of understanding confidence intervals.

Uploaded by

x.zhang.ds

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

5 views19 pages

Statistics Week3

Uploaded by

x.zhang.ds

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 19

Simple & Multiple Linear Regression Lecture 4 & 5 (Week3)

Outline
I. Introduction to the “regression” problem
II. Simple linear regression
• Introducing Equations and Terminology
• Different Types of Regression Lines
• Estimating the Regression Coefficients
• Centring
• Inference
• Model Fit
III. Conclusions & Questions
IV. Multiple linear regression
• Introduction
• Centring
• Explained Variance
• Model Comparison
• Deriving the F-Statistic
• F-Statistic Application
• Model Building
• Continuing Model Comparison
• Continuing Model Building
• Thinking About Partialing
V. Conclusions & Questions

I. Introduction to the “regression” problem

Regression Problem
Some of the most common and useful statistical models are regression models.
 Regression problems involve modelling a quantitative outcome. (As opposed to
classification problems, which involve modelling categorical outcomes)
o The regression problem begins with a random outcome variable, Y.
o We hypothesize that the mean of Y is dependent on some set of fixed
covariates, X.

Flavours of Probability Distribution

The distributions we’ve considered thus far imply a constant mean.
 Each observation is expected to have the same value of Y, regardless of their
individual characteristics.
Here we have a constant mean of 0, which means all observations are expected to
have the same value of Y regardless of individual characteristics.

 This type of distribution is called “marginal” or “unconditional.”

1
The distributions we consider in regression problems have conditional means.
 The value of Y that we expect for each observation is defined by the observation’s
individual characteristics.
 This type of distribution is called “conditional”:

In practice, we only interact with the X-Y plane of the 3D figure.

 On the Y-axis, we plot our outcome variable.
 The X-axis represents the predictor variable upon which we condition the mean of Y.

We want to explain the relationship between Y and X by finding the line that passes through
the scatterplot as “closely” as possible to each point.
 This line is called the “best fit line.”
 For any value of X, the corresponding point on the best fit line is the model’s best
guess for the value of Y.

II. Simple Linear Regression

1. Introducing Equations and Terminology
 The best fit line is defined by a simple equation: Y^ = ^β0 + β^ 1 Χ
^β is the intercept.
0
• So the Y^ value when X = 0.
• And the expected value of Y^ when X = 0
^β is the slope.
1
• The change in Y^ for a unit change in X.
• The expected change in Y for a unit change in X
 The equation only describes the best fit line. It does not fully quantify the
relationship between Y and X.
 We still need to account for the estimation error. Y^ = ^β0 + β^ 1 Χ + ε^

2
In the population regression equation Y = β0 + β 1 Χ +ε , the 𝜀 term represents a vector of
Residuals vs Errors:

errors.

 The errors, 𝜀, are unknown parameters, so we must estimate them.

 The differences between Y and the true regression line, β 0 + β 1 Χ

In the estimated regression model Y^ = ^β0 + β^ 1 Χ + ε^ , the ε^ term represent a vector of

residuals.
 The differences between Y and the estimated best fit line, ^β 0 + ^β 1 Χ
 The residuals, ε^ , are sample estimates of the errors, 𝜀. subset

2. Different Types of Regression Lines

We can think about our regression model in four ways:

In the context of data science and statistics, the notation N(0, 1) typically represents a probability distribution,
specifically the standard normal distribution. A normal distribution with a mean (μ) of 0 and a standard
deviation (σ) of 1 is known as the standard normal distribution.

3. Estimating the regression Coefficients

The purpose of regression analysis is to use a sample of N observed { Y n , Χ n } pairs to find the
best fit line defined by ^β 0 and ^β 1. Different methods exist to do this estimation. The most
popular involves minimizing the sum of the squared residuals (i.e., estimated errors).
 We sum the residuals and want them to be as small as possible.
 The ε^ are defined in terms of deviations between each observed Y nvalue and the
corresponding Y^ n .
 Each ε^ n is squared before summing to remove negative values and produce a
quadratic objective function.
N N N
RSS=∑ ε^ 2n =∑ ( Y n− Y^ n ) =∑ ( Y n− β^ 0− β^ 1 Χ n )
2 2

n=1 n=1 n=1
The RSS is a very well-behaved objective function that admits closed-form solutions
for the minimizing values of ^β 0 and ^β 1. (which means that we don't need any
numerical optimization methods, but we can simply derive the results with algebra.)
 The values of ^β 0 and ^β 1 that minimize the RSS are:
N

∑ ( Χ n −Χ ) ( Y n−Y )
^β 1= n=1
N

∑ ( Χ n− Χ )2
n=1
^β =Y − ^β Χ
0 1

3
 These are the ordinary least squares (OLS) estimates of 𝛽1 and 𝛽0.

Example:
This is how we obtained the ordinary least squares estimates of our parameters for the
example we saw earlier. Horsepower is the dependent variable. And price is the
independent variable. And if we run the model and use the coef() function that you see here
to extract the parameters, then this is what we get.

The estimated intercept is ^β 0 = 60.45.

 A free car is expected to have 60.45 horsepower.
The estimated slope is: ^β 1 = 4.27.
 For every additional $1,000 in price, a car is expected to gain 4.27 horsepower.
Always be aware of the interpretation of the intercept and slope and what type of scaling
we have!

4. Centring - Improving Interpretation 平移

We don’t care about the horsepower of a “free” car, so we may want to transform our data
in some way to improve interpretation.
 The intercept is defined as the expected value of Y when X = 0
o We can make the intercept interesting by shifting the scale of X so that X = 0
is a meaningful point.
 A common way to do this is (mean) centring. We mean-centre X by subtracting Χ
from each of the Χ n.
 After mean-centring, the zero-point of Χ C corresponds to the original Χ .
 Centring only translates the scale of the X-axis.
 Centring does not change the linear relationship in linear models.
 In general, the highest order relationship is not affected by centring.

Estimate the least squares coefficients with X mean-centred:

The estimated intercept is ^β 0 = 143.83.

 A car with average price is expected to have 143.83 horsepower.
The estimated slope is: ^β 1 = 4.27. Slope doesn’t change.
 For every additional $1,000 in price, a car is expected to gain 4.27 horsepower.

5. Inference

4
We have to answer how generalizable our results are, and therefore we need to use
statistical inference to account for the precision with which we’ve estimated ^β 0 and ^β 1.

Sampling Distributions

Our regression intercept and slope both have sampling distributions that we can use to
judge the precision of our estimates.
 The mean of the distribution is going to be our estimated statistic.
 In OLS regression, the coefficients are normally distributed.
 The standard deviation of these distributions is equal to the standard error(SE).
When the standard deviation (s/σ^ ) of a distribution is equal to the standard error(SE), it
usually means that you are dealing with a situation where you're estimating a
population parameter based on a sample statistic.
 Sample Standard Deviation (s / σ^ ) represents the measure of the spread or variability
of individual data points within the sample. It tells you how much the data points
deviate from the sample mean.

√
N

∑ ( X n− X )2
s/ σ^ = n=1
N−1
 The Standard Error(SE) is a measure of the variability or precision of a statistic
(particularly the mean) when it's estimated from a sample of data. It quantifies how
much the sample mean is expected to vary from the true population mean. In other
words, it provides a way to estimate the uncertainty or margin of error associated
with a sample statistic.
 The standard error is an important concept in statistics because it's often used to
calculate confidence intervals, which help researchers and analysts make inferences
about population parameters based on sample data.
 The standard error is typically calculated using the standard deviation (σ) and the
sample size (n):
σ^
SE=
√n
Standard Errors
The standard deviations of the preceding sampling distributions quantify the precision of
our estimated ^β 0∧ β^ 1

 In practice, we estimate the standard deviation of these distributions as 𝑆𝐸( ^β 0) and

 The sampling distributions shown above are theoretical entities.

𝑆𝐸( ^β 1).

5
𝑆𝐸 0
√[ ]
2
1 X
( ^β )= σ^
2
N
+ N

∑ ( X n−X ) 2
n=1

𝑆𝐸 1
√
2
σ^
( ^β )= N

∑ ( X n− X )2
n=1

We can use 𝑆𝐸( β^ )in combination with ^β for inference.

Inferential Tools: Wald Tests

We can construct a Wald statistic for ^β as follows:

^β
t=
SE ( β^ )

Inferential Tools: Confidence Intervals

CI = ^β ± t × SE ( β^ )
The t value is obtained based on the confidence level that we want and the degrees of
freedom in our data.
CI 95: We are 95% certain/confident in the sense that if we repeat this analysis an infinite
number of times, 95% of the CIs that we calculate will surround the true value of β 1.

Interpreting Confidence Intervals

Say we estimate a regression slope of ^β 1= 0.5 with an associated 95% confidence interval of
CI = [0.25; 0.75].
 We cannot say that there is 95% chance that the true value of β 1 is between 0.25
and 0.75. because we must use the term certain/ confident
 We cannot say that the true value of β 1 is between 0.25 and 0.75, with probability
0.95. because we must use the term certain/ confident
The true value of β 1 is fixed; it’s a single quantity.
 Once the interval is calculated, β 1 is either in our interval or it is not; there is no
uncertainty.
 The probability that β 1 is within our estimated interval is either exactly 1 or exactly 0.
We don’t talk about 95% probabilities when interpreting CIs; instead, we talk about 95%
confidence.
 If we collected a new sample – of the same size – re-estimated our model and re-
computed the 95% CI for ^β 1, we would get a different interval.
 Repeating this process an infinite number of times would give us a distribution of CIs.
 95% of those CIs would surround the true value of β 1.

In terms of sampling distributions, our inferential task is to say something about how

CIs give us a plausible range for the population value of 𝛽, so we can use CIs to support
distinct the null and alternative distributions are.

inference.

6
6. Model Fit
We quantify the proportion of the outcome’s variance that is explained by our model using
the R2statistic:
2 TSS−RSS RSS
R= =1−
TSS TSS
where
N
TSS=∑ ( Y n −Y ) =Var (Y ) ×(N −1)
2

n=1

R =0.62 indicates that car price explains 62% of the variability in horsepower. 汽车价格
2

可以解释 62%的马力变化。

Model Fit for Prediction

When assessing predictive performance, we most often use the mean squared error(MSE)
as our criterion. N here is sample size.

( )
N N p 2
1
∑ ( Y n−Y^ n ) = N1 ∑ Y n − ^β 0−∑ ^β p X np = RSS
2
MSE=
N n=1 n=1 p=1 N

The MSE quantifies the average squared prediction error.

Taking the square root improves interpretation.
RMSE=√ MSE
The RMSE estimates the magnitude of the expected prediction error.

 RMSE=3 2.06 : When using price as the only predictor of horsepower, we expect
prediction errors with magnitudes of 32.06 horsepower.

III. Conclusions & Questions

Conclusion
1. Simple linear regression allows us to model the mean of one variable Y, as a function
another variable, X .
o The distribution of Y is a conditional distribution.
2. In regression modelling we find the best fit line Y^ .
o This line does not fully describe the X → Y relationship.
o We also need to consider the errors/residuals that account for the un-
modelled deviations between the best-fit line and the data points.
3. Even after accounting for residuals, our estimated regression line is only an
approximation to the true regression line.
 We can think about our regression model in four different ways.
4. We can use centering to improve the model’s interpretation.
5. After fitting a regression model, we can make inferences about the nature of the X →
Y relationship.
6. We can use confidence intervals for inference.
 When interpreting CIs, we must be careful not to talk about probabilities of
the interval surrounding the parameter. Certain/confident
7. We can assess the fit of the model using R2 and MSE.

7
 R-squared ( R2): In simple linear regression or multiple regression, R2
quantifies the proportion of the total variance in the dependent variable that
is explained by the regression model. It ranges from 0 to 1, with 0 indicating
that the model explains none of the variance and 1 indicating that the model
explains all of the variance. The higher the R-squared, the better the model
fits the data.
 When interpreting MSE, you're assessing the overall goodness-of-fit of a
model. A lower MSE suggests that the model's predictions are closer to the
true values.

Question 1
In the estimated regression model, Y^ = ^β0 + β^ 1 Χ + ε^ , the ε^ term represent a vector of ____
a) errors.
b) residuals.

Question 2
Consider that we regress income (Y) on years of experience (X). Without centring X, the
intercept in the regression is defined as the expected value of Y when X = _0____.
After mean-centring X (with a mean of 5 years of experience), the intercept in the regression
of Y on the mean-centred version of X (X_c) corresponds to the expected value of Y when
the original X = __5____.

8
IV. Multiple Linear Regression
Simple linear regression: A single outcome is predicted by a single independent variable.
Simple linear regression implies a 1D line in 2D space. Adding another predictor leads to a
3D point cloud.

Partial effects
In MLR, we want to examine the partial effects of the predictors.
 What is the effect of a predictor after controlling for some other set of variables?
This approach is crucial to controlling confounds and adequately modelling real-world
phenomena.
 For example, when investigating the effect of income on career satisfaction, we
might want to control for tenure.
 If we conducted experiments and randomization worked out well, we would not
need to control for confounds. But usually, we work with observational data and
not with experimental data.

 SLR asking: What is the effect of age on average blood pressure?

 MLR asking: What is the effect of BMI on average blood pressure, after controlling
for age? (or “above and beyond age” or “keeping age fixed”).
o We’re partialing age out of the effect of BMI on blood pressure.
o Interpretation:

9
 The expected average blood pressure for an unborn patient with zero
weight is 52.25.
 for each year older, average blood pressure is expected to increase by
0.29 points after controlling for BMI.
 There's a 1.08 increase in blood pressure for every unit increase in BMI
after controlling for age.

Centring – for improved interpretation

 The expected average blood pressure for a 30-year-old patient is 88.09.

 For each year older, average blood pressure is expected to increase by 0.35 points.

 The expected average blood pressure for a 30-year-old patient with a BMI of 25 is 87.85.
 For each year older, average blood pressure is expected to increase by 0.29 points, after
controlling for BMI.
 For each additional point of BMI, average blood pressure is expected to increase by 1.08
points, after controlling for age.
 According to the corresponding p-values, both partial predictors are significant, but we
cannot say anything about the two predictors combined.

Explained Variance
Multiple R2
How much variation in blood pressure is explained by the two models?

10
 Check the R2 values.

It means that about 11% and 22% of the variability in blood pressure are accounted for.

Significance Testing for R2

To judge if our model explains a significant proportion of variation in Y, we use an F statistic.
The F statistic represents a ratio of two measures of variability:

Var goo d Var effect Var model

F= = =
Var bad Var noise Var error

Computing the F-Statistic

 For MLR, the good variability is the variability in Y that is explained by our model:
N
SSM =∑ ( Y^ n−Y n )
2
Model sum of squares:
n=1

Degrees of freedom(the number of predictor variables) for SSM: df M =P

SSM
Mean squares for the model: MSM=
df M

 The bad variability is the error variability (i.e. the variability in Y not explained by our
model):
N
Sum of squared errors(i.e. residual sum of squares): SSE=∑ ( Y n−Y^ n)
2

n =1

Degrees of freedom for SSE: df E=N −P−1

SSE
Mean squared errors: MSE=
df E

11
SSM
df M MSM
F= =
SSE MSE
df E

The shape of the F -stat’s sampling distribution is defined by two numbers:

 The Numerator degrees of freedom: df M =P
 The Denominator degrees of freedom: df E=N −P−1

How do we know if these F -statistics are big enough?

 Compare them to their sampling distributions to compute p-values.

4. Model Comparison
 How do we quantify the additional variation explained by BMI, above and beyond
age?
o We compute the ∆ R 2

12
 How do we know if ∆ R 2=0.115 represents a significantly greater degree of
explained variation?
o We use an F -test of the ∆ R 2

 We can also compare models based on their prediction errors.

o For OLS regression, we usually compare MSE values.
o In this case, the MSE for the model with BMI included is smaller. We should
prefer the 2nd model.

5. Deriving the F-Statistic

General Definition of the F-Statistic
MSM
F=
MSEr

Although correct, the above definition is a special case that does not generalize in an
obvious way. In general, the F-statistic is defined in terms of model comparisons:

( E R / EF ) / ( df R −df F )
F=
E F /df F

 E Rand E F are the errors (i.e., SSEs) of the restricted and full models.
 df R and df F are the restricted and full degrees of freedom.

13
The Compared Models
The full model is always the model with more estimated parameters.
 The model with more predictor variables.
The restricted model is the model with fewer estimated parameters.
 The restricted model must be nested within the full model.
o If our full model is the one with age and BMI as predictors, then our
restricted model cannot be one was cholesterol as predicted because
cholesterol was not part of the full model.
When we use the F-statistic to test if R2 > 0 (i.e. when we’re not explicitly comparing two
models):
 The full model is our estimated model.
 The restricted model is an intercept-only model. ( Y^ n=Y )
An intercept-only model is a model that assumes there is no relationship between
the dependent variable and any independent variables, and it only predicts the
mean of the dependent variable. ŷ = β₀ = y bar

Model Building
 We’ll take Y bp=β 0 + β 1 X age 30 +ε as our baseline model.
 Next we simultaneously add predictors of LDL and HDL cholesterol.

 Age, LDL, and HDL explain a combined 14.4% of the variation in blood pressure. That
proportion of variation is significantly greater than zero.

Continuing Model Comparison

14
 Adding LDL and HDL produces a model that explains 3.1% more variation in blood
pressure than a model with age as the only predictor.
 This increase in variance explained is significantly greater than zero.

 Adding LDL and HDL produces a model with lower prediction error (i.e., MSE = 163.4
vs. MSE = 169.4).

Further example:
So far we’ve established that age, LDL, and HDL are all significant predictors of average
blood pressure.
 We’ve also established that LDL and HDL, together, explain a significant amount of
additional variation, above and beyond age.
Next, we’ll add BMI to see what additional predictive role it can play above and beyond age
and cholesterol.

out4 <- lm(bp ̃ age30 + ldl100 + hdl60 + bmi25, data = dDat)

15
1. BMI seems to have a pretty strong effect on average blood pressure, after controlling for
age and cholesterol levels.
 after controlling for BMI, cholesterol levels no longer seem to be important
predictors.
 Let’s take a look at what happens to the cholesterol effects when we add BMI.
2. How much additional variability in blood pressure is explained by BMI above and beyond
age and cholesterol levels?

r2.4 <- summary(out4)$r.squared r2.4 - r2.3

## [1] 0.08595543

3. Is the additional 8.6% variation explained a significant increase? Yes. Using

anova(out3, out4)

4. What about the relative predictive performance?

mse2.4 <- MSE(y_pred = predict(out4), y_true = dDat$bp)
mse2.3
## [1] 163.3983
mse2.4
## [1] 146.9918
Maybe cholesterol levels are not important features once we’ve accounted for BMI. Try:

## Take out the cholesterol variables:

out5 <- lm(bp ̃ age30 + bmi25, data = dDat)

1. How much explained variation did we lose by removing the LDL and HDL variables?

r2.5 <- summary(out5)$r.squared r2.4 - r2.5

## [1] 0.002330906

2. Is it significant? No??

16
3. How do the prediction errors compare?
mse2.5 <- MSE(y_pred = predict(out5), y_true = dDat$bp)
mse2.4
## [1] 146.9918
mse2.5
## [1] 147.4367

QUESTION: Can we compare model 3 and 5 using the change in R2test?

ANSWER: No! The restricted model must be nested within the full model.

Thinking about Partialing: Partial effect of a specific predictor, above and beyond other
predictors

Recall the definition of residual ε^ :

ε^ =Y n− β^ 0 − ^β1 Χ n
 The ε^ are the “leftover” part of Y (i.e., the residue).
 The ε^ contain all the information in Y that is not (linearly) associated with of X.
o The ε^ are (linearly) independent of X.

17
V. Conclusions & Questions
Conclusion
1. Each variable in a regression model corresponds to a dimension in the data-space.
 A regression model with P predictors implies a P- dimensional (hyper)-plane in
(P + 1)-dimensional space.
2. The coefficients in MLR are partial coefficients.
 Each effect is interpreted as controlling for other predictors.

Question 1
Imagine we regress the outcome variable “satisfaction with life” (swl) on the four
predictor variables “age”, “income”, “number of facebook friends” (nr_fb_friend), and
“weekly working hours”(wwh):
lm(swl ̃ age + income + nr_fb_friend + wwh, data = dDat)
Which of the following models would be nested in this “full” model (you can select more
than one answer)
a) lm(swl ̃ age + wwh + nr_pets, data = dDat)
b) lm(swl ̃ age + nr_fb_friend, data = dDat)
c) lm(swl ̃ income + nr_fb_friend + wwh, data = dDat)
d) lm(swl ̃ age + income + nr_fb_friend, data = dDat)

Question 2
Consider the following output from a regression of blood pressure on age.

18
True or false: Based on the output, the effect of age on blood pressure is significant.
a) True
b) False

Question 3
Consider the following output from a regression of blood pressure on age and bmi.

True or false: Based on the output, the effect of age on blood pressure is significant.
a) True
b) False

20230912

MATH6183 Introduction+Regression
No ratings yet
MATH6183 Introduction+Regression
70 pages
Applied Linear Regression Models 4th Ed Note
No ratings yet
Applied Linear Regression Models 4th Ed Note
46 pages
Econometrics: A Simple Introduction
From Everand
Econometrics: A Simple Introduction
K.H. Erickson
3.5/5 (5)
Statistical Analysis in Microbiology StatNotes
0% (1)
Statistical Analysis in Microbiology StatNotes
173 pages
Chapter 3 Notes
No ratings yet
Chapter 3 Notes
5 pages
Chapter 9 Simple Linear Regression and Correlation
No ratings yet
Chapter 9 Simple Linear Regression and Correlation
56 pages
Lecture 3 - Linear Regression Imran 20022025 092939am
No ratings yet
Lecture 3 - Linear Regression Imran 20022025 092939am
46 pages
The Bucharest University of Economic Studies Bucharest Business School Romanian - French INDE MBA Program
No ratings yet
The Bucharest University of Economic Studies Bucharest Business School Romanian - French INDE MBA Program
67 pages
03 Revisions L Regression
No ratings yet
03 Revisions L Regression
25 pages
Ordinary Least Squares Linear Regression Review: Week 4
No ratings yet
Ordinary Least Squares Linear Regression Review: Week 4
10 pages
Statics Thinking-Regression
No ratings yet
Statics Thinking-Regression
51 pages
Simple Regression
100% (1)
Simple Regression
50 pages
ch12 0
No ratings yet
ch12 0
82 pages
Stats101A - Chapter 2
No ratings yet
Stats101A - Chapter 2
59 pages
PE Civil: Transportation Ebook Practice Exam
No ratings yet
PE Civil: Transportation Ebook Practice Exam
41 pages
Simple Linear Regression and Correlation: Abrasion Loss vs. Hardness
No ratings yet
Simple Linear Regression and Correlation: Abrasion Loss vs. Hardness
23 pages
Simple Linear Regression
No ratings yet
Simple Linear Regression
27 pages
SimpleLinearRegression PDF
No ratings yet
SimpleLinearRegression PDF
86 pages
Lecture6 Regression
No ratings yet
Lecture6 Regression
42 pages
Stat 353 Study Guide
No ratings yet
Stat 353 Study Guide
44 pages
Simple Linear Regression: Parameters
No ratings yet
Simple Linear Regression: Parameters
34 pages
Notes On Applied Linear Regression
No ratings yet
Notes On Applied Linear Regression
47 pages
Topic 7 Linear Regreation CHP14
No ratings yet
Topic 7 Linear Regreation CHP14
21 pages
Introduction To Linear Regression
No ratings yet
Introduction To Linear Regression
6 pages
8-1 To 8-3 Simple - Lin - Regress - Inference
No ratings yet
8-1 To 8-3 Simple - Lin - Regress - Inference
49 pages
ch12 0
No ratings yet
ch12 0
43 pages
US - TMC - 06 - Curve Fitting & Interpolation
No ratings yet
US - TMC - 06 - Curve Fitting & Interpolation
64 pages
9 Regression (Statistics IEM 2-2)
No ratings yet
9 Regression (Statistics IEM 2-2)
32 pages
Simple Linear Regression 1. Review of Least Squares Procedure 2. Inference For Least Squares Lines
No ratings yet
Simple Linear Regression 1. Review of Least Squares Procedure 2. Inference For Least Squares Lines
51 pages
Simple Linear Regression
No ratings yet
Simple Linear Regression
51 pages
Lecture 6 Simple Linear Regression
No ratings yet
Lecture 6 Simple Linear Regression
36 pages
Simple Lin Regress Inference
No ratings yet
Simple Lin Regress Inference
51 pages
06 Least Squar Regression
No ratings yet
06 Least Squar Regression
25 pages
Simple Linear Regression Analysis
No ratings yet
Simple Linear Regression Analysis
55 pages
Sec2 Regression PDF
No ratings yet
Sec2 Regression PDF
183 pages
Lecture 4
No ratings yet
Lecture 4
22 pages
Lecture 4 Linear Regression
100% (1)
Lecture 4 Linear Regression
44 pages
Week 3-4
No ratings yet
Week 3-4
75 pages
Module05 Notes
No ratings yet
Module05 Notes
19 pages
Lecture3 221109 035214
No ratings yet
Lecture3 221109 035214
87 pages
Lecture Set 2
No ratings yet
Lecture Set 2
47 pages
Notes 2
No ratings yet
Notes 2
16 pages
1.1 Simple Linear Regression Model
100% (1)
1.1 Simple Linear Regression Model
15 pages
BST 32202 Linear Regression 6 SLR Assumptions Lse
No ratings yet
BST 32202 Linear Regression 6 SLR Assumptions Lse
20 pages
9 Regression (Statistics IEM 2-2)
No ratings yet
9 Regression (Statistics IEM 2-2)
32 pages
Regression Kann Ur 14
No ratings yet
Regression Kann Ur 14
43 pages
Simple Regression
No ratings yet
Simple Regression
46 pages
Chapter 8 Regression Model - 2023
No ratings yet
Chapter 8 Regression Model - 2023
21 pages
Lecture1 STAT4355
No ratings yet
Lecture1 STAT4355
59 pages
Session 1: Simple Linear Regression: Figure 1 - Supervised and Unsupervised Learning Methods
No ratings yet
Session 1: Simple Linear Regression: Figure 1 - Supervised and Unsupervised Learning Methods
16 pages
Simple Linear Regression and Correlation
No ratings yet
Simple Linear Regression and Correlation
50 pages
STAT630Slide Adv Data Analysis
No ratings yet
STAT630Slide Adv Data Analysis
238 pages
13 Chapter14
No ratings yet
13 Chapter14
28 pages
Regression With One Regressor
No ratings yet
Regression With One Regressor
25 pages
Mungadze Linear
No ratings yet
Mungadze Linear
21 pages
SimpleMultipleLinearRegression FoundationalMathofAI S24
No ratings yet
SimpleMultipleLinearRegression FoundationalMathofAI S24
6 pages
Regression Models Notes
No ratings yet
Regression Models Notes
13 pages
Midterm 2 Nem Veg Leges
No ratings yet
Midterm 2 Nem Veg Leges
9 pages
Basic Economterics - I
No ratings yet
Basic Economterics - I
17 pages
Correlation and Regression: Six Sigma Thinking, #8
From Everand
Correlation and Regression: Six Sigma Thinking, #8
Sumeet Savant
5/5 (1)
A-level Maths Revision: Cheeky Revision Shortcuts
From Everand
A-level Maths Revision: Cheeky Revision Shortcuts
Scool Revision
3.5/5 (8)
ARDL Model - Hossain Academy Note PDF
100% (1)
ARDL Model - Hossain Academy Note PDF
5 pages
Multiple Regression - WPS Office
No ratings yet
Multiple Regression - WPS Office
2 pages
NTCC REPORT 010424.docx Megha Pachuri
No ratings yet
NTCC REPORT 010424.docx Megha Pachuri
27 pages
Bayes R2 v3
No ratings yet
Bayes R2 v3
6 pages
Shazam Reference Manual 11
No ratings yet
Shazam Reference Manual 11
565 pages
15EC34T
No ratings yet
15EC34T
131 pages
Journal of Vocational Behavior: Richard P. Douglass, Ryan D. Duffy
No ratings yet
Journal of Vocational Behavior: Richard P. Douglass, Ryan D. Duffy
8 pages
PGDFT Question Papers
100% (1)
PGDFT Question Papers
27 pages
STAT 125 HK Business Statistics Midterm Exam
100% (1)
STAT 125 HK Business Statistics Midterm Exam
65 pages
SPA3e 2.5 LecturePPT
No ratings yet
SPA3e 2.5 LecturePPT
15 pages
Regression Summary of Output: Ordinary Least Squares Estimation
No ratings yet
Regression Summary of Output: Ordinary Least Squares Estimation
3 pages
Estimating Liquidity Risk Using The Exposure Based Cash Flow at Risk Approach An Application To The Uk Banking Sector - Accepted
No ratings yet
Estimating Liquidity Risk Using The Exposure Based Cash Flow at Risk Approach An Application To The Uk Banking Sector - Accepted
27 pages
Unbalanced Panel Data PDF
No ratings yet
Unbalanced Panel Data PDF
51 pages
Multiple Imputation Presentation
No ratings yet
Multiple Imputation Presentation
23 pages
One Way Anova
No ratings yet
One Way Anova
23 pages
3.1 Multivariate Analysis
No ratings yet
3.1 Multivariate Analysis
32 pages
ICPCES2012 1STLFReview
No ratings yet
ICPCES2012 1STLFReview
11 pages
The Effects of Mergers and Acquisitions On Stock Prices: Evidence From International Transactions From 2001-2016
No ratings yet
The Effects of Mergers and Acquisitions On Stock Prices: Evidence From International Transactions From 2001-2016
39 pages
Analytical Chemistry - Random Errors in Chemical Analyses
No ratings yet
Analytical Chemistry - Random Errors in Chemical Analyses
39 pages
Hasan Ul Moin 503 PP (68-84)
No ratings yet
Hasan Ul Moin 503 PP (68-84)
17 pages
Guide To ADePT
100% (1)
Guide To ADePT
50 pages
Chapter 2. Forecasting in Logistics
No ratings yet
Chapter 2. Forecasting in Logistics
33 pages
Multivariate Quality Control: A Historical Perspective
No ratings yet
Multivariate Quality Control: A Historical Perspective
14 pages
SAMPLING & SAMPLING DISTRIBUTION EDITED ch07
100% (1)
SAMPLING & SAMPLING DISTRIBUTION EDITED ch07
50 pages
One Way ANOVA in 4 Pages
No ratings yet
One Way ANOVA in 4 Pages
8 pages
Non Spherical Disturbances - Heteroskedasticity 1
No ratings yet
Non Spherical Disturbances - Heteroskedasticity 1
12 pages
OPIANA - MIDTERM+Problem-set-4-5-6-7-and-8 - 9-10
No ratings yet
OPIANA - MIDTERM+Problem-set-4-5-6-7-and-8 - 9-10
73 pages
MCQ Regression and Correlation With Correct Answers 1
100% (1)
MCQ Regression and Correlation With Correct Answers 1
9 pages
The Effect of Career Paths and Career Planning Toward Career Development of Employees: A Case Study Penetentiary Office in Pekanbaru
No ratings yet
The Effect of Career Paths and Career Planning Toward Career Development of Employees: A Case Study Penetentiary Office in Pekanbaru
8 pages

Statistics Week3

Uploaded by

Statistics Week3

Uploaded by

Simple & Multiple Linear Regression Lecture 4 & 5 (Week3)

I. Introduction to the “regression” problem

Flavours of Probability Distribution

 This type of distribution is called “marginal” or “unconditional.”

In practice, we only interact with the X-Y plane of the 3D figure.

II. Simple Linear Regression

 The errors, 𝜀, are unknown parameters, so we must estimate them.

In the estimated regression model Y^ = ^β0 + β^ 1 Χ + ε^ , the ε^ term represent a vector of

2. Different Types of Regression Lines

3. Estimating the regression Coefficients

The estimated intercept is ^β 0 = 60.45.

4. Centring - Improving Interpretation 平移

Estimate the least squares coefficients with X mean-centred:

The estimated intercept is ^β 0 = 143.83.

 In practice, we estimate the standard deviation of these distributions as 𝑆𝐸( ^β 0) and

We can use 𝑆𝐸( β^ )in combination with ^β for inference.

We can construct a Wald statistic for ^β as follows:

Inferential Tools: Confidence Intervals

Interpreting Confidence Intervals

Model Fit for Prediction

The MSE quantifies the average squared prediction error.

III. Conclusions & Questions

 SLR asking: What is the effect of age on average blood pressure?

Centring – for improved interpretation

 The expected average blood pressure for a 30-year-old patient is 88.09.

Significance Testing for R2

Var goo d Var effect Var model

Computing the F-Statistic

Degrees of freedom(the number of predictor variables) for SSM: df M =P

Degrees of freedom for SSE: df E=N −P−1

The shape of the F -stat’s sampling distribution is defined by two numbers:

How do we know if these F -statistics are big enough?

 We can also compare models based on their prediction errors.

5. Deriving the F-Statistic

Continuing Model Comparison

out4 <- lm(bp ̃ age30 + ldl100 + hdl60 + bmi25, data = dDat)

r2.4 <- summary(out4)$r.squared r2.4 - r2.3

3. Is the additional 8.6% variation explained a significant increase? Yes. Using

4. What about the relative predictive performance?

## Take out the cholesterol variables:

out5 <- lm(bp ̃ age30 + bmi25, data = dDat)

r2.5 <- summary(out5)$r.squared r2.4 - r2.5

QUESTION: Can we compare model 3 and 5 using the change in R2test?

Recall the definition of residual ε^ :

You might also like