0% found this document useful (0 votes)

22 views45 pages

BA501 Week5 Linear Regression

Uploaded by

susan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

22 views45 pages

BA501 Week5 Linear Regression

Uploaded by

susan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 45

BA501 Mastery Program

版权声明
所有太阁官方网站以及在第三方平台课程中所产生的课程内容，如文本，图形，徽标，按钮图
标，图像，音频剪辑，视频剪辑，直播流，数字下载，数据编辑和软件均属于太阁所有并受版
权法保护。

对于任何尝试散播或转售BitTiger的所属资料的行为，太阁将采取适当的法律行动。

C 有关详情，请参阅
https://www.bittiger.io/termsofuse https://www.bittiger.io/termsofservice
Copyright Policy
All content included on the Site or third-party platforms as part of the class, such as text,
graphics, logos, button icons, images, audio clips, video clips, live streams, digital
downloads, data compilations, and software, is the property of BitTiger or its content
suppliers and protected by copyright laws.

Any attempt to redistribute or resell BitTiger content will result in the appropriate legal
action being taken.

C
We thank you in advance for respecting our copyrighted content.

For more info:

see https://www.bittiger.io/termsofuse
and https://www.bittiger.io/termsofservice
Summary
● Simple linear regression
○ Modeling E(Y|X)
○ Assumption
○ Coefficient estimation
■ Least square estimation
■ Maximum likelihood estimation
■ Hypothesis testing
Summary
● Multiple linear regression
○ Coefficient estimation
○ Evaluate model performance
○ ANOVA and F test
● Residual
○ Residual diagnostics
○ Leverage, standardizing
Simple Linear Regression
Classical example
● Let yi be the height of child i; xi
be the height of child i’s
parent
● How to use xi to predict yi?

● What are we predicting?

○ Expected Υ given Χ

● Simple linear regression

○ Υ = β0 + β1 Χ + ε
○ Ε[Υ] = β0 + β Χ
Simple linear regression model

Dependent / Response
Υ = β0 + β1 Χ + Error term
Variable ε
Coefficients Independent Variable

● Assumptions
○ 1. Linear relationship: Υ = β0 + β1 Χ + ε
○ 2. ε is independent, identically distributed (i.i.d)
○ 3. Homoscedasticity (constant variance) Var[ε |Χ = x] = σ2
(no matter what x is)
○ 4. Gaussian Noise: Normal distribution ε ~ Ν(0, σ2)
Assumption Violation

● Linear Relationship
Assumption Violation

● ε Dependent on each other

Assumption Violation
● Heteroscedasticity (non-constant variance)
Some error term example

Why Gaussian noise?

● Central limit theorem

○ Noise might be sum of lots of
little random noises from
different sources, independent
and with similar magnitude.

● Mathematical convenience
○ Closed form estimation
Under Gaussian Assumption
Υ = β0 + β1 Χ +
E(Υ) = β0 + β1 Χ ε

ε = Υ - β0 - β1 Χ ~ Normal (0, σ2) ->

Y = β0 + β1 Χ + ε ~ Normal (β0 + β1 Χ, σ2)
Coefficient estimation
● Some definition
○ Sale price Y: dependent variable, output, response
○ Lot area X: predictor, independent variable, covariate, input
● Coefficient estimation
○ Remember assumption Υ = β0 + β1Χ + ε
○ How do we find optimal (β0, β1)?
■ Least square error estimation (LSE)
■ Maximum likelihood estimation (MLE)
■ Compare LSE and MLE
Least square error estimation
● Mean square error (MSE)
Least square error estimation (cont’d)
● are estimators of
○

● Criteria for good estimators (week 1)

○ Unbiased

○ Variance decreases with more data

What is likelihood
● Probability
○ If we know β0 = 10, β1 = 0.8, E(Y) = 10 + 0.8X, possibility to observe (x1=70,
y1=72), …, (xn, yn)?
● Reverse above logic
○ If we observe (x1=70, y1=72), …, (xn, yn)
○ Likelihood (L) of β0 = 10, β1 = 0.8?
● Connection
○ L(β0,β1|{X, Y}) = P({X, Y}|β0, β1)

● How to use likelihood to estimate β0, β1

○ Maximizes Likelihood function L(β0,β1|{X, Y})
Maximum likelihood estimation (MLE)
● Under Gaussian error distribution

● MLE to LSE are identical when ε ~ Ν(0, σ2)

Hypothesis testing for coefficient
● Test assumption
○ Null hypothesis β1 = 0 (What type of test)

● Calculate stats: compare observed and H0

○

○ Why t distribution. Recall t score. How to decide d.f. (degree of

freedom)?
How to predict for new data?
● New data point x, how to predict y?
● Predicted (fitted) value at x:
● Is a single value?

○ Variance grows as σ2 is larger; more noise in predictions

○ Larger n is, smaller variance is; more precise of predictions
Distribution of predicted value

● Under Gaussian error assumption,

follows normal distribution
○ What’s Expectation, Variance,
confidence interval?
● Visualization
○ Observations as circle
○ Mean prediction as solid line
○ [5th, 95th] quantile of predicted value.
fitted

?
x ?
Bias and variance tradeoff
● Prediction error, suppose we have Υ = f(Χ) + ε
Bias and variance
● Accurate (small bias)
● Precise (small variance)
Multiple Linear Regression
Multiple linear regression
● More than one predictor, say p. ith data point

Yi = β0 + β1 X1i + β2 X2i + … + βp Xpi + εi

● How to get estimate (LSE)?

○ MSE: 1/n *

○ (β0 , β1 , … βp ) minimizes MSE (derivative = 0).

● What about MLE?

LSE for multiple regression
● Matrix form
○ n*1 matrix Y, n*(p+1) matrix X, (p+1)*1 matrix β , n*1 matrix ε
Υ=Χβ+ε
○ MSE:
○ Optimal estimator:
○ Fitted value: Projection matrix: H
How do you know if your model works?
ANOVA
● ANOVA (Analysis of variance)
○ Decompose variance into sources
○ F test
■ Null hypothesis:
■ Calculation: (p does not include β0)

Source of variance df Sum of square

Regression model p

Residual n-1-p
F distribution
Total n-1
How to evaluate model performance?
● R2, and relation with correlation
○

● Property
○ R2 , percentage of variance explained by model
○ 0 <= R2 <= 1
○ R2 could be misleading, link
■ More features always increase R2
■ Need to adjust R2
How do you know your model meet assumptions?
Residual Diagnostics
Residual vs. Error Term
● Residual
● Error (Noise)
○ ei is an observation of εi

● Properties of residual
○ 0
○ Constant variance, unchanging with x.
○ The residuals are not uncorrelated with each other, or extremely
weak
○ Normal distributed
Residual v.s. Fitted value
● To check if unbiased and homoscedasticity
• The residuals spread randomly around the 0 line indicating that the relationship is linear.
• The residuals form an approximate horizontal band around the 0 line indicating homogeneity of error
variance.
• No one residual is visibly away from the random pattern of the residuals indicating that there are no outliers.
Residual v.s. Fitted value – Violation Examples
QQ Plot
● To check for Normality
• The points should roughly follow the diagonal straight line
• You can also check normality use Shapiro-Wilk test or Kolmogorov-Smirnov test
QQ Plot – Violation Examples
Outlier, High Leverage, Influential
• An outlier is a data point whose response y does not follow the general trend of the rest of the
data.
• A data point has high leverage if it has "extreme" predictor x values. With or without this data point
will highly impact the estimate of coefficients
• A data point is influential if it unduly influences any part of a regression analysis, such as the
predicted responses, the estimated slope coefficients, or the hypothesis test results. Outliers and
high leverage data points have the potential to be influential, but we generally have to investigate
further to determine whether or not they are actually influential.
• Great reading
Standardized residuals vs. Leverage
● To check high leverage, influential data points

• Leverage: consider fitted line as lever, passing center point. Points further from center point
have larger leverage. hii= [H]ii measure distance of x to center of x

• Rule of thumb: Cook’s distance>1 influential points

How to Deal with Problematic Data?
Do NOT simply delete them by default!!!
You must have a good, objective reason for deleting data points

1. Check for obvious data errors. correct or delete them

2. Consider the possibility of mis-formulated model:

• Did you leave out any important predictors?
• Should you consider adding some interaction terms?
• Is there any nonlinearity that needs to be modeled?

3. If you delete any data after you've collected it, justify and describe it in your reports.
If you are not sure what to do about a data point, analyze the data twice — once with and once
without the data point — and report the results of both analyses.
Interview Questions
Basic

• What model will you use? For continuous Y variable, linear relationship. Not for categorical
y variable

• What are the assumptions of linear regression? How to check?

• How is coefficients estimated?

• What kind of diagnostics will you do?

Advanced

• What is the estimate of coefficients? How to derive it?

• What is variance of beta, error term? Degree of freedom

• What to do with non-constant variance? Non-linear? Non-normal?

Interview Questions
1, take home project report
2, product sense questions (2 hour, part of it)
3, industry what function, dynamic pricing, marketing analytics, growth, risk
4, What BA do, how to grow to a DS / work experience, day to day work
5, Q&A
Appendix
Leverage
●

hii: leverage of
ith data point

● Properties of hii
○ hii is a measure of the distance between xi and mean of the x
○ hii is a number between 0 and 1.
○ Sum of the hii equals p+1, the number of parameters
(regression coefficients including the intercept).
Standardizing
● Standardized residual
○ What: Residuals rescaled to have a mean of 0 and a variance of 1
○ Why:
■ Since data points far from center has larger impact on
estimating coefficient (leverage), residual variance of these
data points are smaller

○ How to standardize:
Standardizing in simple linear regression
●

● Residual and leverage

○ Link
Should we standardize features?
● Standardizing won’t change coefficient significance

● For comparing coefficients for different predictors

within a model, standardizing helps
● For comparing coefficients for the same predictors
across different data sets, don’t standardize

James W. Hardin, Joseph M. Hilbe - Generalized Linear Models and Extensions-Stata Press (2018)
100% (1)
James W. Hardin, Joseph M. Hilbe - Generalized Linear Models and Extensions-Stata Press (2018)
789 pages
MATH6183 Introduction+Regression
No ratings yet
MATH6183 Introduction+Regression
70 pages
Applied Linear Regression Models 4th Ed Note
No ratings yet
Applied Linear Regression Models 4th Ed Note
46 pages
Stock Watson 3u Exercise Solutions Chapter 7 Instructors
No ratings yet
Stock Watson 3u Exercise Solutions Chapter 7 Instructors
13 pages
Time Series ARIMA Models PDF
No ratings yet
Time Series ARIMA Models PDF
22 pages
Data Science Interview Preparation
100% (1)
Data Science Interview Preparation
113 pages
Stat 353 Study Guide
No ratings yet
Stat 353 Study Guide
44 pages
Mungadze Linear
No ratings yet
Mungadze Linear
21 pages
Simple Linear Regression Analysis
No ratings yet
Simple Linear Regression Analysis
55 pages
Simple Linear Regression Analysis - Final
No ratings yet
Simple Linear Regression Analysis - Final
46 pages
Notes 2
No ratings yet
Notes 2
16 pages
Linear Regression
No ratings yet
Linear Regression
108 pages
Regression Analysis
100% (1)
Regression Analysis
280 pages
Econometrics
No ratings yet
Econometrics
13 pages
Reg Analysis
No ratings yet
Reg Analysis
63 pages
Ordinary Least Squares
No ratings yet
Ordinary Least Squares
54 pages
Stats101A - Chapter 2
No ratings yet
Stats101A - Chapter 2
59 pages
Business Analytics
No ratings yet
Business Analytics
19 pages
CH 2
No ratings yet
CH 2
31 pages
Daunit 3
No ratings yet
Daunit 3
32 pages
Lecture 22: Review For Exam 2 1 Basic Model Assumptions (Without Gaussian Noise)
No ratings yet
Lecture 22: Review For Exam 2 1 Basic Model Assumptions (Without Gaussian Noise)
7 pages
Lec2 ASE
No ratings yet
Lec2 ASE
86 pages
STAT630Slide Adv Data Analysis
No ratings yet
STAT630Slide Adv Data Analysis
238 pages
Simple Linear Regression: Parameters
No ratings yet
Simple Linear Regression: Parameters
34 pages
Sta 3
No ratings yet
Sta 3
9 pages
Machine Learning and Linear Regression
100% (1)
Machine Learning and Linear Regression
55 pages
TSNotes 1
No ratings yet
TSNotes 1
29 pages
Simple Linear Regression
No ratings yet
Simple Linear Regression
20 pages
Inference For Regression
No ratings yet
Inference For Regression
24 pages
Data Science 03 - Regression PDF
No ratings yet
Data Science 03 - Regression PDF
32 pages
Data Science Q&A - Latest Ed (2020) - 3 - 1
No ratings yet
Data Science Q&A - Latest Ed (2020) - 3 - 1
2 pages
Applied Statistics II-SLR
100% (1)
Applied Statistics II-SLR
23 pages
Module05 Notes
No ratings yet
Module05 Notes
19 pages
BST 32202 Linear Regression 6 SLR Assumptions Lse
No ratings yet
BST 32202 Linear Regression 6 SLR Assumptions Lse
20 pages
Intro To Regresion: Codergirl Data Analysis
No ratings yet
Intro To Regresion: Codergirl Data Analysis
32 pages
MultivariableRegression 6
No ratings yet
MultivariableRegression 6
44 pages
Module01 LinearRegression
No ratings yet
Module01 LinearRegression
41 pages
Chaeat Sheet Econometrics
100% (2)
Chaeat Sheet Econometrics
5 pages
Notes On Applied Linear Regression
No ratings yet
Notes On Applied Linear Regression
47 pages
3 SimpleLinearRegression
No ratings yet
3 SimpleLinearRegression
30 pages
Short - Notes - Econometric Methods
No ratings yet
Short - Notes - Econometric Methods
22 pages
Lecture1 STAT4355
No ratings yet
Lecture1 STAT4355
59 pages
Topic 3a
No ratings yet
Topic 3a
64 pages
01 SLR Final
No ratings yet
01 SLR Final
37 pages
1 Residuals, Outliers and Regression Diagnostics - CH 14.8 15.8 Revised
No ratings yet
1 Residuals, Outliers and Regression Diagnostics - CH 14.8 15.8 Revised
48 pages
Linear Regression
100% (2)
Linear Regression
228 pages
PE Civil: Transportation Ebook Practice Exam
No ratings yet
PE Civil: Transportation Ebook Practice Exam
41 pages
Multiple Linear Reegression
No ratings yet
Multiple Linear Reegression
21 pages
Chapter Three
No ratings yet
Chapter Three
35 pages
3 Da
No ratings yet
3 Da
16 pages
Lecture36 2012 Full
No ratings yet
Lecture36 2012 Full
30 pages
Linear Regression - Module 3
No ratings yet
Linear Regression - Module 3
16 pages
Basic Regression Analysis
No ratings yet
Basic Regression Analysis
5 pages
Regression Analysis
No ratings yet
Regression Analysis
65 pages
Regression
No ratings yet
Regression
45 pages
Chapter 9 Simple Linear Regression and Correlation
No ratings yet
Chapter 9 Simple Linear Regression and Correlation
56 pages
Lecture 4: Simple Linear Regression Models, With Hints at Their Estimation
No ratings yet
Lecture 4: Simple Linear Regression Models, With Hints at Their Estimation
12 pages
Lecture 09 - 02.09.2024 - Regression-01
No ratings yet
Lecture 09 - 02.09.2024 - Regression-01
62 pages
Machine Learning Lecture Notes Undergrad
No ratings yet
Machine Learning Lecture Notes Undergrad
19 pages
Fundamental Math
From Everand
Fundamental Math
Russell Pead
No ratings yet
Correlation and Regression: Six Sigma Thinking, #8
From Everand
Correlation and Regression: Six Sigma Thinking, #8
Sumeet Savant
5/5 (1)
Hypothesis Testing: Six Sigma Thinking, #6
From Everand
Hypothesis Testing: Six Sigma Thinking, #6
Sumeet Savant
No ratings yet
Digital Signal Processing (DSP) with Python Programming
From Everand
Digital Signal Processing (DSP) with Python Programming
Maurice Charbit
No ratings yet
Model Sum of Squares DF Mean Square F Sig. 1 Regression 49.210 4 12.302 45.969 .000 Residual 12.846 48 .268 Total 62.056 52 A. Predictors: (Constant), LC, TANG, DEBT, EXT B. Dependent Variable: DPR
No ratings yet
Model Sum of Squares DF Mean Square F Sig. 1 Regression 49.210 4 12.302 45.969 .000 Residual 12.846 48 .268 Total 62.056 52 A. Predictors: (Constant), LC, TANG, DEBT, EXT B. Dependent Variable: DPR
3 pages
Multivariate Regression
No ratings yet
Multivariate Regression
20 pages
System GMM and Others Information Assumtions Restrictions Etc.
No ratings yet
System GMM and Others Information Assumtions Restrictions Etc.
3 pages
Vector Autoregressions Model
No ratings yet
Vector Autoregressions Model
6 pages
Darren George, Paul Mallery - IBM SPSS Statistics 29 Step by Step - 11
No ratings yet
Darren George, Paul Mallery - IBM SPSS Statistics 29 Step by Step - 11
1 page
Hasil Uji Eviews Awen
No ratings yet
Hasil Uji Eviews Awen
3 pages
Forecasting MethodsandApplicationsABookreview
No ratings yet
Forecasting MethodsandApplicationsABookreview
4 pages
DEA As A Tool For Bankruptcy Assessment: A Comparative Study With Logistic Regression Technique
No ratings yet
DEA As A Tool For Bankruptcy Assessment: A Comparative Study With Logistic Regression Technique
13 pages
Regression PDF
No ratings yet
Regression PDF
21 pages
Estimating A VAR - Gretl
No ratings yet
Estimating A VAR - Gretl
9 pages
Uji Normalitas: Case Processing Summary
No ratings yet
Uji Normalitas: Case Processing Summary
2 pages
ECO465 SampleMidterm
No ratings yet
ECO465 SampleMidterm
2 pages
Stat 401B Exam 2 Key F15
No ratings yet
Stat 401B Exam 2 Key F15
10 pages
Econometric S Lecture 45
No ratings yet
Econometric S Lecture 45
31 pages
Panel Data Methods For Microeconomics Using Stata
100% (1)
Panel Data Methods For Microeconomics Using Stata
39 pages
Introduction To Robust Estimation and Hypothesis Testing 5th Edition Rand R. Wilcox - Ebook PDFinstant Download
100% (2)
Introduction To Robust Estimation and Hypothesis Testing 5th Edition Rand R. Wilcox - Ebook PDFinstant Download
65 pages
Assistant Professor, Department of Economics
No ratings yet
Assistant Professor, Department of Economics
6 pages
Panel Data
No ratings yet
Panel Data
9 pages
Multiple Linear Regression: Chapter 12
No ratings yet
Multiple Linear Regression: Chapter 12
49 pages
Topic04 - Simple Linear Regression
No ratings yet
Topic04 - Simple Linear Regression
11 pages
Solution - Sample Test: MULTIPLE CHOICE: Answer Key
No ratings yet
Solution - Sample Test: MULTIPLE CHOICE: Answer Key
4 pages
CQF June 2021 M4L4 Solutions
No ratings yet
CQF June 2021 M4L4 Solutions
14 pages
Introduction To Introduction To Econometrics Econometrics Econometrics Econometrics (ECON 352) (ECON 352)
100% (2)
Introduction To Introduction To Econometrics Econometrics Econometrics Econometrics (ECON 352) (ECON 352)
12 pages
Forecast of Annual Paddy Production in MADA Region Using ARIMA (0,2,2) Model
No ratings yet
Forecast of Annual Paddy Production in MADA Region Using ARIMA (0,2,2) Model
7 pages
BGMT Forecasting
No ratings yet
BGMT Forecasting
82 pages
.-I, II Advanced Statistics (External)
No ratings yet
.-I, II Advanced Statistics (External)
8 pages
Required Assignment Week 7
No ratings yet
Required Assignment Week 7
12 pages

BA501 Week5 Linear Regression

Uploaded by

BA501 Week5 Linear Regression

Uploaded by

BA501 Mastery Program

For more info:

● What are we predicting?

● Simple linear regression

● ε Dependent on each other

Why Gaussian noise?

● Central limit theorem

ε = Υ - β0 - β1 Χ ~ Normal (0, σ2) ->

● Criteria for good estimators (week 1)

○ Variance decreases with more data

● How to use likelihood to estimate β0, β1

● MLE to LSE are identical when ε ~ Ν(0, σ2)

● Calculate stats: compare observed and H0

○ Why t distribution. Recall t score. How to decide d.f. (degree of

○ Variance grows as σ2 is larger; more noise in predictions

● Under Gaussian error assumption,

Yi = β0 + β1 X1i + β2 X2i + … + βp Xpi + εi

● How to get estimate (LSE)?

○ (β0 , β1 , … βp ) minimizes MSE (derivative = 0).

● What about MLE?

Source of variance df Sum of square

• Rule of thumb: Cook’s distance>1 influential points

1. Check for obvious data errors. correct or delete them

2. Consider the possibility of mis-formulated model:

• What are the assumptions of linear regression? How to check?

• How is coefficients estimated?

• What kind of diagnostics will you do?

• What is the estimate of coefficients? How to derive it?

• What is variance of beta, error term? Degree of freedom

• What to do with non-constant variance? Non-linear? Non-normal?

● Residual and leverage

● For comparing coefficients for different predictors

You might also like