0% found this document useful (0 votes)

21 views23 pages

Lecture For 111424

Uploaded by

zhiminc1013

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as KEY, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

21 views23 pages

Lecture For 111424

Uploaded by

zhiminc1013

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as KEY, PDF, TXT or read online on Scribd

You are on page 1/ 23

Lecture for 111424

Topics for today

Activity
Power training data
Motor vehicle theft data
Influence and multicollinearity
Star data
How do you estimate polynomial
curve models?
Estimate these models sequentially
Add linear term
Add quadratic term
Add cubic term
etc. until two terms in a row are not significant
As you build the model keep significant terms in the model even if they
become non-significant after additional terms are added.
Remember that powers are essentially interaction terms
Here’s why you need to go two
additional steps
x x2 x3 x4 xcen xcen2 xcen3 xcen4
0 0 0 0 −2 4 −8 16
1 1 1 1 −1 1 −1 1
2 4 8 16 0 0 0 0

𝑜
𝑟
3 9 27 81 1 1 1 1
4 16 64 256 2 4 8 16

The even and odd powers are tied

together
An example
Training series 1 data set
powerser.training.sas
powerser.training.session1.R
Using polynomials with different
scales
Three ways of creating polynomials
different.scaling.of.polynomials.R
Motor vehicle data
predict.Model.for.Bob.add.manipulate.R
pa.mvtheft.frst40.higher.order.poly.toclass.R
pa.mvtheft.frst40.dat
Maximum-likelihood regression
Maximum-likelihood regression.docx
Influence statistics
What is the effect of removing a data point?
How influential is each data point?
Leverage
Residual value
Determine this for each data point
This is equivalent to the ‘jackknife’
For problematic data points, remove one at a time (the worst first)
See if the regression results change
add a data point to class.R
Based on the work of Belsley, Kuh, & Welsch (1980)
Examples in R and SAS
first.influence.and.collinearity.example.five.predi
ctors.sas
influence.statistics.R
collinearity.statistics.R
Hat matrix: the degree of leverage

hi = xi(X'X)-1xi'

Cutoff: >
2p/n
Studentized residual

𝑖
𝑖
𝑠
1−h
𝑠
𝑡
𝑢
𝑑
𝑒
𝑛
𝑡
𝑟
()

𝑖
𝑟
Cutoff: > 2
Covariance ratio

COVRATIO = [( det ( s2(i) (X(i)'X(i))-1 ) )/( det ( s2

(X'X)-1 ) )]

Cutoff: |COVRATIO – 1| >

3p/n
Difference in fits: DFFITS

^ −^
()

𝑖
𝑖
𝑠
=

𝐷
𝐹
𝐹
𝐼
𝑇
𝑆
() h( )

𝑖
𝑖
𝑦
𝑦
Cutoff:
Difference in betas: DFBETAS
(Mike’s favorite)
− ()

𝑖
𝑗
𝑗
=

𝑠
𝑋
𝑋
𝑇
( )
𝐷
𝐹
𝐵
𝐸
𝑇
𝐴
𝑆
()

𝑗
𝑖
𝑗
𝑏
𝑏
Cutoff:
Cook’s Distance

2 [ (1 − h )2 ]
h

𝑖
𝑖
𝑝
𝑠
=

𝑖
𝐷
𝑖
𝑖
𝑖
𝑟
Cutoff:
Multicollinearity
Tolerance (Tol)
Variance Inflation Factor (VIF)
Eigenvalues
Condition numbers
Tolerance and Variance Inflation
Factor
2
=1−

𝑘
𝑇
𝑜
𝑙
𝑒
𝑟
𝑎
𝑛
𝑐
𝑒
𝑅
Rough cutoff 1: < .1
Rough cutoff 2: < 1-
R2

1 1

𝑘
𝑇
𝑜
𝑙
𝑅
= =
𝑉
𝐼
𝐹
1− 2

Rough cutoff 1: > 10

Rough cutoff 2: > 1/(1-
R2)
Definition of Rk2 for the variance
inflation factor
Rk2 is the multiple R2 for the regression of Xk on the other
covariates and is a regression that does not involve the
response variable Y
Condition numbers

Condition numbers or condition indices are square roots of the ratios

of the largest eigenvalues to individual ith eigenvalues. Eigenvalues are
just characteristic roots of X’X. Conventionally, an eigenvalue close to zero
(say less than .01) or condition number greater than 50 (30 for conservative
persons) indicates significant multicollinearity. Belsley, Kuh, and Welsch (1980)
insist 10 to 100 as a beginning and serious points that collinearity affects
estimates.
Eigenvalues
[1 3]
3 1
=

𝐶
𝑜
𝑣
𝑎
𝑟
𝑖
𝑎
𝑛
𝑐
𝑒
𝑚
𝑎
𝑡
𝑟
𝑖
𝑥
𝐴
[0 1]
1 0
𝐸
: − =0 h =
𝑖
𝑔
𝑒
𝑛
𝑣
𝑎
𝑙
𝑢
𝑒
𝑒
𝑞
𝑢
𝑎
𝑡
𝑖
𝑜
𝑛
𝐴
𝜆
𝐼
𝑤
𝑒
𝑟
𝑒
𝐼
3− 1

𝜆
− = =0

𝐴
𝜆
𝐼
1 3−

𝜆
(3 − )2 − 1 = 0

𝜆
2
9−6 + −1=0
𝜆
𝜆
( − 4)( − 2) = 0
𝜆
𝜆
= 4, 2
𝜆
Interpretation
Eigenvalues roughly equal or none particularly large indicate no
multicollinearity
Turn into condition number

Criteria
>50
10 beginning; 100 problematic (Belsley, Kuh, & Welsch, 1980)
For high condition number check which independent variable has largest
proportion of variance associated with that eigenvalue and consider
removing
Star data
stardata.toclass.R
stardata.forR.dat
stardata.dat

Data Science Interview Preparation
100% (1)
Data Science Interview Preparation
113 pages
Econometrics 2 Exam Answers
67% (3)
Econometrics 2 Exam Answers
6 pages
3 Regression Diagnostics
100% (1)
3 Regression Diagnostics
53 pages
Estimating Demand: Regression Analysis
No ratings yet
Estimating Demand: Regression Analysis
29 pages
BA501 Week5 Linear Regression
No ratings yet
BA501 Week5 Linear Regression
45 pages
Data Science Cheatsheet
100% (1)
Data Science Cheatsheet
5 pages
Regression Models Course Notes
No ratings yet
Regression Models Course Notes
102 pages
Chaeat Sheet Econometrics
100% (2)
Chaeat Sheet Econometrics
5 pages
(Practical Guides To Biostatistics and Epidemiology) Jos W. R. Twisk - Applied Mixed Model Analysis - A Practical Guide-Cambridge University Press (2019)
100% (2)
(Practical Guides To Biostatistics and Epidemiology) Jos W. R. Twisk - Applied Mixed Model Analysis - A Practical Guide-Cambridge University Press (2019)
243 pages
Statistics Study Notes
No ratings yet
Statistics Study Notes
71 pages
Cursus Advanced Econometrics
No ratings yet
Cursus Advanced Econometrics
129 pages
7772 LectureNotes
No ratings yet
7772 LectureNotes
120 pages
NVT SDS Unit V Final PDF
No ratings yet
NVT SDS Unit V Final PDF
100 pages
Final Exam Stat LL
No ratings yet
Final Exam Stat LL
3 pages
Reading 3 Machine Learning - Answers
No ratings yet
Reading 3 Machine Learning - Answers
12 pages
Reg Book Stat
No ratings yet
Reg Book Stat
79 pages
Chapter 5 Variables Selection
No ratings yet
Chapter 5 Variables Selection
57 pages
7 OLS Assumptions
No ratings yet
7 OLS Assumptions
37 pages
1 Multicollinearity and Partial F Test PowerPoint
No ratings yet
1 Multicollinearity and Partial F Test PowerPoint
61 pages
Chapter 5 Variable Selection
No ratings yet
Chapter 5 Variable Selection
56 pages
MultivariableRegression 6
No ratings yet
MultivariableRegression 6
44 pages
Regression Discontinuity
No ratings yet
Regression Discontinuity
44 pages
Topic 7 Regression (Cont.)
No ratings yet
Topic 7 Regression (Cont.)
47 pages
8 2 Correlations+models Ninell
No ratings yet
8 2 Correlations+models Ninell
44 pages
Module01 LinearRegression
No ratings yet
Module01 LinearRegression
41 pages
STAT22209 - Chapter 03-Multiple Regression - 2022
No ratings yet
STAT22209 - Chapter 03-Multiple Regression - 2022
41 pages
Module 4
No ratings yet
Module 4
33 pages
Statistic and Data Science Ii PDF
No ratings yet
Statistic and Data Science Ii PDF
37 pages
330 Lect11
No ratings yet
330 Lect11
35 pages
330 Lecture11 2014
No ratings yet
330 Lecture11 2014
61 pages
3-Linear Regreesion-Assumptions
No ratings yet
3-Linear Regreesion-Assumptions
28 pages
TYCS Practical
No ratings yet
TYCS Practical
26 pages
1 Residuals, Outliers and Regression Diagnostics - CH 14.8 15.8 Revised
No ratings yet
1 Residuals, Outliers and Regression Diagnostics - CH 14.8 15.8 Revised
48 pages
INT354 - Unit 4
No ratings yet
INT354 - Unit 4
50 pages
Econometrics I 7
No ratings yet
Econometrics I 7
20 pages
Outliers and Influential Points
No ratings yet
Outliers and Influential Points
14 pages
Apuntes Econometrics
No ratings yet
Apuntes Econometrics
13 pages
MultivariableRegression Summary
No ratings yet
MultivariableRegression Summary
15 pages
Lecture 4
No ratings yet
Lecture 4
12 pages
Apuntes Econometrics
No ratings yet
Apuntes Econometrics
13 pages
Multicolinearidade
No ratings yet
Multicolinearidade
24 pages
Rohan 20QM30011 AMSM Assignment Ch8
No ratings yet
Rohan 20QM30011 AMSM Assignment Ch8
11 pages
Mathematics of The Linear Model and Linear Mixed Model: Brian Zhang February 2020
No ratings yet
Mathematics of The Linear Model and Linear Mixed Model: Brian Zhang February 2020
20 pages
CH 06
No ratings yet
CH 06
22 pages
Contents
No ratings yet
Contents
11 pages
Model Assumptions For Linear Regression 1728742476
No ratings yet
Model Assumptions For Linear Regression 1728742476
13 pages
Lecture 20: Outliers and Influential Points
No ratings yet
Lecture 20: Outliers and Influential Points
11 pages
R Notesss
No ratings yet
R Notesss
12 pages
UNIT II Part-2
No ratings yet
UNIT II Part-2
32 pages
What Is Empirical - Models
No ratings yet
What Is Empirical - Models
14 pages
Multiple Variables: Regression
No ratings yet
Multiple Variables: Regression
14 pages
Curve Fitting
No ratings yet
Curve Fitting
18 pages
Chapter6 Regression Diagnostic For Leverage and Influence
No ratings yet
Chapter6 Regression Diagnostic For Leverage and Influence
10 pages
Day.11 What Is Multiple Linear Regression
No ratings yet
Day.11 What Is Multiple Linear Regression
10 pages
Condition Number
No ratings yet
Condition Number
6 pages
Chat GPT
No ratings yet
Chat GPT
6 pages
The Influence of Tax Knowledge, Taxpayer Awareness, And Tax Rates on the Compliance of Individual Taxpayers With Tax Sanctions as a Moderating Variable in E-Commerce Business Activities (Case Study at Online Shop Owner in Indo
100% (1)
The Influence of Tax Knowledge, Taxpayer Awareness, And Tax Rates on the Compliance of Individual Taxpayers With Tax Sanctions as a Moderating Variable in E-Commerce Business Activities (Case Study at Online Shop Owner in Indo
12 pages
Lecture 16: Polynomial and Categorical Regression 1 Review
No ratings yet
Lecture 16: Polynomial and Categorical Regression 1 Review
10 pages
Least Squares Fitting - From Wolfram MathWorld
No ratings yet
Least Squares Fitting - From Wolfram MathWorld
5 pages
An Introduction To Linear Correlation - IBDP Mathematics - Applications and Interpretation SL FE2021 - Kognity
No ratings yet
An Introduction To Linear Correlation - IBDP Mathematics - Applications and Interpretation SL FE2021 - Kognity
6 pages
Lecture 19: Interactions
No ratings yet
Lecture 19: Interactions
4 pages
Cheatsheet Part 2
No ratings yet
Cheatsheet Part 2
2 pages
Lecture 22: Review For Exam 2 1 Basic Model Assumptions (Without Gaussian Noise)
No ratings yet
Lecture 22: Review For Exam 2 1 Basic Model Assumptions (Without Gaussian Noise)
7 pages
Multiple Regression
100% (1)
Multiple Regression
7 pages
Stats3 LNRG
No ratings yet
Stats3 LNRG
1 page
Using EXCEL For Statistical Analysis
No ratings yet
Using EXCEL For Statistical Analysis
47 pages
Time Series and Sequential Data
No ratings yet
Time Series and Sequential Data
143 pages
Dva 2
No ratings yet
Dva 2
13 pages
EMM 8001 Advanced Statistical Methods - Docx 1
No ratings yet
EMM 8001 Advanced Statistical Methods - Docx 1
4 pages
Statistics: An Introduction Using R by M.J. Crawley Exercises
No ratings yet
Statistics: An Introduction Using R by M.J. Crawley Exercises
34 pages
The Box-Jenkins Method
No ratings yet
The Box-Jenkins Method
16 pages
Analysis of Variance (ANOVA)
No ratings yet
Analysis of Variance (ANOVA)
7 pages
Auto Correlation and Partial Correlation
No ratings yet
Auto Correlation and Partial Correlation
2 pages
Panel Cointegration Tests - A Review PDF
No ratings yet
Panel Cointegration Tests - A Review PDF
37 pages
Joliffe, I., & Morgan, B. (1992) - Principal Component Analysis and Exploratory Factor Analysis.
No ratings yet
Joliffe, I., & Morgan, B. (1992) - Principal Component Analysis and Exploratory Factor Analysis.
28 pages
Excel Perhitungan Laporan Keuangan
No ratings yet
Excel Perhitungan Laporan Keuangan
5 pages
12 Correlation and Rank Correlation 05-02-2024
No ratings yet
12 Correlation and Rank Correlation 05-02-2024
19 pages
FORMULA SHEET and The TABLES
No ratings yet
FORMULA SHEET and The TABLES
10 pages
Class-33 Regression
No ratings yet
Class-33 Regression
15 pages
Assignment Afework
No ratings yet
Assignment Afework
6 pages
Stat 250 Gunderson Lecture Notes 11: Regression Analysis: Main Idea
No ratings yet
Stat 250 Gunderson Lecture Notes 11: Regression Analysis: Main Idea
22 pages
PS Da2
No ratings yet
PS Da2
2 pages
2 Pearson Correlation
No ratings yet
2 Pearson Correlation
7 pages
Factorial Experiments: 2), and Factor C Has 2 Levels (C 2) - These Factorial Experiment Is Named
No ratings yet
Factorial Experiments: 2), and Factor C Has 2 Levels (C 2) - These Factorial Experiment Is Named
6 pages
Online Test Part 2
No ratings yet
Online Test Part 2
3 pages
Machine Learning
No ratings yet
Machine Learning
3 pages
Two Way Anova Dengan Replikasi: Gender Pendidikan Ujian
No ratings yet
Two Way Anova Dengan Replikasi: Gender Pendidikan Ujian
2 pages
School of Mathematics, Thapar University, Patiala
No ratings yet
School of Mathematics, Thapar University, Patiala
2 pages
Machine Learning 1 - Programming Assignment 1
No ratings yet
Machine Learning 1 - Programming Assignment 1
1 page
Geometric functions in computer aided geometric design
From Everand
Geometric functions in computer aided geometric design
Oscar Ruiz
No ratings yet

Lecture For 111424

Uploaded by

Lecture For 111424

Uploaded by

Lecture for 111424

Topics for today

The even and odd powers are tied

COVRATIO = [( det ( s2(i) (X(i)'X(i))-1 ) )/( det ( s2

Cutoff: |COVRATIO – 1| >

Rough cutoff 1: > 10

Condition numbers or condition indices are square roots of the ratios

You might also like