This document summarizes key concepts in simple and multiple linear regression models:
1. It defines the prediction equation, sample slope, sample y-intercept, and measures of fit including R2, adjusted R2, standard error of the estimate, and tests for individual predictors and overall model significance.
2. It provides formulas and explanations for confidence and prediction intervals for mean and individual predicted y-values.
3. It also covers topics in multiple linear regression including partial correlations, omitted variable bias, and measures of collinearity like variance inflation factors.
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0 ratings0% found this document useful (0 votes)
547 views2 pages
Simple Linear Regression
This document summarizes key concepts in simple and multiple linear regression models:
1. It defines the prediction equation, sample slope, sample y-intercept, and measures of fit including R2, adjusted R2, standard error of the estimate, and tests for individual predictors and overall model significance.
2. It provides formulas and explanations for confidence and prediction intervals for mean and individual predicted y-values.
3. It also covers topics in multiple linear regression including partial correlations, omitted variable bias, and measures of collinearity like variance inflation factors.
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 2
SIMPLE LINEAR REGRESSION
1. Prediction Equation 𝛽̂1 − 𝛽1 12. Adjusted 𝑅2
= 𝑦̂𝑖 = 𝛽̂0 + 𝛽̂1 𝑥1 𝑆𝑒 (𝛽̂1 ) 𝑆𝑆𝐸⁄ (𝑛 − 𝑘 − 1) 8. Confidence Interval for β0 𝑅𝐴2 =1− 2. Sample Slope 𝑆𝑆𝑇⁄ and β1 (𝑛 − 1) 𝑆𝑆𝑥𝑦 𝛽̂1 = 𝛽0 ± 𝑡(𝛼,𝑛−2) × 𝑆𝑒 (𝛽0 ) 𝑛−1 𝑆𝑆𝑥𝑥 2 𝑅𝐴2 = 1 − (1 − 𝑅 2 ) × 𝑛 − (𝑘 + 1) ∑(𝑥𝑖 − 𝑥̄ )(𝑦𝑖 − 𝑦̄ ) 𝛽1 ± 𝑡(𝛼,𝑛−2) × 𝑆𝑒 (𝛽1 ) = 𝑅𝐴2 = The adjusted coefficient of ∑(𝑥𝑖 − 𝑥̄ )2 2 determination ∑ 𝑥𝑖 𝑦𝑖 − 𝑛𝑥̅ 𝑦̅ 𝑐𝑜𝑣(𝑥, 𝑦) 9. Confidence Interval for = = 𝑅 2 = Unadjusted coefficient of 2 ∑ 𝑥𝑖 − 𝑛𝑥̅ 2 𝑣𝑎𝑟(𝑥) Mean value of Y given x determination 𝑆𝑦 A (1 − 𝛼) 100% confidence =𝑟∗ 𝑛 = Number of observations 𝑆𝑥 interval for E(Y|X): ∑ 𝑥 2 − (∑ 𝑥)2 𝑘 = Number of explanatory 𝑆𝑆𝑥𝑥 = 1 (𝑋𝑖 − 𝑋̅)2 variables 𝑛 𝑌̂𝑖 ± 𝑡(𝛼,𝑛−2) 𝑆𝑒 √ + 2 𝑛 𝑆𝑆𝑋 𝑆𝑆𝑥𝑦 = ∑ 𝑥𝑦− ∑ 𝑥 ∑ 𝑦 13. Variance Inflation Factor 𝑛 Here 𝑌̂ is the E(Y|X) 1 𝑉𝐼𝐹(𝑋𝑗 ) = 3. Sample Y Intercept 1 − 𝑅𝑗2 𝑆𝑆𝑋 = (𝑛 − 1)𝑆𝑥2 𝛽̂0 = 𝑦̅ − 𝛽̂1 𝑥̅ 𝑅𝑗2 is the coefficient of 10. Prediction Interval for a 4. Coefficient of determination for the regression random value of Y given x Determination of 𝑋𝑗 as the dependent variable A (1 – 𝛼) 100% prediction interval and all other 𝑋𝑖 as independent 𝑆𝑆𝑅 𝑆𝑆𝐸 for Y is: 𝑅2 = =1− variables 𝑆𝑆𝑇 𝑆𝑆𝑇 1 (𝑋𝑖 − 𝑋̅)2 If VIF >10, Multicollinearity is 5. Standard Error of Estimate 𝑌̂𝑖 ± 𝑡(𝛼,𝑛−2) 𝑆𝑒 √1 + + suspected 2 𝑛 𝑆𝑆𝑋 ∑(𝑌𝑖 − 𝑌̂)2 14. Tolerance Factor 𝑆𝑒 = √ where 𝑛−𝑘−1 1 𝑋𝑖 is the observed value of 1 − 𝑅𝑗2 = 6. Standard Error of β0 and 𝑉𝐼𝐹 independent variable, β1 15. Beta weights 𝑌̂𝑖 is the estimate of Y, (Standardized Beta) 1 𝑥̅ 2 𝑛 is the sample size, and 𝑆(𝛽0 ) = 𝑆𝑒 √ + 𝑆𝑥 𝑛 (𝑛 − 1)𝑆𝑥 2 𝐵𝑒𝑡𝑎 = 𝛽𝑖 × 𝑆𝑒 is the standard error 𝑆𝑦 𝑆𝑒 × √∑ 𝑥 2 11. Coefficient of Correlation 𝑆𝑥 = Standard deviation of X = √𝑛𝑆𝑆𝑥𝑥 (for simple regression) 𝑆𝑦 = Standard deviation of Y 𝑆𝑒 1𝑆𝑒 𝑆𝑆𝑋𝑌 𝑆(𝛽1 ) = = 𝑟 = √𝑅 2 = √𝑆𝑆𝑥𝑥 √𝑛 − 1 𝑆𝑥 √𝑆𝑆𝑋𝑋 𝑆𝑆𝑌𝑌
7. Test statistic for𝛽̂1 Forward Regression
𝐹𝑖𝑛 > 3.84 𝑡(𝑛−2) 𝑃𝑖𝑛 < 0.05 𝐸𝑠𝑡𝑖𝑚𝑎𝑡𝑒 − 𝑃𝑎𝑟𝑎𝑚𝑒𝑡𝑒𝑟 = Backward Regression 𝐸𝑠𝑡. 𝑠𝑡𝑑. 𝑒𝑟𝑟𝑜𝑟 𝑜𝑓 𝑒𝑠𝑡𝑖𝑚𝑎𝑡𝑒 𝐹𝑜𝑢𝑡 < 2.71 𝑃𝑜𝑢𝑡 > 0.10 ANOVA TABLE Degrees of Source of Variation Sum of Squares Mean Square F Statistic Freedom Regression SSR k 𝑆𝑆𝑅⁄ 𝑘 𝑀𝑆𝑅 Error SSE n-(k+1) 𝑆𝑆𝐸⁄ 𝐹(𝑘,𝑛−𝑘−1) = (𝑛 − (𝑘 + 1)) 𝑀𝑆𝐸 Total SST n-1
16. Partial F Test
(𝑆𝑆𝐸𝑅 − 𝑆𝑆𝐸𝐹 )⁄ 17. F Test (Overall significance of the model) 𝑟 𝐹𝑟,𝑛−(𝑘+1) = 𝑀𝑆𝐸𝐹 𝑀𝑆𝑅 𝐹𝑘,𝑛−(𝑘+1) = 𝑀𝑆𝐸 (𝑅𝐹2 − 𝑅𝑅2 )⁄ 𝑟 𝑆𝑆𝑅⁄ = 2 (1 − 𝑅𝐹 )⁄ = 𝑘 (𝑛 − 𝑘 − 1) 𝑀𝑆𝐸⁄ (𝑛 − (𝑘 + 1)) 𝑆𝑆𝐸𝑅 = Sum of squared errors for reduced model 𝑅 2⁄ = 𝑘 𝑆𝑆𝐸𝐹 = Sum of squared errors for full model (1 − 𝑅 2 ) ⁄(𝑛 − (𝑘 + 1)) 𝑟 = Number of variables dropped from the full model / or added to the reduced model
MULTIPLE LINEAR REGRESSION
18. Prediction Interval 21. Semi-partial correlation 22. Omitted variable bias A (1 − 𝛼) 100% PI (Prediction (Part correlation) Actual relationship Interval) for value of a randomly Correlation between 𝑦 and 𝑥1 , 𝑌 = 𝛽0 + 𝛽1 𝑋1 + 𝛽2 𝑋2 chosen 𝑌, given values of 𝑋𝑖 : when the influence of 𝑥2 is removed from 𝑥1 (but not out of Fitted model 𝑦̂ ± 𝑡(𝛼,(𝑛−(𝑘+1)) √𝑠 2 (𝑦̂) + 𝑀𝑆𝐸 𝑦): 2 𝑌 = 𝛼0 + 𝛼1 𝑋1 19. Confidence Interval 𝑟𝑦1 − (𝑟𝑦2 )(𝑟12 ) 𝑠𝑟𝑦1,2 = Then 2 √1 − 𝑟12 A (1 − 𝛼) 100% CI (Confidence 𝐶𝑜𝑣(𝑥1 , 𝑥2 ) Interval) for a conditional mean of Square of part correlation of an ∝1 = 𝛽1 + 𝛽2 × 𝑉𝑎𝑟(𝑥1 ) 𝑌, given values of 𝑋𝑖 : explanatory variable = unique 𝑦̂ ± 𝑡(𝛼,(𝑛−(𝑘+1)) 𝑆[𝐸̂ (𝑌)] contribution of the explanatory 2 variable to 𝑅 2 20. Partial correlation When this variable is added Correlation between 𝑦 and 𝑥1 , 𝑠𝑟 2 = 𝐶ℎ𝑎𝑛𝑔𝑒 𝑖𝑛 𝑅 2 when the influence of 𝑥2 is 2 2 removed from both 𝑦 and 𝑥1 : = 𝑅𝑛𝑒𝑤 − 𝑅𝑜𝑙𝑑