0% found this document useful (0 votes)
7 views41 pages

Lecture Week 9 - Regression

The document discusses various predictive analytic techniques, focusing on regression analysis, including simple linear regression, multiple linear regression, and logistic regression. It highlights the importance of model adequacy, including the significance of regression coefficients, the coefficient of determination (R²), and the need for residual analysis to ensure the model's assumptions are met. Additionally, it addresses multicollinearity and methods for assessing model performance through metrics like PRESS and adjusted R².

Uploaded by

Layan Mahasneh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views41 pages

Lecture Week 9 - Regression

The document discusses various predictive analytic techniques, focusing on regression analysis, including simple linear regression, multiple linear regression, and logistic regression. It highlights the importance of model adequacy, including the significance of regression coefficients, the coefficient of determination (R²), and the need for residual analysis to ensure the model's assumptions are met. Additionally, it addresses multicollinearity and methods for assessing model performance through metrics like PRESS and adjusted R².

Uploaded by

Layan Mahasneh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 41

LO4:Investigate a range of predictive analytic

techniques to discover new knowledge for


forecasting future events
• Agenda
• Linear regression
• Multiple linear regression
• Categorical regression
• Logistics regression
Simple Linear regression and
Correlation
• Many problems in engineering and science involve
exploring the relationships between two or more
variables. Regression analysis is a statistical technique
that is very useful for these types of problems. For
example, in a chemical process, suppose that the yield
of the product is related to the process-operating
temperature. Regression analysis can be used to build a
model to predict yield at a given temperature level. This
model can also be used for process optimization, such
as finding the level of temperature that maximizes
yield, or for process control purposes.
Simple Linear regression and
Correlation
HYPOTHESIS TESTS IN SIMPLE LINEAR
REGRESSION (Individual Coeficients)
Analysis of Variance Approach to
Test Significance of Regression
ADEQUACY OF THE REGRESSION
MODEL:Coefficient of Determination(R2)
A model may have a high R² but still be inadequate if
assumptions are violated: Linearity, normality of
residuals, Homoscedasticity, independence of error,
no multicollinearity (Variance Inflation Factor VIF<5
low to moderate, VIF>5 moderate to high)
sample correlation coefficient
MULTIPLE LINEAR REGRESSION
Matrix Approach to Multiple
Linear Regression
Multicollinearity: Variance
Inflation Factor VIF<5 low to
moderate, VIF>5 moderate to
high

R-sqr: for training data


R-sqr adj: Guard against overfitting
R-sqr pred: for testing data
PRESS: (prediction Sum of Squares) assess
how well a regression model will predict new,
unseen data. leave-one-out cross-validation
(LOOCV)  calculate the SSE without y^i.Note:
R-sqr pred=PRESS/SST
Test for Significance of
Regression (Overall Model
significance)
•n: Total number of observations (samples)
•p: Total number of parameters estimated, including the intercept
•k: Number of predictor variables (independent variables)
R2 and Adjusted R2
MODEL ADEQUACY CHECKING:
Residual Analysis

You might also like