Module 4.0 Improve Phase - 4.1 Simple Linear Regressions
Module 4.0 Improve Phase - 4.1 Simple Linear Regressions
IMPROVE PHASE
Improve Phase – Overview and Objectives
• Correlation coefficient is used to quantify relationship between two continuous variables and denoted by
the symbol r.
• Correlation coefficient is used in the Analyze Phase of Six Sigma Projects to quantify causal relationship
between a continuous X and a Continuous Y variable.
• r value of -1 indicates very strong negative correlation and a r value of +1 shows a very strong positive
correlation.
• When r = 1 or r = -1, all points fall on a straight line; when r = 0, they are scattered and give no
evidence of a linear relationship. Any other value of r suggests the degree to which the points
tend to be linearly related.
Simple Linear Regression
Only a single predictor variable or independent variable ‘X’ (e.g.: cutting speed) and a response variable
or dependent variable ‘Y’ (e.g: tool life).
Yi = β0 + β1 xi + εi i=1,2,...,n
where β0 is the intercept and β1 is the slope of the line and εij is the random error.
Simple Linear Regression – Example
A study was performed on wear of a bearing Y and its relationship to X1 = oil viscosity . The following data
were obtained. Fit a simple regression model to the data (Y vs. X1) = 0.05
Source DF SS MS F P
Regression 1 10240 10240 10.58 0.031
There is a regression
Residual Error 4 3872 968 between Wear and oil
Total 5 14112 viscosity
Simple Linear Regression – Example
Coefficient of Determination (R2)
P value of regression equation as per ANOVA table is 0.015 - Means the equation is valid
R sq adj value is 75.9 - Means only 76% of variation in Y is because of X2. Some other factors also
contribute to the variation.
Need to look for more potential X.2 adj should be min 85%
Coefficient of Determination (R2)
Residual Analysis:
Residual vs Fits - Randomly above and below the line with no trends - ok
Residual vs Time order - Randomly above and below the line with no trends - ok
Overall conclusion: The regression equation is valid, but X2 accounts for only 76% of variation in Y
Summary – Simple Linear Regression
• Correlation Analysis
• Regression Equation