0% found this document useful (0 votes)

33 views40 pages

QT - Unit 2 - Part B - Regression

Uploaded by

anaaya321

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

33 views40 pages

QT - Unit 2 - Part B - Regression

Uploaded by

anaaya321

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 40

UNIT II – Part B

REGRESSION
Regression Analysis
Establishing Correlation is a prerequisite for Linear Regression. We
can't use Linear Regression unless there is a Linear Correlation.
Correlation analysis describes the present or past situation. It
uses Sample data to infer a property of the source Population or
Process. There is no looking into the future. Linear Regression is
used to predict results.

Correlation analysis studies whether the variables under study are related or not and
to what degree. Correlation Analysis does not attempt to identify a
Cause-Effect relationship, Regression does.
In Correlation, we ask to what degree the plotted data forms a
shape that seems to follow an imaginary line that would go
through it. But we don't try to specify that line. In Linear
Regression, that establishes
Regression analysis line is thethe
whole point.
“nature We calculate
of relationship” a best-fit
between line
the variables.
through the data:relationship
It studies functional y = a + bx. and provides a mechanism for prediction or
forecasting.
Regression analysis is a statistical method to model the relationship
between a dependent (target or outcome) variable and one or more
independent (predictor) variables.
Regression analysis helps us to understand how the value of the dependent
variable is changing corresponding to an independent variable when other
independent variables are held fixed. It predicts continuous/real values
such as temperature, age, salary, price, etc.

Dependent Variable: The main factor in

Regression analysis which we want to predict
or understand is called the dependent variable.
It is also called outcome/target variable.

Independent Variable: The factors which

affect the dependent variables or which are
used to predict the values of the dependent
variables are called independent variable, also
called as a predictor variable.

The green points shown in the graph are actual data points.
Least Squares Method
Line of Best Fit/Regression Line
The least-squares regression method is a technique commonly used in
Regression Analysis. It is a mathematical method used to find the best fit
line that represents the relationship between an independent and
dependent variable in such a way that the error is minimized.
The Line of best fit is drawn across a scatter plot of data points in order to
represent a relationship between those data points.
The least squares method is one of the most effective ways used to draw
the line of best fit it is based on the idea that the square of the errors
(residuals) obtained must be minimized to the most possible extent and
hence the name least squares method
Regression Line
Regression Line is defined as a statistical concept that facilitates and predicts
the relationship between independent variable and dependent variable. A
regression line is a straight line that reflects the best-fit connection in a dataset
between independent and dependent variables.

• The independent variable is generally shown on the X-axis and the

dependent variable is shown on the Y-axis.
• The main purpose of developing a regression line is to predict or estimate the
value of the dependent variable based on the values of one or more
independent variables.
There are always two lines of regression:
Y on X – Predicts the value of Y from known values of X.
X on Y - Predicts the value of X from known values of Y.

The lines of regression Y on X or X on Y are best fit in the sense that it minimizes
the sum of the squares of the vertical distances from the observed points to the
line.
When X is known & Y is to predicted – Y on X is used.
When Y is known & X is to predicted – X on Y is used.
Assumptions of (Simple Linear) Regression
We make a few assumptions when we use linear regression to model the
relationship between independent and dependent variables. These
assumptions are essentially conditions that should be met before we draw
inferences regarding the model estimates or before we use a model to make a
prediction.

Regression fails to deliver good results with data sets which doesn’t fulfil its
assumptions. Therefore, for a successful regression analysis, it’s essential to
validate these assumptions
1. Linear relationship - There should be a linear relationship between dependent
(response) variable and independent (predictor) variable(s).
2. Normality of Errors- The errors or residuals must be normally distributed.
3. Homoscedasticity (or, equal variance around the line) - The error terms must
have constant variance.
4. No multicollinearity - The independent variables should not be correlated.
5. Autocorrelation - There should be no correlation between the residual (error).
The presence of correlation in error terms drastically reduces model’s accuracy.
Coefficient of Regression
Regression Equations/Lines
Regression Coefficient – Some Formulas
1. From Original Data

2. From Actual Mean

3. From Assumed Mean

4. From Covariance & Standard Deviation

5. From Correlation Coefficient & Standard Deviation

Relation between Correlation Coefficient &

Regression Coefficient
Properties of Regression
Coefficient of Determination
• The coefficient of determination (R²) measures how well a
statistical model predicts an outcome. The outcome is represented by the
model’s dependent variable.
• The coefficient of determination is often written as R2, which is pronounced
as “r squared.”
• The lowest possible value of R² is 0 and the highest possible value is 1. Put
simply, the better a model is at making predictions, the closer R² will be to 1.
You can see in the first dataset that
when the R2 is high, the observations
are close to the model’s predictions.
In other words, most points are close
to the line of best fit.

In contrast, you can see in

the second dataset that
when the R2 is low, the
observations are far from
the model’s predictions. In
other words, when the R2 is
low, many points are far
from the line of best fit.

Note: The coefficient of determination is always positive, even

when the correlation is negative.
Calculating the Coefficient of Determination
You can choose between two formulas to calculate the coefficient of
determination (R²) of a simple linear regression. The first formula is specific
to simple linear regressions, and the second formula can be used to
calculate the R² of many types of statistical models.
Alternatively,
Interpretation of Coefficient of Determination

• R² is the proportion of variance that is shared between the

independent and dependent variables.

• The coefficient of determination (R²) is interpreted as the

proportion of variance in the dependent variable that is
predicted by the statistical model.

• R² is the proportion of variance “explained” or “accounted

for” by the model. The proportion that remains (1 − R²) is
the variance that is not predicted by the model.
Correlation vs. Regression
CORRELATION REGRESSION

1. Correlation means the relationship between two Regression is a mathematical measure

or more variables which vary in sympathy so expressing the average relationship between the
that the movements in one tend to be two variables.
accompanied by the corresponding movement in
the other.
2 Correlation analysis attempts to determine the Regression analysis attempts to determine the
"degree and direction of relationship" between “nature and extent of relationship" between
variables. variables, i.e. functional relationship between
variables.
3 Correlation need not imply cause and effect Regression analysis clearly indicates the cause
relationship between the variables under study. and effect relationship between the variables.
4. There may be non-sense or spurious correlation There is no such thing like non-sense
between two variables which is due to pure regression.
chance and has no practical relevance .
5 Correlation coefficient r (X, Y) or simply Regression coefficients are not symmetric in X
between two variables is symmetric. i.e., and Y, i.e., byx is not equal to bxy.
r X,Y) = r(Y, X).
CORRELATION REGRESSION

6. Correlation cannot be used for Regression is a forecasting device. It can be

forecasting/prediction purposes. used to predict the value of dependent variable
from the given value of independent variable.

7. Correlation analysis is confined only to the Regression analysis has much wider
study of linear relationship between the applications as it studies linear as well as
variables and therefore has limited non-linear relationship between the
applications. variables.
8 Correlation coefficient is independent of Regression coefficients are independent
change of origin and scale. of only change of origin but not of scale.

Similarities between correlation and regression

In addition to differences, there are some similarities between correlation & regression:
• Both work to quantify the direction and strength of the relationship between two numeric
variables.
• Any time the correlation is negative, the regression will also be negative. Any time the
correlation is positive, the regression will be positive.
• Correlation coefficient and regression coefficients, both range from -1 to +1.
Uses of Regression Analysis
The regression analysis as a statistical tool has a number of uses, or utilities for which it
is widely used:

• It provides a functional relationship between two or more related variables with the
help of which we can easily estimate or predict the unknown values of one variable
from the known values of another variable.

• It provides a measure of errors of estimates made through the regression line. A

little scatter of the observed (actual) values around the relevant regression line
indicates good estimates of the values and vice-versa.

• It provides a measure of coefficient of correlation by taking the square root of the

product of the two regression coefficients, r = √ bxy.byx.

• It provides a measure of coefficient of the determination. This coefficient of

determination is computed by taking the product of the two regression coefficients,
R2 = bxy.byx.
• It provides a formidable tool of statistical analysis in the field of business and
commerce where people are interested in predicting the future events viz.:
consumption, production, investment, prices, sales, profits, etc. and success of
businessmen depends very much on the degree of accuracy in their various estimates.

• It provides a valuable tool for measuring and estimating the cause and effect
relationship among the economic variables that constitute the essence of economic
theory and economic life. It is highly used in the estimation of Demand curves, Supply
curves, Production functions, Cost functions, Consumption functions etc.

• This technique is highly used in our day-to-day life and sociological studies as well to
estimate the various factors viz. birth rate, death rate, tax rate, yield rate, etc.

• Last but not the least, the regression analysis technique gives us an idea about the
relative variation of a series.
Pitfalls of Correlation & Regression Analysis

• It involves very lengthy and complicated procedure of calculations and

analysis.
• It cannot be used in case of qualitative phenomenon viz, honesty, crime
etc.
• Regression retentions can change over time as do correlations. This is
called parameter instability.
• Regression analysis is difficult to apply when identifying the independent
variable and the dependent variable is challenging.
• Limited generalizability: Results may not apply beyond the data. The
functional relationship that is established between any two or more
variables on the basis of some limited data may not hold good if more
and more data are taken into consideration.
• Sensitivity to outliers: Correlation and regression can be affected by
extreme values.
• Correlation and regression may not handle non-normal data.

Cover Sheet: For Audited Financial Statements
80% (10)
Cover Sheet: For Audited Financial Statements
2 pages
Hist SN T1 e ST
No ratings yet
Hist SN T1 e ST
58 pages
Econometrics: A Simple Introduction
From Everand
Econometrics: A Simple Introduction
K.H. Erickson
3.5/5 (5)
Installation of Signboard
100% (1)
Installation of Signboard
13 pages
Regression Analysis
No ratings yet
Regression Analysis
18 pages
Regression: by Vijeta Gupta Amity University
No ratings yet
Regression: by Vijeta Gupta Amity University
15 pages
Regression Analysis
No ratings yet
Regression Analysis
12 pages
Correlation and Linear Regression
No ratings yet
Correlation and Linear Regression
25 pages
Investigating Variables
No ratings yet
Investigating Variables
15 pages
Class Note II - 044242
No ratings yet
Class Note II - 044242
19 pages
Correlation and Regression Notes
No ratings yet
Correlation and Regression Notes
5 pages
Regression&Corr&Annova
No ratings yet
Regression&Corr&Annova
71 pages
Correlation and Simple Linear Regression Analyses: Objectives
No ratings yet
Correlation and Simple Linear Regression Analyses: Objectives
6 pages
M3 Part 2: Regression Analysis
No ratings yet
M3 Part 2: Regression Analysis
21 pages
M. Amir Hossain PHD: Course No: Emba 502: Business Mathematics and Statistics
No ratings yet
M. Amir Hossain PHD: Course No: Emba 502: Business Mathematics and Statistics
31 pages
Unit III Part B
No ratings yet
Unit III Part B
31 pages
4
No ratings yet
4
3 pages
Chapter Regression PDF
No ratings yet
Chapter Regression PDF
95 pages
CH 4 - Correlation and Regression YARA&LAMA
No ratings yet
CH 4 - Correlation and Regression YARA&LAMA
27 pages
Difference Between Correlation and Regression
No ratings yet
Difference Between Correlation and Regression
7 pages
Business Statistic Presentation
No ratings yet
Business Statistic Presentation
22 pages
Econometrics 2
No ratings yet
Econometrics 2
27 pages
Corr - Regression Analysis
No ratings yet
Corr - Regression Analysis
19 pages
Chapter 5 - 1
No ratings yet
Chapter 5 - 1
5 pages
Simple and Multiple Linear Regression
No ratings yet
Simple and Multiple Linear Regression
91 pages
REGRESSION and CORRELATION ANALYSIS STA 106 - DR. BASHIRU
No ratings yet
REGRESSION and CORRELATION ANALYSIS STA 106 - DR. BASHIRU
10 pages
Correlation and Regression
No ratings yet
Correlation and Regression
8 pages
DISCRETE MATH Chapter-8
No ratings yet
DISCRETE MATH Chapter-8
34 pages
Common Pitfalls in Statistical Analysis: Linear Regression Analysis
No ratings yet
Common Pitfalls in Statistical Analysis: Linear Regression Analysis
4 pages
BStats 2
No ratings yet
BStats 2
66 pages
Chapter Eight 8 Simple Linear Regression and Correlation: N XY X Y N X X
No ratings yet
Chapter Eight 8 Simple Linear Regression and Correlation: N XY X Y N X X
5 pages
Definition 3. Use of Regression 4. Difference Between Correlation and Regression 5. Method of Studying Regression 6. Conclusion 7. Reference
No ratings yet
Definition 3. Use of Regression 4. Difference Between Correlation and Regression 5. Method of Studying Regression 6. Conclusion 7. Reference
11 pages
Chapter No 11 (Simple Linear Regression)
No ratings yet
Chapter No 11 (Simple Linear Regression)
3 pages
Correlation and Regression Analyses
No ratings yet
Correlation and Regression Analyses
8 pages
Correlation and Regression
No ratings yet
Correlation and Regression
15 pages
Corelation & Regression
No ratings yet
Corelation & Regression
21 pages
Hsslive-Xii-Statistics-2. Rehression English
No ratings yet
Hsslive-Xii-Statistics-2. Rehression English
5 pages
REGRESSION
No ratings yet
REGRESSION
38 pages
Module 6 RM: Advanced Data Analysis Techniques
No ratings yet
Module 6 RM: Advanced Data Analysis Techniques
23 pages
Correlation Regression
No ratings yet
Correlation Regression
58 pages
Correlation & Regression Analysis
100% (1)
Correlation & Regression Analysis
39 pages
Regression 2
No ratings yet
Regression 2
6 pages
Correlation and Regression
No ratings yet
Correlation and Regression
3 pages
Lecture 12 Simple Linear Regression Analysis
No ratings yet
Lecture 12 Simple Linear Regression Analysis
22 pages
Regression Analysis
No ratings yet
Regression Analysis
21 pages
Correlation 140708105710 Phpapp01
No ratings yet
Correlation 140708105710 Phpapp01
21 pages
CORRELATION
No ratings yet
CORRELATION
23 pages
Regression
No ratings yet
Regression
14 pages
Regression
No ratings yet
Regression
11 pages
Correlation Anad Regression
No ratings yet
Correlation Anad Regression
13 pages
Unit 6, Regression
No ratings yet
Unit 6, Regression
34 pages
DSC 402
No ratings yet
DSC 402
14 pages
Business Analytics: Advance: Simple & Multiple Linear Regression
No ratings yet
Business Analytics: Advance: Simple & Multiple Linear Regression
38 pages
Correlation and Regression 2020
No ratings yet
Correlation and Regression 2020
63 pages
Chapter 14 Simple Linear Regression .
No ratings yet
Chapter 14 Simple Linear Regression .
39 pages
Stat Cor Reg
No ratings yet
Stat Cor Reg
85 pages
Regression: Leech N L, Barret K C & Morgan G A (2011)
No ratings yet
Regression: Leech N L, Barret K C & Morgan G A (2011)
35 pages
Data Analytics Lesson 11 Notes
No ratings yet
Data Analytics Lesson 11 Notes
8 pages
Bivariate Data Analysis
100% (1)
Bivariate Data Analysis
34 pages
Regression and Correlation Analysis
No ratings yet
Regression and Correlation Analysis
16 pages
Linear Regression
No ratings yet
Linear Regression
4 pages
Econometrics For MGT ppt-2
No ratings yet
Econometrics For MGT ppt-2
58 pages
Correlation and Regression: Six Sigma Thinking, #8
From Everand
Correlation and Regression: Six Sigma Thinking, #8
Sumeet Savant
5/5 (1)
Case-Control Study Design
No ratings yet
Case-Control Study Design
60 pages
2nd Diagnostic Test
No ratings yet
2nd Diagnostic Test
2 pages
ChuteDesignFormulas Paper43
No ratings yet
ChuteDesignFormulas Paper43
11 pages
GenAI 20 Weeks Roadmap
No ratings yet
GenAI 20 Weeks Roadmap
2 pages
Horizontal Circular Prac
No ratings yet
Horizontal Circular Prac
3 pages
UIIC Motor Commercial Worksheet
No ratings yet
UIIC Motor Commercial Worksheet
2 pages
CE341 - FCH - Civil Eng Communication Skills
No ratings yet
CE341 - FCH - Civil Eng Communication Skills
2 pages
Comply Efficiently With Electronic Documents and Statutory Reporting Worldwide
No ratings yet
Comply Efficiently With Electronic Documents and Statutory Reporting Worldwide
4 pages
LiFePO4 Battery Material For The Production of Lit
No ratings yet
LiFePO4 Battery Material For The Production of Lit
13 pages
Operation Guide 3294: About This Manual
No ratings yet
Operation Guide 3294: About This Manual
3 pages
Fin
No ratings yet
Fin
2 pages
Secrets of Mind Power Harry Lorayne
No ratings yet
Secrets of Mind Power Harry Lorayne
45 pages
Chapter-1: 1.1 Tapered Steel Members
No ratings yet
Chapter-1: 1.1 Tapered Steel Members
11 pages
Module #2 Part 4 Gradient Series
No ratings yet
Module #2 Part 4 Gradient Series
15 pages
LIUA8BEN3 Eng-00
No ratings yet
LIUA8BEN3 Eng-00
36 pages
Chapter4performanceparav2 28student 29
No ratings yet
Chapter4performanceparav2 28student 29
19 pages
Shop Drawings
No ratings yet
Shop Drawings
3 pages
Imagery Use in Sport: Mediational Effects For Efficacy: Sandra E. Short, Amy Tenute, & Deborah L. Feltz
No ratings yet
Imagery Use in Sport: Mediational Effects For Efficacy: Sandra E. Short, Amy Tenute, & Deborah L. Feltz
11 pages
Captiva Sevies
No ratings yet
Captiva Sevies
5 pages
Hippo 4 - Writing SF
No ratings yet
Hippo 4 - Writing SF
2 pages
Emulgel Preparation
No ratings yet
Emulgel Preparation
6 pages
Standardization For Oil and Gas Sector: S.M. Bhatia Deputy Director General Bureau of Indian Standards
No ratings yet
Standardization For Oil and Gas Sector: S.M. Bhatia Deputy Director General Bureau of Indian Standards
41 pages
Electrically Switch Electromagnet
No ratings yet
Electrically Switch Electromagnet
16 pages
Script Reading Month
No ratings yet
Script Reading Month
2 pages
Class 9 (Moral Science)
No ratings yet
Class 9 (Moral Science)
27 pages
Bok Seng Logistics Pte LTD: Chains Working Load Limits 6.00 T
No ratings yet
Bok Seng Logistics Pte LTD: Chains Working Load Limits 6.00 T
1 page
TechSmart 97, October 2011, The Security Issue
No ratings yet
TechSmart 97, October 2011, The Security Issue
36 pages

QT - Unit 2 - Part B - Regression

Uploaded by

QT - Unit 2 - Part B - Regression

Uploaded by

UNIT II – Part B

Dependent Variable: The main factor in

Independent Variable: The factors which

• The independent variable is generally shown on the X-axis and the

2. From Actual Mean

3. From Assumed Mean

5. From Correlation Coefficient & Standard Deviation

Relation between Correlation Coefficient &

In contrast, you can see in

Note: The coefficient of determination is always positive, even

• R² is the proportion of variance that is shared between the

• The coefficient of determination (R²) is interpreted as the

• R² is the proportion of variance “explained” or “accounted

1. Correlation means the relationship between two Regression is a mathematical measure

6. Correlation cannot be used for Regression is a forecasting device. It can be

Similarities between correlation and regression

• It provides a measure of errors of estimates made through the regression line. A

• It provides a measure of coefficient of correlation by taking the square root of the

• It provides a measure of coefficient of the determination. This coefficient of

• It involves very lengthy and complicated procedure of calculations and

You might also like