Unit III Part B
Unit III Part B
Analysis
Unit III
Part B
Why Is It Called Regression?
• The term regression in the literary sense is also referred as ‘moving
backward’.
• Although there is some debate about the origins of the name, the
statistical technique described above most likely was
termed "regression" by Sir Francis Galton in the 1877
• To describe the statistical feature of biological data (such as heights of
people in a population) to regress to some mean level. In
other words, while there are shorter and taller people, only
outliers are very tall or short, and most people cluster somewhere
around (or "regress" to) the average.
What Is the Purpose of Regression?
• In statistical analysis, regression is used to identify the associations
between variables occurring in some data.
• It can show both the magnitude of such an association and also
determine its statistical significance (i.e., whether or not
the association is likely due to chance).
• Regression is a powerful tool for statistical inference and has also
been used to try to predict future outcomes based on past
observations.
Regression Analysis
• Definition: The statistical technique that expresses the relationship
between two or more variables in the form of an equation to
estimate the value of a variable, based on the given value of
another variable, is called regression analysis.
𝑌 − 𝑌ത = 𝑏𝑦𝑥 𝑋 − 𝑋ത 𝑋 − 𝑋ത = 𝑏𝑥𝑦 𝑌 − 𝑌ത
mean then model is mean then model is
Normal
equations
The regression equation of Y on X The regression equation of X on Y
•σ 𝑦 = 𝑛𝑎 + 𝑏 σ 𝑥 •σ 𝑥 = 𝑛𝑎 + 𝑏 σ 𝑦
•σ 𝑥𝑦 = 𝑎 σ 𝑥 + 𝑏 σ 𝑥2 •σ 𝑥𝑦 = 𝑎 σ 𝑦 + 𝑏 σ 𝑦2
Regression
Equations
The regression equation of Y on X The regression equation of X on Y
where, where,
X = independent variable or Y = independent variable or
regressor or predictor. regressor or predictor.
Y = dependent variable or X = dependent variable or
responses variable. responses variable.
a = intercept term. a = intercept term.
b = slope or regression b = slope or regression
coefficient. coefficient.
Regression
coefficient
𝑏 𝑏𝑥
𝑦
Regression coefficient Regression coefficient
𝑥 of Y on X is 𝑦 of X on Y is
represented by 𝑏𝑦𝑥 , is represented by 𝑏𝑥𝑦 , is
• The regression coefficient • The regression coefficient
mathematically 𝜎𝑦 represented as
𝑏𝑦 = r 𝑏𝑥 = r
mathematically represented as
𝜎 𝜎
𝑥 is −∞ 𝑡𝑜 + • Range of regression𝑦 is −∞ 𝑡𝑜 +
𝜎𝑦
𝑥 𝑥
∞ ∞.
• Range of regression
• When the deviation are taken for the • When the deviation are taken for the
assumed mean assumed mean
𝑏𝑦𝑥
𝑛
σ 𝑖= σ 𝑛
𝑥𝑖𝑦𝑖 σ 𝑥 𝑦 σ
1𝑛 𝑥 = 𝑖=1 𝑖 𝑖
σ 𝑖=1𝑥𝑦 𝑛 𝑥𝑦
= � =
σ 𝑦 σ 𝑖=1 𝑖 σ
𝑖
𝑥2 𝑦2
= �
𝑥 2 𝑦2
Regression
Regression coefficientcoefficient
𝑏 𝑦 𝑏𝑥
Regression coefficient
𝑥 𝑦
• Where, • Where,
• r = Correlation coefficient • r = Correlation coefficient
• 𝜎𝑦 = Standard deviation of • 𝜎𝑦 = Standard deviation of
Y Y
• 𝜎𝑥 = Standard deviation of •𝑥𝜎𝑥 =𝑋
=
Standard deviation of
− 𝑋ത
•𝑦
X X
•𝑥 = 𝑋 − 𝑋ത 𝑌
=
•𝑦 = 𝑌 − 𝑌ത − 𝑌ത
Properti
es
• The correlation is • Arithmetic mean of
the geometric
coefficient mean of the regression coefficient is greater
𝑟=
regression coefficient. i.e. than the correlation coefficient.
𝑏𝑋𝑌𝑏
• If one± of 𝑌𝑋
• Regression coefficient are
the regression independent of change of
coefficient is greater than origin but not of scale.
− ∞ 𝑡𝑜 + ∞.
unity then the other is less than • Range of regression coefficient is
unity.
Field: Shelf Space (x) and Field: Spice Sales (y) appear highly correlated.
90
70
60
Spice Sales (y)
50
40
30
20
10
0
150 200 250 300 350 400 450
Calculate the correlation and find the two lines of regression from the above data.