0% found this document useful (0 votes)
12 views12 pages

Unit II

Unit II covers correlation and linear regression, explaining scatter diagrams and the correlation coefficient, which measures the relationship between two variables. It also discusses regression lines, the principle of least squares, and the angle between regression lines, highlighting the differences between dependent and independent variables. Practice questions are provided to apply the concepts of regression coefficients and correlation coefficients.

Uploaded by

ggaba3855
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views12 pages

Unit II

Unit II covers correlation and linear regression, explaining scatter diagrams and the correlation coefficient, which measures the relationship between two variables. It also discusses regression lines, the principle of least squares, and the angle between regression lines, highlighting the differences between dependent and independent variables. Practice questions are provided to apply the concepts of regression coefficients and correlation coefficients.

Uploaded by

ggaba3855
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

Unit II: Correlation and Linear Regression

Scatter Diagram is a graph of observed plotted points where each point


represents the values of X & Y as a coordinate. It portrays the relationship
between these two variables graphically. It is a dotted representation of
bivariate data.
Correlation is a statistical tool that helps to measure and analyze the degree
of relationship between two variables.
Karl Pearson’s coefficient of correlation

Properties of Correlation coefficient


• Two independent variables are uncorrelated, but the converse is not true.
• Correlation coefficient is independent of change of origin.
𝑎𝑐 𝑟(𝑋, 𝑌), 𝑖𝑓 𝑎, 𝑐 𝑎𝑟𝑒 𝑜𝑓 𝑠𝑎𝑚𝑒 𝑠𝑖𝑔𝑛𝑠.
𝑟(𝑎𝑋 + 𝑏, 𝑐𝑌 + 𝑑) = 𝑟(𝑋, 𝑌) = {
|𝑎𝑐| −𝑟(𝑋, 𝑌), 𝑖𝑓 𝑎𝑐, 𝑐 𝑎𝑟𝑒 𝑜𝑓 𝑜𝑝𝑝𝑜𝑠𝑖𝑡𝑒 𝑠𝑖𝑔𝑛𝑠.

• Correlation coefficient lies between ±1.


• For perfect positive correlation, 𝑟=1 and for perfect negative correlation 𝑟=−1.
𝐸𝑥𝑝𝑙𝑎𝑖𝑛𝑒𝑑 𝑣𝑎𝑟𝑖𝑎𝑡𝑖𝑜𝑛
• Coefficient of Determination=𝑟 2 =
𝑇𝑜𝑡𝑎𝑙 𝑣𝑎𝑟𝑖𝑎𝑡𝑖𝑜𝑛

Q. Calculate the correlation coefficient.


Rank Correlation Coefficient

𝑑 −difference in rank, 𝑛 −Sample size, 𝑚 −multiplicity of repeated rank.

is known as correction factor.

= 𝑡𝑋 + 𝑡𝑌 , where 𝑡𝑋 , 𝑡𝑌 represents correction factors of 𝑋 and 𝑌


resp.
Practice Questions
Lines of Regression
Regression is mathematical measure of the average
relationship between two or more variables in terms of the
original units of the data.
• The variables whose value is influenced or is to be predicted
is called dependent variable (Regressed or Explained
variable) and the variable who influences the values or is
used for prediction is called independent variable
(Repressor or Explanatory variable).
• The curve around which points of scatter diagram cluster is
called curve of regression.
• If the curve is a straight line, it is called line of regression
(Linear Regression), otherwise Curvilinear.
• The line of regression is the line which gives the best
estimate to the value of one variable for any specific value
of other variable (Line of Best Fit).
Principle of Least Squares
It consists in minimizing the sum of the square of the deviations
of the actual values of dependent variable from their estimated
values as given by the line of best fit.
Angle between the two regression lines:
1
𝑏𝑦𝑥 −(𝑏 )
𝑥𝑦
tan 𝜃 = | 1 |, where 𝜃 is an acute angle,
1+𝑏𝑦𝑥 ∗𝑏
𝑥𝑦

Obtuse 𝜃1 = (𝜋 − 𝜃)

Use this ⇑ formula, if data is given

𝒎 −𝒎𝟐
𝟏
𝐭𝐚𝐧 𝜽 = 𝟏+𝒎 Use this formula, if regression lines are given
𝟏 𝒎𝟐

Special cases of 𝜽
For uncorrelated variables, regression lines are perpendicular

For perfectly correlated variables, regression lines coincide.

• Ratio of Coefficient of variability for X on Y is

σ2X bXY
=
σ2Y bYX

Q. Obtain both regression lines.

Q. Use the regression line to estimate the height of son (𝑌) when Father’s height (𝑋)
is 70.
Q.
X 10 15 22 31 42

Y 7 7 7 7 7

Find

(i) Regression coefficient of Y on X.


(ii) Correlation coefficient b/w X and Y.
(iii) Regression line of Y on X.
(iv) Angle b/w regression lines.
(v) Regression coefficient of X on Y.
(vi) Regression line of X on Y.
(vii) Intersection point of two regression lines.

Q.

Answer questions (i) to (vi), and give a thought on part (vii)

You might also like