0% found this document useful (0 votes)

6 views

Unit II Notes Correlation and Regression

The document discusses statistical methods with a focus on correlation, particularly Karl Pearson's Coefficient of Correlation, which measures the linear relationship between two variables. It explains the concepts of correlation, including positive and negative correlations, and provides formulas for calculating the correlation coefficient along with examples. Additionally, it introduces the t-test for assessing the significance of correlation and outlines assumptions related to Pearsonian correlation.

Uploaded by

Rohit

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

6 views

Unit II Notes Correlation and Regression

Uploaded by

Rohit

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 19

Statistical Methods (Comp/IT)

JSPM’s
RAJARSHI SHAHU COLLEGE OF ENGINEERING, PUNE
(Autonomous Institute Affiliated to Savitribai Phule Pune University, Pune)

FIRST YEAR
BATCHLOR OF TECHNOLOGY
(A.Y. 2021 -2022, SEMESTER - II)
COMPUTER/IT /CSBS ENGINEERING
SUBJECT: STATISTICAL METHODS

UNIT – II: LINEAR STATISTCIAL MODELS

2.1 CORRELATION

We have already discussed distributions involving one variable or univariate distributions. In many
problems, of practical in nature, we are required to deal with two or more variables. Distributions
using two variables are called bivariate distribution. In such distributions, we are often interested in
knowing whether there exists some kind (or degree) of relationship between the two variables. In
statistics, this means whether there is correlation or co-variance between the two variables. If the
change in one variable affects the change in the other variable, the variables are said to be correlated
and the relationship is called correlation. Correlation determines the intensity or degree of
relationship between two different variables.
For example, change in rainfall will affect the crop output and thus the variables 'Rainfall recorded'
and 'Crop output' are correlated. If the increase (or decrease) in one variable causes corresponding
increase (or decrease) in the other, the correlation is said to be positive or direct. On the other hand, if
increase in the value of one variable shows a corresponding decrease in the value of the other or vice
versa, the correlation is called negative or inverse. For example, if the income of worker increases, as
a natural course his expenditure also increases, hence the correlation between income and expenditure
is positive or direct. If we consider the price and demand of a certain commodity then our experience
tells us that as the price of a commodity rises, its demand falls and thus the correlation between these
variables is negative or inverse. Correlation can also be classified as linear and non-linear. When the
amount of change in one variable is not in a constant ratio to the amount of change in other variable,
the correlation is called non-linear.
There are different methods to determine correlation between two variables. Graphical method called
'Scatter diagram' gives rough idea about the correlation without giving any specific numerical value
and 'Karl Pearson's Coefficient of Correlation' gives numerical measures for intensity (or degree) of
linear relationship between two variables and is widely used.
Remark: Besides 'Scatter diagram' and 'Karl Pearson's Coefficient of Correlation', there is other
measure called Spearman’s coefficient of correlation, which measures the linear association between

1
Statistical Methods (Comp/IT)
ranks assigned to individual items according to their attributes. But these are not as much of
consequence.
We shall now discuss 'Karl Pearson's Coefficient of Correlation' which is widely used in practice.
2.2 KARL PEARSON'S COEFFICIENT OF CORRELATION:
To measure the intensity or degree of linear relationship between two variables, Karl Pearson
developed a formula called correlation coefficient.
Correlation coefficient between two variables x and y denoted by r ( x, y) is defined as
cov x, y 
r  x, y   (1)
xy
In Bivariate distribution if xi , yi  take the values x1 , y1 , x2 , y 2 , x3 , y3 xn , y n  we define

covx, y  
 xi  x  yi  y 
1
(2)
n
where x, y are arithmetic means for x and y series respectively.
Also, the standard deviations for x and y series are:

x 
1
 xi  x 2 and  y  1
  y i  y 2 (3)
n n
cov x, y 
Correlation coefficient r  x, y   can be calculated using following simplifications
xy

covx, y    xi  x  yi  y 
1
n
On simplifying we get
cov x, y  
1
n
 xi yi  x y (4)
2
1 1 
   x2    x
2
x (5)
n n 
1
Similarly,  y2   y i2  y 2 (6)
n
For simplification of calculation, we put
xi  A y B
ui  xi  A and vi  yi  B or ui  and vi  i , then
h k
covu, v  
1
n
 u i vi  u v , (7)

1 1
 u2 
n
 u i2  u 2 and  v2   vi2  v 2
n
(8)

covu, v 
and r u , v  is given by, r u, v   . It can be established that r  x, y   r u , v  .
u v
We note here that calculation of r u , v  is simpler as compared to r  x, y  .
Using results (4), (5) and (6) in (1) we can write the formula as
2
Statistical Methods (Comp/IT)
1
covx, y   xi y i  x y  xy  n x y
r  x, y    n 
xy 1 2 2 1
 xi  x n  y i  y2 2  x  n x  y
2 2 2
 n y2 
n
Also, using results (7) and (8) in (1), we can write the formula as
1
covu, v   u i vi  u v  uv  nu v
r u, v    n 
u v 1 2 2 1
 u i  u n  vi  v 2 2  u  nu  v
2 2 2
 nv 2 
n
Property: It can be shown that the coefficient of correlation must satisfy the condition  1  r  1 .
Ex. 1: Following are the values of import of raw material and export of finished product
Export 10 11 14 14 20 22 16 12 15 13
Import 12 14 15 16 21 26 21 15 16 14
Calculate the coefficient of correlation between the import value and export values.
Solution: For n  10 the data is tabulated as:

x y x2 y2 xy
10 12 100 144 120
11 14 121 196 154
14 15 196 225 210
14 16 196 256 224
20 21 400 441 420
22 26 484 676 572
16 21 256 441 336
12 15 144 225 180
15 16 225 256 240
13 14 169 196 182
 x  147  y  170 x  2291 y  3056  xy  2638
2 2

x  x  147  14.7, y  y  170  17

n 10 n 10

r  x, y  
covx, y 

 xy  n x y
xy  x  n x  y
2 2 2
 n y2 
2638  10  14.7  17 139
   0.9458
2291  1014.7 3056  1017  
2 2
130 .1  166

Ex. 2: Calculate the coefficient of correlation from the following information.

n  10,  x  40,  x 2  190,  y  40,  y 2  200,  xy  150

3
Statistical Methods (Comp/IT)

Sol.: Here, x 
x 
40
 4, x 2  16 and y  
40 y
 4, y 2  16
n 10 n 10

Correlation coefficient between x and y is r  x, y  

covx, y 

 xy  n x y
xy  x  n x  y
2 2 2
 n y2 
150  10  4  4  10
   0.2886
190  104 200  104 
2 2
30  40

Ex. 3: Given: r  0.9,  XY  70,  y  3.5,  X 2  100 . Find the number of items, if X and Y are
deviation from arithmetic mean.

Sol.:  x2 
1
 x  x 2  1  X 2  100
n n n

 x  x  y  y 
1
r  x, y  
covx, y 
 n 
 XY
xy xy n x  y

Squaring we get, r 2  XY 
 
2

0.92 
70 2  0.81 
4900
n  
2 2 2
 100  1225  n
n2     3.5
x y 2

 n 
0.81  1225 n  4900992 .25 \ n=5
2.3 t- TEST FOR A CORRELATION COEFFICIENT:
The most frequently used test to examine whether the two variables X and Y are correlated is
the t est. To apply this test, we first set up the two hypotheses as follows:

Where, is the population correlation coefficient.

The formula used for the t test is as follows:

Where, is the sample correlation coefficient.

The test statistic follows a distribution with degrees of freedom. Let us take an example.
Ex. 1: If Determine whether there is significant association between advertising
expenditure and sales revenue.
Soln.: We apply the t-test and use the formula

4
Statistical Methods (Comp/IT)

The critical value of t for df at level of significance is As the calculated value of

is more than the critical value, the null hypothesis is rejected. This means that there is statistically
significant correlation between the two variables.
Ex 2: calculate the correlation coefficient between the two series, given below. Examine whether
there is a significant relationship between the two variables.
X 10 20 30 40 50
y 3 2 1 5 4

Soln.:

Here ,
In order to examine whether indices whether the relationship between is
statistically significant, we apply the test.
5
Statistical Methods (Comp/IT)
The two hypotheses are:

Where, indicates correlation coefficient of the population.

The formula for the t test is,

The critical value of t with degrees of freedom and at level of significance is . As the
calculated value of is less than the critical value of , the null hypothesis is accepted. The
conclusion is that the relationship between is not statistically significant.
2.4 ASSUMPTIONS OF THE KARL PEARSONIAN CORRELATION:
1. The two variables x and y are linearly related. This implies that when the individual pairs are plotted
on a graph resulting in a scatter diagram. If the points are joined together, a straight line will be
formed.
2. The two variables are affected by several causes, which are independent, so as to form a normal
distribution. For example, relationships between price and demand, price and supply, advertising
expenditure and sales, length of experience and earning and so on are affected by several factors such
that the series result into a normal distribution.

EXERCISE: 2.1
Ex.1 From a group of 10 students, marks obtained by each in two papers x and y are given below

x 23 28 42 17 26 35 29 37 16 46
y 25 22 38 21 27 39 24 32 18 44
6
Statistical Methods (Comp/IT)
Calculate coefficient of correlation between x and y .

Ex.2 Obtain the correlation coefficient between population density (per square mile) and death rate
(per thousand persons) from the following data:
Population density 200 500 400 700 300
Death rate 12 18 16 21 10

Ex.3 Calculate the coefficient of correlation for the following distribution

x 35 34 40 43 56 20 38
y 32 30 31 32 53 20 33
Ex.4 Find coefficient of correlation for the data:
x 10 14 18 22 26 30
y 18 12 24 06 30 36

2.5 REGRESSION
After having established that the two variables are correlated, we are generally interested in
estimating the value of one variable for a given value of the other variable. For example, if we know
that rainfall affects the crop output then it is possible to predict the crop output at the end of a rainy
season. If the variables in a bivariate distribution are related, the points in scatter diagram cluster
round some curve called the curve of regression or the regression curve. If the curve is a straight line,
it is called the line of regression and in such case the regression between two variables is linear. The
line of regression gives best estimate for the value of one variable for some specified value of the
other variable.

2.6 CURVE FITTING

In experimental work, we often encounter the problem of fitting a curve to data obtain from
observations connecting two variables. Data fitting or representing relationship by means of
polynomials has been considered. There are different methods of fitting curve. Scatter diagram is
graphical method and the least square approximation is the most commonly applied techniques for
best fit.

2.7 SCATTER DIAGRAM

To find a relationship between the set of paired observation, we plot their correspond values on the
graph taking one of the variables along x-axis and other along y-axis. The resulting diagram showing
a collection of dots is called scatter diagram. A smooth curve that approximate the above set of
points is known as the approximating curve.

7
Statistical Methods (Comp/IT)

2.8 LEAST SQUARE APPROXIMATION

As a result of certain experiment suppose the values of the variables (xi, yi) are recorded for i = 1, 2,
3 … n. If these points are plotted, usually it is observed that a smooth curve passes through most of
these points, while some of the points are slightly away from this curve. The curve passing through
these points may be a first-degree curve i.e. a straight line say y = ax + b or a second degree parabola
such as
y = ax2 + bx + c
or in general an nth degree curve.
y = a0xn + a1xn–1 + a2xn–2 +…+ an .
To determine the equation of the curve which very nearly passes through the set of points, we assume
some form of relation between x and y may be a straight line or a parabola of second degree, third
degree and so on, which we expect to be the best fit.
In Fig. below, we observe that a straight line very nearly passes through the set of points. We may
assume the equation of the straight line as y = ax + b (1)

If point (xi, yi) is assumed to lie on (1) then y co-ordinate of the point can be calculated as,
y' i = axi + b
If point actually lies on (1) then, yi = y'i
Otherwise yi – y'i will represent the deviation of observed value yi from the calculated value of y'i
using the formula (1).
In method of least squares we take the sum of the squares of these deviations and minimize this sum
using the principle of maxima or minima. Values of a, b in (1) are calculated using this criterion. This
is called least square criteria. Curve (1) can be of any degree using least square criteria we can find its
8
Statistical Methods (Comp/IT)
equation. We shall now discuss fitting of straight line and second degree parabola to a given set of
points.

2.9 REGRESSION LINES:

Consider the set of values of xi , yi , i  1, 2, 3, n . Let the line of regression of y on x be
y  mx  c
Using the method of principle of least squares, the normal equations for estimating m and c the
regression line of y on x is
r y
y y  x  x 
x
y  y  b y x x  x  (9)

r y
where by x  is called regression coefficient of y on x .
x
Similarly, the regression line of x on y is
x
xx r y  y
y
x  x  bxy  y  y  (10)

sx
where bx y = r is called regression coefficient of x on y .
sy
For obtaining regression lines (9) and (10), we have to calculate r  x, y  ,  x and  y .

For simplification of calculation, we can use change of scale property. If u  x  a, v  y  b then

cov x, y  covu , v 
r  x, y   r u , v   
xy u v

( )
where s x = s u , s y = s v , cov u,v =
1
n
å
1 1
u v - u v ,  u2   u i2  u 2 ,  v2   vi2  v 2 and
n n
x  a u , y  b v
xa y b
In particular, if u  ,v then
h k
cov x, y  covu , v 
r  r  x, y   
xy u v

where  x  h  u ,  y  k  v and x  a  h u , y  b  k v .

9
Statistical Methods (Comp/IT)
 y   x 
Property 1: Since b y x  bx y   r     r   therefore b y x  bx y  r 2 and r  b y x  bx y .
 
 x   y 
Property 2: If  is the acute angle between the two regression lines in the case of two variables
1 r2  x  y
x and y , then tan   .
r  x2   y2

Ex.1: Obtain lines of regression for the following data.

x 6 2 10 4 8
y 9 11 5 8 7
Ans:- We prepare the table:
x y x2 y2 xy
6 9 36 81 54
2 11 4 121 22
10 5 100 25 50
4 8 16 64 32
8 7 64 49 56
 x  30  x  40 x 2
 220 y 2
 3056  xy  2638
No. of observation = n = 5

x
 xi  30  6 and y   yi  40  8
n 5 n 5
 xi  x 2  220  62  8
2

 x2 
n 5
 yi   y 2  340  82  4
2

 y2 
n 5

covx, y    i i  x y 
x y
 6 * 8  5.2
214
n 5
covx, y   5.2 covx, y   5.2
by x    0.65 bx y    1.3
x 2
8  y2 6
Regression line y on x is y  y  by x x  x 
y - 8  - 0.65 x  6 
y  -0.65 x  11.9
Regression line x on y is x  x  by x  y  y 
x  6  -1.3 y  8
x  -1.3 y  16.4

Ex.2: Compute the regression lines for the following data:

x 10 14 19 26 30 34 39
y 12 16 18 26 29 35 38

10
Statistical Methods (Comp/IT)
and estimate y for x  14.5 and x for y  29.5
Sol: We prepare the table

x y u  x  26 v  y  26 u2 v2 uv
10 12 -16 -14 256 196 224
14 16 -12 -10 144 100 120
19 18 -7 -8 49 64 56
26 26 0 0 0 0 0
30 29 4 3 16 9 12
34 35 8 9 64 81 72
39 38 13 12 169 144 156
Total -10 -8 698 594 640
-10 8
Here n  7, u= = -1.429 , v  1.143
7 7

covu, v   uv  u v  640   1.429 1.143   89.795

1 1
n
 7

u i  u 2  698   2.042   97.672

1 1
u2    u  9.883
2

n 7

vi  v 2  594   1.306   83.551

1 1
 v2    v  9.14
2

n 7
cov (u,v )
r = r ( x, y ) = r (u,v ) =
89.795
= = 0.9941
susv 9.883´ 9.14
sy s 9.14
r´ = r ´ v = 0.9941´ = 0.9194
sx su 9.883
sx s 9.883
r´ = r ´ u = 0.9941´ = 1.0749
sy sv 9.14
x  a  u  26  1.429  24.571
y  b  v  26  1.143  24.857
Regression line y on x is y  y  b y x x  x 

y - 24.857  0.9194  x  24.571 (1)

Regression line x on y is x  x  b y x y - y

x - 24.571  -1.0749  y  24.857  (2)

To estimate y for x  14.5 , put the value of x in (1) we get y  15.5977
To estimate x for y  29.5 , put the value of y in (2) we get x  29.56176

11
Statistical Methods (Comp/IT)
Ex. 3 The regression equations are 8x 10y  66  0 and 40x 18y  214. The value of variance of x
is 9. Find A) The mean value of x and y B) The correlation coefficient C) The standard deviation of
y.
Sol.: A) Given regression equations are 8x 10 y  66  0 and 40x -18y = 214. Since both the
regression lines passing through the point  x, y  , Solving above equation we get,
x  13 and y  17
B) Let 8x 10 y  66  0 be the line of regression y on x and 40 x 18 y  214 be the line of regression x
on y . Therefore, the equation can be written as
8 66 18 214
y  x and x y
10 10 40 40
b y x - regression coefficient of y on x =0.8, and bx y - regression coefficient of x on y =0.45
Coefficient of correlation between x and y is given by r  bx y b y x  0.45 * 0.8 =  0.6
But since both regression coefficients are positive, we take r  0.6
C) Variance of x = 9 i.e.  x  9 . Therefore  x  3
2

y y
We have by x  r  0 .8  0 .6 * y  4
x 3
EXERCISE 2.2
1. Obtain lines of regression for the following data.
x 40 44 28 30 44 38 31
y 32 39 26 30 38 34 28
2. Determine the equations of regression lines for the following data:
x 1 2 3 4 5 6 7 8 9
y 9 8 10 12 11 13 14 16 15
and obtain an estimate of y for x  4.5
3. Obtain lines of regression for the following data.
x 6 2 10 4 8
y 9 11 5 8 7
and find x for y  9.3
4. If the two lines of regression are 9 x  y    0 and 4 x  y   and the means of x and y are 2
and -3 resp., find the value of  ,  and the coefficient of correlation between x and y .
5. The regression equations are: 3x  2 y  26  0 and 6 x  y  31  0 then find:
(i) The mean values of x and y (ii) The correlation co-efficient between x and y .

6. Fit a straight line of the form to the following data by, using least square method

x 0 2 4 6 8 12 20
y 10 12 18 22 20 30 30

12
Statistical Methods (Comp/IT)
2.10 MULTIPLE CORRELATION

Introduction: Multiple correlation is based on three or more variables without excluding the effect of
anyone. It is denoted by R as against r, which is used to denote simple bivariate correlation
coefficient.
In case of three variables , the multiple correlation coefficients are given as:
Multiple correlation coefficient with as a dependent variable with as
independent variables and is defined as,

Multiple correlation coefficient with as a dependent variable with as

independent variables and is defined as,

Multiple correlation coefficient with as a dependent variable with as

independent variables and is defined as,

As is the case with simple bivariate correlation, the coefficient of multiple correlation lies between
0 and 1. As R becomes closer to 0, it shows that the relationship is becoming more and more
negligible. In contrast, as it moves closer to 1, it shows that the relationship is becoming more and
more strong. If R is 1, then the correlation is called perfect. It may be added that when R is 0 showing
the absence of a linear relationship, it is just possible that there may be a non-linear relationship
among the variables. Another point to note is that multiple coefficient of correlation is always
positive. This is in contrast to simple bivariate coefficient of correlation, which may vary from -1 to
+1.
We can obtain the coefficient of multiple determination by squaring the multiple coefficient of
correlation.
Example 1: Given the following zero-order coefficients of correlation. Calculate multiple coefficient
of correlations,

Solution:

13
Statistical Methods (Comp/IT)

Example 2: Given the following zero-order coefficients of correlation. Calculate multiple coefficient
of correlations,

Solution:

Example 1: Given the following zero-order coefficients of correlation. Calculate multiple coefficient
of correlations,

Solution:

14
Statistical Methods (Comp/IT)

Example 3: Calculate from the following data:

X 3 4 5 6 7 8 9

Y 2 5 6 4 3 2 4

Z 5 6 4 5 6 5 8

Solution:
X Y Z X2 Y2 Z2 XY XZ YZ

3 2 5 9 4 25 6 15 10

4 5 6 16 25 36 20 24 30

5 6 4 25 36 16 30 20 24

6 4 5 36 16 25 24 30 20

7 3 6 49 9 36 21 42 18

8 2 5 64 25 25 16 40 10

9 4 8 81 16 64 36 72 34

15
Statistical Methods (Comp/IT)

Then zero-order correlation coefficients are given by,

And

2.11 COEFFICIENT OF DETERMINATION:

When, the interpretation of does not pose any problem. When,
, all the points lie on a straight line in a graph showing a perfect positive or a perfect

16
Statistical Methods (Comp/IT)
negative correlation. When the points are extremely scattered on a graph, then it becomes evident that
there is almost no relationship between the two variables. However, when it comes to other values of
, we have to be careful in it’s interpretation, suppose we get a correlation of we may say
that is ‘twice as good’ or ‘twice as strong’ as a correlation of . It may be noted that
this comparison is wrong. The strength of is judge by coefficient of determination,
We multiply it by 100, thus getting 81 percent. This suggest that when
is then we can say that 81 percent of the total variation in the series can be attributed to the
relationship with . When and in percentage terms it is 20.25.

2.12 MULTIPLE REGRESSION:

The multiple linear regression takes the following form:

Where y is the dependent variable which is to be predicted; are the k known variables
on which the predictions are to be based and are parameters, the value of which are
to be determined by the method of least square.

Example 1) The following data relate to radio advertising expenditures, newspaper advertising
expenditures and sales. Fit a regression

Radio advertising expenditures 4 7 9 12

(‘000Rs)

newspaper advertising expenditures 1 2 5 8

Sales (Rs Lakh) (Y) 7 12 17 20

Solution: It may be noted here that as there are three variables, viz. Y, , there will be three
normal equations as below:

4 1 7 16 4 1 28 7

17
Statistical Methods (Comp/IT)

7 2 12 49 14 4 84 24

9 5 17 81 45 25 153 85

12 8 20 144 96 64 240 160

Applying the above values in the normal equations

….. (1)
…. (2)
….. (3)
After Solving above equations, we get

The regression equation is

Example 2) The following data relate to sales and advertising. Fit a regression

Sales Territory Sales (Lakh Rs) (Y) Advertising Number of selling

(‘000Rs) Agents

1 100 40 10

2 80 30 10

3 60 20 7

4 120 50 15

5 150 60 20

6 90 40 12

7 70 20 8

8 130 60 14

Solution: It may be noted here that as there are three variables, viz. Y, , there will be three
normal equations as below:

18
Statistical Methods (Comp/IT)

40 10 100 1600 400 100 4000 1000

30 10 80 900 300 100 2400 800

20 7 60 400 140 49 1200 420

50 15 120 2500 750 225 6000 1800

60 20 150 3600 1200 400 9000 3000

40 12 90 1600 480 144 3600 1080

20 8 70 400 160 64 1400 560

60 14 130 3600 840 196 8400 1820

Applying the above values in the normal equations

….. (1)
…. (2)
….. (3)
After Solving above equations, we get

The regression equation is

Aptis General Mock Test 1 Writing
No ratings yet
Aptis General Mock Test 1 Writing
3 pages
Correlation and Regression-2023
No ratings yet
Correlation and Regression-2023
28 pages
Correlation Ansd Simple Regression
No ratings yet
Correlation Ansd Simple Regression
27 pages
Correlation Analysis
No ratings yet
Correlation Analysis
16 pages
Correlation
No ratings yet
Correlation
14 pages
Unit 3-1
No ratings yet
Unit 3-1
12 pages
Lecture-25 CORRELATION - 34861774 - 2024 - 05 - 04 - 23 - 38
No ratings yet
Lecture-25 CORRELATION - 34861774 - 2024 - 05 - 04 - 23 - 38
4 pages
Course Pack Correlation
No ratings yet
Course Pack Correlation
12 pages
Correlation & Regression
No ratings yet
Correlation & Regression
12 pages
The Significance of Correlation
No ratings yet
The Significance of Correlation
6 pages
lecture 5
No ratings yet
lecture 5
30 pages
Oe Statistics Notes
No ratings yet
Oe Statistics Notes
32 pages
Correction and Regression
No ratings yet
Correction and Regression
30 pages
Correlation and Regression Analysis
100% (1)
Correlation and Regression Analysis
59 pages
Correlation Notes
No ratings yet
Correlation Notes
9 pages
Correlation Coefficient
No ratings yet
Correlation Coefficient
22 pages
Correlation Analysis Notes-2
No ratings yet
Correlation Analysis Notes-2
5 pages
Chapter Four Correlation Analysis: Positive or Negative
No ratings yet
Chapter Four Correlation Analysis: Positive or Negative
15 pages
Chapter 6 Correlation and Regression
No ratings yet
Chapter 6 Correlation and Regression
29 pages
MRS - Diana-Correlation Analysis-Notes
No ratings yet
MRS - Diana-Correlation Analysis-Notes
16 pages
Chapter 4 (Correlation part)
No ratings yet
Chapter 4 (Correlation part)
16 pages
Correlation
0% (1)
Correlation
22 pages
Correlation Analysis-Students NotesMAR 2023
No ratings yet
Correlation Analysis-Students NotesMAR 2023
24 pages
Correlation and Regression -intro
No ratings yet
Correlation and Regression -intro
24 pages
Correlation Analysis
No ratings yet
Correlation Analysis
54 pages
Correlation 1
100% (1)
Correlation 1
57 pages
Correlation Analysis
100% (1)
Correlation Analysis
51 pages
Correlation
100% (1)
Correlation
78 pages
L3 - Correlation & Rank Correlation
No ratings yet
L3 - Correlation & Rank Correlation
11 pages
Lesson 11 - Regression and Correlation Analysis
No ratings yet
Lesson 11 - Regression and Correlation Analysis
8 pages
Correlation Analysis
No ratings yet
Correlation Analysis
49 pages
QMM 1
No ratings yet
QMM 1
18 pages
Approach To Comparative Politics
No ratings yet
Approach To Comparative Politics
8 pages
Correlation
No ratings yet
Correlation
17 pages
Correlation 6th Sem
No ratings yet
Correlation 6th Sem
11 pages
Unit 4
No ratings yet
Unit 4
10 pages
Correlation Notes
No ratings yet
Correlation Notes
15 pages
Correlation 26-2-24
No ratings yet
Correlation 26-2-24
16 pages
Correlation and Regression Analysis
No ratings yet
Correlation and Regression Analysis
34 pages
WINSEM2023 WINSEM2023 24 - BITE304L - TH - VL2023240503885 - DA 1 QP - KEY 01 22 - Reference Material I
No ratings yet
WINSEM2023 WINSEM2023 24 - BITE304L - TH - VL2023240503885 - DA 1 QP - KEY 01 22 - Reference Material I
57 pages
Correlation
No ratings yet
Correlation
25 pages
Class 11 measures of correlation
No ratings yet
Class 11 measures of correlation
29 pages
Correlation
No ratings yet
Correlation
19 pages
Correlation of Experimental Data CLIL 2017
No ratings yet
Correlation of Experimental Data CLIL 2017
8 pages
Correlation Notes
No ratings yet
Correlation Notes
15 pages
Unit 1 Correlation, Regression and Curve Fitting 2024-25-1
No ratings yet
Unit 1 Correlation, Regression and Curve Fitting 2024-25-1
23 pages
Correlation Bmlt
No ratings yet
Correlation Bmlt
5 pages
Correlation Analysis
No ratings yet
Correlation Analysis
48 pages
Correlation and Regression Unit 1
No ratings yet
Correlation and Regression Unit 1
16 pages
Modelling and Forecast
No ratings yet
Modelling and Forecast
19 pages
Peter
No ratings yet
Peter
48 pages
Statistics & Probability Q4 - Week 7-8
No ratings yet
Statistics & Probability Q4 - Week 7-8
15 pages
Correction
No ratings yet
Correction
10 pages
Correlation Analysis and Its Types
No ratings yet
Correlation Analysis and Its Types
50 pages
Correlation Regression
No ratings yet
Correlation Regression
42 pages
Chapter - Six
No ratings yet
Chapter - Six
8 pages
PPP Correlation BIOSTATISTICS
No ratings yet
PPP Correlation BIOSTATISTICS
14 pages
Student Solutions Manual to Accompany Economic Dynamics in Discrete Time, secondedition
From Everand
Student Solutions Manual to Accompany Economic Dynamics in Discrete Time, secondedition
Yue Jiang
4.5/5 (2)
Student's Solutions Manual and Supplementary Materials for Econometric Analysis of Cross Section and Panel Data, second edition
From Everand
Student's Solutions Manual and Supplementary Materials for Econometric Analysis of Cross Section and Panel Data, second edition
Jeffrey M. Wooldridge
No ratings yet
Mathematical Analysis 1: theory and solved exercises
From Everand
Mathematical Analysis 1: theory and solved exercises
Alessio Mangoni
5/5 (1)
Elements of Tensor Calculus
From Everand
Elements of Tensor Calculus
A. Lichnerowicz
3.5/5 (2)
Hashing
No ratings yet
Hashing
1,668 pages
Cheeky Monkey 1 SB Unit 4-1
No ratings yet
Cheeky Monkey 1 SB Unit 4-1
24 pages
MEMORIES OF CHILDHOOD IMPORTANT QUESTIONS
No ratings yet
MEMORIES OF CHILDHOOD IMPORTANT QUESTIONS
3 pages
Mshs Catch Up Plan Mapeh 10
No ratings yet
Mshs Catch Up Plan Mapeh 10
2 pages
Soccer Competitive Edge
No ratings yet
Soccer Competitive Edge
72 pages
Gemma O. Nera: Department of Education
No ratings yet
Gemma O. Nera: Department of Education
5 pages
MPPT Operation For PV Grid-Connected System Using RBFNN and Fuzzy Classification
No ratings yet
MPPT Operation For PV Grid-Connected System Using RBFNN and Fuzzy Classification
9 pages
KET Exam Online
No ratings yet
KET Exam Online
12 pages
DOC-20241013-WA0000. (1)
No ratings yet
DOC-20241013-WA0000. (1)
3 pages
Study of A Juyo Blade
No ratings yet
Study of A Juyo Blade
5 pages
Bhubaneshwar PDF
No ratings yet
Bhubaneshwar PDF
35 pages
Milky Mushroom Cultivation Project Report
No ratings yet
Milky Mushroom Cultivation Project Report
21 pages
130721StonemoodRomangranitECatalog Compressed
No ratings yet
130721StonemoodRomangranitECatalog Compressed
89 pages
HST756
100% (1)
HST756
2 pages
AVID CCR Framework Overview
No ratings yet
AVID CCR Framework Overview
1 page
Third Division: Positive Action Foundation Philippines, Inc.) - This Is A Petition For
No ratings yet
Third Division: Positive Action Foundation Philippines, Inc.) - This Is A Petition For
4 pages
An Automatic Determining Food Security Status Machine Learning Based Analysis of Household Survey Data
No ratings yet
An Automatic Determining Food Security Status Machine Learning Based Analysis of Household Survey Data
12 pages
Form - INS-KOLAKA-11 Inspection Pneumatic Impact
No ratings yet
Form - INS-KOLAKA-11 Inspection Pneumatic Impact
1 page
Final Edu 011 Items
No ratings yet
Final Edu 011 Items
7 pages
Design Spherical Tankdocx
No ratings yet
Design Spherical Tankdocx
6 pages
15, Health, Safety & Security
No ratings yet
15, Health, Safety & Security
25 pages
Catalogue: Industrial
No ratings yet
Catalogue: Industrial
24 pages
CAB Unit V
No ratings yet
CAB Unit V
31 pages
AS Kinematics (June 24 Session)
No ratings yet
AS Kinematics (June 24 Session)
46 pages
Makalah Bahasa inggris
No ratings yet
Makalah Bahasa inggris
12 pages
CABR Mechanical Splicing For Rebar
No ratings yet
CABR Mechanical Splicing For Rebar
73 pages
DownloadFile 2
No ratings yet
DownloadFile 2
28 pages
Abnormalities of Labour and Delivery and Their Management: Joó József Gábor
No ratings yet
Abnormalities of Labour and Delivery and Their Management: Joó József Gábor
44 pages
Diagnostic Observation Lesson Plan Future Forms
100% (1)
Diagnostic Observation Lesson Plan Future Forms
21 pages

Unit II Notes Correlation and Regression

Uploaded by

Unit II Notes Correlation and Regression

Uploaded by

Statistical Methods (Comp/IT)

UNIT – II: LINEAR STATISTCIAL MODELS

x  x  147  14.7, y  y  170  17

Ex. 2: Calculate the coefficient of correlation from the following information.

Correlation coefficient between x and y is r  x, y  

Where, is the population correlation coefficient.

Where, is the sample correlation coefficient.

The critical value of t for df at level of significance is As the calculated value of

Where, indicates correlation coefficient of the population.

Ex.3 Calculate the coefficient of correlation for the following distribution

2.6 CURVE FITTING

2.7 SCATTER DIAGRAM

2.8 LEAST SQUARE APPROXIMATION

2.9 REGRESSION LINES:

For simplification of calculation, we can use change of scale property. If u  x  a, v  y  b then

Ex.1: Obtain lines of regression for the following data.

Ex.2: Compute the regression lines for the following data:

covu, v   uv  u v  640   1.429 1.143   89.795

u i  u 2  698   2.042   97.672

vi  v 2  594   1.306   83.551

y - 24.857  0.9194  x  24.571 (1)

x - 24.571  -1.0749  y  24.857  (2)

Multiple correlation coefficient with as a dependent variable with as

Multiple correlation coefficient with as a dependent variable with as

Example 3: Calculate from the following data:

Then zero-order correlation coefficients are given by,

2.11 COEFFICIENT OF DETERMINATION:

2.12 MULTIPLE REGRESSION:

Radio advertising expenditures 4 7 9 12

newspaper advertising expenditures 1 2 5 8

Sales (Rs Lakh) (Y) 7 12 17 20

12 8 20 144 96 64 240 160

Applying the above values in the normal equations

The regression equation is

Sales Territory Sales (Lakh Rs) (Y) Advertising Number of selling

40 10 100 1600 400 100 4000 1000

30 10 80 900 300 100 2400 800

20 7 60 400 140 49 1200 420

50 15 120 2500 750 225 6000 1800

60 20 150 3600 1200 400 9000 3000

40 12 90 1600 480 144 3600 1080

20 8 70 400 160 64 1400 560

60 14 130 3600 840 196 8400 1820

Applying the above values in the normal equations

The regression equation is

You might also like