0% found this document useful (0 votes)

126 views76 pages

Correlation-Regression 2019

Here are the steps to calculate the correlation coefficient r and coefficient of determination r^2 for the given data: 1) Calculate the mean (average) of X and Y: X-bar = (3 + 7 + 4 + 2 + 0 + 4 + 1 + 2) / 8 = 3 Y-bar = (11 + 18 + 9 + 4 + 7 + 6 + 3 + 8) / 8 = 8 2) Calculate the deviations from the mean: X' = X - X-bar Y' = Y - Y-bar 3) Calculate X'Y': X'Y' = (0)×(3) + (4)×(10

Uploaded by

ANCHURI NANDINI

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

126 views76 pages

Correlation-Regression 2019

Uploaded by

ANCHURI NANDINI

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 76

Introduction to Linear Regression

and Correlation Analysis

 The correlation between two random variables, X
and Y, is a measure of the degree of linear
association between the two variables.
 The population correlation, denoted by ρ, and the
sample Correlation coefficient denoted y ‘r’ can
take on any value from -1 to 1.

Methods of Correlation Analysis

 Scatter diagram method

 Karl Pearson’s Correlation Coefficient

 Spearman’s Rank Correlation Method

Scatter Plots and Correlation

 A scatter plot (or scatter diagram) is used to show

the relationship between two variables
 Correlation analysis is used to measure strength
of the association (linear relationship) between
two variables
 Only concerned with strength of the
relationship
 No causal effect is implied
Scatter Plot Examples
Linear relationships Curvilinear relationships

y y

x x

y y

x x
Scatter Plot Examples
(continued)
Strong relationships Weak relationships

y y

x x

y y

x x
Scatter Plot Examples
(continued)
No relationship

x
Correlation Coefficient
(continued)

 The population correlation coefficient ρ (rho)

measures the strength of the association
between the variables
 The sample correlation coefficient r is an
estimate of ρ and is used to measure the
strength of the linear relationship in the
sample observations
Features of ρ and r
 Unit free
 Range between -1 and 1
 The closer to -1, the stronger the negative
linear relationship
 The closer to 1, the stronger the positive
linear relationship
 The closer to 0, the weaker the linear
relationship
Examples of Approximate
r Values
y y y

x x x
r = -1 r = -.6 r=0
y y

x x
r = +.3 r = +1
Calculating the Correlation Coefficient by using
Karl Pearson’s method
To measure the intensity of the relationship between the
variables Karl Person proposed a formula known as Karl
Pearson's Correlation coefficient

r
 ( x  x )( y  y )
[ ( x  x ) ][  ( y  y ) ]
2 2

or the algebraic equivalent:

n xy   x  y
r
[n(  x 2 )  (  x )2 ][n(  y 2 )  (  y )2 ]
where:
r = Sample correlation coefficient
n = Sample size
x = Value of the independent variable
y = Value of the dependent variable
Assumptions of using Pearson’s
Correlation Coefficient
 Pearson’s correlation coefficient is appropriate to
calculate when both variables ‘x’ and ‘y’ are measured
on an interval or a ratio scale

 Both variables are Normally distributed, and that there

is a linear relationship between these variables

 There is a cause and effect relationship between two

variables that influences the distribution of both the
variables.
Probable Error and Standard Error of
Coefficient of Correlation
 By using probable error we can find whether the
obtained correlation coefficient is significant or not
significant

1 r 2
 P.E ( r ) = (0.6745)
n
 If r < 6.P.E ( r ) then the value of ‘r’ is not significant
 If r > 6.P.E ( r ) then the value of ‘r’ is significant
Coefficient of Determination
 Coefficient of determination is denoted by r2.
 It always has value between 0 to 1
 By using coefficient of determination we can find the
strength of the relationship between variables but we
lose the information about the direction
 r2 = 0 then no variation in y can be explain by the
variable x
 r 2=1 then the values of y completely explained by x
Examples of Approximate
R2 Values
y
R2 = 1

Perfect linear relationship

between x and y:
x
R2 = 1
y 100% of the variation in y is
explained by variation in x

x
R = +1
2
Examples of Approximate
R2 Values
y
0 < R2 < 1

Weaker linear relationship

between x and y:
x
Some but not all of the
y
variation in y is explained
by variation in x

x
Examples of Approximate
R2 Values

R2 = 0
y
No linear relationship
between x and y:

The value of Y does not

x depend on x. (None of the
R2 = 0
variation in y is explained
by variation in x)
Example:
 The sales manager of copier wants to
determine whether there is a relationship
between the number of sales calls made in a
month and the number of copiers sold in that
month. The manager selects a random sample
of 10 representatives and determines the
number of sales calls each representative made
last month and the copiers sold. The sample
information is given below
Sales calls and Copier sales
Sales Person Number of sales No. of copiers sold
calls
Medha 20 30
Mahathi 40 60
Nikhil 20 40
Sai Ram 30 60
Sathya 10 30
Sashi 10 40
krishna 20 40
Pavan 20 50
Raman 20 30
Hari 30 70
Calculation Example
X Y XY X² Y²

20 30 600 400 900

40 60 2400 1600 3600
20 40 800 400 1600
30 60 1800 900 3600
10 30 300 100 900
10 40 400 100 1600
20 40 800 400 1600
20 50 1000 400 2500
20 30 600 400 900
30 70 2100 900 4900
220 450 10800 5600 22100
Calculation Example
n  xy   x  y
r
[n( x 2 )  ( x)2 ][n( y 2 )  ( y)2 ]
80

70
10(10800) (220)(450)
60 
50

40
[10(5600) (220)2 ][8(22100) (450)2 ]
30
0.759014
20

0
5 10 15 20 25
Y
30 35 40 45
There is positive relation
between the sales calls and
sales of the copier
Calculation Example II
Tree Trunk
Height Diameter
y x xy y2 x2
35 8 280 1225 64
49 9 441 2401 81
27 7 189 729 49
33 6 198 1089 36
60 13 780 3600 169
21 7 147 441 49
45 11 495 2025 121
51 12 612 2601 144
=321 =73 =3142 =14111 =713
Calculation Example
(continued)

Tree n xy   x  y
Height,
y
r
70
[n(  x 2 )  (  x)2 ][n(  y 2 )  (  y)2 ]
60

8(3142)  (73)(321)
50 
40
[8(713)  (73)2 ][8(14111)  (321)2 ]
30

 0.886
20

0
r = 0.886 → relatively strong positive
0 2 4 6 8 10 12 14
linear association between x and y
Trunk Diameter, x
Excel Output

Excel Correlation Output

Tools / data analysis / correlation…

Tree Height Trunk Diameter

Tree Height 1
Trunk Diameter 0.886231 1

Correlation between
Tree Height and Trunk Diameter
Ex: Pepsi Cola is studying the effect of its last advertising
campaign. People chosen at random were called and
asked how many cans of Pepsi Cola had bought (X) in
the past week and how many advertisements (Y) they
had either read or seen in the past week.

X :3 7 4 2 0 4 1 2
Y :11 18 9 4 7 6 3 8

Calculate the coefficient of Correlation and coefficient of

determination.
An economist wanted to find out if there was any
relationship between the unemployment rate in a country
and its inflation rate . Data gathered from 7 countries for
the year 2004 are given below.

Country Unemployment Inflation rate

rate (%) (%)
A 4.0 3.2
B 8.5 8.2
C 5.5 9.4
D 0.8 5.1
E 7.3 10.1
F 5.8 7.8
G 2.1 4.7
Find the degree of linear association between a country’s
unemployment and its level of inflation.
Spearman Rank correlation( )
 Correlation between ranks of two individuals is
known as Rank correlation
 To measure the intensity of the relationship
between the variables (having ordinal data), we
use Spearman rank correlation
 Spearman rank correlation lies between +1 and
-1
 If Rank Correlation coefficient is +1 there is
perfect positive correlation and if it is -1 there is
perfect negative correlation
Spearman's Rank Correlation is given by

 6 d  2
  x, y   1   2 
 n(n  1) 
Where d  R x  R y
n is no. of pair of observations
When ranks are equal we add correction factor to
∑d2 and is given by

 
 6  d 2  correction factor 
 ( x, y )  1   
2
 n(n  1) 

m(m2  1)
Where correction factor is , m is number
12
of times an item repeated
Ten Competitors in a beauty contest are ranked by
three judges in the following order.

Judge I :1 6 5 10 3 2 4 9 7 8
Judge II :3 5 8 4 7 10 2 1 6 9
Judge III:6 4 9 8 1 2 3 10 5 7

Determine which pair of judges has the nearest

approach to common tastes in beauty.
1 2 3 D1=1 -2 D2=1- 3 D3 =2 -3 (D1)² (D2)² (D3)²
A financial analyst wanted to find out whether inventory
turnover influences any company’s earnings per share (in
%).A random sample of 7 companies listed in a stock
exchange were selected and the following data was
obtained for each.
Company Inventory turnover Earnings per
(No.of times) share(%)
A 4 11
B 5 9
C 7 13
D 8 7
E 6 13
F 3 8
G 5 8
Find the strength of association between inventory
turnover and earnings per share. Interpret the result.
Co efficient of Determination:

The Squared value of Coefficient of Correlation

is called Co efficient of determination.

It indicates “the proportion of the total variability

of dependent variable that is accounted for or
explained by the independent variable”.

It always lies between 0 and 1.

 The following table gives indices of industrial
production and number of registered unemployed
people (in lakh). Calculate the value of the
correlation coefficient.

Year 1991 1992 1993 1994 1995 1996 1997 1998

Index of production 100 102 104 107 105 112 103 99

No.Of Unemployed 15 12 13 11 12 12 19 26
Introduction to Regression Analysis

 Regression analysis is used to:

 Predict the value of a dependent variable based on
the value of at least one independent variable
 Explain the impact of changes in an independent
variable on the dependent variable
Dependent variable: the variable we wish to
explain
Independent variable: the variable used to
explain the dependent variable
Simple Linear Regression Model

 Only one independent variable, x

 Relationship between x and y is
described by a linear function
 Changes in y are assumed to be caused
by changes in x
Types of Regression Models
Positive Linear Relationship Relationship NOT Linear

Negative Linear Relationship No Relationship

Population Linear Regression

The population regression model:

Population Random
Population Independent Error
Slope
y intercept Variable term, or
Coefficient
Dependent residual

y  β0  β1x  ε
Variable

Linear component Random Error

component
Linear Regression Assumptions

 Error values (ε) are statistically independent

 Error values are normally distributed for any
given value of x
 The probability distribution of the errors is
normal
 The probability distribution of the errors has
constant variance
 The underlying relationship between the x
variable and the y variable is linear
Population Linear Regression
(continued)

y y  β0  β1x  ε
Observed Value
of y for xi

εi Slope = β1
Predicted Value
Random Error
of y for xi
for this x value

Intercept = β0

xi x
Estimated Regression Model
The sample regression line provides an estimate of
the population regression line

Estimated Estimate of Estimate of the

(or predicted) the regression regression slope
y value
intercept
Independent

ŷ i  b0  b1x variable

The individual random error terms ei have a mean of zero

Least Squares Criterion

 b0 and b1 are obtained by finding the values

of b0 and b1 that minimize the sum of the
squared residuals

 
e 2
 (y ŷ) 2

  (y  (b 0  b1x)) 2
The Least Squares Equation
 The formulas for b1 and b0 are:

b1 
 ( x  x )( y  y )
 (x  x) 2

algebraic equivalent:
and

 xy   x y
b1  n b0  y  b1 x
(
x  n
2  x ) 2
Interpretation of the
Slope and the Intercept

 b0 is the estimated average value of y

when the value of x is zero

 b1 is the estimated change in the

average value of y as a result of a one-
unit change in x
Finding the Least Squares Equation

 The coefficients b0 and b1 will usually be

found using computer software, such as
Excel or Minitab

 Other regression measures will also be

computed as part of computer-based
regression analysis
Simple Linear Regression Example

 A real estate agent wishes to examine the

relationship between the selling price of a home
and its size (measured in square feet)
 A random sample of 10 houses is selected
 Dependent variable (y) = house price in $1000s

 Independent variable (x) = square feet

Sample Data for House Price Model
House Price in $1000s Square Feet
(y) (x)
245 1400
312 1600
279 1700
308 1875
199 1100
219 1550
405 2350
324 2450
319 1425
255 1700
Regression Using Excel
 Tools / Data Analysis / Regression
Excel Output
Regression Statistics
Multiple R 0.76211 The regression equation is:
R Square 0.58082
Adjusted R Square 0.52842 house price  98.24833  0.10977 (square feet)
Standard Error 41.33032
Observations 10

ANOVA
df SS MS F Significance F
Regression 1 18934.9348 18934.9348 11.0848 0.01039
Residual 8 13665.5652 1708.1957
Total 9 32600.5000

Coefficients Standard Error t Stat P-value Lower 95% Upper 95%

Intercept 98.24833 58.03348 1.69296 0.12892 -35.57720 232.07386
Square Feet 0.10977 0.03297 3.32938 0.01039 0.03374 0.18580
Graphical Presentation
 House price model: scatter plot and
regression line
450
400
House Price ($1000s)

350
Slope
300
250
= 0.10977
200
150
100
50
Intercept 0
= 98.248 0 500 1000 1500 2000 2500 3000
Square Feet

house price  98.24833  0.10977 (square feet)

Interpretation of the
Intercept, b0

house price  98.24833  0.10977 (square feet)

 b0 is the estimated average value of Y when the

value of X is zero (if x = 0 is in the range of
observed x values)
 Here, no houses had 0 square feet, so b0 = 98.24833
just indicates that, for houses within the range of
sizes observed, $98,248.33 is the portion of the
house price not explained by square feet
Interpretation of the
Slope Coefficient, b1

house price  98.24833  0.10977 (square feet)

 b1 measures the estimated change in the

average value of Y as a result of a one-
unit change in X
 Here, b1 = .10977 tells us that the average value of a
house increases by .10977($1000) = $109.77, on
average, for each additional one square foot of size
Example: House Prices

House Price Estimated Regression Equation:

Square Feet
in $1000s
(x)
(y)
house price  98.25  0.1098 (sq.ft.)
245 1400
312 1600
279 1700
308 1875 Predict the price for a house
199 1100 with 2000 square feet
219 1550
405 2350
324 2450
319 1425
255 1700
Example: House Prices
(continued)
Predict the price for a house
with 2000 square feet:

house price  98.25  0.1098 (sq.ft.)

 98.25  0.1098(200 0)

 317.85
The predicted price for a house with 2000
square feet is 317.85($1,000s) = $317,850
Example: Market Trend
Over all Market Average Return %
 In finance, it is of interest to look at Return % (X) (Y)
the relationship between Y, a stock’s
10 11
average return, and X, the overall
market return. The slope coefficient 12 15
computed by linear regression is 8 3
called the stock’s beta by investment
15 18
analysts. A beta greater than 1
indicates that the stock is relatively 9 10
sensitive to changes in the market; a 11 12
beta less than 1 indicates that the
8 6
stock is relatively insensitive. For the
following data, compute the beta and 10 7
suggest market trend. 13 18
11 13
Properties of regression lines and
their coefficients:
1. Correlation coefficient is the geometric
mean between the regression
coefficient
2. The sign of correlation coefficient is the
same as that of regression coefficient.
3. Regression coefficients are dependent
of the change origin but not of scale.
Problem
 The following data give the ages and Blood
Pressure of 10 women. Find
1. Correlation Coefficient between age and BP
2. Determine the least square regression
equation of BP on age
3. Estimate the BP of a woman whose age is
45
Data
 AGE BP Calculations
56 147
x y x2 y2 xy
42 125 56 147 3136 21609 8232

36 118 42 125 1764 15625 5250

47 128 36 118 1296 13924 4248

49 145 47 128 2209 16384 6016

42 140 49 145 2401 21025 7105

60 155 42 140 1764 19600 5880

72 160 60 155 3600 24025 9300

72 160 5184 25600 11520

63 149
63 149 3969 22201 9387
55 150
55 150 3025 22500 8250

522 1417 28348 202493 75188

 Correlation coefficient n  xy   x  y
r
[n(  x 2 )  ( x) 2 ][n(  y 2 )  ( y)2 ]
r = 0.891679 10(75188)  (522)(1417 )

[10(28348)  (522) 2 ][10(20249 3)  (1417) 2 ]
 Regression Equation of y on x

ŷ i  b 0  b1 x
and
 x y
b1 
 xy  n b0  y  b1 x
( x ) 2
x 2

n
b1 = 1.11 b0 = 83.755
 Regression equation is

 y = 83.755+ 1.11x
 When x=45 y =?

 Y=133.705
Multiple regression Analysis
 A linear regression equation with more than one
independent variable is called a multiple
regression model.
The linear regression equation with
k independen t variables takes the form :
y  β 0  β1 x1  β 2 x 2  β 3 x 3  ........  β k x k  ε
where
 y is the value of dependent variable to be estimated
 β 0 is a constant
 β1,β 2, ...β k are the regression coefficien ts associated
with each of the x k independen t variable.
 ε is the random error due to chance.
Let the fitted linear regression equation be
yˆ  b 0  b1 x1  b 2 x 2  .......  b k x k which minimizes
the sum of squares errors (SSE)   (y - yˆ ) 2
where
 yˆ is the estimated value of dependent variable y
 b1 , b 2 , b 3 ....b k partial regression coefficien ts and are
obtained by the principle of least squares technique.
 Let us consider the case where two independent
variables and a dependent variable.
The multiple linear regression model
involving two independen t variables is :
y  β 0  β1 x1  β 2 x 2  ε
where
 y is the dependent variable
 x1 and x 2 are independen t variables.
 ε is the random error due to chance.
 β 0 is the y - intercept.
 β1 , β 2 are the regression coefficien ts.
Let the fitted multiple linear regression equation be
yˆ  b 0  b1 x1  b 2 x 2
or
yˆ  b 0  b y1.2 x1  b y2.1x 2
where
 yˆ is the estimated value of dependent variable y.
 x1 , x 2 are the independen t variables.
 b 0 , b1,b 2 are the unknown constants and
are determined by the priniple of least squares technique
which minimizes the sum of squres errors (SSE)   (y - yˆ ) 2
By solving the following equations the values of
b 0 , b1 , b 2 can be determined .

 y  nb 0  b y1.2   x1   b y2.1   x 2 

y x 1  b 0   x1   b y1.2  x   b   x x 
1
2
y2.1 1 2

y x 2  b 0   x 2   b y1.2   x1 x 2   b y2.1  x 2
2
Let the fitted multiple linear regression equation be
y  b 0  b 1 x1  b 2 x 2
or y  b 0  b y1.2 x1  b y2.1x 2 - - - -(1)
y  b 0  b y1.2 x1  b y2.1x 2 - - - -(2)
(1) - (2)
(y - y )  b y1.2 (x1  x1 )  b y2.1 (x 2 - x 2 )
Y  b y1.2 X1  b y2.1 X 2
  Y  X    X     Y  X   X X 
1
2
2 2 2 1
b y1.2 
  X   X     X  X 
2
1
2
2 1 2
2

  Y  X    X     Y  X   X X 
2
2
1 1 2 1
b y2.1 
  X   X     X  X 
2
1
2
2 1 2
2

where Y  y - y , X 1  x 1  x1 , X 2  x 2  x 2
Relationsh ip b/w partial regression coefficien ts & Correlatio n coefficien ts :
 ry1  (ry2  r12 )  σ y
b y1.2   2

σ
 1  r12  1
 ry2  (ry1  r12 )  σ y
b y2.1   2

σ
 1  r12  2
 Y X  1
r  the correlatio n b/w y & x 
y1 1
Y X 2 2
1

 Y X  2
r  the correlatio n b/w y & x 
y2 2
Y X 2 2
2

 X X  1 2
r  the correlatio n b/w x & x 
12 1 2
X X
2 2
1 2
 A marketing manager of a company wants to
predict demand for the product. He is believing
strongly demand is highly influenced by annual
average price of the product (in units) &
advertising expenditure (Rs in lakh).He has
collected past data to know the effect of these
factors on demand and given below:
Y 4 6 7 9 13 15
X1 15 12 8 6 4 3
X2 30 24 20 14 10 4
• The following results are obtained from
measuremen t on length (in mm), volume (in cc)
and weight (in gm) of 300 eggs.
x1  55.95 x 2  51.48 y  56.03
σ 1  2.26 σ 2  4.39 σ y  4.41
ry1  0.578 ry2  0.581 r12  0.974

Obtain the linear regression equation of egg weight

on its length and volume. Hence estimate the weight of an egg
whose length is 58 mm and volume is 52.5 cc.
 The Federal Reserve is performing a preliminary
study to determine the relationship between
certain economic indicators and annual
percentage change in the gross national product
(GNP). Two such indicators being examined are
the amount of the federal government’s deficit (in
billions of dollars) and the Dow Jones Industrial
Average (the mean value over the year). Data for
6 years follow:
Change in GNP 2.5 -1.0 4.0 1.0 1.5 3.0
Federal Deficit 100.0 400.0 120.0 200.0 180.0 80.0
Dow Jones 2850 2100 3300 2400 2550 2700
i. Calculate the least squares equation that best
describes the data.
ii. What % change in GNP would be expected in a year
in which the federal deficit was $240 billion and the
mean Dow Jones value was 3000?
 Multiple correlation analysis:

It is a measure of association between a

dependent variable and several independent
variables taken together.
The coefficient of multiple correlation is given by,

r  r  2ry1ry2r12
2
y1
2
y2
R y.12 
1r 2
12

Its value always lie in between 0 and 1.

 Coefficient of multiple determination:

It is the proportion of the total variation in the

multiple values of dependent variable y,
accounted for or explained by the independent
variables in the multiple regression model.

 The square of coefficient of multiple correlation

is called Coefficient of multiple determination.

UNIT II - Statistics For Data Science - New
No ratings yet
UNIT II - Statistics For Data Science - New
153 pages
BStats 2
No ratings yet
BStats 2
66 pages
1
100% (1)
1
385 pages
Correlation and Regression - The Simple Case
100% (2)
Correlation and Regression - The Simple Case
106 pages
Predictive Modeling Project Report
100% (2)
Predictive Modeling Project Report
31 pages
Correlation
100% (1)
Correlation
49 pages
Lecture 4 Linear Regression
100% (1)
Lecture 4 Linear Regression
44 pages
Twido Programming Guide
100% (2)
Twido Programming Guide
717 pages
4 LM80 - Osram
50% (2)
4 LM80 - Osram
12 pages
Correlation and Regression
100% (6)
Correlation and Regression
36 pages
8multiple Linear Regression
100% (1)
8multiple Linear Regression
21 pages
Module 1 Notes
100% (1)
Module 1 Notes
73 pages
PCRI Manual Chapter 1 PDF
No ratings yet
PCRI Manual Chapter 1 PDF
3 pages
Correlation & Regression
No ratings yet
Correlation & Regression
31 pages
Microsoft PowerPoint Session 4 PDF
No ratings yet
Microsoft PowerPoint Session 4 PDF
86 pages
Correlation Analysis and Regression 22
No ratings yet
Correlation Analysis and Regression 22
41 pages
Data Preprocessing
No ratings yet
Data Preprocessing
77 pages
Introduction To STATISTICS-new
100% (1)
Introduction To STATISTICS-new
46 pages
Estimation and Hypothesis
100% (2)
Estimation and Hypothesis
32 pages
Oe Statistics Notes
No ratings yet
Oe Statistics Notes
32 pages
Module9-Correlation and Regression (Business)
No ratings yet
Module9-Correlation and Regression (Business)
15 pages
Linear Regression With LM Function, Diagnostic Plots, Interaction Term, Non-Linear Transformation of The Predictors, Qualitative Predictors
100% (1)
Linear Regression With LM Function, Diagnostic Plots, Interaction Term, Non-Linear Transformation of The Predictors, Qualitative Predictors
15 pages
Basic Statistics
No ratings yet
Basic Statistics
66 pages
Roles of Business in The Economy
0% (1)
Roles of Business in The Economy
12 pages
Mfylg$f3f !y) NNN) 2
No ratings yet
Mfylg$f3f !y) NNN) 2
13 pages
Multiple Regression and Correlation Analysis: BX A Y
100% (1)
Multiple Regression and Correlation Analysis: BX A Y
35 pages
Correlation Analysis
100% (1)
Correlation Analysis
51 pages
Chapter-1: 1.1 Definition of Time Study
No ratings yet
Chapter-1: 1.1 Definition of Time Study
30 pages
Linear Combination of Random Variables: E (X) and Var (X) of Modified Random Variable
No ratings yet
Linear Combination of Random Variables: E (X) and Var (X) of Modified Random Variable
2 pages
STANAG 6001 Language Proficiency Levels
0% (1)
STANAG 6001 Language Proficiency Levels
32 pages
UCM397228 Pharmaceutical Microbiology Manual
100% (1)
UCM397228 Pharmaceutical Microbiology Manual
91 pages
Regression Analysis
No ratings yet
Regression Analysis
25 pages
Organic LED Full Report
100% (1)
Organic LED Full Report
30 pages
Chapter 3 - Describing Data
No ratings yet
Chapter 3 - Describing Data
39 pages
Data Validation & Research
No ratings yet
Data Validation & Research
41 pages
4 Data-Style Questions On Plant Biology
100% (3)
4 Data-Style Questions On Plant Biology
13 pages
Community Medicine Trans - Epidemic Investigation 2
100% (1)
Community Medicine Trans - Epidemic Investigation 2
10 pages
ISC Computer Project/Computer File JAVA
No ratings yet
ISC Computer Project/Computer File JAVA
30 pages
Googlepreview PDF
No ratings yet
Googlepreview PDF
25 pages
SPSS Multiple Linear Regression
No ratings yet
SPSS Multiple Linear Regression
55 pages
Top 10 Weapons in Shadow Fight 2
No ratings yet
Top 10 Weapons in Shadow Fight 2
2 pages
OUTLIERS
100% (1)
OUTLIERS
5 pages
Introduction To Econometrics (3 Updated Edition, Global Edition)
No ratings yet
Introduction To Econometrics (3 Updated Edition, Global Edition)
8 pages
Session 15 Regression and Correlation
No ratings yet
Session 15 Regression and Correlation
66 pages
Multivariate Regression
No ratings yet
Multivariate Regression
20 pages
Multivariate Statistical Method
No ratings yet
Multivariate Statistical Method
85 pages
Topic04 - Simple Linear Regression
No ratings yet
Topic04 - Simple Linear Regression
11 pages
Class 7
No ratings yet
Class 7
42 pages
Ugc Model Curriculum Statistics: Submitted To The University Grants Commission in April 2001
No ratings yet
Ugc Model Curriculum Statistics: Submitted To The University Grants Commission in April 2001
101 pages
Topic03 Correlation Regression
No ratings yet
Topic03 Correlation Regression
81 pages
Statistics
No ratings yet
Statistics
41 pages
Harvard Referencing Guide v2
No ratings yet
Harvard Referencing Guide v2
27 pages
Online SH
No ratings yet
Online SH
39 pages
PSCV Unit-Iii Digital Notes
No ratings yet
PSCV Unit-Iii Digital Notes
46 pages
STATA Codes - Basic
No ratings yet
STATA Codes - Basic
8 pages
Detailed Execution Plan Summer Training Project - BBA 311: Category of Projects Mentioned Below
No ratings yet
Detailed Execution Plan Summer Training Project - BBA 311: Category of Projects Mentioned Below
18 pages
Art Criticism and Aesthetic Analysis
No ratings yet
Art Criticism and Aesthetic Analysis
21 pages
Lesson 11 - Regression and Correlation Analysis
No ratings yet
Lesson 11 - Regression and Correlation Analysis
8 pages
Specialization Project Report Final
No ratings yet
Specialization Project Report Final
49 pages
Linear Regression
No ratings yet
Linear Regression
28 pages
Regression
No ratings yet
Regression
46 pages
Sampling and Sampling Distributions: Mrs. Kiranmayi Patel
No ratings yet
Sampling and Sampling Distributions: Mrs. Kiranmayi Patel
35 pages
Sta 312 Regression Analysis and Analysis of Variance
No ratings yet
Sta 312 Regression Analysis and Analysis of Variance
5 pages
Chapter 2 - Describing Data
No ratings yet
Chapter 2 - Describing Data
24 pages
Lec Set 1 Data Analysis
No ratings yet
Lec Set 1 Data Analysis
55 pages
App.A - Detection and Estimation in Additive Gaussian Noise PDF
No ratings yet
App.A - Detection and Estimation in Additive Gaussian Noise PDF
55 pages
FHMM 1134 Tutorial 5 Correlation and Regression
No ratings yet
FHMM 1134 Tutorial 5 Correlation and Regression
4 pages
Physician Licensure Exam August 2010 Performance of Schools
No ratings yet
Physician Licensure Exam August 2010 Performance of Schools
2 pages
Correlation
No ratings yet
Correlation
34 pages
Applications of Statistical Software For Data Analysis
No ratings yet
Applications of Statistical Software For Data Analysis
5 pages
One-Way ANOVA: Introduction To Analysis of Variance (Anova)
No ratings yet
One-Way ANOVA: Introduction To Analysis of Variance (Anova)
30 pages
Univariate and Bivariate Data Analysis + Probability
100% (1)
Univariate and Bivariate Data Analysis + Probability
5 pages
Anchuri Nandini (28-004) - SM&CRM
No ratings yet
Anchuri Nandini (28-004) - SM&CRM
13 pages
Human Resource Management: Labor Relations and Collective Bargaining
No ratings yet
Human Resource Management: Labor Relations and Collective Bargaining
39 pages
CB Final
No ratings yet
CB Final
17 pages
Tutorial 6 Linear Regression and Correlation
No ratings yet
Tutorial 6 Linear Regression and Correlation
4 pages
Human Resource Management: Managing Global Human Resources
No ratings yet
Human Resource Management: Managing Global Human Resources
24 pages
Count Data Models in SAS
No ratings yet
Count Data Models in SAS
12 pages
Tutorial On "R" Programming Language
No ratings yet
Tutorial On "R" Programming Language
25 pages
RC16 17sem3 7
No ratings yet
RC16 17sem3 7
42 pages
Topic 1 - Whole Numbers: Paper 1
No ratings yet
Topic 1 - Whole Numbers: Paper 1
10 pages
MTCARS Regression Analysis
No ratings yet
MTCARS Regression Analysis
5 pages
Physical Fitness and Wellness: Outline
No ratings yet
Physical Fitness and Wellness: Outline
48 pages
What Is Intrapreneurship
No ratings yet
What Is Intrapreneurship
2 pages
Chapter 4 Summary
No ratings yet
Chapter 4 Summary
2 pages
Script
No ratings yet
Script
6 pages
CH 1 - MDA - 6e - PH
No ratings yet
CH 1 - MDA - 6e - PH
32 pages
Theories of Distribution
No ratings yet
Theories of Distribution
18 pages
Presentation By:-: Manoj Tripura Shivani Nandini Raghavendra Deepthi Section:-TPS (A)
No ratings yet
Presentation By:-: Manoj Tripura Shivani Nandini Raghavendra Deepthi Section:-TPS (A)
11 pages
NWERC 2013 ProblemSet Contest
No ratings yet
NWERC 2013 ProblemSet Contest
22 pages
Improving The Students' Speaking Ability Through Silent Way Method at Smu Negeri 12 Makassar
No ratings yet
Improving The Students' Speaking Ability Through Silent Way Method at Smu Negeri 12 Makassar
10 pages
Libshitz, S. P., Vanderkooy, J. - Pulse Code Modulation An Overview
No ratings yet
Libshitz, S. P., Vanderkooy, J. - Pulse Code Modulation An Overview
16 pages
N1 With Ans
No ratings yet
N1 With Ans
13 pages
Measurement While Drilling: MWD Description OD Equi. ID Length P Weight
No ratings yet
Measurement While Drilling: MWD Description OD Equi. ID Length P Weight
1 page
An R Tutorial Starting Out
No ratings yet
An R Tutorial Starting Out
9 pages
Peran Keluarga Dalam Mewujudkan Lanjut Usia Sejahtera: The Role of Family To Materialize Elderly Welfare
No ratings yet
Peran Keluarga Dalam Mewujudkan Lanjut Usia Sejahtera: The Role of Family To Materialize Elderly Welfare
10 pages
Path To A Healthy Heart.P1229294960jAWTV - Powerpoint
No ratings yet
Path To A Healthy Heart.P1229294960jAWTV - Powerpoint
5 pages
Dummy Regression
No ratings yet
Dummy Regression
23 pages
ST205 Exam Paper 2017
No ratings yet
ST205 Exam Paper 2017
7 pages
Meeting On 2020 Annual Presentation of Stockholders
No ratings yet
Meeting On 2020 Annual Presentation of Stockholders
5 pages
MATH CW1 DEC Decimals Writing in Words and Figures
No ratings yet
MATH CW1 DEC Decimals Writing in Words and Figures
4 pages
ANCHURI NANDINI (28004) - Corporate Interview
No ratings yet
ANCHURI NANDINI (28004) - Corporate Interview
4 pages
Practice Midterm2 Fall2011
No ratings yet
Practice Midterm2 Fall2011
9 pages
ChatLog SSIM CAMU PROCTORTRACK MEETING 2020 - 08 - 08 15 - 49
No ratings yet
ChatLog SSIM CAMU PROCTORTRACK MEETING 2020 - 08 - 08 15 - 49
2 pages
Resume Template
No ratings yet
Resume Template
1 page
Assignment 3 Number Patterns and Problem Solving
No ratings yet
Assignment 3 Number Patterns and Problem Solving
1 page

Correlation-Regression 2019

Uploaded by

Correlation-Regression 2019

Uploaded by

Introduction to Linear Regression

and Correlation Analysis

Methods of Correlation Analysis

 Karl Pearson’s Correlation Coefficient

 Spearman’s Rank Correlation Method

 A scatter plot (or scatter diagram) is used to show

 The population correlation coefficient ρ (rho)

or the algebraic equivalent:

 Both variables are Normally distributed, and that there

 There is a cause and effect relationship between two

Perfect linear relationship

Weaker linear relationship

The value of Y does not

20 30 600 400 900

Excel Correlation Output

Tree Height Trunk Diameter

Calculate the coefficient of Correlation and coefficient of

Country Unemployment Inflation rate

Determine which pair of judges has the nearest

The Squared value of Coefficient of Correlation

It indicates “the proportion of the total variability

It always lies between 0 and 1.

Year 1991 1992 1993 1994 1995 1996 1997 1998

Index of production 100 102 104 107 105 112 103 99

 Regression analysis is used to:

 Only one independent variable, x

Negative Linear Relationship No Relationship

The population regression model:

Linear component Random Error

 Error values (ε) are statistically independent

Estimated Estimate of Estimate of the

The individual random error terms ei have a mean of zero

 b0 and b1 are obtained by finding the values

 b0 is the estimated average value of y

 b1 is the estimated change in the

 The coefficients b0 and b1 will usually be

 Other regression measures will also be

 A real estate agent wishes to examine the

 Independent variable (x) = square feet

Coefficients Standard Error t Stat P-value Lower 95% Upper 95%

house price  98.24833  0.10977 (square feet)

house price  98.24833  0.10977 (square feet)

 b0 is the estimated average value of Y when the

house price  98.24833  0.10977 (square feet)

 b1 measures the estimated change in the

House Price Estimated Regression Equation:

house price  98.25  0.1098 (sq.ft.)

36 118 42 125 1764 15625 5250

47 128 36 118 1296 13924 4248

49 145 47 128 2209 16384 6016

42 140 49 145 2401 21025 7105

60 155 42 140 1764 19600 5880

72 160 60 155 3600 24025 9300

72 160 5184 25600 11520

522 1417 28348 202493 75188

Obtain the linear regression equation of egg weight

It is a measure of association between a

Its value always lie in between 0 and 1.

It is the proportion of the total variation in the

 The square of coefficient of multiple correlation

You might also like