Stats 4

STATISTICS PPT

Uploaded by

KARL FRED DESTURA

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

16 views23 pages

Stats 4

STATISTICS PPT

Uploaded by

KARL FRED DESTURA

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 23

LINEAR REGRESSION AND CORRELATION

GE 104- MATHEMATICS IN THE MODERN WORLD

WHAT IS LINEAR REGRESSION AND CORRELATION

 The most commonly used techniques for investigating the

relationship between two quantitative variables are
correlation and linear regression. Correlation quantifies the
strength of the linear relationship between a pair of
variables, whereas regression expresses the relationship in
the form of an equation.
OBJECTIVES

•Understand the basics of linear regression.

•Learn how correlation quantifies relationships.
•Explore how these concepts are applied in real-
world scenarios.
UNDERSTANDING CORRELATION

 On a scatter diagram, the closer the points lie to a straight line, the stronger the linear
relationship between two variables. To quantify the strength of the relationship, we can
calculate the correlation coefficient. In algebraic notation, if we have two variables x and y,
and the data take the form of n pairs (i.e. [x 1, y1], [x2, y2], [x3, y3] ... [xn, yn]), then the
correlation coefficient is given by the following equation:

 where is the mean of the x values, and is the mean of the y values.
 This is the product moment correlation coefficient (or Pearson
correlation coefficient). The value of r always lies between -1 and
+1. A value of the correlation coefficient close to +1 indicates a
strong positive linear relationship (i.e. one variable increases with
the other; Fig. 2). A value close to -1 indicates a strong negative
linear relationship (i.e. one variable decreases as the other
increases; Fig. 3). A value close to 0 indicates no linear relationship
(Fig. 4); however, there could be a nonlinear relationship between
the variables (Fig. 5).
FIGURE 2;
CORRELATION COEFFICIENT (R) = +0.9. POSITIVE LINEAR RELATIONSHIP
FIGURE 3;
CORRELATION COEFFICIENT (R) = -0.9. NEGATIVE LINEAR RELATIONSHIP.
FIGURE 4;
CORRELATION COEFFICIENT (R) = 0.04. NO RELATIONSHIP.
FIGURE 5;
CORRELATION COEFFICIENT (R) = -0.03. NONLINEAR RELATIONSHIP.
HYPOTHESIS TEST OF CORRELATION

 We can use the correlation coefficient to test whether there is a linear

relationship between the variables in the population as a whole. The
null hypothesis is that the population correlation coefficient equals 0.
The value of r can be compared with those given in Table 2, or
alternatively exact P values can be obtained from most statistical
packages. For the A&E data, r = 0.62 with a sample size of 20 is
greater than the value highlighted bold in Table 2 for P = 0.01,
indicating a P value of less than 0.01. Therefore, there is sufficient
evidence to suggest that the true population correlation coefficient is
not 0 and that there is a linear relationship between ln urea and age.
TABLE 2.
5% AND 1% POINTS FOR THE DISTRIBUTION OF THE CORRELATION COEFFICIENT
UNDER THE NULL HYPOTHESIS THAT THE POPULATION CORRELATION IS 0 IN A TWO-
TAILED TEST
UNDERSTANDING LINEAR REGRESSION

 In the A&E example we are interested in the effect of age (the

predictor or x variable) on ln urea (the response or y variable). We
want to estimate the underlying linear relationship so that we can
predict ln urea (and hence urea) for a given age. Regression can be
used to find the equation of this line. This line is usually referred to
as the regression line.
EQUATION OF A STRAIGHT LINE

 The equation of a straight line is given by y = a + bx, where the

coefficients a and b are the intercept of the line on the y axis and
the gradient, respectively. The equation of the regression line for the
A&E data (Fig. 7) is as follows: ln urea = 0.72 + (0.017 × age)
(calculated using the method of least squares, which is described
below). The gradient of this line is 0.017, which indicates that for an
increase of 1 year in age the expected increase in ln urea is 0.017
units (and hence the expected increase in urea is 1.02 mmol/l).
 The predicted ln urea of a patient aged 60 years, for example, is
0.72 + (0.017 × 60) = 1.74 units. This transforms to a urea level of
e1.74 = 5.70 mmol/l. The y intercept is 0.72, meaning that if the line
were projected back to age = 0, then the ln urea value would be
0.72. However, this is not a meaningful value because age = 0 is a
long way outside the range of the data and therefore there is no
reason to believe that the straight line would still be appropriate.
FIGURE 7;
REGRESSION LINE FOR LN UREA AND AGE: LN UREA = 0.72 + (0.017 × AGE).
METHOD OF LEAST SQUARES

 The regression line is obtained using the method of least squares.

Any line y = a + bx that we draw through the points gives a
predicted or fitted value of y for each value of x in the data set. For
a particular value of x the vertical difference between the observed
and fitted value of y is known as the deviation, or residual (Fig. 8).
The method of least squares finds the values of a and b that
minimise the sum of the squares of all the deviations. This gives the
following formulae for calculating a and b:
FIGURE 8;
REGRESSION LINE OBTAINED BY MINIMIZING THE SUMS OF SQUARES OF ALL OF
THE DEVIATIONS.
ASSUMPTIONS AND LIMITATIONS

 The use of correlation and regression depends on some underlying

assumptions. The observations are assumed to be independent. For
correlation both variables should be random variables, but for
regression only the response variable y must be random. In carrying
out hypothesis tests or calculating confidence intervals for the
regression parameters, the response variable should have a Normal
distribution and the variability of y should be the same for each
value of the predictor variable.
 A scatter diagram of the data provides an initial check of the
assumptions for regression. The assumptions can be assessed in
more detail by looking at plots of the residuals [4,7]. Commonly, the
residuals are plotted against the fitted values. If the relationship is
linear and the variability constant, then the residuals should be
evenly scattered around 0 along the range of fitted values (Fig. 11).
 (a) Scatter diagram of y against x suggests that the relationship is
nonlinear. (b) Plot of residuals against fitted values in panel a; the
curvature of the relationship is shown more clearly. (c) Scatter
diagram of y against x suggests that the variability in y increases
with x. (d) Plot of residuals against fitted values for panel c; the
increasing variability in y with x is shown more clearly.
 In addition, a Normal plot of residuals can be produced. This is a plot
of the residuals against the values they would be expected to take if
they came from a standard Normal distribution (Normal scores). If
the residuals are Normally distributed, then this plot will show a
straight line. (A standard Normal distribution is a Normal distribution
with mean = 0 and standard deviation = 1.) Normal plots are
usually available in statistical packages.
CONCLUSION

 Both correlation and simple linear regression can be used to

examine the presence of a linear relationship between two variables
providing certain assumptions about the data are satisfied. The
results of the analysis, however, need to be interpreted with care,
particularly when looking for a causal relationship or when using the
regression equation for prediction. Multiple and logistic regression
will be the subject of future reviews.

QUIZ 1 - CHAPTERS 7 AND 8. - Revisión Del Intento
No ratings yet
QUIZ 1 - CHAPTERS 7 AND 8. - Revisión Del Intento
25 pages
Quiz No. 003
0% (1)
Quiz No. 003
2 pages
Touchwin v2.d New Functions PDF
No ratings yet
Touchwin v2.d New Functions PDF
28 pages
Ganism
No ratings yet
Ganism
122 pages
Jaffna Hindu College: Risk Holiday Self - Education Worksheet - 2020
No ratings yet
Jaffna Hindu College: Risk Holiday Self - Education Worksheet - 2020
10 pages
Selenium With Java: Module-1: Overview On Automation & Selenium
No ratings yet
Selenium With Java: Module-1: Overview On Automation & Selenium
5 pages
Correlation and Regression Analysis
No ratings yet
Correlation and Regression Analysis
10 pages
Sea Breeze, Bombay
No ratings yet
Sea Breeze, Bombay
10 pages
VeriLog RTL Design
100% (1)
VeriLog RTL Design
32 pages
Dushyant English 20
No ratings yet
Dushyant English 20
205 pages
Data Commentary: Dr. Claudia Odette J. Ayala
No ratings yet
Data Commentary: Dr. Claudia Odette J. Ayala
39 pages
Correlation and Regression
No ratings yet
Correlation and Regression
16 pages
Filipino 10 - Quarter 4 Summary
100% (1)
Filipino 10 - Quarter 4 Summary
4 pages
Correlation & Regression
No ratings yet
Correlation & Regression
65 pages
Nukhba 004
No ratings yet
Nukhba 004
12 pages
Yaws1995 PDF
No ratings yet
Yaws1995 PDF
6 pages
Advanced Data Structures Lab
No ratings yet
Advanced Data Structures Lab
2 pages
BCS LOYALTY SONG - Full Score
No ratings yet
BCS LOYALTY SONG - Full Score
2 pages
Lesson 9: Test of Correlation and Simple Linear Regression
No ratings yet
Lesson 9: Test of Correlation and Simple Linear Regression
7 pages
Language Specialist (French - English)
No ratings yet
Language Specialist (French - English)
3 pages
Openbiz 2.4 Manual
100% (1)
Openbiz 2.4 Manual
122 pages
Reg & Cor QMS 080-1
No ratings yet
Reg & Cor QMS 080-1
48 pages
Lark Ascending Listening Lesson
No ratings yet
Lark Ascending Listening Lesson
2 pages
Participle Phrase Exercise
100% (2)
Participle Phrase Exercise
3 pages
Linear Regression
100% (2)
Linear Regression
69 pages
Maths Project
No ratings yet
Maths Project
11 pages
SEE5211 Chapter3-P2017
No ratings yet
SEE5211 Chapter3-P2017
58 pages
2018 Alessi - Formation of Planetary Populations I. Metallicity and Envelope Opacity Effects
No ratings yet
2018 Alessi - Formation of Planetary Populations I. Metallicity and Envelope Opacity Effects
19 pages
Correlation and Regression: Associate Professor Georgi Iskrov, PHD Department of Social Medicine and Public Health
No ratings yet
Correlation and Regression: Associate Professor Georgi Iskrov, PHD Department of Social Medicine and Public Health
28 pages
Correlation and Regression Analysis Using SPSS
No ratings yet
Correlation and Regression Analysis Using SPSS
102 pages
Correlation Regression
100% (1)
Correlation Regression
55 pages
Introduction To The Design Principles and Vector Graphics
No ratings yet
Introduction To The Design Principles and Vector Graphics
3 pages
Assignment
No ratings yet
Assignment
8 pages
Correlation Regression
100% (1)
Correlation Regression
25 pages
5 Chapter Fi
No ratings yet
5 Chapter Fi
29 pages
Literature Grade 11: - The Masque of Red Death Focus On Vocabulary
No ratings yet
Literature Grade 11: - The Masque of Red Death Focus On Vocabulary
4 pages
Regression Ex
No ratings yet
Regression Ex
13 pages
Dr. Sufian M. Salih / Regression and Correlation
No ratings yet
Dr. Sufian M. Salih / Regression and Correlation
14 pages
Basic Statistics (3685) PPT - Lecture On 22-01-2019
No ratings yet
Basic Statistics (3685) PPT - Lecture On 22-01-2019
29 pages
Correlation and Linear Regression
No ratings yet
Correlation and Linear Regression
25 pages
Part 2 Exploring Relationships Among Variables
No ratings yet
Part 2 Exploring Relationships Among Variables
8 pages
Regression and Correlation
No ratings yet
Regression and Correlation
13 pages
Correlation and Simple Linear Regression: Y. I.E. X
100% (1)
Correlation and Simple Linear Regression: Y. I.E. X
9 pages
Pe119 Course Syllabus
100% (1)
Pe119 Course Syllabus
11 pages
Correlation and Regression
No ratings yet
Correlation and Regression
61 pages
ASS#1-FINALS Doromal
No ratings yet
ASS#1-FINALS Doromal
8 pages
Inferential Analysis
No ratings yet
Inferential Analysis
45 pages
Correlation and Regression
100% (1)
Correlation and Regression
20 pages
Correlation and Linear
No ratings yet
Correlation and Linear
27 pages
Chapter 3
No ratings yet
Chapter 3
15 pages
Correlation and Simple Linear Regression Analyses: Objectives
No ratings yet
Correlation and Simple Linear Regression Analyses: Objectives
6 pages
Corelation With Example
No ratings yet
Corelation With Example
112 pages
Introduction To Java
No ratings yet
Introduction To Java
15 pages
C Manual
No ratings yet
C Manual
122 pages
UNIT-3: Correlation and Regression Analysis
No ratings yet
UNIT-3: Correlation and Regression Analysis
3 pages
Chapter12 Stats
No ratings yet
Chapter12 Stats
6 pages
Al Khulasa Tul Hasna'
No ratings yet
Al Khulasa Tul Hasna'
71 pages
Chapter 1
No ratings yet
Chapter 1
22 pages
Chapter XI Correlation and Regression
No ratings yet
Chapter XI Correlation and Regression
41 pages
Stats10 - Chapter+4 2
No ratings yet
Stats10 - Chapter+4 2
54 pages
Stats 2
No ratings yet
Stats 2
20 pages
Cha 6
No ratings yet
Cha 6
8 pages
Correlation and Regression
No ratings yet
Correlation and Regression
7 pages
Stats 3
No ratings yet
Stats 3
21 pages
07 - Correlation and Regression Analysis-1
No ratings yet
07 - Correlation and Regression Analysis-1
13 pages
Bandura's Self Efficacy and Dweck's Mindset
No ratings yet
Bandura's Self Efficacy and Dweck's Mindset
16 pages
PED 105 Course Syllabus
No ratings yet
PED 105 Course Syllabus
13 pages
Pathfit 4 Syllabus Sport Volleyball
100% (1)
Pathfit 4 Syllabus Sport Volleyball
13 pages
Traditional Games in The Philippines
No ratings yet
Traditional Games in The Philippines
63 pages
Correlation and Regression
No ratings yet
Correlation and Regression
23 pages
NGFW 6113 RN Vpn-Win A En-Us
No ratings yet
NGFW 6113 RN Vpn-Win A En-Us
5 pages
Presentation Regresion and Correlation
No ratings yet
Presentation Regresion and Correlation
31 pages
Coeficiente de Correlação
No ratings yet
Coeficiente de Correlação
6 pages
Locke's Goal-Setting Theory of Motivation
No ratings yet
Locke's Goal-Setting Theory of Motivation
20 pages
1st Year Chap-1 (1st Half
No ratings yet
1st Year Chap-1 (1st Half
1 page
Biostat Lecture Note 3
No ratings yet
Biostat Lecture Note 3
5 pages
Lecture 7 - Correlation Regression
No ratings yet
Lecture 7 - Correlation Regression
47 pages
Captura de Ecrã 2024-10-16 À(s) 13.04.06
No ratings yet
Captura de Ecrã 2024-10-16 À(s) 13.04.06
38 pages
01skills1 U4 L1 Test VocabularyGrammar
No ratings yet
01skills1 U4 L1 Test VocabularyGrammar
2 pages
Investigating Variables
No ratings yet
Investigating Variables
15 pages
CH 4 - Correlation and Regression YARA&LAMA
No ratings yet
CH 4 - Correlation and Regression YARA&LAMA
27 pages
Correlation Regression
No ratings yet
Correlation Regression
58 pages
Corr - Regression Analysis
No ratings yet
Corr - Regression Analysis
19 pages
6 Continuous Data Analysis
No ratings yet
6 Continuous Data Analysis
49 pages
Correlation Regression Tutorial
No ratings yet
Correlation Regression Tutorial
42 pages
6 Correlation and Linear Regression
No ratings yet
6 Correlation and Linear Regression
32 pages
Topic 13 Correlation and Simple Linear Regression
No ratings yet
Topic 13 Correlation and Simple Linear Regression
17 pages
Correlation and Regression
No ratings yet
Correlation and Regression
38 pages
Linear Regression and Correlation
No ratings yet
Linear Regression and Correlation
41 pages
Module 3
No ratings yet
Module 3
92 pages
Chapter4 - Part 2
No ratings yet
Chapter4 - Part 2
37 pages
Correlation and Regression
No ratings yet
Correlation and Regression
70 pages
Literary 100 A Ranking of The Most Influential Novelists Playwrights and Poets of All Time Daniel S. Burt Download
100% (1)
Literary 100 A Ranking of The Most Influential Novelists Playwrights and Poets of All Time Daniel S. Burt Download
52 pages
Econometrics: A Simple Introduction
From Everand
Econometrics: A Simple Introduction
K.H. Erickson
3.5/5 (5)
Correlation and Regression: Six Sigma Thinking, #8
From Everand
Correlation and Regression: Six Sigma Thinking, #8
Sumeet Savant
5/5 (1)
Mathematical Analysis 1: theory and solved exercises
From Everand
Mathematical Analysis 1: theory and solved exercises
Alessio Mangoni
5/5 (1)
Calculus III Essentials
From Everand
Calculus III Essentials
Editors of REA
1/5 (2)
Born in the year 1959: Astrological character profiles for every day of the year
From Everand
Born in the year 1959: Astrological character profiles for every day of the year
Christoph Däppen
No ratings yet

Stats 4

Uploaded by

Stats 4

Uploaded by

LINEAR REGRESSION AND CORRELATION

GE 104- MATHEMATICS IN THE MODERN WORLD

 The most commonly used techniques for investigating the

•Understand the basics of linear regression.

 We can use the correlation coefficient to test whether there is a linear

 In the A&E example we are interested in the effect of age (the

 The equation of a straight line is given by y = a + bx, where the

 The regression line is obtained using the method of least squares.

 The use of correlation and regression depends on some underlying

 Both correlation and simple linear regression can be used to

You might also like