0% found this document useful (0 votes)
3 views

1. Correlation and Regression

The document discusses correlation and regression, explaining correlation as a statistical measure of the relationship between two variables, which can be positive or negative. It details various types of correlation, how to calculate the Pearson correlation coefficient, and introduces regression as a method to predict one variable based on another. The differences between correlation and regression are highlighted, emphasizing that correlation measures strength while regression provides a predictive equation.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

1. Correlation and Regression

The document discusses correlation and regression, explaining correlation as a statistical measure of the relationship between two variables, which can be positive or negative. It details various types of correlation, how to calculate the Pearson correlation coefficient, and introduces regression as a method to predict one variable based on another. The differences between correlation and regression are highlighted, emphasizing that correlation measures strength while regression provides a predictive equation.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 23

MS. T. F.

HARIRARI
BIOTECHNOLOGY AND BIOCHEMISTRY DEPARTMENT
ROOM 144

LECTURE 20: CORRELATION AND REGRESSION


CORRELATION

• Correlation is a statistical measure that expresses the extent to which two


variables are related.
• Two variables can either be correlated or uncorrelated.
• It is completely symmetrical, the correlation between A and B is the same as the
correlation between B and A.
• Correlation can either be positive or negative.
• Correlation analysis is concerned with finding whether a relationship exists
between two variables and the degree of association.
CORRELATION ANALYSIS

• There are four types of correlations: Pearson correlation, Kendall rank


correlation, Spearman correlation and the Point-Biserial correlation.
• The Pearson correlation is the measure of a linear association between two
normally distributed variables.
• The Pearson correlation coefficient (r) describe the strength of an association
between two variables.
• R is a number between -1 and +1; where sign of r denotes the nature of
association and the value of r denotes the strength of association.
SCATTER PLOT

• A scatter plot can be used to determine whether a linear (straight line) correlation
exists between two variables

Linear correlation Nonlinear correlation No correlation


POSITIVE CORRELATION

• Positive correlation is a relationship between two variables that move in tandem,


i.e., the same direction.
• When one variable increases or decreases the other variable increases or
decreases respectively.
NEGATIVE CORRELATION

• Negative correlation is a relationship between two variables that move in opposite


directions from one another.
• When one variable increases the other decreases, and when one variable
decreases the other increases.
CALCULATING ‘R’

• r = 0; no correlation
• 0 < r < 0.25; weak correlation.
• 0.25 ≤ r < 0.75; intermediate correlation.
• 0.75 ≤ r < 1; strong correlation.
• r = l; perfect correlation
• A sample of 6 dogs was selected, data about their age in years and weight in kilograms was
recorded as shown in the Table 1. Find the correlation between age of the dogs and the weight
of the dogs.
EXAMPLE

Table 1. Age and weight of six dogs


REGRESSION

• Regression is the measure of relation between the mean value of one variable
and corresponding values for the other variable.
• the average value of y is a “function” of x, that is, it changes with x.
• Therefore it is possible to predict variable Y using variable X
• If y represents the dependent variable and x the independent variable, this
relationship is described as the regression of y on x.
REGRESSION EQUATION

• The relationship can be represented by a simple equation called the regression


equation.
• y = a + bx; where
b is the coefficient
a is the y-intercept
y is the predicted value
REGRESSION LINE

• The regression equation can be used to construct a regression line on a scatter


diagram.
• Regression equation describes the regression line mathematically.
• By using the least squares method vertical deviations of plotted points
surrounding a straight line are minimised.
• A “best-fit” line for a certain set of data is constructed on the scatter plot.
• Regression minimizes residuals
Regression line
• The direction in which the line slopes depends on whether the correlation is
positive or negative.
• When the two sets of observations increase or decrease together (positive) the
line slopes upwards from left to right.
• When one set decreases as the other increases (negative) the line slopes
downwards from left to right.
• A sample of goats was selected. The value of their age and their weight is
demonstrated in Table 2.
1. Find the regression equation
2. what is the predicted weight when age is 8.5 years.
EXAMPLE
Table 2: Age and weight of six goats
y = a + bx
A=4.675
B=0.92
Y=4.675 + 0.92x

For x=8.5;
Y= 4.675 + (0.92*8.5)
Y=12.50kg
MULTIPLE REGRESSION

• Multiple regression analysis is a straight forward extension of simple regression


analysis.
• allows more than one independent variable.
REGRESSION VS CORRELATION

• Correlation describes the strength of a linear relationship between two variables.


• Regression tells us how to draw the straight line described by the correlation.
• There is a big difference between correlation and regression.
• Although they are studied together.

You might also like