0% found this document useful (0 votes)
43 views48 pages

QT - Unit 2 - Part A - Correlation

Uploaded by

diyajakhar665
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
43 views48 pages

QT - Unit 2 - Part A - Correlation

Uploaded by

diyajakhar665
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 48

Course : BBA LL.

B
Semester : I Semester
Subject : Quantitative Techniques
Subject Code : BBA LLB

Ms. Preeti Goel, Assistant Professor, MAIMS


CORRELATION
Correlation means association – more precisely, it measures
the extent to which two variables are related.

Correlation is used to measure the degree and extent to which two


variables fluctuate with reference to each other
On the basis of degree
On the basis of number of variables
On the basis of linearity
Correlation and Causation
Correlation
The word correlation means a statistical relationship exists between the
variables under investigation. The statistical relationship indicates that the
change in one variable is mathematically related to the change in other
variables. The variable with no causal relationship can also show an excellent
statistical correlation.

Causation
The word Causation means that there is a cause-and-effect relationship
between the variables under investigation. The cause and effect relationship
causes one variable to change with change in other variables. For example, if I
don’t study, I will fail the exam. Alternatively, if I study, I will pass the exam. In
this simple example, the cause is ‘study,’ whereas ‘success in the exam’ is the
effect.

E.g.. - growth from a child to an adult. When your height increased, your mass
increased too. Getting taller didn’t make you also get wider. Instead, maturing
to adulthood caused both variables to increase — that’s causation.
Correlation Coefficient
The quantitative measure of strength in the linear relationship between two
variables is called the correlation coefficient. It is denoted by r.

It measures the extent to which the points cluster about a straight line.

Correlation coefficient ranges from -1 to +1.


Properties of Correlation Coefficient
METHODS OF CORRELATION

1. Scatter Diagram Method

2. Karl Pearson’s Coefficient of Correlation

3. Rank Correlation Method


SCATTER DIAGRAM METHOD
A Scatter diagram is a graphical presentation of bivariate
data on two axes X and Y.
Karl Pearson’s Coefficient of Correlation
Assumptions
Covariance
Covariance is one of the statistical measurement to know the relationship of
the variance between the two variables.

The covariance indicates how two variables are related and also helps to know
whether the two variables vary together or change together. The covariance is
denoted as Cov(X,Y) and the formulas for covariance are given below.
Karl Pearson’s
Coefficient of Correlation
1. With Original Data

2. Deviations from Actual Mean


3. When Deviations are taken from Assumed Mean

4. From Covariance

5. For bivariate frequency distributions


Merits of Karl Pearson’s Method

1. This method not only indicates the presence, or absence of correlation


between any two variables but also, determines the exact extent, or
degree to which they are correlated.
2. Under this method, we can also ascertain the direction of the correlation
i.e. whether the correlation between the two variables is positive, or
negative.
3. It takes into account all the items of variable and is thus based on suitable
measure of variation.
4. This method enables us in estimating the value of a dependent variable
with reference to a particular value of an independent variable through
regression equations.
5. This method has a lot of algebraic properties for which the calculation of
co-efficient of correlation, and a host of other related factors viz. co-
efficient of determination, are made easy.
Demerits of Karl Pearson’s Method

1. It is comparatively difficult to calculate as its computation involves


intricate algebraic methods of calculations.
2. It is very much affected by the values of the extreme items.
3. It is based on a large number of assumptions viz. linear relationship, cause
and effect relationship etc. which may not always hold good.
4. It is very much likely to be misinterpreted particularly in case of
homogeneous data.
5. In comparison to the other methods, it takes much time to arrive at the
results.
6. It is subject to probable error which its propounder himself admits, and
therefore, it is always advisable to compute it probable error while
interpreting its results.
Spearman’s
Coefficient of Rank Correlation
Spearman’s
Coefficient of Rank Correlation
WHEN RANKS ARE NOT EQUAL/NOT REPEATED

WHEN RANKS ARE EQUAL/REPEATED

• The value of will range from -1 to +1.


• Value of +1 indicates perfect association for identical rankings.
• Value of -1 indicates perfect association for reverse rankings.
Case I:
When Ranks are Not Repeated/Equal
Data is given in rank form and rank is not repeated
Data is given in variable form and rank is not repeated
Case II:
When Ranks are Repeated/Equal
• If there are two or more items with same rank in either
series, then assign common rank to each repeated rank.
• Common rank is the average of these ranks which are being
repeated.
• E.g. two items are at rank 4. Common rank assigned to both
of those items would be 4.5 (average of 4 and 5).
• Next item will get rank next to rank used in computing
common rank. In above e.g., next item would be assigned
rank 6.

Where m is number of times a rank is repeated.


Merits of Spearman’s Rank Correlation Method
1. This method is easy to understand.
2. In calculating correlation coefficient, this method is easier than that of Karl
Pearson’s method.
3. When the related data is qualitative, this is the only method to find the measure of
correlation.
4. This is only method that can be used where we are given the ranks and not the
actual data.
5. Even where actual data are given rank method can be applied for ascertaining
correlation.
6. When there is more dispersion in the related numerical data or the extreme
observations are present in the data, Spearman’s method is preferred than Karl
Pearson’s method.

Demerits of Spearman’s Rank Correlation Method


1. This method does not provide accurate measure of correlation coefficient as
compared to Karl Pearson’s method.
2. It is tedious to assign ranks when number of observations is large.
3. This method cannot be used for a bivariate frequency distribution.
4. In case of grouped frequency distribution, this method can’t be applied.
PROBABLE ERROR
Conditions for use of Probable Error
Pitfalls/Limitations of Correlation Analysis
The correlation analysis has certain limitations:
1. Two variables can have a strong non-linear relation and still have a very low
correlation. Recall that correlation is a measure of the linear relationship between
two variables.
2. The correlation can be unreliable when outliers are present.
3. The correlation may be spurious. Spurious correlation refers to the following
situations:
o The correlation between two variables that reflects chance relationships in a
particular data set.
o The correlation induced by a calculation that mixes each of two variables with a
third variable.
o The correlation between two variables arising not from a direct relation between
them, but from their relation to a third variable. Ex: shoe size and vocabulary of
school children. The third variable is age here. Older shoe sizes simply imply that
they belong to older children who have a better vocabulary.
4. Correlation does not imply cause and effect relationship.

You might also like