Kendall Tau
Kendall Tau
Kendall Tau
Kendalls Tau rank correlation coefficient is a measure of the association between two scalar or
ordinal variables, say X and Y. It is a non-parametric test and simply measures the extent to which the
order of the observations in X differ from the order of the observations in Y. It is similar to Spearman's
and Pearson's Product Moment Correlation Coefficient, or Pearson's r, in that is measures the
relationship between two variables. Like Spearman's and Pearson's r a negative correlation indicates
that when X is increasing then Y is decreasing. Even though it is a similar to Spearman's in that it is a
non-parametric measure of relationship it differs in the interpretation of the correlation value. Spearman's
and Pearson's r magnitude are similar. However, Kendall's Tau represents a probability. In other words,
it is the difference between the probability that the observed data are in the same order versus the
probability that the observed data are no in the same order.
It is also measure of rank correlation: the similarity of the orderings of the data when ranked by each
of the quantities. It is named after Maurice Kendall, who developed it in 1938, though Gustav Fechner had
proposed a similar measure in the context of time series in 1897.
Advantages:
A pair {(xi, yi), (xj, yj)} is said to be tied if xi = xj or yi = yj; a tied pair is neither concordant nor discordant.
When tied pairs arise in the data, the coefficient may be modified in a number of ways to keep it in the
range [1, 1]:
Tau-a
The Tau-a statistic tests the strength of association of the cross tabulations. Both variables have to be
ordinal. Tau-a will not make any adjustment for ties. There are many ways to show the equation. Equation
1 shows how Kendall's Tau is the probability of the difference of the concordant pairs and the discordant
pairs. This is because the denominator is all possible combinations.
Or
Where:
C/nc = Concordant Pairs
D/nd = Discordant Pairs
n0 = n(n-1)/2
i.e.
and
or
and
we consider it concordant if
and
i.e.
but
or
but
different orders for variables X and Y)
But there is an easier way in determining if it is concordant or discordant. A concordant pair is when
the rank of the second variable is greater than the rank of the former variable. A discordant pair is when
the rank is equal to or less than the rank of the first variable.
Tau-b
The Tau-b statistic, unlike Tau-a, it makes adjustments for ties. Values of Tau-b range from 1 (100%
negative association, or perfect inversion) to +1 (100% positive association, or perfect agreement) A value
of zero indicates the absence of association.
The Kendall Tau-b coefficient is defined as:
Where:
We can calculate
where
is the number of ties in the ith group of ties in variable X . Similarly for Y.
Tau-c
Tau-c also called Kendall-Stuart Tau-c differs from Tau-b as in being more suitable for rectangular
tables than for square tables. It equals the excess of concordant over discordant pairs, multiplied by a
term representing an adjustment for the size of the table.
Tau-c = (C - D)*[2m/(n2(m-1))]
Where:
m = the number of rows or columns, whichever is smaller
n = the sample size.
Hypothesis:
Ho: The variables are not correlated/There is no ordered relationship between the ordered distributions of
categories.
Ha: The variables are correlated/There is ordered relationship between the ordered distributions of
categories.
Decision Rule
Reject the null hypothesis if the computed Tau is not equal to zero.
Calculating Kendall's Tau manually can be very tedious without a computer and is rarely done without a
computer. Large dataset make it almost impossible to do by manually by hand.
EXAMPLES:
Hypothesis:
Display
Ranked
Change in
Testosterone
Ranked
Display
1.16
05.40
.81
3.80
1.06
3.00
1.01
4.80
.96
3.60
1.07
3.60
.90
3.40
1.23
5.20
Decision Rule:
Reject the null hypothesis if the computed Tau is not less than or equal to zero.
If the relationship were perfect and positive, then we would expect that the person who had the
lowest score for change in testosterone would also have the lowest score for display. For each person
who had a display score that was lower than that persons change score, the worse the correlation would
be. The same is true for the second person, and the third person, and so on, for every possible pair of
people. To calculate the Kendall tau-a correlation, all we do is count those which are concordant with the
theory, and those which are discordant with the theory.
We have already put the variables in order, in Table 1. This isnt strictly necessary, but it makes our life
easier. Taking the first person, who is ranked 1 for change in testosterone, how many people are ranked
above that person for display? These are concordant and the answer is 7 people, so C = 7. The number
of discordant people, who are ranked above, is zero, so D = 0.
Take the second person. 6 people are ranked above that person, and they are concordant, so C = 6,
and 1 person (the person ranked 4th in display) is equal, so they are discordant, D = 1. We keep doing this
for each person, but we can make our lives easier by putting this into a table, which is shown in Table 2.
For each pair of people, we say whether the scores are concordant, in which case we give them a C, or
discordant, in which case we give them a D.
We count the total number of Cs, and find there are 21. We count the number of Ds, and find there are
7. Kendalls tau-a can be computed by the following formula:
Decision Remarks:
We reject the null hypothesis because the computed tau is 0.05 which is not equal to zero.
Conclusion:
We conclude that there is a correlation between the two variables which is the score of a person who had
change in testosterone and his score for display.
2. The following data represent a tutor's ranking of ten clinical psychology students as to
Psychology
5
8
6
2
10
3
9
4
7
1
Hypothesis:
Ho: The variables are not correlated.
Ha: The variables are correlated.
Decision Rule:
Reject the null hypothesis if the computed Tau is not less than or equal to zero.
Computation:
Rank 1
Rank 2
Pair 1
Pair 2
Pair 3
Pair 4
Pair 5
C4
C0
C4
C5
C0
D5
D8
D3
D1
D-5
Pair 6
Pair 7
Pair 8
Pair 9
C3
C2
C1
C-0
D-1
D1
D1
D1
D 26
Formula:
= 19-26 / 45
= -0.156
Decision Remarks:
Reject the null hypothesis since the computed tau is less than zero.
Conclusion:
We can conclude that there is a correlation or relationship between the ranking of the career suitability
and psychology knowledge of the students. The tutor tended to rank students with apparently greater
knowledge as more suitable to their career than those with apparently less knowledge and vice versa.