Kendall Tau

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 8
At a glance
Powered by AI
The key takeaways are that Kendall's Tau is a non-parametric measure of rank correlation used to assess the association between two ordinal variables. There are different variants of Kendall's Tau that make adjustments for ties in the data.

The different types of Kendall's Tau discussed are Tau-a, Tau-b, and Tau-c. Tau-a does not adjust for ties, Tau-b makes adjustments for ties, and Tau-c is more suitable for rectangular tables than square tables.

Kendall's Tau is calculated by counting the number of concordant and discordant pairs in the data and taking the difference between these counts divided by the total number of pairs. A concordant pair is when one variable's rank is greater than the other, and a discordant pair is when one rank is less than or equal to the other.

Kendall's Tau Rank Correlation Coefficient or Tau Test

Kendalls Tau rank correlation coefficient is a measure of the association between two scalar or
ordinal variables, say X and Y. It is a non-parametric test and simply measures the extent to which the
order of the observations in X differ from the order of the observations in Y. It is similar to Spearman's
and Pearson's Product Moment Correlation Coefficient, or Pearson's r, in that is measures the
relationship between two variables. Like Spearman's and Pearson's r a negative correlation indicates
that when X is increasing then Y is decreasing. Even though it is a similar to Spearman's in that it is a
non-parametric measure of relationship it differs in the interpretation of the correlation value. Spearman's
and Pearson's r magnitude are similar. However, Kendall's Tau represents a probability. In other words,
it is the difference between the probability that the observed data are in the same order versus the
probability that the observed data are no in the same order.
It is also measure of rank correlation: the similarity of the orderings of the data when ranked by each
of the quantities. It is named after Maurice Kendall, who developed it in 1938, though Gustav Fechner had
proposed a similar measure in the context of time series in 1897.

Advantages:

Spearmans rank correlation is satisfactory for testing null hypothesis of independence


between two variables but it is difficult to interpret when the null hypothesis is rejected.
Kendall Tau improves upon this by reflecting the strength of dependence between the

variables being compared.


This is more appropriate in square tables.
It can deal with ties.
It has a simpler interpretation.

Accounting for Ties

A pair {(xi, yi), (xj, yj)} is said to be tied if xi = xj or yi = yj; a tied pair is neither concordant nor discordant.
When tied pairs arise in the data, the coefficient may be modified in a number of ways to keep it in the
range [1, 1]:

Tau-a
The Tau-a statistic tests the strength of association of the cross tabulations. Both variables have to be
ordinal. Tau-a will not make any adjustment for ties. There are many ways to show the equation. Equation
1 shows how Kendall's Tau is the probability of the difference of the concordant pairs and the discordant
pairs. This is because the denominator is all possible combinations.
Or

Where:
C/nc = Concordant Pairs
D/nd = Discordant Pairs

n0 = n(n-1)/2

For each pair of observations

i.e.

and

or

and

we consider it concordant if

and

the pair are considered discordant if

i.e.
but
or
but
different orders for variables X and Y)

(in this case observations 1 and 2 are in

But there is an easier way in determining if it is concordant or discordant. A concordant pair is when

the rank of the second variable is greater than the rank of the former variable. A discordant pair is when
the rank is equal to or less than the rank of the first variable.

Tau-b
The Tau-b statistic, unlike Tau-a, it makes adjustments for ties. Values of Tau-b range from 1 (100%
negative association, or perfect inversion) to +1 (100% positive association, or perfect agreement) A value
of zero indicates the absence of association.
The Kendall Tau-b coefficient is defined as:

Where:

= the number of pairs with a tie in variable X.


= the number pairs with a tie in variable Y.

We can calculate

where

is the number of ties in the ith group of ties in variable X . Similarly for Y.

Tau-c
Tau-c also called Kendall-Stuart Tau-c differs from Tau-b as in being more suitable for rectangular
tables than for square tables. It equals the excess of concordant over discordant pairs, multiplied by a
term representing an adjustment for the size of the table.
Tau-c = (C - D)*[2m/(n2(m-1))]
Where:
m = the number of rows or columns, whichever is smaller
n = the sample size.

Hypothesis:
Ho: The variables are not correlated/There is no ordered relationship between the ordered distributions of
categories.
Ha: The variables are correlated/There is ordered relationship between the ordered distributions of

categories.

Decision Rule
Reject the null hypothesis if the computed Tau is not equal to zero.
Calculating Kendall's Tau manually can be very tedious without a computer and is rarely done without a
computer. Large dataset make it almost impossible to do by manually by hand.
EXAMPLES:

1. This example shows an example without any ties or Tau-a.


Table 1: Short version of Roney, et al, data, ranked and sorted.
Change in
Testosterone

Hypothesis:

Display

Ranked
Change in
Testosterone

Ranked
Display

1.16

05.40

.81

3.80

1.06

3.00

1.01

4.80

.96

3.60

1.07

3.60

.90

3.40

1.23

5.20

Ho: The variables are not correlated.


Ha: The variables are correlated.

Decision Rule:
Reject the null hypothesis if the computed Tau is not less than or equal to zero.

If the relationship were perfect and positive, then we would expect that the person who had the
lowest score for change in testosterone would also have the lowest score for display. For each person
who had a display score that was lower than that persons change score, the worse the correlation would
be. The same is true for the second person, and the third person, and so on, for every possible pair of
people. To calculate the Kendall tau-a correlation, all we do is count those which are concordant with the
theory, and those which are discordant with the theory.
We have already put the variables in order, in Table 1. This isnt strictly necessary, but it makes our life
easier. Taking the first person, who is ranked 1 for change in testosterone, how many people are ranked
above that person for display? These are concordant and the answer is 7 people, so C = 7. The number
of discordant people, who are ranked above, is zero, so D = 0.
Take the second person. 6 people are ranked above that person, and they are concordant, so C = 6,
and 1 person (the person ranked 4th in display) is equal, so they are discordant, D = 1. We keep doing this
for each person, but we can make our lives easier by putting this into a table, which is shown in Table 2.
For each pair of people, we say whether the scores are concordant, in which case we give them a C, or
discordant, in which case we give them a D.
We count the total number of Cs, and find there are 21. We count the number of Ds, and find there are
7. Kendalls tau-a can be computed by the following formula:

Where C and D represent the number of Cs and Ds.


In our data:

Decision Remarks:
We reject the null hypothesis because the computed tau is 0.05 which is not equal to zero.
Conclusion:

We conclude that there is a correlation between the two variables which is the score of a person who had
change in testosterone and his score for display.

2. The following data represent a tutor's ranking of ten clinical psychology students as to

their suitability for their career and their knowledge of psychology:


Career
4
10
3
1
9
2
6
7
8
5

Psychology
5
8
6
2
10
3
9
4
7
1

Here, there are no ties so again, we will use Tau-a.

Hypothesis:
Ho: The variables are not correlated.
Ha: The variables are correlated.
Decision Rule:
Reject the null hypothesis if the computed Tau is not less than or equal to zero.

Computation:
Rank 1

Rank 2

Pair 1

Pair 2

Pair 3

Pair 4

Pair 5

C4

C0

C4

C5

C0

D5

D8

D3

D1

D-5

Pair 6

Pair 7

Pair 8

Pair 9

C3

C2

C1

C-0

D-1

D1

D1

D1

Sum all Cs and Ds.


C 19

D 26
Formula:

= 19-26 / 45
= -0.156

Decision Remarks:
Reject the null hypothesis since the computed tau is less than zero.
Conclusion:
We can conclude that there is a correlation or relationship between the ranking of the career suitability
and psychology knowledge of the students. The tutor tended to rank students with apparently greater
knowledge as more suitable to their career than those with apparently less knowledge and vice versa.

You might also like