Circular Data Analysis
Circular Data Analysis
com
Chapter 230
Circular Data
Analysis
Introduction
This procedure computes summary statistics, generates rose plots and circular histograms, computes hypothesis
tests appropriate for one, two, and several groups, and computes the circular correlation coefficient for circular
data.
Angular data, recorded in degrees or radians, is generated in a wide variety of scientific research areas. Examples
of angular (and cyclical) data include daily wind directions, ocean current directions, departure directions of
animals, direction of bone-fracture plane, and orientation of bees in a beehive after stimuli.
The usual summary statistics, such as the sample mean and standard deviation, cannot be used with angular
values. For example, consider the average of the angular values 1 and 359. The simple average is 180. But with a
little thought, we might conclude that 0 is a better answer. Because of this and other problems, a special set of
techniques have been developed for analyzing angular data. This procedure implements many of those techniques.
230-1
© NCSS, LLC. All Rights Reserved.
NCSS Statistical Software NCSS.com
Circular Data Analysis
Technical Details
Suppose a sample of n angles a1 , a2 ,..., a n is to be summarized. It is assumed that these angles are in degrees.
Fisher (1993) and Mardia & Jupp (2000) contain definitions of various summary statistics that are used for
angular data. These results will be presented next. Let
n n
Cp Sp
Cp = ∑ cos( pai ) , Cp =
i =1 n
, Sp = ∑ sin( pa ) , S
i =1
i p =
n
,
Rp
Rp = C p2 + S p2 , Rp =
n
−1 S p
tan C p > 0, S p > 0
Cp
Sp
Tp = tan −1 + π Cp < 0
Cp
S
tan −1 p + 2π S p < 0, C p > 0
Cp
To interpret these quantities it may be useful to imagine that each angle represents a vector of length one in the
direction of the angle. Suppose these individual vectors are arranged so that the beginning of the first vector is at
the origin, the beginning of the second vector is at the end of the first, the beginning of the third vector is at the
end of the second, and so on. We can then imagine a single vector a that will stretch from the origin to the end of
the last observation.
R1 , called the resultant length, is the length of a . R1 is the mean resultant length of a . Note that R1 varies
between zero and one and that a value of R1 near one implies that there was little variation in values of the
angles.
The mean direction, θ , is a measure of the mean of the individual angles. θ is estimated by T1 .
The circular variance, V, measures the variation in the angles about the mean direction. V varies from zero to one.
The formula for V is
V = 1 − R1
The circular standard deviation, v, is defined as
v = − 2 ln( R1 )
230-2
© NCSS, LLC. All Rights Reserved.
NCSS Statistical Software NCSS.com
Circular Data Analysis
n(1 − H )
σ =
4 R2
1 n n
H = cos(2T1 )∑ cos(2ai ) + sin(2T1 )∑ sin(2ai )
n i =1 i =1
230-3
© NCSS, LLC. All Rights Reserved.
NCSS Statistical Software NCSS.com
Circular Data Analysis
Tests of Uniformity
Uniformity refers to the situation in which all values around the circle are equally likely. Occasionally, it is useful
to perform a statistical test of whether a set of data do not follow the uniform distribution. Several tests of
uniformity have been developed. Note that when any of the following tests are rejected, we can conclude that the
data were not uniform. However, when the test is not rejected, we cannot conclude that the data follow the
uniform distribution. Rather, we do not have enough evidence to reject the null hypothesis of uniformity.
Rayleigh Test
The Rayleigh test, discussed in Mardia & Jupp (2000) pages 94-95, is the score test and the likelihood ratio test
for uniformity within the von Mises distribution family. The Rayleigh test statistic is 2nR 2 . For large samples, the
distribution of this statistic under uniformity is a chi-square with two degrees of freedom with an error of
( ) . A closer approximation to the chi-square with two degrees of freedom is achieved by
approximation of O n
−1
the modified Rayleigh test. This test, which has an error of O( n ) , is calculated as follows.
−2
1 nR 4
S * = 1 − 2nR 2 +
2n 2
0.24
V = Vn n + 0155
. +
n
where
a i a i 1
Vn = max ( i ) − − min ( i ) − +
i =1 to n 360 n i =1 to n 360 n n
Published critical values of V are
V Alpha
1.537 0.150
1.620 0.100
1.747 0.050
1.862 0.025
2.001 0.010
This table was used to create an interpolation formula from which the alpha values are calculated.
Watson Test
The following uniformity test is outlined in Mardia & Jupp pages 103-105. The test is conducted by calculating
U 2 and comparing it to a table of values. If the calculated value is greater than the critical value, the null
hypothesis of uniformity is rejected. Note that the test is only valid for samples of at least eight angles.
The calculation of U 2 is as follows
i − 12
2
n
1 1
U = ∑ u( i ) −
2
−u + +
i =1 n 2 12n
230-4
© NCSS, LLC. All Rights Reserved.
NCSS Statistical Software NCSS.com
Circular Data Analysis
where
n
∑u (i)
a( i )
u= i =1
, u( i ) =
n 360
a(1) ≤ a( 2 ) ≤ a( 3) ≤ ≤ a( n ) are the sorted angles. Note that maximum likelihood estimates of κ and θ are used
in the distribution function. Mardia & Jupp (2000) present a table of critical values that has been entered into
NCSS. When a value of U 2 is calculated, the table is interpolated to determine its significance level.
Published critical values of U 2 are
U2 Alpha
0.131 0.150
0.152 0.100
0.187 0.050
0.221 0.025
0.267 0.010
f (a; θ , κ ) =
1
2π I 0 (κ )
[
exp κ cos(a − θ ) ]
where I p ( x ) (the modified Bessel function of the first kind and order p) is defined by
∞ 2r + p
1 x
I p ( x) = ∑ , p = 0,1,2,
r = 0 ( r + p ) !r ! 2
In particular
∞ 2r
1 x
I0 ( x ) = ∑ 2
r = 0 ( r !)
2
2π
1 x cos (θ )
=
2π ∫e
0
dθ
The parameter θ is the mean direction and the parameter κ is the concentration parameter.
The distribution is unimodal. It is symmetric about A. It appears as a normal distribution that is truncated at plus
and minus 180 degrees. When κ is zero, the von Mises distribution reduces to the uniform distribution. As κ gets
large, the von Mises distribution approaches the normal distribution.
230-5
© NCSS, LLC. All Rights Reserved.
NCSS Statistical Software NCSS.com
Circular Data Analysis
Point Estimation
The maximum likelihood estimate of θ is the sample mean direction. That is, θ = T1 .
The maximum likelihood of κ is the solution to
A1 (κ ) = R
where
I1 ( x )
A1 ( x ) = .
I0 ( x )
That is, the MLE of κ is given by
κ * = A1−1 ( R )
This can be approximated by (see Fisher (1993) page 88 and Mardia & Jupp (2000) pages 85-86)
5R 5
2 R + R 3
+ R < 0.53
6
0.43
κ * = − 0.4 + 139. R+ 0.53 ≤ R < 0.53
1− R
1
R ≥ 0.85
3R − 4 R 2 + R 3
This estimate is very biased. This bias is corrected by using the following modified estimator.
* 2
maxκ − , 0 κ * < 2
nκ *
n ≤ 15
κ = ( n − 1) κ
3 *
κ ≥2
*
n n 2
(
+ 1 )
κ* n > 15
R* = [ S ] + [C ]
* 2 * 2
230-6
© NCSS, LLC. All Rights Reserved.
NCSS Statistical Software NCSS.com
Circular Data Analysis
Score Test
The score test, given by Mardia & Jupp (2000) page 123, is computed as
nκ
( )
2
χ S2 = S*
A1 (κ )
For large n, χ S2 follows the chi-square distribution with one degree of freedom.
[( ) ( )]
4n R * 2 − C * 2
if n ≥ 5 and C * ≤ 2 / 3
( )
2
2− C*
χL =
2
1 − C * ( )
2
2n 3
2 log if n ≥ 5 and C * > 2 / 3
( ) ( )
* 2 1− R* 2
n + nC + 3n
The test statistic, χ L2 , follows a chi-square distribution with one degree of freedom.
The test statistic, F, follows an F distribution with one and n-1 degrees of freedom.
Stephens Test
This test, given by Fisher (1993) pages 93-94, is computed as
sin(T1 − θ0 )
E=
1 / ( nκR )
T1 ± cos −1 [
2n 2 R 2 − nz 2
α ] if R ≤ 2 / 3
R 4n − z
2 2
α( )
2
n 2 − n 2 − R 2 exp zα
( )
n
T1 ± cos −1 if R > 2 / 3
R
230-7
© NCSS, LLC. All Rights Reserved.
NCSS Statistical Software NCSS.com
Circular Data Analysis
χ n2−1;α 1 3
R <1− + 2
2n κ 0 8κ 0
When testing κ = κ 0 versus κ > κ 0 , reject the null hypothesis if
χ n2−1;1−α 1 3
R >1− + 2
2n κ 0 8κ 0
These tests are based on the result that
2n(1 − R )
~ χ n2−1
1 3
+
κ 0 8κ 02
n(1 − R )
d=
χ n2−1,α / 2
230-8
© NCSS, LLC. All Rights Reserved.
NCSS Statistical Software NCSS.com
Circular Data Analysis
where
n
∑ p (i)
p = i =1
n
(
p ( i ) = Fκ a( i ) − T1 )
a(1) ≤ a( 2 ) ≤ a( 3) ≤ ≤ a( n ) are the sorted angles and Fκ (a − θ ) is the cumulative distribution function of the von
Mises distribution. Note that maximum likelihood estimates of κ and θ are used in the distribution function.
Lockhart & Stephens (1985) present a table of critical values that has been entered into NCSS. When a value of
U 2 is calculated, the table is interpolated to determine its significance level.
Cox Test
Mardia & Jupp (2000) pages 142-143 present a von Mises goodness-of-fit test that was originally given by Cox
(1975).
The test statistic, C, is distributed as a chi-squared variable with two degrees of freedom under the null hypothesis
that the data follow the von Mises distribution. It is calculated as follows.
sc2 ss2
C= +
nv c (κ ) nv s (κ )
where
n
sc = ∑ cos 2(a i − T1 ) − nα 2 (κ )
i =1
n
ss = ∑ sin 2(a i − T1 )
i =1
vc ( x ) =
1 + α4 [α / 2 + α 3 / 2 − α1α 2 ]
− α 22 − 1
2
2 (1 + α 2 ) / 2 − α12
vS ( x) =
α1 − α 4
−
(α1 − α 3 )
2
2 1 − α2
Multi-Group Tests
Three multi-group tests are available for testing hypotheses about two or more groups. The nonparametric
uniform-scores test tests whether the distributions of the groups are identical. The Watson-Williams F test tests
whether a set of mean directions are equal given that the concentrations are unknown, but equal, given that the
groups each follow a von Mises distribution. The concentration homogeneity test tests whether the concentration
parameters are equal, given that the groups each follow a von Mises distribution.
230-9
© NCSS, LLC. All Rights Reserved.
NCSS Statistical Software NCSS.com
Circular Data Analysis
Wg = 2∑
g
(C 2
Ri + S Ri2 )
i =1 ni
( ) ( )
ni ni g
where C Ri = ∑ cos γ ij , S Ri = ∑ sin γ ij , n = ∑ ni , and γ ij are the circular ranks of the corresponding
j =1 j =1 i =1
If all ni are greater than 10, the distribution of Wg is approximately distributed as a chi-square with 2g-2 degrees
of freedom.
Since ranks are used in this test, ties become an issue. We have adopted the strategy of applying average ranks.
Note that little has been done to test the adoption of this strategy within the realm of circular statistics.
g
∑ R j − R / ( g − 1)
3 j =1
FWW = 1 +
8κ g
n − ∑ R j / (n − g)
j =1
The distribution of FWW is approximately distributed as an F with g-1 and n-1 degrees of freedom when the
assumptions that κ 1 = κ 2 =... = κ g and that the distributions are Von Mises are made. The approximation also
requires that κ ≥ 1 .
230-10
© NCSS, LLC. All Rights Reserved.
NCSS Statistical Software NCSS.com
Circular Data Analysis
j =1
∑ wj
j =1
(
4 nj − 4 ) and
where w j =
3
(
f j = sin −1 2 R j 3 / 8 )
Case II. 0.45 ≤ R ≤ 0.70
U2 is approximately distributed as a chi-square with g-1 degrees of freedom
2
g
g
∑ w j h j
U 2 = ∑ wjhj −
2 j =1
g
j =1
∑ wj j =1
nj − 3 −1
.
R j − 1089
where w j = and h j = sinh
0.797449 .258
1 g 1 1
where ν j = n j − 1 , ν = n − g ,and d = ∑ − .
3( g − 1) j =1 ν j ν
230-11
© NCSS, LLC. All Rights Reserved.
NCSS Statistical Software NCSS.com
Circular Data Analysis
∑ sin (a
k =1
2
1k − T1,1 )∑ sin 2 (a2 k − T2,1 )
k =1
where T1,1 is the mean direction of the first circular variable and T2,1 is the mean direction of second.
The significance of this correlation coefficient can be test using the fact the zr is approximately distributed as a
standard normal, where
nλ20λ02
zr = rc
λ22
and
1 n
λij = ∑ sin i (a1k − T1,1 ) sin j (a2 k − T2,1 )
n k =1
Data Structure
The data consist of one or more variables. Each variable contains a set of angular values. The rows may be
separated into groups using the unique values of an optional grouping variable. An example of a dataset
containing circular data is Circular1.S0. Missing values are entered as blanks (empty cells).
230-12
© NCSS, LLC. All Rights Reserved.
NCSS Statistical Software NCSS.com
Circular Data Analysis
Data Type
The data type of the plot is specified independently of the data type specified on the Variables tab of the Circular
Data Analysis procedure.
230-13
© NCSS, LLC. All Rights Reserved.
NCSS Statistical Software NCSS.com
Circular Data Analysis
Direction
This option indicates whether the orientation of the plot is in a 'Clockwise' or 'Counter-Clockwise' direction.
230-14
© NCSS, LLC. All Rights Reserved.
NCSS Statistical Software NCSS.com
Circular Data Analysis
Interior Objects
The two choices for plot styles are Rose Plot and Circular Histogram.
Group Display
When the data is grouped data, this option determines whether the petals within a bin are side-by-side, stacked
upon each other, or overlaid.
Side-by-Side
The bin width is divided equally by the number of groups and the petals are laid out sequentially in the bin.
Although the petals are narrower, they still encompass the points of the group that within the boundaries of the
whole bin.
Stacked
A single petal in each bin is divided by the number of groups. Rose plots with the group display set to Stacked
may be misleading because the proportional area is larger for the outside groups.
Overlaid
Each petal for each group is overlaid in each bin. Some degree of transparency is recommended when using the
Overlaid group display. It is also difficult to distinguish groups when there are more than 2 or 3 groups.
Petal Width
Specify the percent of the total width of each bin that is to be used for each petal.
230-15
© NCSS, LLC. All Rights Reserved.
NCSS Statistical Software NCSS.com
Circular Data Analysis
Number of Bins
Specify the number of bins for the circle.
230-16
© NCSS, LLC. All Rights Reserved.
NCSS Statistical Software NCSS.com
Circular Data Analysis
230-17
© NCSS, LLC. All Rights Reserved.
NCSS Statistical Software NCSS.com
Circular Data Analysis
References Tab
Direction References
The options in this section allow you to specify the tick marks and references going around the plot.
Magnitude References
The options in this section allow you to specify the tick marks and references going from the center to the outside
of the plot.
230-18
© NCSS, LLC. All Rights Reserved.
NCSS Statistical Software NCSS.com
Circular Data Analysis
Setup
To run this example, complete the following steps:
Option Value
Variables Tab
Data Variables ........................................ Wind
Grouping Variable ................................... Group
Hypothesized Theta ................................ 40
Hypothesized Kappa .............................. 2
Mean Circular
Sample Mean Resultant Circular Standard Circular Von Mises
Size Direction Length Variance Deviation Dispersion Concentration
Group (N) (Theta) (R bar) (V) (v) (Delta) (Kappa)
1 10 41.5869 0.9324 0.0676 21.4299 0.1449 5.5452
2 10 42.6725 0.9599 0.0401 16.3991 0.0768 9.1850
Notes:
Theta is the mean direction.
R bar is a measure of data concentration. 0 <= R bar <= 1. R bar = 1 implies high concentration.
V = 1 - R bar, so 0 <= V <= 1. V = 0 implies high concentration.
v is the circular analog of the linear standard deviation.
Delta is another measure of circular spread.
Kappa is the concentration parameter of the Von Mises distribution.
Group
This is the group (or variable) presented on this line.
Sample Size
This is the number of nonmissing values in this group.
230-19
© NCSS, LLC. All Rights Reserved.
NCSS Statistical Software NCSS.com
Circular Data Analysis
Mean Direction
This is estimated mean direction, T1 .
Circular Variance
The circular variance, V, is a measure of variation in the data. Note that V = 1 − R1 .
Circular Dispersion
1 − T2
The circular dispersion, δ = , is another measure of variation.
2 R12
Notes:
This large-sample confidence interval does not require a von Mises distribution assumption.
This report provides the large sample confidence interval for the mean direction as described by Upton &
Fingleton (1989) page 220. Note that this interval does not require the assumption that the data come from the von
Mises distribution.
230-20
© NCSS, LLC. All Rights Reserved.
NCSS Statistical Software NCSS.com
Circular Data Analysis
Circular
Sample Circular Standard Circular
Size Variance Deviation Dispersion Skewness Kurtosis
Group (N) (V) (v) (Delta) (s) (k)
1 10 0.0676 21.4299 0.1449 -0.0795 -1.7244
2 10 0.0401 16.3991 0.0768 -4.8582 5.4248
Notes:
These statistics are designed to assess and compare the variation in the data.
V = 1 - R bar, so 0 <= V <= 1. V = 0 implies high concentration.
v is the circular analog of the linear standard deviation.
Delta is another measure of circular spread.
For symmetric, unimodal data sets, the skewness is close to zero.
For von Mises data sets, the kurtosis is close to zero.
This report provides measures of data variation and dispersion which were defined in the Statistical Summary
Report. It also provides measures of the skewness and kurtosis of the data.
Skewness
This is a measure of the skewness (lack of symmetry about the mean) in the data. Symmetric, unimodal datasets
have a skewness value near zero.
Kurtosis
This is a measure of the kurtosis (peakedness) in the data. Von Mises datasets have a kurtosis near zero.
Notes:
These large-sample confidence intervals assume that the data follow a von Mises assumption.
The confidence interval for kappa requires the estimated kappa to be > 2.
This report provides estimates and confidence intervals of the parameters (mean direction and concentration) of
the von Mises distribution that best fits the data. Note that the von Mises distribution is a symmetric, unimodal
distribution. You should check the rose plot or circular histogram to determine if the data are symmetric.
The formulas used in the estimation and confidence intervals were given earlier in this chapter. They come from
Mardia & Jupp (2000).
230-21
© NCSS, LLC. All Rights Reserved.
NCSS Statistical Software NCSS.com
Circular Data Analysis
Notes:
These values are used in the calculation of other statistics.
This report provides summary statistics that are used in other calculations.
Mean Cos(a)
1 n
This is C1 = ∑ cos(ai ) .
n i =1
Mean Sin(a)
1 n
This is S1 = ∑ sin(ai ) .
n i =1
Mean Cos(2a)
1 n
This is C2 = ∑ cos(2ai ) .
n i =1
Mean Sin(2a)
1 n
This is S 2 = ∑ sin(2ai ) .
n i =1
R bar
This is R1 =
1
n
(
n C12 + S12 . )
2R bar
This is R2 =
1
n
(
n C22 + S22 . )
230-22
© NCSS, LLC. All Rights Reserved.
NCSS Statistical Software NCSS.com
Circular Data Analysis
Theta, 2 Theta
This is calculated using the following formula with p set to 1 and then 2, respectively.
−1 S p
tan C p > 0, S p > 0
Cp
Sp
Tp = tan −1 + π Cp < 0
Cp
S
tan −1 p + 2π S p < 0, C p > 0
Cp
Null
Hypothesis Test Test Prob Reject H0
(H0) Name Statistic Level at 0.05 Level
Equal Distributions Uniform Scores Test 6.7392 0.0344 Yes
Equal Directions Watson-Williams F Test 0.0147 0.9047 No
Equal Concentrations Concentration Homogeneity Test 0.5717 0.4496 No
Notes:
These statistics test various hypotheses about the parameters of von Mises distributions.
They require that each group follow the von Mises distribution.
The Uniform Scores test requires samples of at least 10.
The Watson-Williams F-test assumes that all kappa's are equal and that their average is > 1.
This report provides tests for three hypotheses about the features of several von Mises datasets. That is, it
provides a test of whether the distributions are identical, whether the mean directions are identical, and whether
the concentrations are identical. These tests are documented in the Technical Details section of this chapter.
Notes:
These statistics test various hypotheses about the parameters of von Mises distributions.
They require that each group follow the von Mises distribution.
Equal distributions tested by the Mardia-Watson-Wheeler uniform scores test. Requires all Ni > 10.
Equal directions tested by the Watson-Williams F test. Assumes Von Mises data with equal kappa's, all > 1.
Equal concentrations tested by concentration homogeneity test. Assumes Von Mises data.
This report provides the same three tests as the Multiple-Group Hypothesis Tests Section, taken two groups at a
time. It allows you to pinpoint where differences occur.
230-23
© NCSS, LLC. All Rights Reserved.
NCSS Statistical Software NCSS.com
Circular Data Analysis
Tests for a Specified Mean Direction Assuming Von Mises Data – Test
Statistic & Prob Levels
Tests for a Specified Mean Direction Assuming Von Mises Data - Test Statistics - Wind ─────────────
Notes:
These procedures test whether the mean direction is equal to a specified value, when kappa (concentration) is
unknown.
They assume that the data follow the von Mises distribution.
The Score Test requires a large sample size.
The Likelihood Ratio Test requires a sample size of at least 5.
The Watson & Williams Test requires a large value of kappa.
The Stephens Test requires kappa to be greater than 2.
Tests for a Specified Mean Direction Assuming Von Mises Data - Probability Levels - Wind ───────────
Notes:
This report gives the probability levels of the test statistics displayed in the previous report.
Although the probability levels of four tests are given, you should use only one of these.
This section reports the results of four tests of the hypothesis that the mean direction of a particular group is equal
to a specific value. These are two-sided tests. They were documented earlier in this chapter.
The first table gives the values of the test statistics. The second table gives the probability levels. The null
hypothesis is rejected when the probability level is less than 0.05 (or some other appropriate cutoff).
Prob Prob
Sample Actual H0 Level of Level of
Size Concentration Concentration Chi-Square (H1:Kappa (H1:Kappa
Group (N) (Kappa) (Kappa0) Value < Kappa0) > Kappa0)
1 10 5.5452 2.0000 2.2756 0.0137 0.9863
2 10 9.1850 2.0000 1.3518 0.0019 0.9981
Notes:
These statistics test whether the kappa (concentration) parameter is equal to the specified value.
The tests require that the estimated kappa is > 2.
This section reports the results of two, one-sided tests of the hypothesis that the concentration parameter of each
group is equal to a specific value. They were documented earlier in this chapter.
The first probability level is for testing the null hypothesis that kappa is greater than or equal to kappa0. The
second probability level is for test the null hypothesis that kappa is less than or equal to kappa0.
230-24
© NCSS, LLC. All Rights Reserved.
NCSS Statistical Software NCSS.com
Circular Data Analysis
Notes:
The tests in this report assess the goodness-of-fit of the uniform distribution.
The Rayleigh test requires samples of at least 20.
The Kuiper and Watson tests require samples of at least 8.
This section reports the results of three goodness-of-fit tests for the uniform distribution. They were documented
earlier in this chapter.
These tests may be viewed as testing whether the data are distributed uniformly around the circle.
Notes:
The tests in this report assess the goodness-of-fit of the von Mises distribution.
Both tests require samples of at least 20.
This section reports the results of two goodness-of-fit tests for the von Mises distribution. They were documented
earlier in this chapter. Several hypothesis tests assume that the data follow a von Mises distribution. These tests
allow you to check the accuracy of this assumption.
230-25
© NCSS, LLC. All Rights Reserved.
NCSS Statistical Software NCSS.com
Circular Data Analysis
Rose Plots
Combined Rose Plot ─────────────────────────────────────────────────────────
These plots show the distribution of the data around the circle.
230-26
© NCSS, LLC. All Rights Reserved.
NCSS Statistical Software NCSS.com
Circular Data Analysis
Circular Histograms
Combined Rose Plot ─────────────────────────────────────────────────────────
The circular histograms are generated by clicking the Plot Format buttons and setting Interior Objects to “Circular
Histogram.”
230-27
© NCSS, LLC. All Rights Reserved.