0% found this document useful (0 votes)
12 views19 pages

Fitting To The Power-Law Distribution

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views19 pages

Fitting To The Power-Law Distribution

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 19

Fitting to the Power-Law Distribution

Michel L. Goldstein, Steven A. Morris, Gary G. Yen


School of Electrical and Computer Engineering, Oklahoma State University, Stillwater, OK 74078
(Receipt date: 02/11/2004)

This paper reviews and compares methods of fitting power-law distributions and methods
to test goodness-of-fit of power-law models. It is shown that the maximum likelihood
estimation (MLE) and Bayesian methods are far more reliable for estimation than using
graphical fitting on log-log transformed data, which is the most commonly used fitting
technique. The Kolmogorov-Smirnoff (KS) goodness-of-fit test is explained and a table
of KS values designed for the power-law distribution is given. The techniques presented
here will advance the application of complex network theory by allowing reliable
estimation of power-law models from data and further allowing quantitative assessment
of goodness-of-fit of proposed power-law models to empirical data.

PACS Number(s): 02.50.Ng, 05.10.Ln, 89.75.-k

I. INTRODUCTION

In recent years, a significant amount of research focused on showing that many


physical and social phenomena follow a power-law distribution. Some examples of these
phenomena are the World Wide Web [1], metabolic networks [2], Internet router
connections [3], journal paper reference networks [4], and sexual contact networks [5].
Often, simple graphical methods are used for establishing the fit of empirical data to a
power-law distribution. Such graphical analysis can be erroneous, especially for data
plotted on a log-log scale. In this scale, a pure power law distribution appears as a straight
line in the plot with a constant slope.
The pure power-law distribution, known as the zeta distribution, or discrete Pareto
distribution [6] is expressed as:

1
k −γ
p (k ) = (1)
ζ (γ )

where: k is an integer usually measuring some variable of interest, e.g., number of links
per network node
p(k) is the probability of observing the value k;
γ is the power-law exponent;
ζ(γ) is the Riemann zeta function.
Without a quantitative measure of goodness of fit, it is difficult to make final
conclusions about how well the data approximates a power-law distribution. Moreover, a
quantitative analysis of the goodness of fit enables the identification of possible
interesting external phenomena that could be causing the distribution to deviate from a
power-law. In some cases the underlying process may not actually generate power-law
distributed data, but outside influences, such as data collection techniques, may cause the
data to appear as power-law distributed. Quantitative assessment the goodness-of-fit for
the power-law distribution can assist on identifying these cases.
In the remainder of this paper, Section II discusses the methods for fitting a power
law. Section III presents two candidate goodness-of-fit tests of a power-law distribution,
the Kolmogorov-Smirnov test, and the χ2 goodness-of-fit test. Section IV illustrates the
application of fitting and goodness-of-fit testing to analysis of a series of collections of
journal papers. Finally, Section V presents conclusions about the problem of fitting
power-law distributions and discusses some possible further analysis that can be
implemented.

II. FITTING POWER-LAW DISTRIBUTIONS

Many methods exist in the theory of parameter estimation that can be used for
estimating the exponent of the power-law distribution [7]. This section overviews three
methods, namely maximum likelihood estimation (MLE), Bayesian Estimation, and
linear regression-based methods.

2
In some cases the head of the distribution may deviate from a power-law, while the
tail appears to be a power-law. A good example of this is the distribution of outbound
links on a webpage [8]. Most have few links only, but some do have a larger amount of
links, especially pages that give a list of interesting pages, also called hubs [9]. On a log-
log plot, the number of outbound links in the tail appears to be linear, suggesting a
“power-law tail.”
It is necessary to have a strict definition of a power-law tail, and define estimators and
tests for this distribution. It is important to note that the tail usually contains only a small
fraction of the data. Thus, no statistical methods may be available to accurately estimate
the power-law exponent, or even determine that the distribution has a power-law tail. An
analysis of this uncertainty was recently performed by Jones and Handcock [10]. The
scope of this paper is limited to analyzing the fitting of power-law functions and to an
entire distribution and applying goodness-of-fit tests for validation and comparison. A
deeper analysis for tail distributions would require first an analysis of the basis of a tail-
only distribution and is beyond the scope of this paper.

A. Maximum likelihood estimation (MLE)


MLE is often used for estimating the exponent of a power-law distribution [6]. It is
based on finding the maximum value of the likelihood function:

N
xi−γ
l (γ | x ) = ∏
i =1 ζ (γ )
(2)
N N
L ( γ | x ) = log l ( γ | x ) = ∑ ( −γ log xi − log ζ ( γ ) ) = −γ ∑ log xi − N log ζ ( γ )
i =1 i =1

where: l(γ|x) is called the likelihood function of γ given the data x


L(γ|x) is the log-likelihood function.
The log-likelihood is used because it simplifies the calculation and, because the log
function is a monotonically increasing function, which does not disturb the point where
the maximum is obtained. This maximum can be obtained theoretically for the zeta
distribution by finding the root of the derivative of the log-likelihood function:

3
N
d 1 d
L ( γ | x ) = −∑ log xi − N ζ (γ ) = 0
dγ i =1 ζ (γ ) dγ
(3)
ζ ′ (γ ) 1 N
⇒− = ∑ log xi
ζ ( γ ) N i =1

where: ζ ′ ( γ ) is the derivative of the Riemann Zeta function.

A table with the value of the ratio ζ ′ ( γ ) ζ ( γ ) can be obtained in [11] or values can

be generated on most modern mathematical and engineering calculation programs.


Calculation of MLE is very fast and robust; however it only offers a single estimate
of the power-law exponent without information to define the confidence interval of that
estimate. In order to deal with this deficiency, a Bayesian estimator, discussed below, can
be used.

B. Bayesian estimation
A Bayesian estimator, derived from MLE, differs from MLE in the meaning of the
parameter estimate [12]. In a Bayesian approach, the unknown parameter is not a single
value, but a distribution, called the posterior distribution. Moreover, this approach
incorporates what is known about the values of the parameter before any data is analyzed,
speeding up convergence and assuring that the final estimate is constrained to values that
are deemed as reasonable. This is done by the definition of a prior distribution p(θ). The
final posterior distribution is given by a normalized multiplication of the prior and the
likelihood. For the power-law distribution:

p ( x | γ ) ⋅ p (γ ) l (γ | x ) ⋅ p (γ ) N
xi−γ
p (γ | x ) = = ∝∏ p ( γ ). (4)
∞ ∞
ζ (γ )
∫ p ( x | γ ) ⋅ p (γ ) dγ ∫ l (γ | x ) ⋅ p (γ ) dγ
i =1

−∞ −∞

The choice of the prior, as mentioned, relates to the range and, possibly, the
distribution in this range. Moreover, the prior also defines possible discretization of the
estimated parameter. A discrete prior always generates a discrete posterior distribution.

4
Another good feature of the Bayesian estimation process is that it is naturally
adaptable to iterative estimation, where one sample or a subset of the samples is analyzed
at a time. This is particularly useful if it is interesting to analyze the influence of each of
the samples, and if the amount of memory available for implementation is low (the only
thing that has to be stored is the posterior at each step).

C. Linear regression-based estimators


For power-law exponent estimation, linear regression is an often used estimation
procedure [13]. Different variations of this technique are all based on the same principle:
a linear fit is made to the data that is plotted on a log-log scale. Actually, with reasonable
accuracy, the linear fit can be made by hand on a log-log plot of the distribution.
However, the linear fitting does not take into consideration that almost all of the data
observed is on the first few points of the distribution. For example, for an exponent, γ, of
3.0, 93.6% of the data is expected to have k=1 or k=2. Therefore, an estimation method
that does not incorporate this fact will fit to the “noise” in the tail, where very few
observations occur [14].
Because of this, two modifications to direct linear fitting were proposed: 1) the use of
only the first 5 points for regression and, 2) the use of logarithmically binned data. The
first variation is straightforward to implement. The second variation is based on adding
all values that fall into bins that are logarithmically spaced (same size in the logarithmic
scale), and then performing the linear regression on the log of the quantity of these
groups of data. This method is similar to binning methods used for curve estimation [1].
The advantage of this method is that, by grouping the data points the noise is reduced.
The reduction of noise is dependent on the size chosen for the bins. However, this
method only generates a graph that is approximately linear even when the distribution is a
power-law, as can be seen in Figure 1 for “log2” bins, i.e., bin boundaries at (0, 1, 2, 4,
8, ...). The linearization error decreases the estimation accuracy. More importantly, the
slope obtained is not directly the value of the power-law exponent. This can be observed
by plotting the exponent of the power-law distribution and the slope obtained by
simulating this distribution and using the method explained above. Figure 2 shows this

5
plot for “log2” bins. Approximating the relation to a line, the following transformation
equation was obtained:

b = 1.094 ⋅ γ − 0.963 (5)

where: b is the measured exponent


γ is the actual exponent.
Another common method of linear estimation is of using 5 bins per decade. Using the
same method, the transformation equation obtained using 10 bins (2 decades) was:

b = 1.026 ⋅ γ − 0.931 . (6)

The most important observation about the linear fitting methods is that they are not
tied to the definition of a probability distribution. Thus, the integration of the fit line may
not be unity. By using the slope given by this fitted line and forcing an adjustment to the
intercept in order for the fitted function to be a probability distribution, the final function
may end up visually distant from the empirical distribution. Examples in Section IV will
assist on illustrating this behavior.

III. GOODNESS-OF-FIT TESTS

In statistical analysis, many methods for assessing the goodness-of-fit of a


distribution have been proposed. Among these methods, the most commonly used for
general distributions are Pearson’s χ2 goodness-of-fit test, and the Kolmogorov-Smirnov
(KS) type test.

A. Pearson’s χ2 goodness-of-fit test


Pearson’s χ2 test is the most commonly used test for large samples. It was introduced
by Pearson in 1900 [15] and is defined as the following test when hypothesizing for a
specific distribution:

6
( Oi − Ei )
2
C
Q=∑ ~& χ C2 −1 (7)
i =1 Ei

where: C is the total number of classes


Oi is the observed value related to class i
Ei is the expected value of class i.
When the distribution is independent on the data, the number of degrees of freedom
of the χ2 distribution is, as shown in Equation (7), C-1. However, when the distribution in
the hypothesis has some parameters that are estimated using the same data to which the
test is going to be applied, the number of degrees of freedom decreases [16]. More
specifically, the number of degrees of freedom of the χ2 distribution is C-s-1, where s is
the number of parameters that were obtained from the data. This decrease in the number
of degrees of freedom assumes that the parameters were obtained using MLE method.
Using other methods may cause this number do decrease even more. Thus, it is only
possible to apply the χ2 test when MLE is performed. For MLE, the degrees of freedom
used for testing for the power-law distribution is C-2.
Another important decision for the χ2 test is on the number of classes to use. Later
analysis of the χ2 test has shown that the test is not valid when the expected value of the
quantity in any of the classes is less than 5 [16]. Therefore, it is necessary to sum all
values from the tail of the distribution into a class whose total expected value is greater
than 5. For example, in a dataset with 5,000 samples and a γ of 3.0, there would be 10
classes; a class for integers 1 to 9 and a tail class for all points whose frequency is greater
than 9.
Nicchols [17] points out that the need for all class expected values to be greater than 5
is a rule of thumb, it has a purely heuristic reason and is the main criticism researchers
have about using this test. For example, another possible solution would be to use the
smallest classes possible for the tail, i.e., instead of grouping all tail values into one class,
they would be grouped into classes such that each of the class has an expected value as
close to 5 as possible. By choosing this heuristic solution, which does not have conflicts
with any of the test assumptions, the χ2 test statistic may vary considerably. Because of
this, most analyses tend to employ the Kolmogorov-Smirnov test.

7
B. Kolmogorov-Smirnov-type test
The KS-type test has recently been applied to testing goodness-of-fit when total
sample size is small. The test is based on the following value:

K = sup F * ( x ) − S ( x ) (8)
x

where: F*(x) is the hypothesized cumulative distribution function


S(x) is the empirical distribution function based on the sampled data.
Kolmogorov [18] first supplied a table for the quantiles of this distribution for the
case where the probability function was independent on the data points. However, when
there is dependence, other tables must be used. This limitation was not taken into
consideration by Pao and Nicholls in their application [17, 19] of the KS test to power-
laws. Without correcting for this factor, the KS test gives a rejection rate lower than what
is expected [20].
Lilifoers later introduced tables for using the KS test with other distributions, such as
normal and exponential [21, 22]. These tables were obtained using a Monte Carlo
method, which is based on the generation of a large number of distributions with random
parameters and calculating the test statistic for each of the test cases. From these tests,
empirical values for the quantiles can be extracted. The same procedure was used to
obtain these values for the power-law distribution. For each of the logarithmically spaced
sample sizes, 10,000 power-law distributions were simulated, with random exponent
from 1.5 to 4.0. Statistics were collected from these simulations to generate the KS table,
shown in Table I. This table was created assuming that the estimation method used was
the MLE. If other estimation methods are used, it would be necessary to construct a new
KS table.
A step by step example of how to apply the KS test for determining the goodness-of-
fit to a power-law is presented in the next section.

8
IV. EXAMPLES

First a simple example will be given on how the KS table can be used for determining
the goodness-of-fit of an empirical distribution to a power-law distribution. Using data
from a small collection of 131 papers and 359 authors that cover the topic of MEMS RF
switches, the process of using the KS goodness-of-fit test for the papers per author
distribution follows four steps:
1) Use the MLE method for estimating the power-law exponent. In this case, the
estimated exponent was 2.76.
2) Generate the hypothesized cumulative distribution F*(x) using the cumulative
sum of equation (1) and build a table showing side by side the values of F*(x)
and S(x) where there were values observed in the dataset. This table is shown
in Table II.
3) Calculate the absolute difference between each pair of values and find the
maximum. This is the KS test statistic. The absolute differences can be seen in
Table II. The value in bold is largest difference for this dataset.
4) On the table in the Appendix, the largest value, 0.0313, is compared with the
values in the row with the closest number of points. For a more conservative
approach, where it is better to accept the hypothesized distribution when there
is a doubt, the row with the lower number of points should be used. In this
case, use the row for 100 points. This row shows that for 90% of the cases
when the distribution was a power-law, the KS statistic was 0.0580 or below.
The maximum observed KS statistic for this example was much lower than
this. In other words, the p value, or Observed Significance Level (OSL) is
greater than 10%. Thus, using a confidence level of 5%, there is no statistical
evidence to support that this distribution is not power-law.
This simple example shows two important details about any goodness-of-fit tests: the
result of the test does not prove that a sample actually comes from a power-law
distribution, it can only suggest when the chance of being a power-law is low. As would
be expected, higher chance of the sample being tested of being from a power-law
distribution is suggested by a higher the p value. The latter can be used as a method to

9
compare samples to infer which are more likely to be generated by a power-law
distribution.
A second example is a collection of papers covering the topic of vibrating sandpiles,
containing 368 papers with 6272 references. The power-law exponent of the paper per
reference distribution was extracted using the four methods discussed above (the
Bayesian method was not used because it generates results that are not easily comparable
to the other results): MLE, linear regression in log-log scale, linear regression using the
first 5 points only and linear regression using logarithmically binned data. A third
example estimates the power-law exponent of the authors per paper distribution of a
collection 336 papers and 422 authors from a collaboration network associated with
researchers at the University of Maryland Psychiatric Research Center. Figure 3 shows a
comparison of the different results for this dataset. As discussed, the linear regression on
log-log transformed data fits all points with equal weight, and greatly underestimates the
exponent. Using the first 5 points, the method over-estimated the exponent, because it
does not take into consideration the tail, while the log2 binned data underestimated the
exponent because it places too much emphasis on the tail points.
Table III shows a summary of the estimated exponents obtained using each of the
fitting methods applied to 27 different collections of journal papers from 27 different
research topics. These collections were gathered from the Institute for Scientific
Information Web of Science product over a period of two years from queries and seed
references and were used for research in information visualization and knowledge
domain mapping. The characteristics of these collections are summarized in Table IV.
For paper collections two distributions are usually claimed as power-laws: papers per
author (Lotka’s Law [23]) and papers per reference ([24]). The number of papers in each
collection varies from 131 to 14,211 and the MLE power-law exponents vary from 1.99
to 3.71 for the distribution of papers per author and 1.98 to 3.93 for the distribution of
papers per reference.
Using the MLE, the two goodness-of-fit tests discussed above were used to analyze
all 27 datasets. Table V shows the overall result of the number of distributions that were
actually accepted as power laws using both goodness of fit tests described in the previous
section using a 95% confidence level.

10
These results support the idea that it is not possible to assume in all datasets that these
distributions are actually power-law. Papers per author distributions experience a 56%
acceptance rate using the KS test and appear to be more likely to be accepted as actual
power-laws, . However, using the KS test of power-law fit on the papers per reference
distribution the acceptance rate was only 7%, so that it is very unlikely to be an actual
power law, as suggested by Naranan [24]. Further analysis on the ability to define power-
law tail distributions would be needed to test if the paper-per-reference distribution is
power-law tail only as reported by Redner [4]. The two collections where paper per
reference distribution was accepted as power-law by both χ2 and KS tests are two small
datasets containing 148 papers and 3,767 references, and 131 papers and 1,573
references, respectively.
In Figure 4 to Figure 7, some examples of distributions and their actual goodness of
fit test values are shown. In these examples, it is easy to observe that visual inspection in
a log-log plot is not accurate enough to determine if these distributions are actual power-
laws or not.

V. CONCLUSIONS

This paper presents an analysis of the extent to which empirical data can be assumed
to be power-law distributed. First a brief discussion of possible fitness methods was
presented. Then two well-known goodness-of-fit measurements were used for the
analysis: Pearson’s χ2 test and the Kolmogorov-Smirnov test. A KS table for testing the
fitness of MLE estimated power-law was provided.
Using these goodness-of-fit tests on 27 collections of journal papers and testing them
for two distributions that are usually believed as power-laws, the papers per author and
papers per reference distributions, it was shown that caution must be taken when
assuming power-law distributions. Especially on the papers per reference distributions, it
was observed that in many cases the power-law distribution could not be substantiated.
Importantly, the usual method for observing power-law distributions, that is, plotting on
log-log scale, offers no support on actually identifying poor goodness of fit.

11
Most of the fitness problems may be caused by external effects that usually affect the
initial points of the distribution only (the head of the distribution). Further analysis would
be required to test if the tail of the distribution is a power-law. However, when testing for
the fit of the tail, it is wise to be cautious about the extreme paucity of sample points that
generate the tail of the distribution. The power of goodness-of-fit tests decreases when
fewer points are sampled. Therefore, it becomes much more difficult to confirm that the
distribution of the tail is power-law and not any other distribution.
Another possible analysis that could be performed with this data is a quantitative
analysis of the modifying external effects. For example, if it is known that, in some
collections of journal paper there may be some survey papers that reference many papers
that were never referenced before and that are actually external from the dataset, this may
cause an unexpected increase in the number of references appearing in only one paper
(the first value in the distribution). With a goodness of fit test it is possible to establish
some hypothesis on the amount of external references that were added to the database
and, possibly, remove them from further analyses.
Overall, the evaluation of these tests is simple and does not add much to the overall
processing complexity. The insightful understanding of goodness of fit measurements
when testing for power-law distributions enhances the capabilities of analysis of datasets
that may show highly skewed distributions. It is a vital process in order to confirm
assumptions and make meaning full comparisons when modeling of the datasets.

12
Table I - KS test table for power-law distributions
Quantile
# samples 0.9 0.95 0.99 0.999
10 0.1765 0.2103 0.2835 0.3874
20 0.1257 0.1486 0.2003 0.2696
30 0.1048 0.1239 0.1627 0.2127
40 0.0920 0.1075 0.1439 0.1857
50 0.0826 0.0979 0.1281 0.1719
100 0.0580 0.0692 0.0922 0.1164
500 0.0258 0.0307 0.0412 0.0550
1000 0.0186 0.0216 0.0283 0.0358
2000 0.0129 0.0151 0.0197 0.0246
3000 0.0102 0.0118 0.0155 0.0202
4000 0.0087 0.0101 0.0131 0.0172
5000 0.0073 0.0086 0.0113 0.0147
10000 0.0059 0.0069 0.0089 0.0117
50000 0.0025 0.0034 0.0061 0.0077

Table II - Sample results for using the KS goodness-of-fit test


x S(x) F*(x) |F*(x) - S(x)|
1 0.7647 0.7960 0.0313
2 0.9188 0.9132 0.0056
3 0.9692 0.9513 0.0178
4 0.9860 0.9686 0.0174
5 0.9916 0.9779 0.0137
6 0.9972 0.9835 0.0137
17 1.0000 0.9971 0.0029

Table III – Sample statistics for power-law fitting of all datasets varying the fitting method
Papers per author Papers per reference
µ σ µ σ
MLE 2.63 0.47 MLE 2.78 0.51
Linear 2.17 0.48 Linear 2.04 0.46
Linear (5p) 2.59 0.51 Linear (5p) 2.77 0.49
Log2 bins 2.73 0.55 Log2 bins 2.60 0.46

13
Table IV - Summary table of a series of 27 paper collections used in for demonstrating power-law
fitting and goodness of fit testing.

no. of no. of no. of


Index topic papers references authors
1 agent based models 148 3767 259
2 angiogenesis 453 8246 1590
3 anthrax 2472 25010 4493
4 atrial ablation 3095 22670 6574
5 biosensors 5892 32767 11034
6 botox 1560 20819 3521
7 cocition and bibliographic coupling 550 13010 492
8 complex networks 902 19185 1665
9 distance education 1391 16603 2472
10 econophysics 482 6281 588
11 ht supercon 1631 29044 3001
12 info science 14211 119289 9413
13 information visualization 2450 56912 5545
14 mems RF switch 131 1573 359
15 milgrams 404 6791 465
16 molecular imprinting 513 5717 785
17 nerve agents 407 8293 1064
18 neuroimaging 671 25279 2042
19 ontology 224 6501 456
20 schizophrenia 513 20422 1477
21 scientometrics 3468 70117 2928
22 self organized criticality 1634 27622 2176
23 silicon on insulator semiconductor 2383 23041 4902
24 superstring 6652 53568 4813
25 TQM 1893 28216 2875
26 U of Maryland 336 5890 422
27 vibrating sandpiles 368 6272 547

Table V - Overall results for the goodness of fit for all datasets
Papers per author Papers per reference
χ2 test KS test χ2 test KS test
# rejected 10 (37%) 12 (44%) # rejected 26 (96%) 25 (93%)
# accepted 17 (63%) 15 (56%) # accepted 1 (4%) 2 (7%)

14
Figure 1 - Log-2 bin results for a theoretical power-law distribution with N=1000 and γ=2.0

Figure 2 - Empirical transformation between the slope of the log-2 binned data and the power-law
exponent

15
Figure 3 - Results of fitting using the different fitting methods for papers per reference distribution
for the vibrating sandpiles dataset.

Figure 4 –Papers per author distribution for the University of Maryland dataset. The circles
represent the actual empirical distribution, the line is the Maximum Likelihood fit (gamma = 2.02).
The database has 336 papers, 422 authors, Q = 3.24, pQ = 0.7785, K = 0.0158, pK > 0.1.

16
Figure 5 – Papers per author distribution for the atrial ablation dataset. The circles represent the
actual empirical distribution, the line is the Maximum Likelihood fit (gamma = 2.11). The database
has 3,095 papers, 6,574 authors, Q = 62.7, pQ = 1.5·10-5, K = 0.0125, pK < 0.01.

Figure 6 – Reference distribution for the superstring dataset. The circles represent the actual
empirical distribution, the line is the Maximum Likelihood fit (gamma = 1.98). The database has
6,652 papers, 53,568 references, 208,119 citations, Q = 604, pQ = 0, K = 0.0139, pK < 0.001.

17
Figure 7 – Reference distribution for the angiogenesis dataset. The circles represent the actual
empirical distribution, the line is the Maximum Likelihood fit (gamma = 2.33). The database has 453
papers, 8,246 references, 18,818 citations, Q = 1.74, pQ = 1.2·10-5, K = 0.0107, pK < 0.01.

[1] R. Albert, H. Jeong, and A. L. Barabasi, Nature 401, 130 (1999).


[2] H. Jeong, B. Tombor, R. Albert, et al., Nature 407, 651 (2000).
[3] M. Faloutsos, P. Faloutsos, and C. Faloutsos, Comput. Commun. Rev. 29, 251
(1999).
[4] S. Redner, Eur. Phys. J. B 4, 131 (1998).
[5] F. Liljeros, C. R. Edling, L. A. N. Amaral, et al., Nature 411, 907 (2001).
[6] N. L. Johnson, S. Kotz, and A. W. Kemp, Univariate discrete distributions (John
Wiley & Sons, New York, 1992).
[7] J. M. Mendel, Lessons in Estimation Theory for Signal Processing,
Communications, and Control (Prentice Hall, Upper Saddle River, NJ, 1995).
[8] R. Albert, H. Jeong, and A.-L. Barabási, 401, 130 (1999).
[9] J. M. Kleinberg, J. ACM 5, 604 (1999).
[10] J. H. Jones and M. S. Handcock, Nature 423, 605 (2003).
[11] A. Walther, Acta Mathe. 48, 393 (1926).
[12] G. E. P. Box and G. C. Tiao, Bayesian inference in statistical analysis (Addison-
Wesley Pub. Co., Reading, Mass.,, 1973).
[13] R. Albert and A.-L. Barabási, Rev. Mod. Phys. 74, 47 (2002).
[14] J. H. Jones and M. S. Handcock, P Roy Soc Lond B Bio 270, 1123 (2003).

18
[15] K. Pearson, 50, 157 (1900).
[16] W. G. Cochran, Ann. Math. Stat. 23, 315 (1952).
[17] P. T. Nicholls, J. Am. Soc. Inform. Sci. 40, 379 (1989).
[18] A. N. Kolmogorov, G. Inst. Ital. Attuari 4, 77 (1933).
[19] M. L. Pao, Info. Proc. Man. 21, 305 (1985).
[20] W. J. Conover, Practical nonparametric statistics (Wiley, New York, 1999).
[21] H. W. Lilifoers, J. Am. Stat. Assoc. 62, 399 (1967).
[22] H. W. Lilifoers, J. Am. Stat. Assoc. 64, 387 (1969).
[23] A. J. Lotka, J. Wash. Acad. Sci. 16, 317 (1926).
[24] S. Naranan, J. Doc. 27, 83 (1971).

19

You might also like