Report3 Pattern Analysis
Report3 Pattern Analysis
1. Introduction
The first step towards spatial pattern modelling is the spatial randomness analysis. A
rich source of methods can be found in the literature but in general terms the methods
can be classified into three categories:
• Distance analysis
• Quadrat count analysis
• Second moment analysis
Distance analysis uses the distances between events or the distances between events
and selected points. Quadrat count analysis uses the number of events falling into the
quadrats of certain shape, either located randomly or according to a pre-arranged
pattern. Second moment analysis mainly deals with the K-function analysis though
the second order intensity and the covariance density also fall into this category. In
this report, each category of the analysis is discussed in turn and some examples are
given.
2. Distance analysis
As illustrated in Figure 2.1, we use t, y and x to represent three different measures of
distances. t is the distance between any pair of events and is referred as interevent
distance. y is the distance between an event and its nearest one and is named the
nearest neighbour distance. x is the distance between a selected point in the region
and its nearest event and is referred to as the point to nearest event distance. For the
later two types of distances, if the interest is concerned with the k-th nearest event, the
distances are then called the k-th nearest neighbour distance or the k-th point to
nearest event distance.
2
Stochastic Modelling of Fractures in Rock Masses, 2003
For a point process in R d, under the assumption of homogeneous Poisson process, the
probability that there is no event within distance y of an arbitrary event can be
obtained by the inter-event property of the Poisson process and can be expressed as:
− λV
P[no further event within distance y of an event ] = e y
where Vy is the d-dimensional volume of a sphere with radius y and is given below:
d
2
π ⋅ yd
Vy =
d
Γ(1 + )
2
Therefore the probability distribution function of measure y and its density function
are:
G ( y ) = 1 − e − λV y ( probability function)
d
d ⋅ λ ⋅π 2 − λV
g( y) = ⋅ y d −1 ⋅ e y (density function)
d
Γ(1 + )
2
For two dimensional case,
G ( y ) = 1 − e − λπy
2
( probability function)
g ( y ) = 2πλ ⋅ y ⋅ e −λπ y
2
(density function)
If we consider up to the kth nearest event of an arbitrary event, the joint distribution
function for y1, y2, …, yk can be derived as follows:
3
Stochastic Modelling of Fractures in Rock Masses, 2003
∫ ∫⋅⋅⋅ ∫ λ e
− λ ⋅Vk
G k ( y , y ,⋅ ⋅ ⋅, y ) =
*
1
*
2
*
k
k
dV1 ⋅ dV 2 ⋅ ⋅ ⋅ dV k
0 V1* Vk*−1
k −1
= ∏ (Vi* − Vi*−1 ) ⋅ λ k −1 ⋅ (e −λ ⋅Vk −1 − e −λ ⋅Vk ) with V0* = 0
* *
i =1
The same theory can be applied to the distribution of x, the nearest event distance to a
randomly selected point, i.e., the following relations can be obtained:
F ( x ) = 1 − e − λV x ( probability function)
d
For the inter-event distance t, Bartlett [] derived the distribution density in 1964 for
homogeneous Poisson process within a two dimensional square, which is given in the
following equation:
4
Stochastic Modelling of Fractures in Rock Masses, 2003
2πt 8t 2 2t 3
2 − 3 + 4 (t 2 ≤ L2 )
L L L
h(t ) = 2 L2
4 t ⋅ arcsin( − 1)
t 2 4t 8t ⋅ t 2 − L2 2t 3
− + − 4 ( L2 < t 2 <= 2 L2 )
L 2
L2
L 3
L
where L is the size of the square. Diggle [] gives the distribution function for a unit
square as follows:
8⋅ t3 t4
π ⋅ t 2
− + 0 ≤ t ≤1
3 2
H (t ) =
1 − 2 ⋅ t 2 − t + 4 t − 1 ⋅ (2 ⋅ t + 1) + 2 ⋅ t 2 ⋅ arcsin( 2 − 1)
4 2 2
3 1< t ≤ 2
2 3 t2
where i,j and m have the same meanings as the last equation, k is the distribution class
id and ∆ is the interval dividing the ranges of t, y and x.
To compare the results with theory, however, edge effect has to be taken into account.
) )
For the calculation of G and F , the edge effect can be corrected by the following
relations:
5
Stochastic Modelling of Fractures in Rock Masses, 2003
~ # (t ij ≤ t , d i > t )
H (t ) = 1 i , j = 1,2,⋅ ⋅ ⋅, n, nt = # (d i > t )
2 n t ( n t − 1)
~ # ( y i ≤ y, d i > y )
G( y) = i = 1,2,⋅ ⋅ ⋅, n
# (d i > y )
~ # ( x i ≤ x, d i > x )
F ( x) = i = 1,2,⋅ ⋅ ⋅, m
# (d i > x )
where di is the distance of the selected event (out of n events) or the selected point
(out of m points) to the nearest border (edge) of the region ℜ being considered. This
treatment is equivalent to imposing a safe-guarding area (or volume) around the edge
of the region (see Upton []) and discard any event (or point) falling within it. Note the
guarding area (volume) changes in size according to the distance t, y or x being
considered. Another approach to edge correction specifically in two-dimensional case
when the region being considered ℜ is a rectangle is to virtually join the opposite side
of the rectangle and turn the region into a torus. Distances are then calculated on the
basis of this virtual region. For example, the nearest event of an event located at the
bottom left corner of the region can be an event located at the top right corner of the
region.
) ~
The random points selected for the calculation of F or F values can be chosen
randomly or from a fixed grid. Diggle [] suggests using a k×k regular grid and
) ~
increasing k until F or F effectively stabilizes throughout its range.
For example, if the inter-event distance of a point process is being analysed, one of
the obvious choices of statistics for null hypothesis testing is the squared differences
)
between the calculated H (t ) and the theoretical H (t ) , i.e.,
)
s d = ∫ ( H (t ) − H (t )) 2 dt
where H (t ) is the probability distribution of inter-event distances of the point model
to be tested. k independent realisations are generated and each of the squared
6
Stochastic Modelling of Fractures in Rock Masses, 2003
) ) )
differences between H 1 , H 2 , ⋅ ⋅⋅, H k and H are calculated. The above criteria is then
applied to test the hypothesis. In this case, it is an upper tail test. For instance, if 99
simulations are used and the significance level is 5%, then any ranking of sd above 96
will reject the hypothesis.
Note also in the above arguments, the point model can refer to any known models. If
the homogeneous Poisson process model is used, the test will be against the
discrepancy between the data points and a complete spatial random (CSR) point
pattern. In other words, the test is against the departure of the data from CSR.
With respect to the edge effect issue, it can either be taken into account or ignored if
statistics from data are only to be compared to those from Monte Carlo simulation
using the same treatment. If the results are to be compared with the theoretical
values, however, edge corrections must be considered. See examples below.
7
Stochastic Modelling of Fractures in Rock Masses, 2003
100 simulations. Three statistics are displayed from the Monte Carlo, the average
simulated value, the minimum and the maximum acceptance envelope values with
95% confidence based on a two tails test described in the above section. As can be
seen in the homogeneous case, the calculated statistics agree well with the Monte
Carlo test results in all occasions.
) )
Figure 2.3 (a) H (t ) vs distance t Figure 2.3 (b) h (t ) vs distance t
) )
Figure 2.4 (a) G ( y ) vs distance y Figure 2.4 (b) g ( y ) vs distance y
8
Stochastic Modelling of Fractures in Rock Masses, 2003
) )
Figure 2.5 (a) F ( x ) vs distance x Figure 2.5 (b) f ( x ) vs distance x
where u and v are the horizontal and vertical coordinates. The distance statistics
) ) ) ) ) )
H (t ), h (t ), G ( y ), g ( y ), F ( x ) and f ( x ) are shown in Figure 2.7 to Figure 2.9
) )
respectively. As can be seen from H (t ) and h (t ) analysis, the distributions start to
depart from the statistics obtained from homogeneous simulation. The degree of
departure depends on the data but it can not easily be quantified by the analysis. The
) )
G ( y ) and g ( y ) analysis does not even suggest the non-homogeneity for the pattern.
) )
F (x ) and f ( x ) only show a very modest degree of departure.
)
In general for non-homogeneous case, h (t ) will tend to be negative skewed as there
will always be some point aggregation compared with the average distribution. This
feature is clearly visible from the figure. More discussion about this point will follow
in the next section.
9
Stochastic Modelling of Fractures in Rock Masses, 2003
) )
Figure 2.7 (a) H (t ) vs distance t Figure 2.7 (b) h (t ) vs distance t
) )
Figure 2.8 (a) G ( y ) vs distance y Figure 2.8 (b) g ( y ) vs distance y
) )
Figure 2.9 (a) F ( x ) vs distance x Figure 2.9 (b) f ( x ) vs distance x
10
Stochastic Modelling of Fractures in Rock Masses, 2003
uniformly distributed around their parent within a circle of radius of 5 and centred at
their parent location. The realisation consists of daughter points only.
) ) ) ) ) )
The distance statistics H (t ), h (t ), G ( y ), g ( y ), F ( x ) and f ( x ) for the realisation are
shown from Figure 2.11 to Figure 2.13. All figures suggest a certain degree of
)
departures of the point pattern from homogeneous case. Only h (t ) , however,
)
demonstrates some interesting results. h (t ) shows multi-modes characteristics. The
first mode peaks at around t = 5, which suggests from the histogram analysis point of
view that there is a point aggregation with the average inter-event distance inside the
aggregation at about 5. This is exactly the size of the cluster we specified when
generating the pattern. There are also some other modes that suggest aggregation at
)
different level. The overall trend of h (t ) , however, follows more or less the curve for
homogeneous process, which suggests that the underlying parent process is
homogeneous. Recall from the last section for non-homogeneous case, the
)
distribution of h (t ) will tend to be negative skewed. Figures of other distance
analysis always also demonstrate considerable amount of deviation of this pattern
from homogeneous case, but the connection between the results and the defined
cluster process is not obvious.
) )
Figure 2.11 (a) H (t ) vs distance t Figure 2.11 (b) h (t ) vs distance t
11
Stochastic Modelling of Fractures in Rock Masses, 2003
) )
Figure 2.12 (a) G ( y ) vs distance y Figure 2.12 (b) g ( y ) vs distance y
) )
Figure 2.13 (a) F ( x ) vs distance x Figure 2.13 (b) f ( x ) vs distance x
2π ⋅ 3.5 2
with variance σ2=13.5. The parent process again is homogeneous. The points image
is shown in Figure 2.14.
Distance analysis results are displayed in Figure 2.15 - 2.17. By comparing these
figures with those of Figure 2.11 – 2.13, it is not difficult to conclude that great
similarities in the results between the two point realisations. The average distance
between points from the same parent is 2 × 3.5 = 5 which is correctly identified in
)
the h (t ) analysis result by the peak of the first mode.
12
Stochastic Modelling of Fractures in Rock Masses, 2003
) )
Figure 2.15 (a) H (t ) vs distance t Figure 2.15 (b) h (t ) vs distance t
) )
Figure 2.16 (a) G ( y ) vs distance y Figure 2.16 (b) g ( y ) vs distance y
13
Stochastic Modelling of Fractures in Rock Masses, 2003
) )
Figure 2.17 (a) F ( x ) vs distance x Figure 2.17 (b) f ( x ) vs distance x
These are not selected examples. In fact clustered points pattern always show the
)
same characteristics, i.e., multi-modes in the h (t ) graph and the overall trend
)
implicitly represent the parent process. In other words, if h (t ) calculated from a
particular data set shows the same or similar characteristics, a cluster process can then
)
be used as the underlying model. If there is no evidence of multi-mode h (t ) , a non-
homogeneous model may be more suitable.
This, however, only serves as a general rule of thumb and should not be treated as a
fixed relation. For example, in two extreme cases, if the number of clusters
(aggregated points cloud) is small compared to the whole point population, or if the
distribution of daughters is more spread out, resulting in greater mixtures between
)
daughters from different parent, the multi-mode judgement based on h (t ) may not be
suitable, or may even provide false directions. Other limitations include the
applications of the rule in the case of anisotropic distribution of daughter process. In
this situation, directional distance analysis may help resolve the problem.
Note the rule described here can only help to judge if the point pattern has clusters
and help to establish the most likely cluster size if it has. It does not reveal any
information about the distribution of daughter process itself.
14
Stochastic Modelling of Fractures in Rock Masses, 2003
location. By comparing the image with that of Figure 2.6, it is obvious the realisation
of Cox process expresses more randomness which renders the defined underlying
model such as the mean less obvious. This is certainly due to the extra random
component which controls the realisation of the random density field. This feature of
Cox process obviously imposes extra difficulties to identify an effective parametric
model for the modelling exercise.
) ) ) ) ) )
The distance statistics H (t ), h (t ), G ( y ), g ( y ), F ( x ) and f ( x ) are shown in Figure
2.19 to Figure 2.21 respectively. Though the results show very similar features as the
non-homogeneous point pattern, as compared with Figure 2.7 to Figure 2.9, they tend
to get closer to CSR pattern because of the reason stated above.
) )
Figure 2.19 (a) H (t ) vs distance t Figure 2.19 (b) h (t ) vs distance t
15
Stochastic Modelling of Fractures in Rock Masses, 2003
) )
Figure 2.20 (a) G ( y ) vs distance y Figure 2.20 (b) g ( y ) vs distance y
) )
Figure 2.21 (a) F ( x ) vs distance x Figure 2.21 (b) f ( x ) vs distance x
16
Stochastic Modelling of Fractures in Rock Masses, 2003
Figure 2.22 (a) Example 1 (fracture Figure 2.22 (b) Example 1 (Points data
traces) set)
Figure 2.23 (a) Example 1 – fracture Figure 2.23 (b) Example 1 - fracture set
trace 1 1
Figure 2.24 (a) Example 1 – fracture Figure 2.24 (b) Example 1 - fracture set
trace 2 2
17
Stochastic Modelling of Fractures in Rock Masses, 2003
Figure 2.25 (a) Example 2 (fracture Figure 2.25 (b) Example 2 (Points data
traces) set)
Figure 2.26 (a) Example 2 – fracture Figure 2.26 (b) Example 2 – fracture
trace set 1 trace set 1
Figure 2.27 (a) Example 2 – fracture Figure 2.27 (b) Example 2 – fracture
trace set 2 trace set 2
Since cumulative histograms for nearest event distance y and point to nearest event
distance x do not actually give clear pictures about the distance characteristics and
therefore in the following analysis they will not be presented to save some space of
the report. Figure 2.28 shows the three distance analysis histogram for the example 1
for fracture trace set 1 and 2 combined. Analysis for trace set 1 only is displayed in
Figure 2.29, while for trace set 2 in Figure 2.30. For the second practical example,
the corresponding analysis results are given in Figure 2.30 to Figure 2.32.
18
Stochastic Modelling of Fractures in Rock Masses, 2003
Figure 2.28 (a) h(t) for E.1 – F.1 + F. 2 Figure 2.28 (b) h(t) for E.1 – F.1 + F. 2
Figure 2.28 (c) g(y) for E.1 – F.1 + F. 2 Figure 2.28 (d) f(x) for E.1 – F.1 + F. 2
Figure 2.29 (a) h(t) for E.1 – F.1 Figure 2.29 (b) h(t) for E.1 – F.1
19
Stochastic Modelling of Fractures in Rock Masses, 2003
Figure 2.29 (c) g(y) for E.1 – F.1 Figure 2.29 (d) f(x) for E.1 – F.1
Figure 2.30 (a) h(t) for E.1 – F.2 Figure 2.30 (b) h(t) for E.1 – F.2
Figure 2.30 (c) g(y) for E.1 – F.2 Figure 2.30 (d) f(x) for E.1 – F.2
20
Stochastic Modelling of Fractures in Rock Masses, 2003
Figure 2.31 (a) h(t) for E.2 – F.1 + F.2 Figure 2.31 (b) h(t) for E.2 – F.1 + F.2
Figure 2.31 (c) g(y) for E.2 – F.1 + F.2 Figure 2.31 (d) f(x) for E.2 – F. 1 + F.2
Figure 2.32 (a) h(t) for E.2 – F.1 Figure 2.32 (b) h(t) for E.2 – F.1
21
Stochastic Modelling of Fractures in Rock Masses, 2003
Figure 2.32 (c) g(y) for E.2 – F.1 Figure 2.32 (d) f(x) for E.2 – F. 1
Figure 2.33 (a) h(t) for E.2 – F.2 Figure 2.33 (b) h(t) for E.2 – F.2
Figure 2.33 (c) g(y) for E.2 – F.2 Figure 2.33 (d) f(x) for E.2 – F. 2
For example 1, analysis for all three cases (fracture trace set 1 only, fracture trace set
2 only or fracture trace set 1 + set 2) show very similar results. A brief list of some of
the common features are given below:
1. Serious departure from homogeneous Poisson process.
2. Negative skewed histograms for the inter-event distance analysis suggest in general
a non-homogeneous type of Poisson process should be used for the modelling.
22
Stochastic Modelling of Fractures in Rock Masses, 2003
3. Only single mode can be observed in h(t) for all three cases which suggest there is
no cluster features in the point patterns.
4. Compared with points from fracture trace set 2, points from fracture trace set 1
show less deviation from homogeneous case which reveals that for this set of
fractures, there is less aggregation, i.e., fractures tend to be more evenly distributed.
The degree of deviation can be obtained by inspecting the histogram. Greater skew of
the set 2 suggest more serious points aggregation.
5. A logical geological extension to the above point is that fracture trace set 1 was
created before set 2. When set 2 fractures were being formed at later geological
activity, certain part of the rock was weakened more seriously than other parts by the
existing set 1 fractures and therefore attract more new fractures being created. More
fracture aggregation would be the results of this action. This argument, of cause, is
only a speculation based on the distance analysis results. More verifications need to
be done from different angles, especially from the geological history of the site.
6. A common feature for non-homogeneous point patterns is the significantly high
proportion of small nearest event distances compared with homogeneous case. This is
so as more points aggregation means more small inter-event distance point pairs and
more tendency toward non-homogeneity. This feature is clearly visible from g(y)
analysis of the three cases.
7. Since a regular grid covering the whole area of the region being studied is used to
calculate the point to nearest event distance x, the distribution f(x) can then be viewed
as certain measure for points distribution across the whole area. For the case when
the whole area is covered by points realised from a homogeneous point process, f(x)
will show identical characteristics as g(y). For cases when points only occupy certain
area of the whole region, the distribution of f(x) will tend to uniform. An extreme
case is shown in Figure 2.34 below.
23
Stochastic Modelling of Fractures in Rock Masses, 2003
Figure 2.34 (b) h(t) for the case Figure 2.34 (c) h(t) for the case
Figure 2.34 (d) g(y) for the case Figure 2.34 (e) f(x) for the case
As for the example 1 dataset, considerable amount of space in the region is not
covered by data points and therefore f(x) tends to skew to the right compared with the
simulated results derived by the Monte Carlo simulations. As a result, the degree of
uniformality of f(x) may be used as an index to describe the proportion of the space
being occupied by the point cloud. More consideration may be given to this
speculation at later stage.
For example 2 data set, all analysis basically suggest similar features to those of
example 1 data set. Non-homogeneous process is again suggested for the fracture
trace set 1 only, fracture trace set 2 only or the combined fracture traces. Whether or
not the two fracture traces form a hierarchy depends on some rigorous test of
dependency between the sets. Relations between different point sets will be covered
in later research.
24
Stochastic Modelling of Fractures in Rock Masses, 2003
Suppose that n number of samples marked as N1, N2, …, Nn, are obtained by the
sampling. The most obvious test to see if they are from a Poisson distribution is to
derive a histogram, hence a distribution, from the samples and use the goodness-of-fit
testing technique. There is another quick and effective way to reach the same or
similar conclusion which uses the equality property between mean and variance of a
Poisson distribution. We do not, however, know the mean and variance of the
Poisson distribution we are testing and therefore they can only be estimated from the
samples. If N is the mean value of the sample, the ratio between the variance and
the mean of the samples, without any distribution assumption, is:
n
1
r= ∑
(n − 1) ⋅ N i =1
(N i − N )2
If the hypothesis being tested does not hold and we are still assuming Poisson points
field, the only implication is that the Poisson process is not homogeneous. In this
case, the ratio r helps to describe the degree of departure of the point pattern from
homogeneity. It is for this reason, r in quadrat count analysis is given a special name,
the index of dispersion, i.e.,
n
1
index of dispersion ( IOD) = ∑
(n − 1) ⋅ N i =1
(N i − N )2
25
Stochastic Modelling of Fractures in Rock Masses, 2003
There are about half a dozen of different indices, such as index of cluster size, index of
patchiness, index of mean crowding. For a more detail listing, see Upton []. These
indices are derived more or less based on IOD and do not actually reveal more
information about the point pattern and therefore they will not be discussed here.
Based on the definition, significantly large IOD indicates large variations for the
number of points inside the quadrats which implies point aggregation. Significantly
small IOD means the variations between quadrat count is small and the point pattern
is more regular. Diggle [] suggest that IOD test is powerful against point aggregation,
but weak against point regularity.
i =1 N
which can be approximated by χ n −1 on the condition that n > 6 and N > 1 . When
2
this criterion is used, lower tail and upper tail tests are possible. Lower tail
(significantly small ℵ2 value) is used to test against regularity and upper tail
(significantly large ℵ2 value) is used to test against point aggregation.
Note ℵ2 = (n-1)⋅ IOD. In other words, these two analyses are equivalent.
There are two issues for the sampling scheme: the number of quadrats to use and the
locations of quadrats. Obviously there are a great range of choices here but the main
concern for the selection is to satisfy the conditions of independent and representative
samplings. For independent sampling, regularly spaced mutually exclusive quadrats
can serve the purpose and for representative sampling, randomly located quadrats may
be a better choice. It is also possible to impose the mutually exclusive condition to
the random quadrats though the performance may become an issue when large
number of quadrats are used.
Figure 3.1 shows the difference between the random quadrat sampling and contiguous
sampling scheme. As can be seen, random sampling may not be able to produce
totally independent samples because of the overlapping of some quadrats. If,
however, the total number of quadrats used is not too high, the effect of sample
dependency on the analysis results may not be severe. There is research in distance
sampling analysis about the upper limit of the number of samples for independent
26
Stochastic Modelling of Fractures in Rock Masses, 2003
sampling, which is n = 0.1⋅N, where N is the total number of Poisson points (Diggle
[]). For quadrat count analysis, no similar value is reported to my readings. Some
research effort can be directed to the investigation into the limitation if it became an
issue. For the contiguous quadrat sampling, the grid can float in the region to get the
best representative samples. If performance is not an issue, covering the whole region
with the sampling grid may be a safer choice. Random selection of a certain number
of quadrats from a contiguous grid can also be used in some cases.
In our analysis, in order to eliminate the directional effect imposed by the original
proposal, intermediate blocks showing differences when horizontal or vertical
27
Stochastic Modelling of Fractures in Rock Masses, 2003
agglomeration is used are not to be used. This leaves the Greig-Smith variance being
calculated by:
G q = 4 ⋅ Tq − T4q q = 1, 4, 16, 64, ...
3.2 Quadrat count analysis of simulated point patterns
In this section, we are going to present some quadrat count analysis for four different
type of simulated point patterns, homogeneous, non-homogeneous, cluster and Cox
Poisson processes. The intension here is trying to make the connection between the
point patterns and the expected quadrat count analysis results.
Figure 3.2 (a) Homogeneous points Figure 3.2 (b) By random quadrats
28
Stochastic Modelling of Fractures in Rock Masses, 2003
Figure 3.2 (c) By regular quadrats - 1 Figure 3.2 (d) By regular quadrats -2
Figure 3.2 (e) By regular quadrat - 3 Figure 3.2 (f) By Greig-Smith grid
As can be seen from Figure 3.2 (b), the index of dispersion (IOD) remains more or
less constant at the value of around 1.0 for different quadrat sizes. It also follows the
curve of the average IOD based on the Monte Carlo simulation. The χ2 values for
different quadrat sizes are also calculated and shown as the green curve in the Figure.
The number of degrees of freedom in this case is 49 and the 95% confidence critical
value is 67. As can be seen from the figure, most of the χ2 values are not significant.
Only the χ2 value for the quadrat size of about 0.2 exceeds the critical value. This
phenomenon, however, is not consistent, as it is not the case during other running
sessions (i.e., with different random quadrats) and therefore the χ2 values for different
quadrat sizes should be considered not significant. In the figure, mean count and
quadrat count variance are also calculated and displayed. For homogeneous case,
these two statistics will have quadratic increase with the size of quadras, as can be
seen from the figure.
For quadrat count analysis using regular qurdrat, three different options are used. In
the examples shown here, the size of the grid covers 80% of the region. The first
option is for the grid to be fixed at certain location, for example, starting at 0.1 and
ending at 0.9 (relative scales), and the result is shown in Figure 3.2 (c). The second
option is for the whole grid to be located randomly in the region and the result is
shown in Figure 3.2 (d). The last option is to fix the grid at certain location but only a
certain proportion of the quadrats inside the grid are selected randomly for the
29
Stochastic Modelling of Fractures in Rock Masses, 2003
analysis and the result for this option is shown in Figure 3.2 (e). All results in these
three figure show very similar features. Compared with random quadrat analysis, two
interesting difference are obvious. Firstly, the χ2 values for small quadrat size are
extremely significant implying that in small scales the point process is not
homogeneous. This is true as any point process can be viewed as non-homogeneous
if the scale used is small enough. The reason this feature does not show up in random
quadrat analysis is considered to be due to the fixed number of random quadrats used
(50 compared with about 6000 number of quadrats used in regular grid quadrat
analysis) and in the case of small quadrat size, 50 samples may not be representative.
Secondly, note the difference between the shapes of the 95% confidence envelopes
for the homogeneous Monte Carlo simulation. The differences between the upper and
lower 95% envelope values are vanishing (and it should be) as the quadrat size
decreases in regular grid quadrat case but this is not so in random quadrat case. Non-
representative samples or sample correlations may be the reason behind. These
features suggest that in general, regular grid quadrat analysis should give a more
reliable analysis result.
For the Greig-Smith analysis, the result is presented in Figure 3.2 (f). There is no
significant variations in the variance except for the large quadrat sizes. For large
quadrat sizes, however, the number of quadrats available for calculating the variances
is normally small and the values are considered less reliable than the smaller quadrats
cases. In this example, for quadrat sizes less than 25% of the size of the region, the
variances are more or less constant, which implies no point aggregations detected
according to the intentions of the analysis. Note also the empirical variances follow
quite well the variances from Monte Carlo simulations.
Fig. 3.3 (a) Non-homogeneous points Figure 3.3 (b) By random quadrats
30
Stochastic Modelling of Fractures in Rock Masses, 2003
Figure 3.3 (c) By regular quadrats - 1 Figure 3.3 (d) By regular quadrats -2
Figure 3.3 (e) By regular quadrat - 3 Figure 3.3 (f) By Greig-Smith grid
From these figures, it is not difficult to conclude from IOD values that serious
departure from homogeneous point process has been detected by the analysis. The χ2
values are all extremely significant which also implies the departure of the point
pattern from CSR. Greig-Smith variances also show the discrepancy from CSR but
no sign of point aggregations.
One of the interesting points shown by these figures is for the cases of analysis when
the quadrat sizes are small. For these cases, the results do not actually suggest
departure of the point pattern from CSR. This can be considered as one of the weak
points by quadrat count testing. In other words, the quadrat count analysis is not
sensitive for small quadrat sizes.
Figure 3.4 (b) – (f) display the results of the quadrat count analysis. Serious departure
from CSR is again evident in all results.
31
Stochastic Modelling of Fractures in Rock Masses, 2003
Fig. 3.4 (a) Poisson cluster points Figure 3.4 (b) By random quadrats
Figure 3.4 (c) By regular quadrats - 1 Figure 3.4 (d) By regular quadrats -2
Figure 3.4 (e) By regular quadrat - 3 Figure 3.4 (f) By Greig-Smith grid
Compared with the analysis results for non-homogeneous points, Figure 3.3, the
considerable difference is the IOD values or G-S variances for small quadrat sizes.
The greater values in these figures suggest there is more point aggregations in smaller
scale in this example compared with the non-homogeneous case presented above.
This is true as the cluster process creates point aggregations in the scale of 10 (or 0.1
in relative scale). From these figures, it is only possible to conclude that point
aggregations happen in the scale roughly less than 20 (or 0.2 relative), but nothing
more details can be obtained.
32
Stochastic Modelling of Fractures in Rock Masses, 2003
Quadrat count analysis is considered to be good at detecting cluster sizes but based on
our examples this is not generally the case. Several factors contribute to the
effectiveness of this detection. The most important factor is that the quadrats used are
arranged in such a way that coincides with the locations of clusters. An idealised
example is given in Figure 3.5 (a). In this example, the quadrats used happen to be in
such a way that the main parts of most of point clusters are contained within the
quadrats. In this case, there will be a peak value for IOD for this quadrat size (which
is 10), as can be seen from Figure 3.5 (b). The cluster size in this case can easily
identified from such a quadrat count analysis. However, in practice, this kind of
quadrat arrangement is unlikely to be always the case in cluster point pattern analysis
and therefore detecting cluster sizes by quadrat count analysis is not always reliable.
For example, the same point pattern as Figure 3.5 (a) is analysed again using the same
regular grid, but repositioned in a slightly different location as shown in Figure 3.5
(c), the analysis result is displayed in Figure 3.5 (d). As can be seen, the peak value
of IOD implying the cluster size disappears all together. The similar result is
obtained by random quadrat count analysis as displayed in Figure 3.5 (e) and (f). The
cluster size in the later few cases fails to be detected.
Grid used
Fig. 3.5 (a) Special case of clusters Figure 3.5 (b) IOD results
Grid used
Fig. 3.5 (c) Different grid location Figure 3.5 (d) IOD results
33
Stochastic Modelling of Fractures in Rock Masses, 2003
Fig. 3.5 (e) IOD by random quadrats Figure 3.5 (f) IOD by random quadrats
Grid used
Fig. 3.5 (g) Special case of clusters Figure 3.5 (h) IOD results
Fig. 3.5 (i) G-S variance by 70×70 grid Fig. 3.5 (j) G-S variance by 75×75 grid
34
Stochastic Modelling of Fractures in Rock Masses, 2003
It is the same story when Greigh-Smith variance analysis is used. For the point
pattern displayed in Figure 3.5 (a), the Greigh-Smith variance analysis results for the
grid size of 70×70 is given in Figure 3.5 (i). The peak value of the variance at the
grid cell size of about 10 is obvious. However, this result again is sensitive to the
changes in the grid used. Figure 3.5 (j) gives the analysis result for the same point
pattern but using the 75×75 grid. As can be seen, the peak value present in Figure 3.5
(i) disappears. If we use the 70×70 grid to analysis the point pattern in Figure 3.5 (g),
the peak value is not present either. This demonstrates the similar conclusion reached
above that detecting cluster size by quadrat count analysis is not a reliable tool.
Quadrat count analysis can provide a reliable results for detecting the departure of the
point pattern from CSR, but not for detecting the size of clusters.
mean ( u, v ) = 0 . 1 ⋅ e 100
− u−2 v
variance (u, v ) = 0.015 ⋅ e 100
where u and v are the horizontal and vertical coordinates. At each location, the mean
together with the variance that is about 15% the mean value, define a normal
distribution for the density at that location. A random value is then generated from
this distribution to serve as the realisation of the density field at the location. A
realisation of this Cox process is given in Figure 3.6 (a).
Fig. 3.6 (a) Cox points Figure 3.6 (b) By random quadrats
35
Stochastic Modelling of Fractures in Rock Masses, 2003
Figure 3.6 (c) By regular quadrats - 1 Figure 3.6 (d) By regular quadrats -2
Figure 3.6 (e) By regular quadrat - 3 Figure 3.6 (f) By Greig-Smith grid
Results from quadrat count analysis are given from Figure 3.6 (b) – (f). These figure
show similar features as the analysis for non-homogeneous case, Figure 3.3. Apart
from the conclusion that there is serious departure of the point pattern from CSR, no
other specific features are apparent.
36
Stochastic Modelling of Fractures in Rock Masses, 2003
Figure 3.7 (c) By regular quadrats - 1 Figure 3.7 (d) By regular quadrats -2
Figure 3.7 (e) By regular quadrat - 3 Figure 3.7 (f) By Greig-Smith grid
37
Stochastic Modelling of Fractures in Rock Masses, 2003
Fig. 3.8 (a) Example 1 – set 1 Figure 3.8 (b) By random quadrats
Figure 3.8 (c) By regular quadrats - 1 Figure 3.8 (d) By regular quadrats -2
Figure 3.8 (e) By regular quadrat - 3 Figure 3.8 (f) By Greig-Smith grid
38
Stochastic Modelling of Fractures in Rock Masses, 2003
Fig. 3.9 (a) Example 1 – set 2 Figure 3.9 (b) By random quadrats
Figure 3.9 (c) By regular quadrats - 1 Figure 3.9 (d) By regular quadrats -2
Figure 3.9 (e) By regular quadrat - 3 Figure 3.9 (f) By Greig-Smith grid
39
Stochastic Modelling of Fractures in Rock Masses, 2003
Figure 3.10 (c) By regular quadrats - 1 Figure 3.10 (d) By regular quadrats -2
Figure 3.10 (e) By regular quadrat - 3 Figure 3.10 (f) By Greig-Smith grid
40
Stochastic Modelling of Fractures in Rock Masses, 2003
Fig. 3.11 (a) Example 2 – set 1 Figure 3.11 (b) By random quadrats
Figure 3.11 (c) By regular quadrats - 1 Figure 3.11 (d) By regular quadrats -2
Figure 3.11 (e) By regular quadrat - 3 Figure 3.11 (f) By Greig-Smith grid
41
Stochastic Modelling of Fractures in Rock Masses, 2003
Fig. 3.12 (a) Example 2 – set 2 Figure 3.12 (b) By random quadrats
Figure 3.12 (c) By regular quadrats - 1 Figure 3.12 (d) By regular quadrats -2
Figure 3.12 (e) By regular quadrat - 3 Figure 3.12 (f) By Greig-Smith grid
These figures reveal nothing more than the conclusion that the point pattern being
analysed have serious departure from CSR. Any point aggregation features can not be
concluded from these results. Many of the figures presented here are solely for the
purpose of giving a complete set of the analysis.
42
Stochastic Modelling of Fractures in Rock Masses, 2003
pattern. Quite often, characteristics such as cluster size may not be able to be detected
by this method.
4. K-function analysis
The last pattern analysis tool to be discussed in this report is the K-function analysis,
which belongs to the category of second moment analysis of point density. The
analysis is equivalent to the variogram analysis in geostatistical modelling and it
reveals some valuable spatial correlations for the point density measurement.
Using these definitions, supposed λ(X) can be evaluated accurately at all locations
within the region ℜ being considered, the random points are then transformed into a
random field quantified by the density values. Tools such as geostatistics can then be
used to analyse and model the variable. There are, however, two main points which
obstruct us to go directly in this route for the modelling. Firstly, the estimation of
point density λ(X) is difficult to be conducted in an objective and accurate manner.
Secondly, geostatistics only describes the model which is correct in a global scale and
is lack of descriptions for local details. This characteristic will find the technique
difficult to model sensibly some special point processes which require both global and
local models. A handy example for this case is the cluster process which needs the
global model to describe the distributions of clusters within the region, and the local
model to describe the point distributions within clusters. This argument does not
imply that geostatistics is not applicable in the case of point process modelling. It
may be worthwhile to do some comparison analysis at later stage.
43
Stochastic Modelling of Fractures in Rock Masses, 2003
Meanwhile something suitable for modelling spatial correlations of point density for
point process is needed. K-function is such a tool. A simple and practical definition
for K function is as follows:
E[number of further events within distance t of an arbitrary event ]
K (t ) =
λ
For formal definition of K function based on reduced Palm distribution, please see
Cressie []. The reason that the definition includes the point density value as its
denominator is for normalisation of the expression of K(t) by point density values,
hence eliminating the scaling effect of the density value. In other words, K(t)
corresponds to the expected number of further events within distance t of an arbitrary
event when the point density is a unity. Take the stationary point process as an
example. According to the definition of point density λ(X), the expected number of
events within a volume of V can be expressed as:
N (V ) = ∫ λ ( X ) ⋅ dX
V
For stationary cases, N(V) = λ⋅V, and therefore the K function in these cases will be:
λ ⋅ V (t )
K (t ) = = V (t )
λ
which is independent of density value λ, i.e., same point patterns with difference only
in density values can be described with the same model. For two dimensional case,
K(t) = π⋅t2.
Since K function is also a second order measure directly related to point density, their
relation can be formerly established. Conditioned on a known arbitrary event located
at X, we can find the conditional probability of another event at location Y as:
P{ N (dY ) > 0, N (dX ) = 1}
P{ N (dY ) > 0 / N (dX ) = 1} =
P{ N (dX ) = 1}
where dX and dY are infinitesimal volume centered at location X and Y. If dY is done
in such a way that only one event is possible inside the volume, we can then have the
following relations:
E{ N (dY ) = 1 / N (dX ) = 1} = 1 ⋅ P{ N (dY ) = 1 / N (dX ) = 1} = P{ N (dY ) = 1 / N (dX ) = 1}
E{ N (dY ) = 1, N (dX ) = 1} = 1 ⋅ P{ N (dY ) = 1, N (dX ) = 1} = P{ N (dY ) = 1, N (dX ) = 1}
E{ N (dX ) = 1} = 1 ⋅ P{ N (dX ) = 1} = P{ N (dX ) = 1}
dX
Note dY is only an infinitesimal volume centred at location Y and to get the total
expected number of events within distance t of the location X, which is λ⋅K(t)
according to definition, dY must be integrated over the ball centred at X. For
example, in two dimensional case, the integration must be done over the area of the
circle centred at X, i.e.,
44
Stochastic Modelling of Fractures in Rock Masses, 2003
t 2π
λ2 ( X , y) 2π
t
λ( X ) ⋅ K ( X , t) = ∫
∫ λ( X ) λ ( X ) ∫0
⋅ y ⋅ dy ⋅ d θ = λ 2 ( X , y ) ⋅ y ⋅ dy
0 0
Note dy here means the integration over y and is different to the meaning of dY
discussed above. The above relation can be re-arranged as:
[λ ( X )]2
λ2 ( X , t ) = ⋅ K ′( X , t )
2π t
For stationary case, X can be dropped from the relation:
λ2
λ2 (t ) = ⋅ K ′(t )
2π t
For point process in d-dimensions, similar integration can be done and the result is
given below:
d
λ2 Γ(1 + )
λ2 (t ) = 2 ⋅ K ′(t )
d π ⋅ t d −1
d
where λ can be replaced with the empirical intensity N/V(ℜ). This estimate does not
include pair of events for which event j is outside the region ℜ and is not observable.
In other words, the edge effect is not taken into account and the estimate is biased. To
obtain an unbiased estimate for K(t), we can use the guard volume techniques
discussed in Section 2 of this report. This approach, however, effectively throw away
a considerable amount of valuable points. Another approach is to take into account
the conditional probability pij that event j is observed given that the distance between
the event and the event i is dij.
45
Stochastic Modelling of Fractures in Rock Masses, 2003
For two dimensional case, pij can be calculated as the proportion of the circumference
inside the region ℜ of the circle centred at i and with radius dij. As shown in Figure
4.1, pij is the proportion of the solid circumference line over the whole circumference
of the circle.
When the circle is fully enclosed by the region ℜ, pij=1. In other words, the edge
does not affect the pair. When the circle is partially enclosed, pij<1, which means
there are possibilities some events with the same distance as event j to event i are
outside the region ℜ and therefore the calculated indicator value Iij(t) must be
compensated. To compensate Iij(t), the weight wij=1/pij is used to increase the
indicator value going into the calculation of K(t).
ldash (sl)out
Figure 4.1 (a) Edge correction Figure 4.1 (b) Numerical calculation
Similar correction to get unbiased estimate of K(t) for d-dimensional case can also be
obtained. In this case, surface areas of the d-dimensional sphere can be used instead
to calculate pij. Note the total surface area of a d-dimensional sphere with radius r is:
d π d ⋅ r d −1
S=
d
Γ(1 + )
2
In practice, K(t) is normally evaluated for certain number of distances t and then the
graph of K(t) vs t is plotted. There is a restriction on the selection of t values for the
evaluations: it can not go too large. From the above discussion about the edge
correction, it is possible as t goes large, pij→0 and wij→∝. Diggle [] suggest the
upper bound of half the maximum possible distance with the region ℜ as the
maximum value for t. For a unit square region, this works out to be 0.7.
For analysis purpose, the graph of {K(t)-πt2} vs t is normally used instead of K(t) vs t.
Because K(t)=πt2 for homogeneous Poisson point process, the plot of {K(t)-πt2} vs t
will be a horizontal line with the value of 0 for homogeneous cases. For cases other
46
Stochastic Modelling of Fractures in Rock Masses, 2003
than homogeneous one, the plot will demonstrate directly the degree of departure of
the point pattern from CSR.
where d1 = min(xi, a-xi), d2 = min(yi, b-yi) and a and b are the sizes of the rectangles
in x and y directions. d1 and d2 as defined are the shorter distance of i to the two
vertical edges and the two horizontal edge of the rectangular region.
The above equations are only applicable when dij is in the range of [0, 12 min(a , b) ] ,
which quite often may not be adequate in practical analysis, especially when the
region is an extremely slim (flat) rectangle (i.e., considerable difference in a and b).
This restriction provides only a partial edge correction for K(t) evaluation in the
region for small t values, not the whole range of interest. We approach the calculation
of pij purely by numerical approximation described below.
As shown in Figure 4.1 (b), we divide the circumference of the circle into L equal
length segments. The probability pij is then calculated as:
∑L ( s l ) | sl is inside ℜ
p ij =
∑ [( s l ) | sl is inside ℜ +( s l ) | sl is outside ℜ ]
L
If the value of L is large enough, the numerical method should provide a very good
approximation. Acceptable L value obviously depends on the size of the region. In
most of the cases we use the value of 50 for L and the differences between the
numerical results and those calculated by the above equations for small dij values are
negligible.
The advantage of using numerical approximation is that the pij can be calculated for
large dij value up till the maximum possible distance in the region. This is required
for a complete edge correction for the evaluation of K(t) in the region. As discussed
above, there is possibility that wij becomes unbounded for too large dij values, which
will push the corrected K(t) values for large t towards infinity. Another advantage of
this approach is that the technique can be readily adapted for higher dimensional cases
where no analytical solutions for pij are not available. More on this point when we
come to the stage to deal with three dimensional problems.
47
Stochastic Modelling of Fractures in Rock Masses, 2003
The second issue needs discussing is the evaluation of second order point density λ2(t)
and the covariance density γ(t). From Section 4.1, we understand the evaluations of
these two densities involve the calculation of the derivatives of K(t), i.e., K′(t). As
only discrete values of K(t) are available, K′(t) can only be evaluated numerically at
the t where K(t) value is available. The average of the forward and backward
derivatives at t is used as K′(t) at that distance.
As illustrated in Figure 4.2, the forward and back derivatives at distance t can be
written as:
K (t + ∆t ) − K (t )
K ′(t ) | forward =
∆t
K (t ) − K (t − ∆t )
K ′(t ) | backward =
∆t
1
K ′(t ) = ( K ′(t ) | forward + K ′(t ) | backward )
2
K(t)
K(t+∆t)
K(t-∆t)
t
t-∆t t t+∆t
γ (t ) = 0
As can be seen, γ(t) = 0, which implies there is no correlation between point densities
in different locations for this process.
48
Stochastic Modelling of Fractures in Rock Masses, 2003
The first data set generated is from a homogeneous Poisson process with λ=0.02 for
the same region (0,100)×(0,100) used in previous demonstrations. Figure 4.3 (a) is
one realisation of the process.
The functions of K(t), λ2(t) and γ(t) for this point pattern are given in Figure 4.3 (b).
The K function values from 100 Monte Carlo simulations are also plotted. The green
and pink lines are the 95% confidence envelope of [K(t)-πt2] when the underlying
process is homogeneous. As can be seen from the figure, [K(t)-πt2] from the data
follows more or less the average value from the Monte Carlo simulations, and is
oscillating around 0, implying homogeneous point pattern. γ(t) is also oscillating
around 0 which is also the behaviour of homogeneous Poisson point pattern as
discussed above. Note this analysis is conducted up till the distance of 90.
To demonstrate the effect of edge correction on the evaluation of K(t), λ2(t) and γ(t),
we have the following comparison. Figure 4.4(a) and (b) shows the empirical values
of K(t), λ2(t) and γ(t) with edge correction imposed for a homogeneous Poisson
process. Figure (4.4 (c) and (d) gives the evaluated values without edge correction.
As can be seen, K(t), λ2(t) and γ(t) values without edge correction could be very
misleading. Significant departure from values they should have is obvious after a
very short distance. All functions suggest wrongly a non-CSR point pattern. The
function values with edge correction, on the other hand, agree well with analytical
results and all suggest the correct point process.
49
Stochastic Modelling of Fractures in Rock Masses, 2003
As can be seen, all functions [K(t)-πt2], λ2(t) and γ(t) display significant departures
from the values for CSR case. A few interesting points are worth listed:
• [K(t)-πt2] curve is a of parabolic shape concaving upward. It increases as t
increases when t is small, peaks at a certain distance (to be discussed) and then
decreases as t increases for large t values. From definition, K(t) is directly
proportional to the expected number of points within an area of πt2. Therefore
a positive value of [K(t)-πt2] implies that the actual number within the area is
greater than that to be expected if the points are evenly distributed in the
region (i.e., homogeneous distribution of points). In other words, point
aggregation occurs within the area πt2 defined by the distance. Negative [K(t)-
πt2], on the other hand, signifies the actual number is lower than that expected
for even distribution case. This will always happen in the cases of large
50
Stochastic Modelling of Fractures in Rock Masses, 2003
distance t, hence large areas covered. These behaviours described are exactly
the basic characteristics of non-homogeneous point distribution across the
region as a whole.
• The distance t when [K(t)-πt2] peaks represents a balancing point when the
degree of point aggregations start decreasing as the area πt2 increases. This
point corresponds to the point where covariance γ(t) changes signs from
positive to negative, implying negative correlations between point density
separated beyond this distance. After the balancing point, [K(t)-πt2] value
continues decreasing as t increases and eventually becomes negative, implying
lower number of points than expected. For very large distances, [K(t)-πt2]
becomes unbounded. This may be caused by the unbound property of the
edge correction weight wij and therefore should be discarded. Note in this
example, the distance of this balancing point is about 50, half of the size of the
edge of the rectangular region.
• As discussed, covariance density γ(t) will change from positive to negative as t
increases. The distance where γ(t) = 0 represents the boundary within which
correlation between the point density of two locations is positive. Density
values of two locations separated more than this distance will have negative
correlation between them. As mentioned above, this distance corresponds to
the peak value of [K(t)-πt2]. Recall from geostatistics, the structural analysis
always suggests the range of influence beyond which the two random
variables will no longer correlate, i.e., correlation coefficient = 0. This seems
not to be the case in point density analysis. The influence of point density in
one location on another location can either be positive or negative (except for
homogeneous cases). The reason for this is due to the construction of the
reference homogeneous (average) point distribution used for density analysis,
which takes all points and the whole region. The K function analysis is
actually about the difference between the actual and the reference point
patterns and therefore density variables in the whole region are correlated.
• The shape of the curve for γ(t) may be important. It may be used to reveal the
characteristics of the underlying density model of the point process. We will
come back to this point in later discussions.
(In the following analysis, I use the word “bump”. Please If you can think of a better
suggestion).
Figure 4.6 (a) and (b) are the generated pattern and the results for empirical K(t), λ2(t)
and γ(t). The most interesting feature of the graph is the bumpy characteristics. This
is actually the feature unique to cluster point processes. To demonstrate this, we start
with just one cluster, such as the one shown in Figure 4.7 (a). The K(t), λ2(t) and γ(t)
for this point set are given in Figure 4.7 (b) and (c). As can be seen, one bump is
present for [K(t)-πt2], but none for others. For two clusters, the results are shown in
Figure 4.8 and two bumps are observed for [K(t)-πt2] and one for λ2(t) and γ(t). For
51
Stochastic Modelling of Fractures in Rock Masses, 2003
three clusters, Figure 4.9 shows four bumps for [K(t)-πt2] and three for λ2(t) and γ(t).
In fact, for n clusters, the number of bumps present in the curve of [K(t)-πt2] vs t will
be 12 n(n − 1) +1 and the 12 n(n − 1) for the curves of λ2(t) and γ(t). This kind of
behaviour can be explained below:
Size ≈ 10
Size ≈ 10
A
Distance ≈ 58
B
Size ≈ 10
52
Stochastic Modelling of Fractures in Rock Masses, 2003
10 Distance ≈ 58 10 Distance ≈ 58
Dist. ≈ 58
B
Dist. ≈ 88
Dist. ≈ 32
C
For two cluster case, Figure 4.8 (a), when distance t is within the scale of the cluster
size, the behaviour of K(t), λ2(t) and γ(t) are the same as in the single cluster case
except the absolute function values. As t increases and before the area defined by t
spans both cluster A and B, K(t) and γ(t) will also stay unchanged and λ2(t) will
53
Stochastic Modelling of Fractures in Rock Masses, 2003
remain as zero. As the area starts include points from both clusters at the same time,
values for these functions will start increasing until the maximum separate distance
(in the example 58) is reached. After that, K(t) and γ(t) will again stay unchanged and
λ2(t) will again become zero.
As for three cluster case, Figure 4.9 (a), K(t), λ2(t) and γ(t) have the same behaviour
as those discussed above. Only the absolute values of the functions are different. As
distance t increase, K(t), λ2(t) and γ(t) will remain unchanged before the area defined
by t can possibly cover points from at least two out of the three clusters. The function
values will then start increasing until they stabilise at another level. Note in this case,
there are three different distances separating the clusters and therefore three further
stabilising stages, or three further bumps can be observed.
For the case of n clusters, similar characteristics as those discussed above can be
expected for K(t), λ2(t) and γ(t). As mentioned above, the total number of bumps is
equal to 12 n(n − 1) +1 for [K(t)-πt2] and 12 n(n − 1) for λ2(t) and γ(t). The actual
numbers, however, may be different depending on the cluster distributions across the
region. For example, if all clusters are separated by the same distance only one bump
will be observed as points from all clusters will come into effect at the same distance.
Another possible case will be when the differences in distances between clusters are
small or there are too many clusters in the region, it will not possible to distinguish
two or more very close bumps. In extremely case when n→∝, i.e., there are infinite
number of clusters separated by all possible distances, an infinite number of bumps
will make up the curves which will actually come out as smooth curves, i.e., no
bumps at all.
In most cases this estimation is correct. It still can be used when the distance between
clusters is less than the cluster size as in this case clusters join together to form large
clusters and the first bump can still be used to estimate averaged “joined” cluster size.
The only exception is when all clusters mix together to form a “smeared” picture of
points in the region so that point clusters visually disappears. In this case, there may
be still the first lump in the [K(t)-πt2] curve, which may or may not correctly identify
the size of the underlying point clusters, or the first bump may not show up at all. To
demonstrate this point, look at the following examples. Figure 4.10 (a) is realised
from a cluster process with cluster radius of 20 (cluster size = 40) and as can be seen
from Figure 4.10 (b), the cluster size is correctly identified from the first bump in the
[K(t)-πt2] function. When the number of cluster is increased, however, the cluster
patterns become mixed up and a smeared point pattern is obtained, as shown in Figure
4.10 (c). The [K(t)-πt2] function shown in Figure 4.10 (d) fails to identify any cluster
effect at all. It only reveals a non-homogeneous point process.
54
Stochastic Modelling of Fractures in Rock Masses, 2003
20
Fig. 4.10 (c) A few more clusters added Figure 4.10 (d) [K(t)-πt2] vs t
The feature of the first bump in the [K(t)-πt2] curve discussed above is only broadly
correct if the cluster pattern is the dominant characteristics within the region. In point
processes, point clusters are present but the dominant features of the whole point
pattern is something else such as a non-homogeneous point process. In this case,
[K(t)-πt2] curve will still give the bump features discussed above but the first bump
disappears as for the distance in the scale of the cluster size, non-homogeneous point
process is dominant. Two examples are given in Figure 4.11 where the parent process
is a non-homogeneous process with density λ=f(x). 10 or 20 daughter points are
generated for each parent and they are distributed uniformly within the circle of radius
= 5 and centred at parent point. As can be seen from 4.11 (b) and (d), the first bump
of the [K(t)-πt2] curve corresponding to the cluster size fails to show up clearly. The
curves do still preserve the bump features. Care should be exercised in estimating the
cluster size when features other than clustering (such non-homogeneity) are dominant
in the point pattern.
55
Stochastic Modelling of Fractures in Rock Masses, 2003
From the theoretical side, the K(t) function for a cluster process can be expressed as
follows (see Diggle []):
E[ S ( S − 1)] ⋅ H 2 (t )
K ( t ) = πt 2 +
ρ ⋅ E [ S}
where ρ is the parent point density, S is the number of daughters per parent, E[⋅] is the
expectation and H2[⋅] is the distribution function of the PDF h2[⋅] defined as:
h2 (Y ) = ∫ h( X ) ⋅ h( X − Y ) ⋅ dX
and h[⋅] is the PDF of daughter points relative to their parents. For example, if each
parent produces a Poisson number of daughter points and daughter points are
distributed around their parents according to the bi-variate model:
u2 +v 2
1 −
h(u, v ) = e 2σ
2
2πσ 2
σ is the dispersion variance of daughter points. The K(t) function for this process can
be deduced as (see Cressie []):
t2
1 − 2
K (t ) = πt + [1 − e 4σ ]
2
ρ
I think this theoretical solution, however, only takes into account the clustering effect
in the scale of the cluster size, i.e., clustering of points from the same parent. The
clustering or non-homogeneous effects from points in different clusters are only
approximated with a homogeneous term πt2 in the equation. In other words, the
56
Stochastic Modelling of Fractures in Rock Masses, 2003
equation is only correct up till the distance t equal to the average size of the clusters.
Take the example shown in Figure 4.6. The theoretical curve based on the above
equation is only roughly correct up to the distance 10, which is the average size of
clusters within the region, as shown in Figure 4.12 below.
Theoretical
model
≈10
57
Stochastic Modelling of Fractures in Rock Masses, 2003
(Figure 4.5 (a) and (b) are reproduced here for easy comparison)
58
Stochastic Modelling of Fractures in Rock Masses, 2003
59
Stochastic Modelling of Fractures in Rock Masses, 2003
The K(t), λ2(t) and γ(t) function results for the whole of dataset 2 and the two subsets
are given in Figure 4.15 below. [K(t)-πt2]
60
Stochastic Modelling of Fractures in Rock Masses, 2003
(e), but the overall behaviour of the point pattern is non-homogeneous. This may be a
totally misperception but nevertheless it is worth some more investigation.
For non-homogeneous and Cox processes, second order functions show very similar
features which imply that both processes are all non-homogeneous in nature. In other
words, an inhomogeneous point pattern can be either modelled by non-homogeneous
Poisson model or by the Cox model. Both, if modelled accurately, should statistically
give the same answer (on average). Apart from the extra modelling component
(freedom) provided by Cox process, Cox modelling is no difference to the non-
homogeneous modelling. To illustrate this point, we use a one dimensional example.
X X
Fig. 4.16 (a) Non-homogeneous modelling Figure 4.16 (b) Cox modelling
61
Stochastic Modelling of Fractures in Rock Masses, 2003
62
Stochastic Modelling of Fractures in Rock Masses, 2003
Simple coordinate transformation does not cut out all the white spaces occupied
within the region. This suggests another possible approach to improve the modelling.
If a polygonal boundary is defined around the region of interest, it is possible to use a
simpler point process for easier and more accurate modelling of the point pattern
within the polygonal region. Two examples are given in Figure 4.18 below, where (a)
is for sub-set 1 of example data set 1 and (b) is for whole dataset 2. The program we
have has not yet implemented this technique yet. However, only by visual inspection,
it is not difficult to suggest points within the defined polygon(s) can be modelled by
homogeneous point process.
Figure 4.18 (a) Polygonal dataset - 1 Figure 4.17 (b) Polygonal dataset - 2
5. References
1. Cressie, N. A. C., Statistics for spatial data, John Wiley & Sons, Inc., New Yory,
1993.
2. Diggle, P, Statistical analysis of spatial point patterns, Academic press, 1983.
3. Lee, J. S., Einstein, H. H. & Veneziano, D., Stochastic and centrifuge modelling of
jointed rock, Part III – stochastic and topological fracture geometry model,
technical report MIT CE R-90-25, MIT, 1990.
4. Upton, G and Fingleton, B, Spatial data analysis by example, John Wiley & Sons,
1985.
5. van Lieshout, M. N. M., Markov point processes and their applications, Imperial
College Press, London, 2000.
63
Stochastic Modelling of Fractures in Rock Masses, 2003
64
Stochastic Modelling of Fractures in Rock Masses, 2003
65
Stochastic Modelling of Fractures in Rock Masses, 2003
66