0% found this document useful (0 votes)
2 views

Lecture notes

The document outlines a module on Spatial Statistics for Bioimage Analysis, led by Dr. Ed Cohen, focusing on statistical tools for analyzing spatial data in fluorescence microscopy. It covers topics such as spatial point patterns, hypothesis testing, and model fitting, with practical applications in various fields. The module includes lectures, problem classes, and an exam, along with recommended readings for further study.

Uploaded by

Len McLemore
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Lecture notes

The document outlines a module on Spatial Statistics for Bioimage Analysis, led by Dr. Ed Cohen, focusing on statistical tools for analyzing spatial data in fluorescence microscopy. It covers topics such as spatial point patterns, hypothesis testing, and model fitting, with practical applications in various fields. The module includes lectures, problem classes, and an exam, along with recommended readings for further study.

Uploaded by

Len McLemore
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 172

Spatial Statistics for Bioimage Analysis

What’s going on?


I 12 lectures.
I 2 problems classes, Tuesday and Wednesday.
I Office Hour 2-3pm Thursday, Huxley 536.
I Exam (Friday).
Who am I?
I Dr Ed Cohen, Statistics, Department of Mathematics
I Contact: [email protected]
I Research: Statistical Signal and Image Processing
What is this module about?
I Studying simple models for the type of spatial data found in
fluorescence microscopy, as well as several other other
applications
I Developing simple statistical tools for analysing these type of
data
1 / 172
Overview: Spatial Statistics for Bioimage Analysis

Things we’ll look at this module:


I An intro to bioimaging and microscopy
I Spatial point patterns
I Spatial point processes
I Hypothesis testing for spatial data
I Model fitting
I Multivariate spatial data

2 / 172
The notes are self-contained, however, should you desire extra
reading, I recommend

Diggle, P.J., Statistical analysis of spatial and spatio-temporal


point patterns, CRC Press.

Cressie, N., Statistics for Spatial Data, Wiley.

3 / 172
Bioimaging
An intro to fluorescence microscopy
See other slides

4 / 172
Spatial data and spatial point
patterns

5 / 172
Introduction to spatial data

I Spatial data is classed as any data that represents


observations over a spatial domain.

I Typically, this spatial domain will be R2 , but is often R3 , and


may even be in more general spaces (beyond the scope of this
module).

I Spatial data can be broadly categorised into two types: (i)


data sampled over a continuous domain, (ii) event data.

6 / 172
(i) Continuous domain data
Data sampled over a continuous domain are those that observe a
continuous process. They could be sampled anywhere in the
domain (hence the term continuous). Examples include
I Temperature recorded at weather stations, i.e.,

{T (s 1 ), T (s 2 ), ...},
where s 1 , s 2 , ... ∈ R2 are the grid coordinates of the weather
stations.

7 / 172
(i) Continuous domain data
I Wind velocity at weather stations, i.e.

{V (s 1 ), V (s 2 ), ...},

where s 1 , s 2 , ... ∈ R2 are the grid coordinates of the weather


stations.

8 / 172
(i) Continuous domain data
I Air pollution recorded at pollution sensors, i.e.

{P(s 1 ), P(s 2 ), ...}

, where s 1 , s 2 , ... ∈ R2 are the grid coordinates of the


pollution sensors.

9 / 172
(ii) Event data

Event data is the location of specific events in R2 or R3 . They can


be represented as a list {s 1 , s 2 , ...}. Examples of event data
include
I Location of Leukaemia sufferers 1964-1968 in Woburn
Massachusetts, US.

10 / 172
(ii) Event data

I Location of crimes in a city

11 / 172
(ii) Event data

I Location of Japanese pine trees in a region of forest.

japanesepines

12 / 172
(ii) Event data
I Location of SrcN15-mEos2 on the plasma membrane of HeLa
cells.

From: P Annibale, Investigating the Impact of Single Molecule Fluorescence Dynamics on Photo Activated

Localization Microscopy Experiments, 2012

13 / 172
Spatial Point Patterns

I It is event data that will be the focus of this module. When


the event locations are random, this is known as a spatial
point pattern.

I Broadly speaking, “a spatial point pattern is a set of


locations, irregularly distributed within a designated region
and presumed to have been generated by some form of
stochastic mechanism (random process).” [Diggle, p. xxix].

I Any such data-set is termed a spatial point pattern. The


locations are referred to as events. This is to distinguish them
from arbitrary points of the region of space.

14 / 172
Question:

Point patterns arise in many different contexts. Just a few


examples have already been given. Can you think of any more?

15 / 172
Spatial Point Patterns
I Here are two spatial point patterns in a square region of
interest (ROI).
(a) Completely Spatially Random (b) Clustered

Figure: Example point patterns

I The two patterns appear to be very different. The left pattern


appears shows no clear structure and might be regarded as
”completely random”. The right pattern, on the other hand,
shows clear clustering.
16 / 172
Here is a further different type of pattern. This shows the location
of cell nuclei.
cells

Figure: Position of cell nuclei in a widefield microscopy image. An


example of a regular point pattern.

They are distributed more or less regularly over the ROI. This
would be highly improbable, unless there is some underlying
mechanism which promotes an even distribution.
17 / 172
Question

Can you think of data that might be appear completely random,


data which you would expect to be clustered, and data which you
might expect to be regular?

18 / 172
Objectives of statistical analysis

Spatial statistics is about analysing spatial data to make inference


on some underlying mechanism.

19 / 172
Point processes

20 / 172
Poisson distribution
Before we move forward, it will be necessary for us to define the
Poisson distribution.

Definition
A discrete random variable X is said to have a Poisson distribution
with parameter µ > 0, X ∼ Poisson(µ), if for k = 0, 1, 2, ...

µk exp(−µ)
P(X = k) = .
k!

If X ∼ Poisson(µ), then E {X } = µ and Var(X ) = µ.

NOTE: 0! = 1 by convention.

21 / 172
Poisson distribution

mean = 1 mean = 2 mean = 5 mean = 20


0.4 0.3 0.18 0.09
0.35 0.16 0.08
0.25
0.3 0.14 0.07
0.2 0.12 0.06
0.25
P(X=k)

0.1 0.05
0.2 0.15
0.08 0.04
0.15
0.1 0.06 0.03
0.1 0.04 0.02
0.05
0.05 0.02 0.01
0 0 0 0
0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 0 5 10 15 20 0 10 20 30 40
k k k k

Figure: Probability mass function for the Poisson distributed when µ = 1,


µ = 2, µ = 5, µ = 20

22 / 172
Poisson distribution

The Poisson distribution is a hugely important distribution in


statistics as it models count data - in particular, it models the
count of events that happen independently. For example:
I The number of letters you receive each day.
I The number of emails you receive each day.
I The number of people who join a post office queue between
9:30am and 10:00am.
I The number of phone calls a call centre receives each day.

Question: What other examples can you think of?

23 / 172
Simulating Poisson Data
To simulate n Poisson random variables with expected value µ in R
we use the function rpois(n,mu).
For example:

X = rpois(10,5)

which gives me
3 5 7 3 6 5 8 6 6 3

X = rpois(500,5)
hist(X,freq=F,breaks = seq(0,20,1)

24 / 172
Question

The number of emails I get each day is Poisson distributed with


expected value 10. What is the probability I get no emails on any
given day?
(a) 0.1
(b) 0
(c) exp(−10)
(d) 10 exp(−10)
(e) exp(−240)

25 / 172
Solution

26 / 172
Question

The number of buses that arrive at my local bus stop between


08:00am and 09:00am is Poisson distributed with expected value 4.
What is the probability at least one bus turns up?
(a) 0
(b) 1
(c) exp(−4)
(d) 1 − exp(−4)
(e) None of the above

27 / 172
Solution

28 / 172
Stochastic processes

I A stochastic process is a random process that evolves in either


time, space, or time and space.
I A key concept with stochastic processes is the difference
between the process and the realization.
I The process is the stochastic mechanism that generates the
data.
I The realization is the observed data for any one run of the
stochastic process.
I Run the same stochastic process again and you get a new
realization.
I Often, we will only see one realization of the random process,
from which we must try and understand the mechanics of the
process itself.

29 / 172
Examples
Stochastic process (Markov Stochastic process (Random
chain) walk)

Xt = Xt−1 +t ; t ∼ N(0, 1).

Realisations (Starting with


Xt = 0.)

250

200

150

100

Realisations (starting in A)
50

A, B, B, C , A, C , C , A, B, A, ... -50

A, C , C , A, B, C , C , C , A, C , ... -100

-150
0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000

A, C , A, B, A, B, A, C , A, B, ... t

30 / 172
Point processes
I Point processes (also called event processes), are a type of
stochastic process.
I They are used to model event data, in either the temporal
domain, the spatial domain, or both.
I Temporal examples:
I The times at which you receive an email
I The times at which children are born in a maternity unit.
I The times at which a bus arrives at a bus-stop
I Spatial examples:
I where a particular species of trees grow in a forest
I where protein molecules position themselves on a cellular
membrane
I where crimes happen in a city
I Spatio-temporal examples:
I The time and location of earthquake events

31 / 172
STOCHASTIC MECHANISM REALIZATIONS

ROLLING TWO DICE

BIVARIATE DISCRETE
UNIFORM DISTRIBUTION ON
{1,2,3,4,5,6} x {1,2,3,4,5,6}

CLATHARIN COATED
PITS ON CELL MEMBRANCE

POISSON PROCESS
WITH INTENSITY !

32 / 172
STOCHASTIC MECHANISM REALIZATIONS

ROLLING TWO DICE

BIVARIATE DISCRETE
UNIFORM DISTRIBUTION ON
{1,2,3,4,5,6} x {1,2,3,4,5,6}

CLATHARIN COATED
PITS ON CELL MEMBRANCE

POISSON PROCESS
WITH INTENSITY !

33 / 172
Point process in 1D
We will begin our exploration into point processes with temporal
point processes (1D point processes).
I We will represent an event process as N(A), where A ⊂ R.
N(A) is a random number describing the number of events
that occur in A.
I We will also use the notation N(t) to denote the number of
events that have occurred in the interval (0, t]. I.e., for an
June 28, 2019

Abstract

interval (a, b], N((a, b]) = N(b) − N(a).


1

Figure: point process in 1D


34 / 172
Simple processes

We will solely be dealing with simple point processes

Definition (Simple)
N is called a simple point process if no two events can occur at
exactly the same time.
I We will make use of the incremental process
dN(t) = N(t + dt) − N(t).
I Essentially this tells us the number of events that occur in the
interval (t, t + dt] and for simple processes is Bernoulli (i.e.
takes a value of 0 or 1).

35 / 172
Intensity

Let us consider a key descriptor of a point process.

Definition (Intensity)
The intensity λ(t) of a point process is the expected number of
events per unit time, i.e.

E {dN(t)}
λ(t) = .
dt

36 / 172
Poisson process in 1D

A Poisson process is the simplest type of event/point process.

Definition (Poisson process)


Event process N is called a Poisson process if the following two
conditions hold:
1. For any set A R⊂ R, N(A) is Poisson distributed with expected
value µ(A) = A λ(t)dt. We write N(A) ∼ Poisson(µ(A)).
2. For any disjoint pair of sets A and B, N(A) and N(B) are
independent events.

37 / 172
Homogeneous process
Definition (Homogeneous)
A Poisson process N is called homogeneous is λ(t) = λ for all t.
That is to say, it has a constant intensity for all time.
Processes that have a variable intensity are called inhomogeneous.
inhomogeneous intensity homogeneous intensity
100 100

80 80

60 60

40 40

20 20

0 0
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
time time
inhomogeneous events homogeneous events

0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1


time time

38 / 172
Properties of Poisson Processes

I A Poisson process has the key memoryless property.

I That is to say, the presence or absence of an event at time t


has no bearing on whether there will be events anywhere else
in the process. All events occur independently of the others.

I Question: Can you think of any real-life event processes that


can legitimately be modelled as Poisson?

39 / 172
Question

People join a Post Office queue according to a homogeneous


Poisson process at a constant intensity of 1 per minute. What is
the probability that no people join the queue between 10:00:00am
and 10:05:00am?
(a) exp(−5)
(b) 1
(c) exp(−1)
(d) 1 − exp(−5)
(e) None of the above

40 / 172
Solution

41 / 172
Question

Mass extinction events happen according to a Poisson process with


intensity 1 per 10 million years. What is the probability that at
least one will occur in the next 1 million years.
(a) exp(−24)
(b) 1 − exp(−1)
(c) 1 − exp(−24)
(d) 1
(e) None of the above

42 / 172
Solution

43 / 172
Question

The radioactive sample decays according to an inhomogeneous


Poisson process at a intensity of 10 exp(−0.1t) (t > 0) events per
second. What is the E {N((0, 10])}?
(a) exp(−10)
(b) exp(−1)
(c) 10 exp(−1)
(d) 10
(e) 1 − exp(−1)

44 / 172
Solution

45 / 172
Simulating temporal Poisson processes
Suppose we wish to simulate a homogeneous Poisson process with
intensity λ on some interval (0, T ]. How might we go about doing
so?
I We know that the number of events in (0, T ], i.e.
N(T ) = N((0, T )) ∼ Poisson(λT ), so we first simulate a
Poisson random variable with expected value λT . This will
give us the total number of events in our realization.
n = rpois(1,lambda*T)
I Then we wish to distributed these events uniformly in the
region.
E = runif(n,min=0,max=T)
I It will probably be convenient to order these events in
increasing order.
E = sort(E)
I We could plot these with
plot(E,rep(0,length(E),type="p",pch=19,cex=0.3)
46 / 172
Realisations of a homogeneous Poisson process on (0,10]
with intensity λ = 2

0 2 4 6 8 10 0 2 4 6 8 10
time time

0 2 4 6 8 10 0 2 4 6 8 10
time time

0 2 4 6 8 10 0 2 4 6 8 10
time time

47 / 172
Spatial Point Processes
I The concept of a point process extends to higher dimensions
(space).

I In this course, we will only be working with data in R2 , so we


will define it as such.

I We will in fact work with a region X ⊆ R2 .

I It is often assumed the process exists on the entirety of R2 ,


i.e. X = R2 , however it could be some other region if we
know the process is restricted to a certain domain.

48 / 172
Spatial Point Process

I We will again use the notation N to represent a point process,


where for some set A ⊂ X , N(A) is the random number of
events in set A.

I We again start by defining the intensity of a spatial point


process. In a slight switch of notation, we will let ds be an
infinitesimal ball (a circle in R2 ).

I We will solely be dealing with simple point processes


Definition (Simple)
N is called a simple point process if no two events can occur at
exactly the same point in X .

49 / 172
Intensity

! = 10 on [0,10]x[0,10]

N
Definition (Intensity)
The intensity λ(s) of a point process is
the expected number of events per unit
area, i.e.
! = 1 on [0,10]x[0,10]

E {N(ds)}
λ(s) = lim .
|ds|→0 |ds|

50 / 172
Intensity

A useful interpretation of the intensity is as follows. The number


of events in the infinitesimal ball ds is a Bernoulli random variable
(i.e. takes a value of either 0 or 1).

Therefore

E {N(ds)} = 1 · P(N(ds) = 1) + 0 · P(N(ds) = 0) = P(N(ds) = 1)

and
P(N(ds) = 1) = λ(s)|ds|.

51 / 172
Poisson Process

Definition (Poisson process)


Event process N is called a Poisson process if the following two
conditions hold:
1. For any set A R⊂ X , N(A) is Poisson distributed with expected
value µ(A) = A λ(s)ds. We write N(A) ∼ Poisson(µ(A)).

2. For any disjoint pair of sets A and B, N(A) and N(B) are
independent events.

52 / 172
1

Figure: Properties 1of a Poisson process

53 / 172
Homogeneous

Definition (Homogeneous)
A Poisson process N is called homogeneous is λ(s) = λ for all
s ∈ X . That is to say, it has a constant intensity over all space.

Processes that have a variable intensity are called inhomogeneous.

Homogeneous Poisson processes can be called completely spatially


random (CSR). In other words, events are equally likely to occur
anywhere, irrespective of where any other events are.

54 / 172
Question

Let N be a homogeneous Poisson process with intensity λ = 2, and


let S be the unit circle (r = 1) centred at 0. Which of the
following statements in FALSE?
(a) E {N(S)} = 2π.
(b) The number of events in S is random.
(c) P(N(S) = 0) = exp(−2π).
(d) P(N(S) = 1) = π exp(−2π).
(e) P(N(S) > 0) = 1 − exp(−2π).

55 / 172
Complete spatial randomness

56 / 172
Complete spatial randomness
0 0 1 0 0 0 1 1 0 0 1 1 0 1 0 1 1 0 0 0 1 1 1 1 1 0 1 1 0 0 1 1 1 1 0 0 0 1 1 1 0
1 0 0 1 0 1 1 0 0 0 0 1 0 1 0 0 0 1 0 0 1 0 0 0 1 0 1 0 1 1 0 0 1 0 0 0 1 1 0 1 1
0 0 0 0 1 1 1 1 0 0 1 0 1 0 1 0 1 0 0 1 0 1 1 0 1 0 1 0 0 0 1 0 0 1 1 0 0 1 1 1 1
1 1 1 1 1 0 1 0 1 1 1 0 0 0 0 0 1 1 1 1 0 0 0 0 1 0 0 1 0 0 0 0 1 1 0 1 0 0 0 1 1
1 1 0 0 0 1 1 0 1 0 0 1 0 0 0 0 0 0 0 0 1 1 0 0 1 0 0 0 1 0 0 0 0 1 0 1 1 0 0 0 1
0 1 0 0 0 1 1 1 0 0 0 0 0 1 0 1 1 0 0 0 1 0 0 1 1 1 0 1 0 0 1 1 0 0 0 0 1 1 0 0 0
1 0 1 1 1 0 1 1 1 1 0 0 1 1 0 0 0 0 1 0 1 1 1 1 1 1 1 0 1 1 0 0 0 0 0 0 0 1 1 1 1
0 1 1 0 1 0 0 0 0 1 1 0 0 0 0 0 0 1 1 1 0 0 0 0 1 0 0 1 0 0 1 0 0 1 1 1 0 0 1 0 0
0 0 1 1 0 1 1 1 0 1 0 1 0 0 1 0 1 0 0 1 1 1 1 1 0 0 0 0 1 0 1 0 1 0 1 0 0 0 0 0 1
1 0 0 0 1 0 0 1 0 0 1 0 0 1 0 0 0 0 1 1 0 0 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 0
1 0 1 1 1 1 1 1 1 1 1 1 0 0 0 1 1 1 0 1 1 1 1 1 0 1 0 0 1 0 0 0 0 1 0 1 1 0 0 1 1
0 0 0 0 1 0 1 0 1 0 0 1 1 1 1 1 0 1 0 1 1 1 1 0 0 0 0 1 1 1 1 1 0 0 0 1 1 0 0 0 0
0 0 1 0 1 1 1 1 0 1 0 1 1 0 1 0 0 1 1 0 0 0 1 1 0 0 0 0 0 1 0 1 1 0 1 0 0 1 1 1 0
1 1 1 1 1 1 1 1 0 1 0 0 1 1 0 1 1 0 1 0 0 0 0 1 0 1 0 0 0 1 1 1 0 0 0 0 1 0 0 1 0
1 1 0 1 0 0 0 0 0 0 1 1 0 1 1 1 0 0 0 1 1 0 0 0 1 1 0 0 0 1 0 1 0 1 1 1 0 0 1 1 1
0 0 1 1 1 0 0 1 1 0 1 1 0 1 0 0 0 1 0 0 1 0 1 1 1 0 0 1 0 1 0 0 1 0 1 0 1 1 0 0 1
1 0 1 0 0 0 1 1 0 0 0 0 0 1 0 1 1 1 0 0 0 0 0 0 0 1 1 1 0 0 0 0 1 0 0 1 0 1 1 1 1
0 1 1 1 1 1 0 1 1 1 0 1 0 0 1 0 0 1 0 0 1 1 0 0 0 1 0 0 0 0 1 1 1 1 1 1 0 0 1 1 1
1 1 1 0 1 0 1 1 1 1 1 0 1 0 0 0 1 1 1 0 0 1 1 0 0 1 0 0 1 0 1 1 0 0 1 0 0 0 1 1 1
0 1 1 0 0 1 1 1 0 1 1 0 1 0 0 1 1 0 0 0 0 1 0 1 1 1 0 0 0 0 1 0 1 0 0 0 1 1 0 0 0
0 0 0 0 1 0 1 1 0 1 1 0 1 1 1 1 1 0 1 1 1 1 0 1 0 0 0 0 1 1 0 1 0 0 0 1 0 1 1 1 0
0 0 0 1 0 1 0 0 0 0 0 1 1 0 1 1 1 1 0 0 0 1 0 0 0 0 1 0 1 0 0 0 1 0 1 1 0 0 1 0 0
57 / 172
Different types of process
It again forms the dividing live between two other broad classes of
process:
I Cluster processes tend to have events occur in groupings.
What real-life spatial point processes exhibit clustering?
I Regular processes tend to have events that are well separated
in time. What real-life spatial point processes exhibit
regularity?
CSR Clustered Inhibited

58 / 172
Obvious

CSR Clustered Inhibited

59 / 172
Not obvious

CSR Clustered Inhibited

60 / 172
Simulating spatial Poisson processes
I Suppose we wish to simulate a homogeneous Poisson process
with intensity λ on some rectangular/square ROI
A = (a, b) × (c, d).
I How might we go about doing so?
I We know that the N(A), the number of events in A is
Poisson(λ|A|), so we first simulate a Poisson random variable
with expected value λ|A|. This will give us the total number
of events in our realization.
I Then we wish to distributed these events uniformly in the
region.
I You could write your own piece of code to do this, or you
could just use the function rpoisspp in spatstat.
I E.g.
N = rpoispp(10,win=c(0,10,0,10))
plot(N,type="p",pch=19,cex=0.3)
61 / 172
Second-order intensity and covariance intensity

I Recall from module 2, that covariance describes the joint


variability of two random variables X and Y , and is defined as
E (XY ) − E (X )E (Y ).

I The covariance intensity extends this notion to point


processes to describe joint variability in the process between
two points in space.

I We first describe the second order intensity....

62 / 172
Second-order intensity

Definition (Second order intensity)


The second order intensity of a spatial point process at points s
and u is given as

E {N(ds)N(du)}
γ(s, u) = lim .
|ds||du|→0 |ds||du|

63 / 172
Covariance intensity

We can now define the covariance intensity.

Definition (Covariance intensity)


The covariance intensity of a spatial point process at points s and
u is given as
c(s, u) = γ(s, u) − λ(s)λ(u).

The covariance intensity can be interpreted as the covariance


between whether there is an event or not at s and at u.

64 / 172
Covariance intensity for homogeneous Poisson

I Let us consider the homogeneous Poisson process as an example.


I We have said that all events happen independently of each other. This
means that for two disjoint sets A, B ⊂ R2 , we have

E {N(A)N(B)} = E {N(A)}E {N(B)}.

Therefore, for s 6= u, we have

E {N(s)N(u)} = E {N(s)}E {N(u)}

, and hence
E {N(ds)N(du)} E {N(ds)} E {N(du)}
c(s, u) = lim − lim lim
|ds||du|→0 |ds||du| |ds|→0 |ds| |du|→0 |du|
E {N(s)}E {N(u)} E {N(ds)} E {N(du)}
= lim − lim lim
|ds||du|→0 |ds||du| |ds|→0 |ds| |du|→0 |du|
= λ(s)λ(u) − λ(s)λ(u) = 0.

65 / 172
Covariance intensity for homogeneous Poisson
I This demonstrates that an homogeneous Poisson process has
a covariance intensity of zero between any two distinct points,
i.e. there is no covariance between N(s) (indicating whether
an event occurs at s) and N(u) (indicating whether an event
occurs at u).

I Suppose the process is not homogeneous Poisson, instead an


event occuring at s means we are more likely to see an event
at u than by chance, then the covariance intensity c(s, u)
would be positive.

I Oppositely, suppose an event occuring at s means we are less


likely to see an event at u than by chance, then the
covariance intensity c(s, u) would be positive.

66 / 172
Pair correlation function

Recall: correlation is a normalised measure of covariance. For a


pair of random variables X and Y , it is defined as

cov(X , Y )
ρ= p .
var(X )var(Y )

This is a measure between −1 and 1 that adjusts the variance. We


say the random variables are uncorrelated when ρ = 0.

We can again extend this concept to spatial point processes,


however, the formulation is slightly different.

67 / 172
Pair correlation function

Definition (pair correlation function)


The pair correlation functions is for a spatial point process at
s, u ∈ R2 is
γ(s, u)
ρ(s, u) = .
λ(s)λ(u)

Note that in the case of uncorrelated data (i.e. CSR), the pair
correlation function is 1 (not zero) for all s 6= u.

68 / 172
Stationarity and Isotropy
Two key properties of a point process are stationarity and isotropy.

Definition (Stationary)
A point process N is called stationary if λ(s) = λ for all s ∈ X and
γ(s, u) is a function only of s − u.

Definition (Isotropic)
A stationary point process N is called isotropic if γ(s, u) is a
function only of the radial distance r = ks − uk.

Recall back to module 2, where you were taught about the concept of
stationarity in time series. The definitions given here are in the same spirit.
I Stationarity tells us that no matter where we are in the X , the second
order structure of the process looks the same.
I Isotropy (not defined for time series) tells that it looks the same in all
directions.
69 / 172
Stationarity and Isotropy

This now means we can simplify the functions we’ve just presented
for the stationary setting.
I Second order intensity: γ(s, u) = γ(r ).
I Covariance intensity: c(s, u) = c(r ) = γ(r ) − λ2 .
I Pair correlation function: ρ(s, u) = ρ(r ) = γ(r )/λ2 .

70 / 172
1

Figure: Second-order intensity, covariance intensity and pair correlation


1
function for CSR, clustered and regular processes.

71 / 172
Ripley’s K -function

72 / 172
Ripley’s K-function
In this section, we will give a detailed treatment of what is quite
possibly the most commonly used analysis tool for spatial point
patterns.

Definition
Ripley’s K-function Let N be a stationary isotropic process, and let
N0 (r ) represent the random number of events within a distance r
of an arbitrarily chosen event (not including that event). Ripley’s
K -function is defined as

K (r ) = λ−1 E {number of events within a distance r of an arbitrary event}


= λ−1 E {N0 (r )}.

Note: this is a theoretical function of the actual process.


73 / 172
1

Figure: Ripley’s
1
K -function

74 / 172
Ripley’s K -function for Poisson process

I Let us consider the example of the Poisson process with


intensity λ. The number of additional events within a circle of
radius r drawn around some arbitrary event is distributed
Poisson(λπr 2 ).

I Therefore, the expected number of events within that circle is


λπr 2 . This gives

K (r ) = λ−1 λπr 2 = πr 2 .

75 / 172
Question

Let N be a homogeneous Poisson process with intensity λ = 10.


What is

E {number of events within a distance 2 of an arbitrary event}?

(a) 50π.
(a) 4π.
(c) 0.4π.
(d) 40π.
(e) π.

76 / 172
Assessing CSR

I This forms a baseline with which we can assess the clustering


or regularity behaviour of a point process.

I If K (r ) > πr 2 , then we expect more points within a radius r


of an arbitrary event than we would under complete spatial
randomness. This implies clustering.

I If K (r ) < πr 2 , we expect fewer events than under complete


spatial randomness. This implies regularity.

77 / 172
L(r ) − r

It, in fact, makes some sense to standardise this. The L−function


linearises K (r ) and is defined as
 1/2
K (r )
L(r ) = .
π

In the case of a homogeneous Poisson process, L(r ) = r . We can


further define the L(r ) − r function, which in the case of
homogeneous Poisson process equals 0.
Therefore, we can broadly characterise a point process as follows
I Clustered: L(r ) − r > 0
I CSR: L(r ) − r = 0
I Regular: L(r ) − r < 0

78 / 172
1

Figure: L(r ) − r function for CSR,


1
Clustered and Regular data

79 / 172
Theoretical consideration (tricky)
Let us consider the relationship between the K function and the
second-order intensity γ(r ) of a stationary, isotropic point process
N on R2 .
Let a(0, dh) be the annulus of width dh of a circle of radius h
centred at 0. Then

P(N(a(0, dr ) > 0|N(d0) > 0))


lim
dr →0 dr
|d0|→0
P(N(a(0, dr ) > 0, N(d0) > 0)) 2πr γ(r )
= lim =
dr →0 P(N(d0) > 0)dr λ
|d0|→0

Integrating with respect to r , the expected number of events


within distance r of an arbitrary event is

2π r
Z
λK (r ) = sγ(s)ds; r > 0.
λ 0

80 / 172
Estimating Ripley’s K -function

First, let E (r ) = E {N0 (r )}. It seems sensible to construct an


estimator for E (t) as follows. Let dij = ks i − s j k, and define a
crude estimator for E (t) as
n X
X
−1
Ẽ (t) = n I (dij ≤ r ).
i=1 j6=i

Here, I is the indicator function which takes a value of 1 if the


argument is true, and a value 0 if it is false.

81 / 172
Estimating Ripley’s K -function

Ẽ will be negatively biased. Why?

Several methods have been proposed to correct for this source of


bias. We will focus on Ripley’s method.

82 / 172
Edge Correction

I Let w (s, r ) be the proportion of the circumference of the


circle with centre s and radius r which lies within A.

I We will use the notation wij for w (s i , ks i − s j k).

I Then, for any stationary isotropic processes, wij is the


conditional probability that an event is observed, given there is
an event a distance rij = ks i − s j k away from the ith event s i .

83 / 172
1

Figure: Edge
1 correction

Note: in general wij 6= wji .

84 / 172
Edge correction
Therefore, an unbiased estimator for E (r ) is
n X
X
Ê (r ) = n−1 wij−1 I (dij ≤ r ).
i=1 j6=i

Finally, to get an estimator for K (r ), replace the unknown intensity


by an estimator for the intensity. The obvious choice is n/|A|.

The Ripley’s estimator for K (r ) is


n X
X
K̂ (r ) = n−2 |A| wij−1 I (dij ≤ r )
i=1 j=1

This estimator is approximately unbiased for small r


85 / 172
Weights - an aside

I The formula for the weights is a geometry problem and can be


written down for a small number of cases.
I Consider the rectangle A = (0, a) × (0, b).
I Write s = (s1 , s2 ) and let δ1 = min(s1 , a − s1 ),
δ2 = min(s2 , b − s2 ).
I The values of δ1 and δ2 are the distances from the point s to
the nearest vertical and horizontal edges of A, respectively. To
calculate w (s, r ), we need to distinguish two case:
I if r 2 ≤ δ12 + δ22 , then

w (s, r ) = 1 − π −1 [cos−1 (min(δ1 , r )/r ) + cos−1 (min(δ2 , r )/r )]

I if r 2 > δ12 + δ22 , then

w (s, r ) = 0.75 − (2π)−1 [cos−1 (δ1 /r ) + cos−1 (δ2 /r )]

86 / 172
R implementation

Here is some example code:


% load point pattern from csv
N=read.table(‘test data dense.csv’,sep=‘‘,’’)

% convert it into the ppp data format for SpatStat


N=as.ppp(N,c(0,1,0,1))

% estimate K(r) using Ripley’s edge correction


K = Kest(N,correction = ’Ripley’)

% plot K(r) and L(r)-r


plot(K)
plot(Kest,sqrt(./pi)-r ∼ r)

87 / 172
Simple tests for complete spatial
randomness

88 / 172
Hypothesis tests and Monte Carlo

89 / 172
Hypothesis tests

I A hypothesis test is a method of statistical inference.

I It lies at the very heart of scientific method.

I A hypothesis is proposed for either


I the statistical relationship between two data sets, or
I a data set obtained by sampling is compared to an idealized
model for the data.

I We will restrict this discussion to the latter of these.

90 / 172
Null and alternative hypothesis
I The null hypothesis H0 states that the sampled data comes
from a specified model. It is typically associated with a
contradiction to a theory one would like to prove.

I The alternative hypothesis HA is typically associated with a


theory one would like to prove.

I Examples:
I H0 : The Higgs Boson does not exist
vs
HA : The Higgs Boson does exist

I H0 : This drug treatment does not cure the illness


vs
HA : This drug treatment does cure the illness.

91 / 172
Test statistic

I For the data set I am testing the hypothesis with, I need to


extract a summary statistic from it, this is known as the test
statistic.

I I then wish to to know how consistent this test statistic is


with my null hypothesis,

I i.e. if, assuming the null hypothesis is true, this value of test
statistic would be very strange/extreme then I reject the null
hypothesis.

I If the value of the test statistic is not strange/extreme under


the null hypothesis then I fail to reject it.

92 / 172
p-value and significance level

I To do this, we determine the p-value. The p-value is defined


as the probability under the null hypothesis that I see a test
statistic at least as extreme as the value of T I observe.

I The threshold at which I deem the statistic to be strange or


not is called the significance level. E.g., if my p-value < 0.05 I
reject H0 at the 5% level.

I The critical region for the test statistic is that in which I


reject the null hypothesis.

93 / 172
Example 1
I think a coin I have is unfair. That is to say, I don’t think the
probability it lands heads (H) and the probability it lands tails (T)
are both equal to 0.5. In fact, I think it favours heads.

Null hypothesis: H0 : it is a fair coin.

Alternative hypothesis: HA : the coin has a higher probability of


falling H than T .

How would we write these hypotheses mathematically?

Null hypothesis: H0 : P(H) = P(T ) = 0.5.

Alternative hypothesis: HA : P(H) > 0.5.


94 / 172
Example 1

I wish to run a hypothesis test at the 5% level. My test statistic is


going to be the number of heads I observe, denoted #Heads.

The distribution for the number of heads is Bionomial(n, 0.5), and


the corresponding probability mass function is
n
P(#Heads = k) = Ck 0.5k 0.5n−k = n
Ck 0.5n k = 0, ..., n

Therefore, the p-value for a particular observation t of the test


statistic is
Xn
n
p = P(#Heads ≥ t) = Ck 0.5n .
k=t

95 / 172
Example 1
I now observe some data.

H H T H H T H T H H H H H T T H H H

Therefore, here, my test statistic takes the value t = 13 and


n = 18.

To determine whether this is significant, I determine the p-value


for these data under the null hypothesis.
18
X
p = P(#Heads ≥ 13) = P(#Heads = k)
k=13
X18
18
= Ck 0.5k 0.518−k = 0.0481
k=13

This is less than the 0.05, and I therefore reject my hypothesis at


the 5% level.
96 / 172
Example 1
0.2
Fail to reject
0.18 Critical region - reject

0.16

P(num heads = k) 0.14

0.12

0.1

0.08

0.06

0.04

0.02

0
0 2 4 6 8 10 12 14 16 18
k

For this test, the critical region is C = {13, 14, 15, 16, 17, 18}
97 / 172
Example 2
Consider a population with X ∼ N(µ, σ 2 ), with both µ and σ 2
unknown. I wish to test the null hypothesis

H0 : µ = 0

against the alternative hypothesis

HA : µ 6= 0.

Suppose I observe n samples from the population X1 , ..., Xn . In


this setting, my test statistics is


T = √
Ŝ/ n

where X̄ is the sample mean and Ŝ is the sample standard


deviation.
98 / 172
Example 2

Under the null hypothesis, T ∼ tn−1 (“Student” t-distribution


with n − 1 degrees of freedom).

I observe the data

−0.0479, 1.7013, −0.5097, −0.0029, 0.9199,

1.4049, 1.0341, 0.2916, −0.7777, 0.1498, .


Here, n = 10, x̄ = 0.4164 and ŝ = 0.8182, thus t = 1.6092. It is a
fact that P(T < 1.6092) = 0.9290. Therefore we fail to reject this
at the 5% level.

99 / 172
Example 2
pdf for t
9
0.4

0.35

0.3

0.25
p T(t)

0.2

0.15

0.1

0.05

0
-6 -4 -2 0 2 4 6
t

For this test, the critical region is C = (−∞, −2.26] ∪ [2.26, ∞)


100 / 172
KEY POINT!!!!!!!!!!!

THE TEST STATISTIC IS RANDOM!!!

EACH TIME A NEW DATA SET IS SAMPLED FROM THE


POPULATION, A DIFFERENT TEST STATISTIC IS
OBTAINED!!!!

OUR GOAL IS TO DETERMINE WHAT THE DISTRIBUTION


OF THIS TEST STATISTIC IS UNDER THE NULL.

101 / 172
Statements of inference

I Correct things to say:


I ”We reject the null hypothesis at the 5% level“ or ”The data
is not consistent with the null hypothesis“.
I ”We fail to reject the null hypothesis“ or ”The data is
consistent with the null hypothesis“
I Incorrect thing to say:
I ”We accept the null hypothesis“ or ”The alternative
hypothesis is wrong“.
I ”The null hypothesis is wrong“.
I ”The p-value is the probability the null hypothesis is correct.“

102 / 172
Type I and Type II errors

I Type I error is the probability of incorrectly rejecting the null


hypothesis when it is in fact true.
I This is set by the level of the test.
I Type II error is the probability of incorrectly failing to reject
the null hypothesis when the alternative is in fact true.
I Power of test = 1-type II error.

103 / 172
Type I and Type II errors

104 / 172
Question

I run a hypothesis test. The p-value of my test statistic is 0.02.


Which of the following is a correct statement of inference?
(a) I reject the null hypothesis at the 1% level.
(b) I fail to reject the null hypothesis at the 10% level.
(c) The probability the null hypothesis is wrong is 0.02.
(d) I reject the null hypothesis at 5% level.
(e) The null hypothesis is wrong.

105 / 172
Monte Carlo testing

I In the previous examples, we have known what the


distribution of the test statistic is under the null hypothesis.

I Suppose now this is not the case - we don’t know the


distribution of the test statistic under the null hypothesis.

I How can we proceed?

106 / 172
Example

I Return again to Example 1, but now suppose I’ve never been


taught that the distribution for the number of heads from n
tosses of a fair coin is Binomial(n,0.5). How can I test the
null hypothesis?
I Easy! If I have a coin that I know is fair, and toss it n times
(in our case 18 times) and record the number of heads, I have
taken a sample from the null distribution.
I If I repeat this s times, I have taken s samples from the null
distribution.
I I can then compare the original sample I observed with these s
samples from the null to determine whether it is consistent
with the null or not.

107 / 172
Example: Method

I Toss a fair coin 18 times, and record number of heads.


I Repeat this s times (s is typically 99, but the larger the
better).
I Sort these s samples in ascending order.
I Find the (1 − α)s sample of this list. This is my Monte Carlo
estimate of the critical value.

108 / 172
Example: Implementation

3 3 4 5 5 6 6 6 6
s = 99 6 6 6 6 6 6 7 7 7
7 7 7 7 7 8 8 8 8
S = rep(0,s) 8 8 8 8 8 8 8 8 8
8 8 8 9 9 9 9 9 9
for(ii in seq(1,s,1)) 9 9 9 9 9 9 9 9 9
{ 9 9 9 9 9 9 10 10 10 10
S[ii] = sum(rbinom(1,18,0.5)) 10 10 10 10 10 10 10 10 11 11
} 11 11 11 11 11 11 11 11 11 12
12 12 12 12 13 13 13 13 13 13
S = sort(S)
13 14 14 14 16

109 / 172
Example: Implementation

Histogram of S

s = 999

0.15
S = rep(0,s)

Density

0.10
for(ii in seq(1,s,1))

0.05
{
S[ii] = sum(rbinom(1,18,0.5))

0.00
0 5 10 15
} S

S = sort(S) S[950] = 12

110 / 172
Testing for complete spatial randomness
An immediate question a data analyst will ask when presented with
event data is:
Are these events completely spatially random? If not, do they
exhibit clustering or regularity?
1

0.9

0.8

0.7

0.6

0.5

0.4

0.3

0.2

0.1

0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Figure: Dataset: test data dense.csv

111 / 172
Testing for complete spatial randomness

In this case, the null hypothesis is

H0 : The spatial point process N that generates these data is


completely spatially random.

or, equivalently

H0 : The spatial point process N that generates these data is


homogeneous Poisson.

Versus the alternative Hypothesis

HA : The N is not completely spatially random/homogeneous


Poisson process.

112 / 172
Sampling

X
I Typically, we will observe
the point pattern on a
A
limited region.
I When testing, we have to
take into account that the
point pattern continues
outside of the region.

113 / 172
Testing for complete spatial randomness

I To test the null hypothesis we will need to summarise the


spatial properties of the observed data.

I We will then wish to know what this summary would normally


look like if the null hypothesis is true to determine if what we
observe is consistent with it being completely spatially
random, or if it is unusual/weird for CSR and hence evidence
that the null hypothesis is not actually correct.

I Let us first consider some methods for summarising the spatial


data.

114 / 172
Nearest Neighbour Distances
I For n events in a ROI A, let di denote the distance from the
ith event to the nearest other event in A.
I The set {d1 , d2 , ..., dn } are called the nearest neighbour
distances.
June 28, 2019
I Typically, the n nearest neighbour distances will contain
Abstract

duplicates if nearest neighbour pairs are reciprocal.


1

Figure: Nearest neighbour distances


115 / 172
Nearest Neighbour Distances
We can define the Empirical Distribution Function (EDF) for the
nearest neighbour distribution June
function
28, 2019
as
Abstract

Ĝ (d) = n−1 #(di ≤ d).


1

Figure: EDF for Nearest Neighbours Distances


1

116 / 172
Computing G (t) in R

Here is some example code:


% load point pattern from csv
N=read.table(‘test_data_sparse.csv’,sep=",")

% convert it into the ppp data format for SpatStat


N=as.ppp(N,c(0,1,0,1))

% estimate G(d)
G = Gest(N)

% plot G(d)
plot(G)

117 / 172
Testing

I We wish to compare the estimated EDF with what it should


be under CSR. Large deviations away from it would indicate
the null hypothesis should be rejected

I The theoretical distribution of nearest neighbour distance D


under CSR depends on n and A, and is not expressible in
closed form because of complicated edge effects.

118 / 172
Large n approximation‘

I If we, for the moment, ignore edge effects, then by noting |A|
denotes the area of A, then πd 2 |A|−1 is the probability under
CSR that an arbitrary event is within distance d of a specified
event. Since the events are located independently, the
distribution of D can be approximated as

G (d) = 1 − (1 − πd 2 |A|−1 )n−1 : d ≥ 0

For large n, a further approximation is

G (d) = 1 − exp(−λπd 2 ) : d ≥ 0.

I This has therefore reduced to the theoretical result for the


nearest neighbour distribution of a Poisson process.

119 / 172
Testing G(r)

I Remember: Our EDF is random. It is dependent on the


realization. A different realization of the same process would
give a different EDF.

I Our goal is to determine if the observed EDF is consistent


with CSR data (fail to reject H0 ), or if it seems abnormal
(reject H0 ).

I To do so, we need to determine the sampling distribution of


Ĝ (d) under complete spatial randomness.

I Analytically, this is very difficult!!

120 / 172
Simulation envelopes

I To tackle this problem, we consider simulation envelopes.


Proceed with the following:
I Call our EDF Ĝ0 (d)
I Create s (s typically 99) independent simulations of n
uniformly distributed points in ROI A. For each one of these
simulated set of points, compute Ĝi (d), i = 1, .., s.
I Define the upper and lower simulation envelopes,

U(d) = max {Ĝi (d)}; L(d) = min {Ĝi (d)}.


i=1,..,s i=1,..,s

These two could be plotted against d, and have the property


that under CSR, and for each d,

P(Ĝ0 (d) > U(d)) = P(Ĝ0 (d) < L(d)) = s −1 .

121 / 172
Formation of 1envelope plots

Figure: Formation1 of envelope plots

122 / 172
Producing envelope plots in R

^
G ob s (r )

G t h eo (r )

The following code will produce

0.8
^
G h i (r )
^
G l o (r )

envelope plots for Ĝ (r ) in R

0.6
G (r )

0.4
Genv = envelope(N, fun = Gest)

0.2
plot(Genv)

0.0
0.00 0.01 0.02 0.03 0.04

123 / 172
Question

Which of the following is a correct


G

statement of inference?
^
G ob s (r )

(a) The data is consistent with G t h eo (r )

0.8
^
G h i (r )

complete spatial randomness. ^


G l o (r )

0.6
(b) The process is not completely

G (r )
spatially random.

0.4
(c) The data is not consistent with

0.2
complete spatial randomness.

0.0
(d) The process is clustered. 0.00 0.01 0.02 0.03 0.04

(e) None of the above.

124 / 172
I Simulation envelopes are not a rigorous test, but instead
intended to be a visual aid to assess how well the the data
matches the hypothesis of CSR.

I If we instead wanted to do a more formal Monte Carlo test for


CSR we could choose one of the following methods.

125 / 172
Monte Carlo test 1
I For the observed point pattern compute the sample mean of
the n nearest neighbour distances {d1 , ..., dn }
n
X
d¯0 = di .
i=1

I Simulate s realizations (s typically 99) of a CSR process on


the same ROI. For j = 1, ..., s, compute the sample mean of
the n nearest neighbour distances
n
(j)
X
d¯(j) = di .
i=1

I Sort them and relabel such that d¯(1) < d¯(2) < ... < d¯(s) .
I To obtain the critical region for an α level test, take the αs/2
and (1 − α)s/2 elements of this list.
126 / 172
Monte Carlo test 2

An alternative test statistic for large n is


Z
u0 = (Ĝ0 (y ) − G (y ))2 dy

where G (·) is the theoretical nearest neighbour distribution


function given earlier.

We can then simulate s realizations of a CSR process on the same


ROI. For j = 1, ..., s, compute
Z
uj = (Ĝj (y ) − G (y ))2 dy

and test in the same way.

127 / 172
Point to nearest event distances

I A closely related analysis uses the distances xi from each of m


sample points in A to the nearest of the n events.

I The EDF
F̂ (x) = m−1 #(xi ≤ x)
measures the empty spaces in A.

I This is because 1 − F̂ (x) is an estimate of the area |B(x)| of


the region B(x) consisting of all points in A a distance at
least x from every one of the n events in A.

128 / 172
1

Figure: point to nearest event distances


1
and forming the ECDF

129 / 172
Point to nearest event distances

I Under the same reasoning as the nearest neighbour


distribution function,

F (x) = 1 − exp(−πλx 2 ) x ≥ 0,

approximately, where λ = n|A|−1 .

I We are faced with the challenge of how to choose m, the


number of points at which we measure the nearest event
distance. There has been some recommendation that a k × k

grid should be chosen where k ≈ n.

I A Monte Carlo test of CSR can be performed in an analogous


manner to that used the nearest neighbour distances.

130 / 172
Producing envelope plots in R

1.0
The following code will produce

0.8
envelope plots for F̂ (r ) in R

0.6
F (r )

0.4
Fenv = envelope(N, fun = Fest) ^
F ob s (r )

0.2
F t h eo (r )

plot(Genv) ^
F h i (r )
^
F l o (r )

0.0
0.00 0.01 0.02 0.03 0.04

131 / 172
Quadrat counts
I The methods discussed thus far are what we call distance
based approaches (as they analyse the distances to or between
events).
I An alternative to a distance-based approach is to partition A
into m sub-regions, or quadrats, or equal area and use the
event counts in each of the m quadrats to test for CSR.
I How we go about choosing the m quadrats is completely
arbitrary, but if we have aJunesquare
28, 2019
ROI, it makes sense to chop
this up into a k × k grid of square sub-regions, so that
Abstract

m = k 2. 1

132 / 172
Quadrat counts
I Denote n1 , n2 , ..., nm to be the event counts in each quadrant.
Let n̄ = n/m, the sample mean of ni . A sensible statistic with
which to test for departures from the uniform distribution,
implied by CSR, is
m
X
2
X = (ni − n̄)2 /n̄.
i=1

I If the events are uniformly distributed on A, we expect this to


be small. If they are not evenly distributed we expect this to
be large.

I We note, this does not necessarily tell us about whether it is


truly uniform (in the distributional sense). For example, a
regular grid of events will also give a low value.

133 / 172
I However, if we also note that this is m − 1 times the ratio of
the sample variance to the sample mean, we can obtain more
insight.

I Recall that the expected value and variance of a Poisson


distributed random variable equal one another. Therefore, in
the case of CSR, where each quadrant count will be a Poisson
random variable, we expect this ratio to be close to m − 1.

I If it is higher than this then it means the variability amongst


quadrants is high, this could imply clustering. If this ratio is
low it means the variance between quadrant counts is low
could imply regularity.

I It has been shown that in the case of CSR data X 2 ∼ χ2m−1 ,


and a hypothesis test can be constructed using this.

134 / 172
1

Figure: Hypothesis testing for


1
CSR using Quadrat counts

135 / 172
Example

136 / 172
Ripley’s K -function

Using Ripley’s K -function for testing CSR follows a very similar


approach to that demonstrated previously. Although approximate
distributions exist in the case of CSR in rectangular regions, they
are messy, and instead we consider the Monte Carlo approach.

137 / 172
Simulation envelopes
The first exploratory data technique is to consider simulation
envelopes. The procedure is as follows:
I Compute K̂0 (r ), the estimate of Ripley’s K -function for the
point pattern under analysis.
I Estimate the intensity of the point process as λ̂ = |A|/n.
I Simulate s (s typically 99) Poisson processes with intensity
λ = λ̂.
I For each one, compute K̂i (r ), i = 1, ..., s on a vector of values
for r .
I Compute the upper and lower simulation envelopes

U(r ) = max {K̂i (r )}; D(r ) = min {K̂i (r )}.


i=1,..,s i=1,..,s

These two could be plotted against H(t), and have the


property that under CSR, and for each t,
P(K̂0 (r ) > U(r )) = P(K̂0 (t) < D(r )) = s −1 .
138 / 172
Envelope plots
It is typically more useful to look at L̂0 (r ) − r = (K̂0 (r )/π)1/2 − r
and the simulation envelopes (U(r )/π)1/2 − r and (D(r )/π)1/2 − r

Kenv = envelope(N,Kest)
plot(Kenv,sqrt(./pi)-r ∼ r)
Kenv
0.015

^
K ob s (r ) π − r
K t h eo (r ) π − r
^
K h i (r ) π − r
^
K l o (r ) π − r
0.010
K (r ) π − r

0.005
0.000
−0.005

0.00 0.05 0.10 0.15 0.20 0.25

r
139 / 172
Hypothesis test

An alternative approach is to obtain a summary test statistic from


K̂0 (r ). There are two common test statistics that have been
proposed.
I The first measures globally how different K̂0 (r ) is from the
theoretical value of πr 2 ,
Z rmax  2
TI = L̂0 (r ) − r dr .
0

I The second measures the maximum deviation K̂0 (r ) takes


from the null hypothesis value of πr 2

TS = sup |L̂0 (r ) − r |
r

140 / 172
1

Figure: Test
1
statistics

141 / 172
Alternative models of spatial point
processes

Thus far, the only model of a spatial point process we have


encountered is the Poisson process. Here we consider some
simple models for alternative processes.

142 / 172
Clustered processes

The following model is called the Thomas process. It is used to


model clustered event data seen in ecology and microscopy images,
amongst others.
1. Let the Poisson process be homogeneous (with intensity
λp (s) = λp ).
2. Each parent produces a Poisson(ξ) distributed number of
offspring.
3. The positions of the offspring relative to their parents are
independent and identically distributed according to a
2-dimensional Gaussian distribution N2 (µ, σ 2 I ).
4. The final process is composed of the superposition of offspring
only.

143 / 172
1

Figure: Thomas1 Cluster Process

144 / 172
First and second order properties

We will consider the first and second order properties of a Thomas


process N with parent intensity λp and Poisson(ξ) distributed
number of offspring.
Proposition
Let N be a Thomas process with parent intensity λp and offspring
rate ξ. The intensity of N is λ = λp ξ.

Proposition
Let N be a Thomas process with parent intensity λp and offspring
rate ξ, then N is both stationary and isotropic.

145 / 172
First and second order properties

Proposition
Let N be a Thomas process with parent intensity λp and offspring
rate ξ, the second order intensity is given by
r
γ(r ) = λ2 + λp ξ 2 exp(−r 2 /4σ 2 ); r > 0,
2σ 2
and consequently, the K -function is given as

K (r ; σ 2 , λp ) = πr 2 + λ−1 2 2
p (1 − exp(−r /4σ )); r > 0.

The proof of this is reasonably involved, c.f. Cressie.

146 / 172
Model fitting

The K -function can be used for fitting a Thomas process model.


The procedure is as follows:
I For the dataset, compute estimate K̂ (r ) of Ripley’s K
function.
I Find values of σ and λp that minimize
Z rmax
|(K (r ; σ 2 , λ0 ))0.25 − (K̂ (r ))0.25 |2 dr .
rmin

This can be performed in R with thomas.estK.

147 / 172
Model fitting

148 / 172
Inhibited processes

In this section we will introduce two related models for


inhibited/regular point processes. We will soon see why these are
important in imaging. The first is the Matérn I process.

A point process NI is called Matérn I if:


1. Events are first generated as a homogeneous Poisson process
with intensity ρ.
2. There exists some fixed “hardcore” distance δ > 0 such that
all events within distance δ of another event are removed.
3. The remaining events form the point process NI .

149 / 172
First order properties
Let us consider the first order properties of this process.
Proposition
Let NI be a Matérn I process generated from a Poisson process N0
with intensity ρ and hardcore distance δ. The intensity is given as

λ = ρ exp(−ρπδ 2 ).

Proof.
The probability that an arbitrary event in N0 is retained is the
same as the probability that an arbitrary event as no other events
within distance δ of it. This is equal to exp(−ρπr 2 ). Therefore,

λ(s) = P(NI (ds) = 1)|ds|


= P(NI (ds) = 1|N0 (ds) = 1)P(N0 (ds) = 1)|ds|
= exp(−ρπδ 2 )ρ.

150 / 172
Second-order properties

Proposition
Let NI be a Matérn I process generated from a Poisson process N0
with intensity ρ and hardcore distance δ. The second order
intensity is given as
 2
ρ exp{−ρUδ (r )} r ≥ δ
γ(r ) =
0 r <δ

where Uδ (r ) is the area of the union of two circles of distance


radius δ, separated by r .

151 / 172
Second-order properties

Proof.

γ(s, u) = P(NI (ds) = 1, NI (du) = 1)|ds||du|


= P(NI (ds) = 1, NI (du) = 1|N0 (ds) = 1, N0 (du) = 1)P(N0 (ds) = 1, N0 (du) = 1)|ds||ds|
= P(NI (ds) = 1, NI (du) = 1|N0 (ds) = 1, N0 (du) = 1)P(N0 (ds) = 1, N0 (du) = 1)|ds||ds|
2
= P(NI (ds) = 1, NI (du) = 1|N0 (ds) = 1, N0 (du) = 1) ρ

Let us consider three cases for assessing P(NI (ds) = 1, NI (ds) = 1|N0 (ds) = 1, N0 (ds) = 1):
1. If r = ks − uk < δ then they will definitely both be deleted, therefore

P(NI (ds) = 1, NI (ds) = 1|N0 (ds) = 1, N0 (ds) = 1) = 0.

2. If r = ks − uk > 2δ then we just need to know the probability there are no further events in either circle.
This is given as
2 2 2
exp(−ρπδ ) exp(−ρπδ ) = exp(−2ρπδ ).

3. if δ ≤ r = ks − uk < 2δ, then we need to know the probability that there are no further events in the
area Uδ (r ). This is given as exp(−ρUδ (r )).

The result follows.

152 / 172
Pair-correlation function and Ripley’s K -function for
Matern I

Following from this result we have the following identity for the
pair correlation function takes the form

exp(ρ(2πr 2 − ρUδ (r ))) r ≥ δ



ρ(r ) =
0 r <δ

and the Ripley’s K -function is given as


Rr
2π δ r 0 exp(ρ(2πr 02 − ρUδ (r 0 )))dr 0 r ≥ δ

K (r ) =
0 r <δ

153 / 172
Matérn II

A second useful model that we will consider is the Matérn II


process.
Definition (Matérn II)
A point process NII is called Matérn I if:
1. Events are first generated as a homogeneous Poisson process
with intensity ρ.
2. Independently mark the events {s 1 , s 2 , ...} with numbers
{Z (s 1 ), Z (s 2 ), ...} from any absolutely continuous distribution
F.
3. An event s of N0 is deleted if there exists another event u
with ks − uk < δ and Z (u) < Z (s).
4. The remaining events form the point process NII .

154 / 172
First order properties of Matern II

Proposition
Let NII be a Matérn II process generated from a Poisson process N0
with intensity ρ and hardcore distance δ. The intensity is given as

1 − exp(−ρπδ 2 )
λ= .
πr 2

155 / 172
Second-order properties of Matern II

Proposition
Let NII be a Matérn II process generated from a Poisson process
N0 with intensity ρ and hardcore distance δ. The second order
intensity is given as
2Uδ (r )(1−exp(−ρπδ 2 ))−2πδ 2 (1−exp(−ρUδ (r )))
(
γ(r ) = ρ2 πδ 2 Uδ (r )(Uδ (r )−πδ 2 )
r ≥δ
0 r <δ

where Uδ (r ) is the area of the union of two circles of distance


radius δ, separated by r .

156 / 172
Multivariate Point Processes and
Point Patterns

157 / 172
Multivariate Point Processes and Point Patterns
We now turn our attention to the case where we have two (or
more) spatial point processes.
Some examples of this include
I two different types of protein molecule in an image. One is
tagged with a green fluorophore and the other red.
I two different species of plant in a forest.
I the location of nuclear power plants and the location of
leukaemia sufferers.

Lagache et al, Nature Communications (2018)9:698 158 / 172


Multivariate Point Processes and Point Patterns

I One immediate question we might ask ourselves is: “are these


two processes independent of each other, or is there
inter-dependency?”

I Another way to say this is, where I see an event of one


process, am I more or less likely to see an event of another
process than I would if they were completely independent.

I Before introducing a method for dealing with these types of


data, we will introduce some theory on bivariate point
processes.

159 / 172
Theoretical framework for bivariate point processes

I Let us consider two point processes. We will label these N1


and N2 .

I Process N1 has intensity λ1 (s) and second order intensity


γ1 (s, u) and process N2 has intensity λ2 (s) and second order
intensity γ2 (s, u).

160 / 172
Cross-intensity

Definition (Cross-intensity)
The cross intensity between N1 and N2 at points s and u is
defined as
E {N1 (ds)N2 (du)}
γ12 (s, u) = lim .
|ds||du|→0 |ds||du|

161 / 172
Cross-covariance intensity

From this we can further define the cross covariance intensity of


N1 and N2 .
Definition (Cross-covariance intensity)
The cross covariance intensity between N1 and N2 at points s and
u is defined as

c12 (s, u) = γ12 (s, u) − λ1 (s)λ2 (u).

162 / 172
I This function provides a measure of dependency between the
two point processes.
I Let us consider the case where the two processes are
independent. Recall, that for two independent random
variables X and Y we have E {XY } = E {X }E {Y }.
I With this in mind, we have
E {N1 (ds)N2 (du)}
γ12 (s, u) = lim
|ds||du|→0 |ds||du|
E {N1 (ds)}E {N2 (du)}
= lim
|ds||du|→0 |ds||du|
E {N1 (ds)} {N2 (du)}
= lim lim
|ds|→0 |ds| |du|→0 |du|
= λ1 (s)λ2 (u)

and following from this the cross-covariance intensity is

c12 (s, u) = 0
163 / 172
Joint stationarity and isotropy

The concepts of stationarity and isotropy also extend to the


bivariate setting.

Definition (Joint stationary and isotropic)


Let N1 and N2 be a pair of spatial point processes. We say they
are jointly stationary and isotropic if they are both individually
stationary and isotropic and the cross-intensity is a function of
r = ks − uk only.

In this setting, the cross-intensity and cross-covariance intensity


can be represented as γ12 (r ) and c12 (r ), respectively.

164 / 172
Cross-K -function
The cross-K -function provides a natural extension to Ripley’s
K -function.

Definition (cross-K-function)
Let N1 and N2 be a pair of jointly stationary and isotropic
processes, and let Nij (r ) represent the random number of type j
events within a distance r of an arbitrarily chosen type i event
(i, j = 1, 2, not including that event). The cross-K -function is
defined as

Kij (r ) = λ−1
j E {number of type j events
within a distance r of an arbitrary type i event}
= λ−1
j E {Nij (r )}.

165 / 172
Cross-K -function for independence

I Let us consider the case where N1 and N2 are independent


jointly stationary and isotropic processes. It will be the case
that

E {number of type 2 events within a


distance r of an arbitrary type 1 event} = λ2 πr 2

and therefore
K12 (r ) = πr 2 .
I Therefore, irrespective of the type of process N1 and N2 are, if
they are independent then K12 (r ) = πr 2 . This therefore
means that we can use K12 (r ) to test for independence.

166 / 172
Question

Let N1 and N2 be independent processes with intensities λ1 = 5


and λ2 = 2, respectively. What is

E {number of type 2 events within a distance 3 of an arbitrary type 1 event}?

(a) 18π.
(b) 45π.
(c) 9π.
(d) π.
(e) 4.5π

167 / 172
I Note also that analogously to the standard Ripley’s
K -function, we have the relationship
Z r

K12 (r ) = r 0 γ12 (r 0 )dr 0 .
λ1 λ2 0

I We also make the observation that γ12 (r ) = γ21 (r ) (this is


because for random variables X and Y it is true that
E (XY ) = E (YX ). Therefore it follows that K12 (r ) = K21 (r ).

168 / 172
Estimating the cross-K-function

I Estimating the cross K -function is very similar to how


estimated the Ripley’s K -function, however, this time, we
measure distances between pairs of events of different types.

I Let uij be the distance between the ith event of type 1 and
jth event of type 2, wij are the weights as used before, and
the number of events of type 1 and type 2 are n1 and n2 ,
respectively.

169 / 172
We can construct an estimator of K12 (r ) as
n1 X
X n2
λ̂2 K̃12 (r ) = n1−1 wij I (uij ≤ r )
i=1 j=1

but we can also construct one as


n2 X
X n1
λ̂1 K̃21 (s) = n2−1 wji I (uij ≤ r ).
i=1 j=1

We can therefore average the two estimates to give



 X n1 Xn2
−1
K̂12 (r ) = (n1 n2 ) |A| n1 wij I (uij ≤ r )

i=1 j=1

X n2 Xn1 
+n2 wij I (uij ≤ r ) /(n1 + n2 )

j=1 i=1
n1 X
X n2
= (n1 n2 )−1 |A| wij∗ I (uij < r )
i=1 j=1

170 / 172
Question

where wij∗ equals...?


(a) n1 wij + n2 wji .
(b) (wij + wji )/(n1 + n2 ).
(c) wij + wji .
(d) (n1 wij + n2 wji )/(n1 + n2 ).
(e) wij .

171 / 172
Testing for independence

I Using the K -function to test

H0 : N1 and N2 are independent.

is somewhat troublesome.
I This is because the distribution of K̂12 (r ) is dependent on the
type of processes that are being tested. E.g. the distribution
of K̂12 (r ) when N1 and N2 are Poisson is different to when
they are Thomas.
I Monte Carlo methods exist.

172 / 172

You might also like