0% found this document useful (0 votes)
22 views106 pages

Lecture 1

Uploaded by

rawatrahul1217
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views106 pages

Lecture 1

Uploaded by

rawatrahul1217
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 106

Lecture 1: Introduction to Spatial Econometric

Professor: Mauricio Sarrias

Universidad de Talca

October 7, 2020
1 Introduction to Spatial Econometric
Goals and Mandatory Reading
Why do We Need Spatial Econometric?
Spatial Heterogeneity and Dependence
Spatial Autocorrelation
2 Spatial Weight Matrix
Definition
Weights Based on Boundaries
From Contiguity to the W Matrix
Weights Based on Distance
Row Standardization
Spatial Lag
Higher-Order Spatial Neighbors
3 Examples of Weight Matrices in R
Creating Contiguity Neighbors
Creating Distance-Based
4 Testing for Spatial Autocorrelation
Indicators of spatial association
Moran’s I
1 Introduction to Spatial Econometric
Goals and Mandatory Reading
Why do We Need Spatial Econometric?
Spatial Heterogeneity and Dependence
Spatial Autocorrelation
2 Spatial Weight Matrix
Definition
Weights Based on Boundaries
From Contiguity to the W Matrix
Weights Based on Distance
Row Standardization
Spatial Lag
Higher-Order Spatial Neighbors
3 Examples of Weight Matrices in R
Creating Contiguity Neighbors
Creating Distance-Based
4 Testing for Spatial Autocorrelation
Indicators of spatial association
Moran’s I
Goals

Goals of this Lecture


To understand the concept of spatial heterogeneity and spatial
autocorrelation.
To understand the concept of the Spatial Weight Matrix W.
To learn how to obtain the spatial weight matrices in R.
To derive and understand the main test for spatial autocorrelation.
To learn how to perform the Moran’s I test in R.
Reading for: Introduction to Spatial Econometrics

(A) - Chapter 2
(LK)-Sections 1.1-1.2
(A)-Chapter 3
(AR)-Chapter 3 and 4.
Anselin, L. (1995). Local Indicators of Spatial Association- LISA.
Geographical analysis, 27(2), 93-115.
Dall’Erba, S. (2005). Distribution of regional income and regional funds
in Europe 1989-1999: an exploratory spatial data analysis.The Annals of
Regional Science, 39(1), 121-148.
Celebioglu, F., & Dall’erba, S. (2010). Spatial disparities across the
regions of Turkey: an exploratory spatial data analysis. The Annals of
Regional Science, 45(2), 379-400.
1 Introduction to Spatial Econometric
Goals and Mandatory Reading
Why do We Need Spatial Econometric?
Spatial Heterogeneity and Dependence
Spatial Autocorrelation
2 Spatial Weight Matrix
Definition
Weights Based on Boundaries
From Contiguity to the W Matrix
Weights Based on Distance
Row Standardization
Spatial Lag
Higher-Order Spatial Neighbors
3 Examples of Weight Matrices in R
Creating Contiguity Neighbors
Creating Distance-Based
4 Testing for Spatial Autocorrelation
Indicators of spatial association
Moran’s I
Why do We Need Spatial Econometric?

Important aspect when studying spatial units (cities, regions, countries)


Why do We Need Spatial Econometric?

Important aspect when studying spatial units (cities, regions, countries)


Potential relationships and interactions between them.
Why do We Need Spatial Econometric?

Important aspect when studying spatial units (cities, regions, countries)


Potential relationships and interactions between them.
Example: Modeling pollution:
Why do We Need Spatial Econometric?

Important aspect when studying spatial units (cities, regions, countries)


Potential relationships and interactions between them.
Example: Modeling pollution:
Should we analyze regions as independent units?
Why do We Need Spatial Econometric?

Important aspect when studying spatial units (cities, regions, countries)


Potential relationships and interactions between them.
Example: Modeling pollution:
Should we analyze regions as independent units?
No, regions are spatially interrelated by ecological and economic
interactions.
Why do We Need Spatial Econometric?

Important aspect when studying spatial units (cities, regions, countries)


Potential relationships and interactions between them.
Example: Modeling pollution:
Should we analyze regions as independent units?
No, regions are spatially interrelated by ecological and economic
interactions.
Existence of environmental externalities:
Why do We Need Spatial Econometric?

Important aspect when studying spatial units (cities, regions, countries)


Potential relationships and interactions between them.
Example: Modeling pollution:
Should we analyze regions as independent units?
No, regions are spatially interrelated by ecological and economic
interactions.
Existence of environmental externalities:
an increase in i’s pollution will affect the pollution in neighbors regions, but
the impact will be lower for more distance regions.
Figure: Environmental Externalities

R1 R2 R3 R4 R5
Distance Matters
Key Point:
First law of geography of Waldo Tobler: “everything is related to everything
else”, but near things are more related than distant things

This first law is the foundation of the fundamental concepts of spatial


dependence and spatial autocorrelation.

Figure: Professor Waldo Tobler


1 Introduction to Spatial Econometric
Goals and Mandatory Reading
Why do We Need Spatial Econometric?
Spatial Heterogeneity and Dependence
Spatial Autocorrelation
2 Spatial Weight Matrix
Definition
Weights Based on Boundaries
From Contiguity to the W Matrix
Weights Based on Distance
Row Standardization
Spatial Lag
Higher-Order Spatial Neighbors
3 Examples of Weight Matrices in R
Creating Contiguity Neighbors
Creating Distance-Based
4 Testing for Spatial Autocorrelation
Indicators of spatial association
Moran’s I
Why do We Need Spatial Econometric?

Spatial econometric deals with spatial effects


(I) Spatial heterogeneity

Definition (Spatial heterogeneity)


Spatial heterogeneity relates to a differentiation of the effects of space over the
sample units. Formally, for spatial unit i:

yi = f (xi )i + i =⇒ yi = βi xi + i
Lack of stability over the geographical space.
Why do We Need Spatial Econometric?

Spatial econometric deals with spatial effects

(II) Spatial dependence

Definition (Spatial dependence)


What happens in i depends on what happens in j. Formally,

yi = f (yi , yj ) + i , ∀i 6= j.
Spatial Dependence

How would you model this situation?

Figure: Environmental Externalities

R1 R2 R3 R4 R5
Spatial Dependence

Using our previous example, we would like to estimate

y1 = β21 y2 + β31 y3 + β41 y4 + β51 y5 + 1


y2 = β12 y1 + β32 y3 + β42 y4 + β52 y5 + 2
y3 = β13 y1 + β23 y2 + β43 y4 + β53 y5 + 3 (1)
y4 = β14 y1 + β24 y2 + β34 y3 + β54 y5 + 4
y5 = β15 y1 + β25 y2 + β35 y3 + β45 y5 + 4
where βji is the effect of pollution of region j on region i.
Spatial Dependence

Using our previous example, we would like to estimate

y1 = β21 y2 + β31 y3 + β41 y4 + β51 y5 + 1


y2 = β12 y1 + β32 y3 + β42 y4 + β52 y5 + 2
y3 = β13 y1 + β23 y2 + β43 y4 + β53 y5 + 3 (1)
y4 = β14 y1 + β24 y2 + β34 y3 + β54 y5 + 4
y5 = β15 y1 + β25 y2 + β35 y3 + β45 y5 + 4
where βji is the effect of pollution of region j on region i.

What is the problem with this modeling strategy?


Spatial Dependence

Under standard econometric modeling, it is impossible to model spatial


dependency.
1 Introduction to Spatial Econometric
Goals and Mandatory Reading
Why do We Need Spatial Econometric?
Spatial Heterogeneity and Dependence
Spatial Autocorrelation
2 Spatial Weight Matrix
Definition
Weights Based on Boundaries
From Contiguity to the W Matrix
Weights Based on Distance
Row Standardization
Spatial Lag
Higher-Order Spatial Neighbors
3 Examples of Weight Matrices in R
Creating Contiguity Neighbors
Creating Distance-Based
4 Testing for Spatial Autocorrelation
Indicators of spatial association
Moran’s I
Spatial Autocorrelation

Autocorrelation =⇒ the correlation of a variables with itself


Spatial Autocorrelation

Autocorrelation =⇒ the correlation of a variables with itself


Time series: the values of a variable at time t depends on the value of the
same variable at time t − 1.
Spatial Autocorrelation

Autocorrelation =⇒ the correlation of a variables with itself


Time series: the values of a variable at time t depends on the value of the
same variable at time t − 1.
Space: the correlation between the value of the variable at two different
locations.
Spatial Autocorrelation

Autocorrelation =⇒ the correlation of a variables with itself


Time series: the values of a variable at time t depends on the value of the
same variable at time t − 1.
Space: the correlation between the value of the variable at two different
locations.

Definition (Spatial Autocorrelation)


Correlation between the same attribute at two (or more) different
locations.
Coincidence of values similarity with location similarity.
Under spatial dependency it is not possible to change the location of the
values of certain variable without affecting the information in the sample.
It can be positive and negative.
Spatial Autocorrelation

Definition (Positive Autocorrelation)


Observations with high (or low) values of a variable tend to be clustered in
space.
Spatial Autocorrelation

Definition (Positive Autocorrelation)


Observations with high (or low) values of a variable tend to be clustered in
space.

Figure: Positive Autocorrelation

1 1

1 1
Spatial Autocorrelation

Definition (Negative Autocorrelation)


Locations tend to be surrounded by neighbors having very dissimilar values.
Spatial Autocorrelation

Definition (Negative Autocorrelation)


Locations tend to be surrounded by neighbors having very dissimilar values.

Figure: Negative Autocorrelation

1 1

1 1
Spatial Autocorrelation: Another Example
Positive Spatial Autocorrelation Negative Spatial Autocorrelation
Spatial Autocorrelation

Definition (Spatial Randomness)


When none of the two situations occurs.
Spatial Autocorrelation
Two main sources of spatial autocorrelation (Anselin, 1988):
Measurement errors.
Importance of Space.
The second source is of much more interest.

Figure: Professor Luc Anselin


Why the space matters?

The essence of regional sciences and new economic geography is that


location and distance matter.
What is observed at one point is determined by what happen elsewhere in
the system.
First Law of Geography Again

Tobler’s First Law of Geography


Everything depends on everything else, but closer things more so

Important ideas:
First Law of Geography Again

Tobler’s First Law of Geography


Everything depends on everything else, but closer things more so

Important ideas:
Existence of Spatial Dependence.
First Law of Geography Again

Tobler’s First Law of Geography


Everything depends on everything else, but closer things more so

Important ideas:
Existence of Spatial Dependence.
Structure of Spatial Dependence
First Law of Geography Again

Tobler’s First Law of Geography


Everything depends on everything else, but closer things more so

Important ideas:
Existence of Spatial Dependence.
Structure of Spatial Dependence
Distance decay.
First Law of Geography Again

Tobler’s First Law of Geography


Everything depends on everything else, but closer things more so

Important ideas:
Existence of Spatial Dependence.
Structure of Spatial Dependence
Distance decay.
Closeness = Similarities.
1 Introduction to Spatial Econometric
Goals and Mandatory Reading
Why do We Need Spatial Econometric?
Spatial Heterogeneity and Dependence
Spatial Autocorrelation
2 Spatial Weight Matrix
Definition
Weights Based on Boundaries
From Contiguity to the W Matrix
Weights Based on Distance
Row Standardization
Spatial Lag
Higher-Order Spatial Neighbors
3 Examples of Weight Matrices in R
Creating Contiguity Neighbors
Creating Distance-Based
4 Testing for Spatial Autocorrelation
Indicators of spatial association
Moran’s I
Spatial Weight Matrix

One crucial issue in spatial econometric is the problem of formally


incorporating spatial dependence into the model.
Spatial Weight Matrix

One crucial issue in spatial econometric is the problem of formally


incorporating spatial dependence into the model.

Question?
What would be a good criteria to define closeness in space? Or, in other
words, how to determine which other units in the system influence the one
under consideration?
Spatial Weight Matrix

The device typically used in spatial analysis is the so-called spatial weight
matrix, or simply W matrix.
Impose a structure in terms of what are the neighbors for each location.
Assigns weights that measure the intensity of the relationship among pair
of spatial units.
Not necessarily symmetric.
Spatial Weight Matrix

Definition (W Matrix)
Let n be the number of spatial units. The spatial weight matrix, W, a n × n
positive symmetric and non-stochastic matrix with element wij at location
i, j. The values of wij or the weights for each pair of locations are assigned by
some preset rules which defines the spatial relations among locations. By
convention, wij = 0 for the diagonal elements.

The symmetry assumption can be dropped later.


 
w11 w12 . . . w1n
 w21 w22 . . . w2n 
W=  ... .. .. .. 
. . . 
wn1 wn2 . . . wnn
Spatial Weight Matrix

Two main approaches:


1 Contiguity.
2 Based on distance
1 Introduction to Spatial Econometric
Goals and Mandatory Reading
Why do We Need Spatial Econometric?
Spatial Heterogeneity and Dependence
Spatial Autocorrelation
2 Spatial Weight Matrix
Definition
Weights Based on Boundaries
From Contiguity to the W Matrix
Weights Based on Distance
Row Standardization
Spatial Lag
Higher-Order Spatial Neighbors
3 Examples of Weight Matrices in R
Creating Contiguity Neighbors
Creating Distance-Based
4 Testing for Spatial Autocorrelation
Indicators of spatial association
Moran’s I
Weights Based on Boundaries

The availability of polygon or lattice data permits the construction of


contiguity-based spatial weight matrices. A typical specification of the
contiguity relationship in the spatial weight matrix is
(
1 if i and j are contiguous
wij = (2)
0 if i and j are not contiguous

1 Binary Contiguity:
Rook criterion (Common Border)
Bishop criterion (Common Vertex)
Queen criterion (Either common border or vertex)
Rook Contiguity

How are the neighbors of region 5?

Figure: Rook Contiguity

1 2 3

4 5 6

7 8 9
Rook Contiguity

Figure: Rook Contiguity

1 2 3

4 5 6

7 8 9
Rook Contiguity

Figure: Rook Contiguity

1 2 3

4 5 6

7 8 9

Common border: 2, 4, 5, 6
Bishop Contiguity

Figure: Bishop Contiguity

1 2 3

4 5 6

7 8 9
Bishop Contiguity

Figure: Bishop Contiguity

1 2 3

4 5 6

7 8 9

Common vertex: 1, 3, 7, 9
Queen Contiguity

Figure: Queen Contiguity

1 2 3

4 5 6

7 8 9
Queen Contiguity

Figure: Queen Contiguity

1 2 3

4 5 6

7 8 9

Common vertex and border: 1, 2, 3, 4, 6, 7, 8, 9.


1 Introduction to Spatial Econometric
Goals and Mandatory Reading
Why do We Need Spatial Econometric?
Spatial Heterogeneity and Dependence
Spatial Autocorrelation
2 Spatial Weight Matrix
Definition
Weights Based on Boundaries
From Contiguity to the W Matrix
Weights Based on Distance
Row Standardization
Spatial Lag
Higher-Order Spatial Neighbors
3 Examples of Weight Matrices in R
Creating Contiguity Neighbors
Creating Distance-Based
4 Testing for Spatial Autocorrelation
Indicators of spatial association
Moran’s I
Rook Contiguity

0 1 0 1 0 0 0 0 0
 
1 2 3 1 0 1 0 1 0 0 0 0
0 1 0 0 0 1 0 0 0
1 0 0 0 1 0 1 0 0
 
4 5 6 0
W= 1 0 1 0 1 0 1 0
0 0 1 0 1 0 0 0 1

0 0 0 1 0 0 0 1 0
0 0 0 0 1 0 1 0 1
 
7 8 9 0 0 0 0 0 1 0 1 0
Bishop Contiguity

0 0 0 0 1 0 0 0 0
 
1 2 3 0 0 0 1 0 1 0 0 0
0 0 0 0 1 0 0 0 0
0 1 0 0 0 0 0 1 0
 
4 5 6 1
W= 0 1 0 0 0 1 0 1
0 1 0 0 0 0 0 1 0

0 0 0 0 1 0 0 0 0
0 0 0 1 0 1 0 0 0
 
7 8 9 0 0 0 0 1 0 0 0 0
1 Introduction to Spatial Econometric
Goals and Mandatory Reading
Why do We Need Spatial Econometric?
Spatial Heterogeneity and Dependence
Spatial Autocorrelation
2 Spatial Weight Matrix
Definition
Weights Based on Boundaries
From Contiguity to the W Matrix
Weights Based on Distance
Row Standardization
Spatial Lag
Higher-Order Spatial Neighbors
3 Examples of Weight Matrices in R
Creating Contiguity Neighbors
Creating Distance-Based
4 Testing for Spatial Autocorrelation
Indicators of spatial association
Moran’s I
W based on distance

Weights may be also defined as a function of the distance between region


i and j, dij .
dij is usually computed as the distance between their centroids (or other
important unit).
Let xi an xj be the longitud and yi and yj the latitude coordinates for
region i and j, respectively:
Distance Metric

Definition (Minkowski metric)


Let two point i and j, with respective coordinates (xi , yi ) and (xj , yj ):
p p 1/p
dpij = (|xi − xj | + |yi − yj | ) (3)

Definition (Euclidean metric)


Consider Minkowski metric and set p = 2, then
q
deij = (xi − xj )2 + (yi − yj )2 . (4)

Definition (Manhattan metric)


Consider Minkowski metric and set p = 1, then

dm
ij = |xi − xj | + |yi − yj | . (5)
Distance Metric

Euclidean distance is not necessarily the shortest distance if you take into
account the curvature of the earth.

Definition (Great Circle Distance)


Let two point i and j, with respective coordinates (xi , yi ) and (xj , yj ):
−1
dcd
ij = r × arccos [cos |xi − xj | cos yi cos yj + sin yi sin yj ] (6)

where r is the Earth’s radius. The arc distance is obtained in miles with
r = 3959 and in kilometers with r = 6371.
W based on distance

Inverse distance: (
1
dα if i 6= j
wij = ij (7)
0 if i = j
Typically, α = 1 or α = 2.
Negative exponential model:
 
dij
wij = exp − (8)
α
W based on distance

k-nearest neighbors: We explicitly limit the number of neighbors.


(
1 if centroid of j is one of the k nearest centroids to that of i
wij =
0 otherwise
(9)
Threshold Distance (Distance Band Weights): In contrast to the
k-nearest neighbors method, the threshold distance specifies that an
region i is neighbor of j if the distance between them is less than a
specified maximum distance:
(
1 if 0 ≤ dij ≤ dmax
wij = (10)
0 if dij > dmax
To avoid isolates that would result from too stringent a critical distance,
the distance must be chosen such that each location has at least one
neighbor. Such a distance conforms to a max-min criterion, i.e., it is the
largest of the nearest neighbor distances.
1 Introduction to Spatial Econometric
Goals and Mandatory Reading
Why do We Need Spatial Econometric?
Spatial Heterogeneity and Dependence
Spatial Autocorrelation
2 Spatial Weight Matrix
Definition
Weights Based on Boundaries
From Contiguity to the W Matrix
Weights Based on Distance
Row Standardization
Spatial Lag
Higher-Order Spatial Neighbors
3 Examples of Weight Matrices in R
Creating Contiguity Neighbors
Creating Distance-Based
4 Testing for Spatial Autocorrelation
Indicators of spatial association
Moran’s I
Row standardization

W’s are used to compute weighted averages in which more weight is


placed on nearby observations than on distant observations.
Row standardization

W’s are used to compute weighted averages in which more weight is


placed on nearby observations than on distant observations.
The elements of a row-standardized weights matrix equal
s wij
wij =P .
j wij

This ensures that all weights are between 0 and 1 and facilities the
interpretation of operation with the weights matrix as an averaging of
neighboring values.
Row standardization

W’s are used to compute weighted averages in which more weight is


placed on nearby observations than on distant observations.
The elements of a row-standardized weights matrix equal
s wij
wij =P .
j wij

This ensures that all weights are between 0 and 1 and facilities the
interpretation of operation with the weights matrix as an averaging of
neighboring values.
Under row-standardization, the element of each row sum to unity.
Row standardization

W’s are used to compute weighted averages in which more weight is


placed on nearby observations than on distant observations.
The elements of a row-standardized weights matrix equal
s wij
wij =P .
j wij

This ensures that all weights are between 0 and 1 and facilities the
interpretation of operation with the weights matrix as an averaging of
neighboring values.
Under row-standardization, the element of each row sum to unity.
The row-standardized weights matrix also ensures that the spatial
parameter in many spatial stochastic processes are comparable between
models.
Row standardization

W’s are used to compute weighted averages in which more weight is


placed on nearby observations than on distant observations.
The elements of a row-standardized weights matrix equal
s wij
wij =P .
j wij

This ensures that all weights are between 0 and 1 and facilities the
interpretation of operation with the weights matrix as an averaging of
neighboring values.
Under row-standardization, the element of each row sum to unity.
The row-standardized weights matrix also ensures that the spatial
parameter in many spatial stochastic processes are comparable between
models.
Under row-standardization the matrices are not longer symmetric!.
Row standardization

The row-standardized matrix is also known in the literature as the


row-stochastic matrix:

Definition (Row-stochastic Matrix)


A real n × n matrix A is called Markov matrix, or row-stochastic matrix
if
1 aij ≥ 0 for 1 ≤ i, j ≤ n;
Pn
j=1 aij = 1 for 1 ≤ i ≤ n
2

An important characteristic of the row-stochastic matrix is related to its eigen


values:

Theorem (Eigenvalues of row-stochastic Matrix)


Every eigenvalue ωi of a row-stochastic Matrix satisfies |ω| ≤ 1

Therefore, the eigenvalues of the row-stochastic (i.e., row-normalized, row


standardized or Markov) neighborhood matrix Ws are in the range [−1, +1].
1 Introduction to Spatial Econometric
Goals and Mandatory Reading
Why do We Need Spatial Econometric?
Spatial Heterogeneity and Dependence
Spatial Autocorrelation
2 Spatial Weight Matrix
Definition
Weights Based on Boundaries
From Contiguity to the W Matrix
Weights Based on Distance
Row Standardization
Spatial Lag
Higher-Order Spatial Neighbors
3 Examples of Weight Matrices in R
Creating Contiguity Neighbors
Creating Distance-Based
4 Testing for Spatial Autocorrelation
Indicators of spatial association
Moran’s I
Spatial Lag

P form yL = Wy with dimension n × 1, where


The spatial lag operator takes the
each element is given by yLi = j wij yj , i.e., a weighted average of the y
values in the neighbor of i.
For example:
! ! !
0 1 0 10 50
Wy = 1 0 1 50 = 10 + 30 (11)
0 1 0 30 50
Using a row-standardized weight matrix:
! ! !
0 1 0 10 50
s
W y = 0.5 0 0.5 50 = 5 + 15 (12)
0 1 0 30 50
Therefore, when W is standardized, each element (Wy)i is interpreted as a
weighted average of the y values for i’s neighbors.
1 Introduction to Spatial Econometric
Goals and Mandatory Reading
Why do We Need Spatial Econometric?
Spatial Heterogeneity and Dependence
Spatial Autocorrelation
2 Spatial Weight Matrix
Definition
Weights Based on Boundaries
From Contiguity to the W Matrix
Weights Based on Distance
Row Standardization
Spatial Lag
Higher-Order Spatial Neighbors
3 Examples of Weight Matrices in R
Creating Contiguity Neighbors
Creating Distance-Based
4 Testing for Spatial Autocorrelation
Indicators of spatial association
Moran’s I
Higher-Order Neighbors

How to define higher-order neighbors?


We might be interested in the neighbors of the neighbors of spatial unit i.
We define the higher-order spatial weigh matrix l as Wl .
Spatial weight of order l = 2 is given by W2 = WW.
Spatial weight of order l = 3 is given by W3 = WWW.
As an illustration consider the following structure for our previous
example:
0 1 0 0 0
 
 1 0 1 0 0 
W= 0 1 0 1 0 (13)

0 0 1 0 1
0 0 0 1 0
Higher-Order Neighbors
Then W2 = WW based on the 5 × 5 first-order contiguity matrix W from
(13) is:
1 0 1 0 0
 
0 2 0 1 0
W2 = 

1 0 2 0 1 (14)
0 1 0 2 0

0 0 1 0 1

Figure: Higher-Order Neighbors

R1 R2 R3 R4 R5

R1 R2 R3 R4 R5
1 Introduction to Spatial Econometric
Goals and Mandatory Reading
Why do We Need Spatial Econometric?
Spatial Heterogeneity and Dependence
Spatial Autocorrelation
2 Spatial Weight Matrix
Definition
Weights Based on Boundaries
From Contiguity to the W Matrix
Weights Based on Distance
Row Standardization
Spatial Lag
Higher-Order Spatial Neighbors
3 Examples of Weight Matrices in R
Creating Contiguity Neighbors
Creating Distance-Based
4 Testing for Spatial Autocorrelation
Indicators of spatial association
Moran’s I
Examples in R

Creating spatial weight matrices by hand is tedious (and almost


impossible).
However, there exists several statistical software that allow us to create
them in a very simply fashion.
First, we need the shape file, which has geographical information:
It is a digital vector storage for storing geometric location and associated
attribute information
Mandatory files:
.shp: shape format; the feature geometry itself,
.shx: shape index format; a positional index of the feature geometry to
allow seeking forwards and backwards quickly,
.dbf: attribute format; columnar attributes for each shape, in dBase IV
format.
Examples in R
library("maptools")

If the shape file mr_chile.shp is in the same folder, then we can load it into R using the
command readShapeSpatial:
setwd("~/Dropbox/Mis Clases/Spatial Econometrics/Children")
mr <- readShapeSpatial("mr_chile.shp")

## Warning: readShapeSpatial is deprecated; use rgdal::readOGR or sf::st_read


## Warning: readShapePoly is deprecated; use rgdal::readOGR or sf::st_read

class(mr)

## [1] "SpatialPolygonsDataFrame"
## attr(,"package")
## [1] "sp"

The function readShapeSpatial reads data from the shapefile into a Spatial object of class
“sp”. The function names give us the name of the variables in the .dbf file associated with
the shape file.
names(mr)

## [1] "ID" "NAME" "NAME2" "URB_POP" "RUR_POP"


## [6] "MALE_POP" "TOT_POP" "FEM_POP" "N_PARKS" "N_PLAZA"
Examples in R

plot(mr, main = "Metropolitan Region-Chile", axes = TRUE)

Metropolitan Region−Chile
−33.0
−33.5
−34.0
−34.5

−71.5 −71.0 −70.5 −70.0


W in R

To create spatial weight matrices we need to use the spdep package


library("spdep")

In the spdep package, neighbor relationships between n observations are represented by


an object of class “nb”.
The function poly2nb is used in order to construct weight matrices based on
contiguity.
First, we create a neighbor list based on the ‘Queen’ criteria for the communes of the
Metropolitan Region:

queen.w <- poly2nb(mr, row.names = mr$NAME, queen = TRUE)


W in R

summary(queen.w)

## Neighbour list object:


## Number of regions: 52
## Number of nonzero links: 292
## Percentage nonzero weights: 10.79882
## Average number of links: 5.615385
## Link number distribution:
##
## 2 3 4 5 6 7 8 9 10 12
## 3 2 7 15 10 10 2 1 1 1
## 3 least connected regions:
## Tiltil San Pedro Maria Pinto with 2 links
## 1 most connected region:
## San Bernardo with 12 links
W in R
To transform the list into an actual matrix W, we can use the function nb2listw:

queen.wl <- nb2listw(queen.w, style = "W")


summary(queen.wl)

## Characteristics of weights list object:


## Neighbour list object:
## Number of regions: 52
## Number of nonzero links: 292
## Percentage nonzero weights: 10.79882
## Average number of links: 5.615385
## Link number distribution:
##
## 2 3 4 5 6 7 8 9 10 12
## 3 2 7 15 10 10 2 1 1 1
## 3 least connected regions:
## Tiltil San Pedro Maria Pinto with 2 links
## 1 most connected region:
## San Bernardo with 12 links
##
## Weights style: W
## Weights constants summary:
## n nn S0 S1 S2
## W 52 2704 52 19.76751 216.466
W in R

Now, we construct a binary matrix using the Rook criteria:

# Rook W
rook.w <- poly2nb(mr, row.names = mr$NAME, queen = FALSE)
summary(rook.w)

## Neighbour list object:


## Number of regions: 52
## Number of nonzero links: 272
## Percentage nonzero weights: 10.05917
## Average number of links: 5.230769
## Link number distribution:
##
## 2 3 4 5 6 7 8 9 10
## 3 3 12 16 7 6 2 1 2
## 3 least connected regions:
## Tiltil San Pedro Maria Pinto with 2 links
## 2 most connected regions:
## Santiago San Bernardo with 10 links
W in R
Finally, we can plot the weight matrices using the following set of commands.
# Plot Queen and Rook W Matrices
plot(mr, border = "grey")
plot(queen.w, coordinates(mr), add = TRUE, col = "red")
plot(rook.w, coordinates(mr), add = TRUE, col = "yellow")
1 Introduction to Spatial Econometric
Goals and Mandatory Reading
Why do We Need Spatial Econometric?
Spatial Heterogeneity and Dependence
Spatial Autocorrelation
2 Spatial Weight Matrix
Definition
Weights Based on Boundaries
From Contiguity to the W Matrix
Weights Based on Distance
Row Standardization
Spatial Lag
Higher-Order Spatial Neighbors
3 Examples of Weight Matrices in R
Creating Contiguity Neighbors
Creating Distance-Based
4 Testing for Spatial Autocorrelation
Indicators of spatial association
Moran’s I
W in R
First, we extract coordinates: We now construct spatial weight matrices using the k-nearest
neighbors criteria.
# K-neighbors
coords <- coordinates(mr) # coordinates of centroids
head(coords, 5) # show coordinates

## [,1] [,2]
## 0 -70.65599 -33.45406
## 1 -70.71742 -33.50027
## 2 -70.74504 -33.42278
## 3 -70.67735 -33.38372
## 4 -70.67640 -33.56294

k1neigh <- knearneigh(coords, k = 1, longlat = TRUE) # 1-nearest neighbor


k2neigh <- knearneigh(coords, k = 2, longlat = TRUE) # 2-nearest neighbor

The function coords extract the spatial coordinates from the shape file, whereas the
function knearneigh return a matrix with the indices of points belonging to the set of
the k-nearest neighbors of each other.
The argument k indicates the number of nearest neighbors to be returned.
If point coordinates are longitude-latitude decimal degrees, then distances are
measured in kilometers if longlat = TRUE, if TRUE great circle distances are used.
objects k1neigh and k2neigh are of class knn.
W in R
Weight matrices based on inverse distance can be computed in the following way:

# Inverse weight matrix


dist.mat <- as.matrix(dist(coords, method = "euclidean"))
dist.mat[1:5, 1:5]

## 0 1 2 3 4
## 0 0.00000000 0.07687010 0.09438408 0.07350782 0.11078109
## 1 0.07687010 0.00000000 0.08226867 0.12324109 0.07489489
## 2 0.09438408 0.08226867 0.00000000 0.07814455 0.15606360
## 3 0.07350782 0.12324109 0.07814455 0.00000000 0.17922003
## 4 0.11078109 0.07489489 0.15606360 0.17922003 0.00000000

dist.mat.inv <- 1 / dist.mat # 1 / d_{ij}


diag(dist.mat.inv) <- 0 # 0 in the diagonal
dist.mat.inv[1:5, 1:5]

## 0 1 2 3 4
## 0 0.000000 13.008960 10.595007 13.603994 9.026811
## 1 13.008960 0.000000 12.155295 8.114177 13.352046
## 2 10.595007 12.155295 0.000000 12.796797 6.407644
## 3 13.603994 8.114177 12.796797 0.000000 5.579733
## 4 9.026811 13.352046 6.407644 5.579733 0.000000
W in R

# Standardized inverse weight matrix


dist.mat.inve <- mat2listw(dist.mat.inv, style = "W", row.names = mr$NAME)
summary(dist.mat.inve)

## Characteristics of weights list object:


## Neighbour list object:
## Number of regions: 52
## Number of nonzero links: 2652
## Percentage nonzero weights: 98.07692
## Average number of links: 51
## Link number distribution:
##
## 51
## 52
## 52 least connected regions:
## Santiago Cerillos Cerro Navia Conchali El Bosque Estacion Central La Cisterna La
## 52 most connected regions:
## Santiago Cerillos Cerro Navia Conchali El Bosque Estacion Central La Cisterna La
##
## Weights style: W
## Weights constants summary:
## n nn S0 S1 S2
## W 52 2704 52 2.902384 214.3332
W in R
Queen 1−Neigh

2−Neigh Inverse Distance


1 Introduction to Spatial Econometric
Goals and Mandatory Reading
Why do We Need Spatial Econometric?
Spatial Heterogeneity and Dependence
Spatial Autocorrelation
2 Spatial Weight Matrix
Definition
Weights Based on Boundaries
From Contiguity to the W Matrix
Weights Based on Distance
Row Standardization
Spatial Lag
Higher-Order Spatial Neighbors
3 Examples of Weight Matrices in R
Creating Contiguity Neighbors
Creating Distance-Based
4 Testing for Spatial Autocorrelation
Indicators of spatial association
Moran’s I
Global Autocorrelation

Indicators of spatial association


1 Global Autocorrelation
2 Local Autocorrelation

Definition (Global Autocorrelation)


It is a measure of overall clustering in the data. It yields only one statistic to
summarize the whole study area (Homogeneity).
1 Moran’s I.
2 Gery’s C.
3 Getis and Ord’s G(d)

Definition (Local Autocorrelation)


A measure of spatial autocorrelation for each individual location.
Local Indices for spatial Spatial Analysis (LISA)
1 Introduction to Spatial Econometric
Goals and Mandatory Reading
Why do We Need Spatial Econometric?
Spatial Heterogeneity and Dependence
Spatial Autocorrelation
2 Spatial Weight Matrix
Definition
Weights Based on Boundaries
From Contiguity to the W Matrix
Weights Based on Distance
Row Standardization
Spatial Lag
Higher-Order Spatial Neighbors
3 Examples of Weight Matrices in R
Creating Contiguity Neighbors
Creating Distance-Based
4 Testing for Spatial Autocorrelation
Indicators of spatial association
Moran’s I
Moran’s I

This statistic is given by:


Pn Pn Pn Pn
i=1 j=1,j6=i wij (xi − x̄) (xj − x̄) n i=1 j=1 wij (xi − x̄) (xj − x̄)
I= Pn 2 = Pn 2
S0 i=1 (xi − x̄) /n S0 i=1 (xi − x̄)
Pn Pn (15)
where S0 = i=1 j=1 wij and wij is an element of the spatial weight matrix
that measures spatial distance or connectivity between regions i and j. In
matrix form:

n z0 Wz
I= (16)
S0 z0 z
where z = x − x̄. If the W matrix is row standardized, then:

z0 W s z
I= (17)
z0 z
because S0 = n. Values range from -1 (perfect dispersion) to +1 (perfect
correlation). A zero value indicates a random spatial pattern.
Moran Scatterplot
A very useful tool for understanding the Moran’s I test

3
2

Quadrant II Quadrant I
Wx

Quadrant III
0

Quadrant IV
4:2
−1

4:1
5:4

−3 −2 −1 0 1 2 3 4
Moran’s I

Note that: P
i (x− x̄)(yi − ȳ)
βbOLS = Pi 2
i i − x̄)
(x
Therefore?
Moran’s I

Note that: P
i (x− x̄)(yi − ȳ)
βbOLS = Pi 2
i i − x̄)
(x
Therefore?
Remark
I is equivalent to the slope coefficient of a linear regression of the spatial lag
Wx on the observation vector x measured in deviation from their means. It
is, however, not equivalent to the slope of x on Wx which would be a more
natural way.
Moran’s I

H0 : x is spatially independent, the observed x are assigned at random


among locations. (I is close to zero)
H1 : X is not spatially independent. (I is not zero)
Moran’s I

We are interested in the distribution of the following statistic:

I − E(I)
TI = p (18)
Var(I)

Three approaches to compute the variance of Moran’s I:


Monte Carlo
Normality of xi : It is assumed that the random variable xi are the result
of n independently drawings from a normal population.
Randomization of xi : No matter what the underlying distribution of the
population, we consider the observed values of xi were repeatedly
randomly permuted.
Moran’s I

Theorem (Moran’s I Under Normality)


Assume that {xi } = {x1 , x2 , ..., xn } are independent and distributed as
N(µ, σ 2 ), but µ and σ 2 are unknown. Then:
1
E (I) = − (19)
n−1
and
 n2 S1 − nS2 + 3S02
E I2 = (20)
S02 (n2 − 1)
Pn Pn Pn Pn
where S0 = i=1 j=1 wij , S1 = i=1 j=1 (wij + wji )2 /2,
Pn Pn Pn
S2 = i=1 (wi. + w.i )2 , where wi. = j=1 wij and wi. = j=1 wji Then:
2
Var (I) = E I 2 − E (I)

(21)
Moran’s I
Theorem 17 gives the moments of Moran’s I under randomization.
Theorem (Moran’s I Under Randomization)
Under permutation, we have:
1
E (I) = − (22)
n−1
and

     
2
 n2 − 3n + 3 S1 − nS2 + 3S02 − b2 n2 − n S1 − 2nS2 + 6S02
n
E I =
(n − 1)(n − 2)(n − 3)S02
Pn Pn Pn Pn (23)
where S0 = i=1 j=1 wij , S1 = i=1 j=1 (wij + wji )2 /2,
Pn Pn Pn
S2 = i=1 (wi. + w.i )2 , where wi. = j=1 wij and wi. = j=1 wji .Then:
2
Var (I) = E I 2 − E (I)

(24)

It is important to note that the expected value of Moran’s I under normality


and randomization is the same.
Monte Carlo

Normality and randomization? We can use a Monte Carlo simulation


To test a null hypothesis H0 we specify a test statistic T such that large
values of T are evidence against H0 .
H0 : no spatial autocorrelation.
Let T have observed value tobs . We generally want to calculate:

p = Pr(T ≥ tobs |H0 ) (25)

We need the distribution of T when H0 is true to evaluate this probability.


Monte Carlo

Theorem (Moran’s’ I Monte Carlo Test)


The procedure is the following:
1 Rearrange the spatial data by shuffling their location and compute the
Moran’s I S times. This will create the distribution under H0 .
2 Let I1∗ , I2∗ , ..., IS∗ be the Moran’s I for each time. A consistent Monte Carlo
p-value is then:
PS
1 + s=1 1(Is∗ ≥ Iobs )
pb = (26)
S+1
3 For tests at the α level or at 100(1 − α)% confidence intervals, there are
reasons for choosing S so that α(S + 1) is an integer. For example, use
S = 999 for confidence intervals and hypothesis tests when α = 0.05.
Inference

Inference:
If I > E(I), then a spatial unit tends to be connected by locations with
similar attributes: Spatial clustering (low/low or high/high). The
strength of positive spatial autocorrelation tends to increase with
I − E(I).
If I < E(I) observations will tend to have dissimilar values from their
neighbors: Negative spatial autocorrelation (low/high or high/low)
Application

Lab1A.R
Lab1B.R

You might also like