0% found this document useful (0 votes)
4 views13 pages

Exploratory Spatial Data Analysis Es Da

The document provides an overview of Exploratory Spatial Data Analysis (ESDA) using the GeoDa software, which is designed for spatial data analysis and is compatible with various operating systems. It outlines the data set used for analysis, specifically the American Community Survey data for New York City, and details various exploration techniques available in GeoDa, including univariate and multivariate analysis methods. Additionally, it includes instructions for using GeoDa, along with acknowledgments of the territories and communities involved in the work.

Uploaded by

Anamaria Olaru
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views13 pages

Exploratory Spatial Data Analysis Es Da

The document provides an overview of Exploratory Spatial Data Analysis (ESDA) using the GeoDa software, which is designed for spatial data analysis and is compatible with various operating systems. It outlines the data set used for analysis, specifically the American Community Survey data for New York City, and details various exploration techniques available in GeoDa, including univariate and multivariate analysis methods. Additionally, it includes instructions for using GeoDa, along with acknowledgments of the territories and communities involved in the work.

Uploaded by

Anamaria Olaru
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/352648085

Exploratory Spatial Data Analysis (ESDA)

Method · May 2021


DOI: 10.13140/RG.2.2.14436.50562

CITATIONS READS
0 325

1 author:

Scott Bell
University of Saskatchewan
87 PUBLICATIONS 2,045 CITATIONS

SEE PROFILE

All content following this page was uploaded by Scott Bell on 22 June 2021.

The user has requested enhancement of the downloaded file.


6/22/2021 Exploratory Spatial Data Analysis (ESDA)

Exploratory Spatial Data


Analysis (ESDA)
An overview of exploratory tools in Geography and
GeoDa

February 12, 2021

A. Background
From geodacenter:

“GeoDa is a free software program that serves as an introduction to


spatial data analysis. GeoDa is the cross-platform, open source
version of Legacy GeoDa. GeoDa runs on different versions of
Windows (including XP, Vista and 7), Mac OS, and Linux. It is
written in C++ and no longer relies on ESRI's MapObjects library
(it uses wxwidgets instead).

GeoDa is the flagship program of the GeoDa Center, following a


long line of software tools developed by Dr. Luc Anselin. It is
designed to implement techniques for exploratory spatial data
analysis (ESDA) on points and polygon). The free program provides
a user friendly and graphical interface to methods of descriptive

https://storymaps.arcgis.com/stories/cc015a906add4307abf62cfb9dd6429a/print 1/12
6/22/2021 Exploratory Spatial Data Analysis (ESDA)

spatial data analysis, such as spatial autocorrelation statistics, as well


as basic spatial regression functionality.

The development of GeoDa and related materials has been


primarily supported by the U.S. National Science Foundation/ the
Center for Spatially Integrated Social Science (CSISS) (Grant BCS-
9978058).

Reference: Anselin, L., I. Syabri and Y Kho. (2005). GeoDa : An


Introduction to Spatial Data Analysis. Geographical Analysis 38(1),
5-22.”

GeoDa can be downloaded at:


https://geodacenter.github.io/download.html

GeoDa does run on Mac OS. Scott runs it on a Mac. However, while
stable, there are some quirks that can suggest instability. These
instructions are for Windows, but the functionality (but maybe not
the interface) is the same on Mac. Since we are all working on our
own computers this has the potential to affect your workflow; before
you download and start running GeoDa on Mac OS consider other
places you might need the data or output (in a format that can be
used in ArcGIS).

B. Data set
In these exercises we will use American Community Survey (ACS) 3
year data, ending in 2016 (NYC_Data). There is also a dataset for
the same geography (New York City) from the 2000 decennial
census, if you are interested. (Note: GeoDa comes with several data
files as samples (e.g., Crime in Columbus [tracts], SIDS in North
Carolina [counties])).

The shapefile NYC_Data.shp is the map of New York City with


ACS 2016 data. These are socioeconomic attributes for census tracts
in the five boroughs. It includes the following variables:

NYC_Data.shp

https://storymaps.arcgis.com/stories/cc015a906add4307abf62cfb9dd6429a/print 2/12
6/22/2021 Exploratory Spatial Data Analysis (ESDA)

1. PCT_WHT Percent of Population that is white


2. STATEFP State FIPS* code
3. COUNTYFP County FIPS* code
4. TRACTCE Extended tract id
5. AFFGEOID GeoID for joining
6. GEOID
7. ALAND Area of tract
8. GEOID1 Copy of GeoID
9. GEOID2 Copy of GeoID (don't ask ;))
10. CT Census Tract #
11. TOT-POP Estimate of total population
12. TOT_WHT_POP Estimate of total White population
13. TOT_BLK_POP Estimate of total Black population
14. TOT_ASN_POP Estimate of total Asian population
15. HSHLDS_TOT Estimate of total households
16. FAM-TOT Estimate of total families
17. Highschool Estimate count of population with
a high school diploma
18. Postsec Estimate count of population
with some post-secondary education
19. Bachelo_o Estimate count of population
with a Bachelor’s degree or higher
20. MED_House Median House Value
21. MED_income Median Household income (derived
by compiling three other median income variable

*Federal Information Processing Standard

C. GeoDa introduction
GeoDa employs ESRI shapefiles as its primary data format, making
a convenient program to use in conjunction with ArcGIS Pro.

Start GeoDa by Click the Start > All Programs > GeoDa
Software >GeoDa

You should see something like this:

https://storymaps.arcgis.com/stories/cc015a906add4307abf62cfb9dd6429a/print 3/12
6/22/2021 Exploratory Spatial Data Analysis (ESDA)
This is the main GeoDa Toolbar

1) Go to File > New, under file click the folder icon, choose ESRI
shapefile, and navigate to our data location.

2) Browse to NYC_Data.shp, click Connect.

3) Click OK. Your screen should now look something like … (if you
get an error, don’t close program, “ignore” and you might have to
adjust the windows a little bit. I’m testing/writing while running
geode on a mac and there are some differences)

The shapefile after being opened

The holes in the spatial data are places (census tracts) with no data
for several of the variables we are using. While the entire country is
“tracted” (or covered by census tracts) not all of them have
consistent data. The removed tracts include parks, open space,
industrial areas, or are otherwise differently or less populated than
other place

4) The GeoDa menu bar contains nine menu items

https://storymaps.arcgis.com/stories/cc015a906add4307abf62cfb9dd6429a/print 4/12
6/22/2021 Exploratory Spatial Data Analysis (ESDA)

1. File (Project Toolbar)


2. Edit
3. Tools (Weights Toolbar)
4. Table
5. Map
6. Explore
7. Clusters
8. Space
9. Time
10. Regression
11. Options

Now let’s make some adjustments to the view:

We will improve the background color for more clarity.

Right click on the background and choose color, and choose


background color and select a light gray.

Then, change the map color to a different color (right-click on the


rectangle on the table of content, choose fill color for category).

zoom in and zoom out.

Zoom in and out: at top of map window

Zooming in and out is a little crude in GeoDa, the + option let you
draw a box and zoom to that box, the – (minus) tool is tricky. You
can zoom to the map’s full extent by clicking on the four arrows.

D. Data exploration
(Note: Most of the GeoDa exploration functions can be applied to

https://storymaps.arcgis.com/stories/cc015a906add4307abf62cfb9dd6429a/print 5/12
6/22/2021 Exploratory Spatial Data Analysis (ESDA)

either polygon or point shapefiles, you can use either.)

1) Univariate exploration with the variable of housing value


(MED_House)

I) Quantile distribution

The Quantile map function in GeoDa allows you to specify up to 9


categories, the default is 4. Let’s choose 5 categories.

Go to Map > Quantile,

(you can also access and change map types by right-clicking and
rolling over the Change Current Map Type optio

In the dialog box select variables select a variable of interest


(MED_House) by scrolling down the variable list. Click OK. You
will now be asked how many classes, type 5.

A quantile map of House Value (notice the pane on the left show the range of $$ values for each
quantile category AND the number of tracts in each..

Drag the vertical bar so that the legend will show properly. The
version you are using will show values without exponents (which is a
relief).

II) Percentile distribution

GeoDa has six preset categories (classes) for percentile maps.

https://storymaps.arcgis.com/stories/cc015a906add4307abf62cfb9dd6429a/print 6/12
6/22/2021 Exploratory Spatial Data Analysis (ESDA)

As we make additional maps through the main menu bar, a new


map window will be created for each one and each time we need to
specify a variable for the map.

III) Box Map

A Box Map is designed to show quartile distributions with outliers


defined by upper and lower hinges. The “hinge” values allow us to
identify outliers based on the values for the interquartile ranges
(IQR). A hinge value of 1.5 will identify high and/or low outliers as
those observations that are greater or less than the 75th or 25th
percentile (respectively) by more than 1.5 times than the IQR.

Let’s create an additional window:

Go to Map > Box Map > Select Hinge = 1.5

IV) Standard Deviation Map

GeoDa Standard Deviation map maps mean, single and double


standard deviations, and beyond.

Let’s create an additional window:

Go to Map > Standard Deviation Map

V) Cartogram

Cartogram is another method to examine variable distribution. It is


a technique that allows you to map locations according to the values
of a selected variable using size as the symbol variable that changes
with the corresponding non-spatial variable. (Note: cartogram can
only be made on polygons.)

Go to Map > Cartogram

Cartogram has been buggy this week in the lab; if it freezes your
computer, restart GeoDa and skip. When making a cartogram we

https://storymaps.arcgis.com/stories/cc015a906add4307abf62cfb9dd6429a/print 7/12
6/22/2021 Exploratory Spatial Data Analysis (ESDA)

can symbolize two variables, one with symbol size (MED_House)


and one with color (MED_Income).

VI) Show map movie

Map movie is a tool that allows you to see the variable distribution
in an animated fashion. Specifically, it highlights the locations of
selected variable in ascending order.

Switch back to GeoDa, from the main icons menu select ► Then,
set Speed Control selector towards the right end, Click Play. The
animation displays median housing values from lowest to highest
across the mapped area. One of the aspects of GeoDa that is useful
is the linking across all map windows. In the case of map movie we
can see movie play out on each window, according to the variable
we selected for the movie.

VII) Histogram

A histogram is a bar graph that shows frequency data. The


horizontal axis should be the independent variable and the vertical
axis should be the dependent variable.

Click Explore > Histogram. Select MED_House (or

https://storymaps.arcgis.com/stories/cc015a906add4307abf62cfb9dd6429a/print 8/12
6/22/2021 Exploratory Spatial Data Analysis (ESDA)

another variable), click OK.

Increase the Histogram window size. You can change the number
of intervals, Right Click on the Histogram, select Interval and type
“50” in theinterval dialogue box, then, click OK. You can also
unclick Display Statistics to clean up the window.

VIII) Box Plot

(Box Plot is designed to show several critical distributional measures


in a single graph, we will be able to see median, upper and lower
quartiles, and outliers defined by upper and lower hinges.)

Go to Explore > Box Plot.

2) Multivariate exploration

Choose Scatter plot

Scatter plot explores bivariate relationship. Let’s first select one


more variable.

Go to Explore > Scatter Plot matrix and select MED_House as


X axis and MED_income as Y.

https://storymaps.arcgis.com/stories/cc015a906add4307abf62cfb9dd6429a/print 9/12
6/22/2021 Exploratory Spatial Data Analysis (ESDA)

Parallel Coordinate Plot (PCP) and brushing

(PCP) allows you to observe the relationship between multiple


variables

I) Explore > Parallel Coordinate Plot >

Select three variables: PCT_WHT , BACHELOR_o, and


MED_House, and click OK.

https://storymaps.arcgis.com/stories/cc015a906add4307abf62cfb9dd6429a/print 10/12
6/22/2021 Exploratory Spatial Data Analysis (ESDA)

Variable Selection for Parallel Coordinate Plot

Make the window it produces as big as your screen allows, so you


can see with better resolution.

II) Brushing- You can move the mouse around and click on the PCP
and highlight portions of the plot (“brushing”) to observe the
multivariate relationship in a dynamic fashion.

Then, with the Ctrl button depressed, click, drag, and release to
create a small box in the scatter plot window. It will flash for a
couple of seconds, and then become continuously active. You can
move it around the PCP and dynamically highlight portions of the
plot (“brushing”), all the while viewing the active selections in the
map (“linking”). Brushing makes most sense when you move the
selection “box” along one of the three parallel variable axes (top,
middle, or bottom). Doing this allows you to visualize a set of
observations (in this case each observation is a census tract) with
similar values for one of the three variables and how they vary across
the range of the remaining two variables. Simply click the mouse to
end the brushing.

https://storymaps.arcgis.com/stories/cc015a906add4307abf62cfb9dd6429a/print 11/12
View publication stats

6/22/2021 Exploratory Spatial Data Analysis (ESDA)

We are all done with ESDA, in the next two GeoDa exercises we will
be making edits to a shapefile before calculating spatial
autocorrelation; this only means that we have to be careful with
variable names, file names, and locations. A notebook will be useful.

Credits and Land Acknowledgement


I acknowledge that I live and work on Treaty 6 Territory and the Homeland of
the Métis. We pay our respect to the First Nations and Métis ancestors of this
place and reaffirm our relationship with one another.

Harvard University is situated on the traditional and ancestral homelands of the


Massachusett people. Our university honors the historic Harvard Charter of 1650,
which committed our institution to “the education of English and Indian youth of
this country.”

Scott Bell, Boabang Owusu University of Saskatchewan

Scott Bell, Jill Kelly Harvard University

https://storymaps.arcgis.com/stories/cc015a906add4307abf62cfb9dd6429a/print 12/12

You might also like