0% found this document useful (0 votes)
44 views40 pages

Clustering of Census Recorded Ethnic Background

This document discusses clustering ethnic groups using UK census data at the output area level. It introduces the topic, loads relevant packages in R, obtains ethnicity and population count data from a UK government database for the local authority of Slough by census output area. It then prepares to map the distribution of a specific ethnic minority group (Bangladeshi, Pakistani and Indian) by output area and look for evidence of clustering within or across local authorities. The goal is to help local public health teams better understand the resident populations to design culturally appropriate health services and reduce inequalities.

Uploaded by

Rafael Monteiro
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
44 views40 pages

Clustering of Census Recorded Ethnic Background

This document discusses clustering ethnic groups using UK census data at the output area level. It introduces the topic, loads relevant packages in R, obtains ethnicity and population count data from a UK government database for the local authority of Slough by census output area. It then prepares to map the distribution of a specific ethnic minority group (Bangladeshi, Pakistani and Indian) by output area and look for evidence of clustering within or across local authorities. The goal is to help local public health teams better understand the resident populations to design culturally appropriate health services and reduce inequalities.

Uploaded by

Rafael Monteiro
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 40

23/10/2020 Clustering of Census Recorded Ethnic Background

Introduction
Clustering of Census Recorded
Load Packages

Load Ethnicity Data


Ethnic Background
Simon Hailstone
Load Resident Population Numbers
December 2017
Download and Unzip the Output Area
Lookup Data

Obtain Data From a PostGIS Introduction


PostgreSQL System This project looks at the following:
Map of Ethnic Group by Output Area How are specific ethnic groups distributed within a local authority
Dots in Polygon Mapping Is there evidence of clustering of ethnic groups?
How does GP practice coverage relate to these clusters?
Clustering of Ethnic Group
What is the reason for looking at this information? Health inequalities are of significant importance
GP Registration Numbers in the field of public health. Those inequalities are driven by a wide variety of factors. One of those
factors can be cultural attitudes towards health and wellbeing services, leading to poorer uptake
or lower levels of engagement with programmes.

This work will hopefully form a foundation upon which a more fully-featured tool might be
developed. Allowing local public health teams to better understand their resident populations and
design services which are both acceptable to and appropriate for those groups.

A more detailed discussion of the issue of health and ethnicity in the UK can be read here:
http://www.parliament.uk/documents/post/postpn276.pdf
(http://www.parliament.uk/documents/post/postpn276.pdf)

Load Packages

https://rstudio-pubs-static.s3.amazonaws.com/346625_9b7a90358ca44d0b89db512afedc63b2.html#create_a_neighbourhood 1/40
23/10/2020 Clustering of Census Recorded Ethnic Background

library("tidyverse")
Introduction library("dplyr")
library("rgdal")
Load Packages library("extrafont")
library("sp")
Load Ethnicity Data
library("maptools")
Load Resident Population Numbers library("rgeos")
library("MASS")
Download and Unzip the Output Area library("raster")
Lookup Data library("broom") # contains the tidy function which now replaces the fortify funct
ion for ggplot
Obtain Data From a PostGIS library("viridis") # For nicer ggplot colours
PostgreSQL System library("spdep")
library("gridExtra")
Map of Ethnic Group by Output Area
library("Cairo")
Dots in Polygon Mapping library("RSQLite")

Clustering of Ethnic Group # To connect to a postgis database


library("RPostgreSQL")
GP Registration Numbers
library("postGIStools")

Load Ethnicity Data


Census data can be downloaded from the NOMIS website: https://www.nomisweb.co.uk/
(https://www.nomisweb.co.uk/)

In this instance I use an API to directly pull the data I need at Output Area level. I’m only
downloading data for Slough (ONS code E06000039) currently, I would eventually like to
download data for the entire country at this level of detail as this would allow identification of
clusters which span authority boundaries (i.e. edge effects).

For now we will look at a single BMI ethnic group. The NOMIS census code for this group is 132
(NOMIS ethnicity detailed codelist available from
https://www.nomisweb.co.uk/api/v01/dataset/NM_575_1/cell.def.htm
(https://www.nomisweb.co.uk/api/v01/dataset/NM_575_1/cell.def.htm))

https://rstudio-pubs-static.s3.amazonaws.com/346625_9b7a90358ca44d0b89db512afedc63b2.html#create_a_neighbourhood 2/40
23/10/2020 Clustering of Census Recorded Ethnic Background

api_string <- "http://www.nomisweb.co.uk/api/v01/dataset/NM_575_1.data.csv?date=la


Introduction test&geography=E06000039TYPE299&rural_urban=0&cell=132&measures=20100&select=date_
name,geography_name,geography_code,rural_urban_name,cell_name,measures_name,obs_va
Load Packages lue,obs_status_name"

Load Ethnicity Data


ethnicity_detailed <- read.csv(api_string, stringsAsFactors=F)
Load Resident Population Numbers

Download and Unzip the Output Area


Lookup Data
Load Resident Population Numbers
Obtain Data From a PostGIS We will also pull population information from NOMIS to use as a denominator when calculating a
PostgreSQL System proportion for each output area.

Map of Ethnic Group by Output Area api_string <- "http://www.nomisweb.co.uk/api/v01/dataset/NM_144_1.data.csv?date=la


test&geography=E06000039TYPE299&rural_urban=0&cell=0&measures=20100&select=date_na
Dots in Polygon Mapping
me,geography_name,geography_code,rural_urban_name,cell_name,measures_name,obs_valu
Clustering of Ethnic Group e,obs_status_name"

GP Registration Numbers pop_detailed <- read.csv(api_string, stringsAsFactors=F)

pop_detailed <- pop_detailed %>%


dplyr::select(GEOGRAPHY_CODE, "POPULATION"=OBS_VALUE)

Download and Unzip the Output Area


Lookup Data
I wanted to have R directly download the file but it seems to be a bit more complex than just
pointing at a URL so I downloaded manually and just use R to unzip and read in the csv file. This
file allows us to assign Output Areas to the correct Local Authority.

https://rstudio-pubs-static.s3.amazonaws.com/346625_9b7a90358ca44d0b89db512afedc63b2.html#create_a_neighbourhood 3/40
23/10/2020 Clustering of Census Recorded Ethnic Background

# Unzip
Introduction unzip(zipfile="data\\OA11_WD11_LAD11_EW_LU.zip", exdir="data")

Load Packages # Import


oa_lad_lkp <- read.csv("data\\OA11_WD11_LAD11_EW_LU.csv", stringsAsFactors=F)
Load Ethnicity Data

Load Resident Population Numbers # Trim out the fields we don't need in the LAD lookup
oa_lad_lkp <- oa_lad_lkp %>% dplyr::select(OA11CD,LAD11CD,LAD11NM)
Download and Unzip the Output Area
Lookup Data Now the Local Authority Code can be joined to the census ethnicity table.
Obtain Data From a PostGIS
ethnicity_detailed <- ethnicity_detailed %>%
PostgreSQL System
left_join(pop_detailed, by="GEOGRAPHY_CODE") %>%
Map of Ethnic Group by Output Area left_join(oa_lad_lkp, by=c("GEOGRAPHY_CODE"="OA11CD")) %>%
filter(!is.na(LAD11CD)) %>%
Dots in Polygon Mapping mutate(
"CELL_NAME"=gsub("[.]"," ",CELL_NAME),
Clustering of Ethnic Group
"POP_PROPORTION"=OBS_VALUE/POPULATION)
GP Registration Numbers

Obtain Data From a PostGIS


PostgreSQL System
In another post I discuss setting up a spatial database using PostgreSQL with PostGIS. Now this
sytem can be connected to in order to download outputa areas for Slough.

https://rstudio-pubs-static.s3.amazonaws.com/346625_9b7a90358ca44d0b89db512afedc63b2.html#create_a_neighbourhood 4/40
23/10/2020 Clustering of Census Recorded Ethnic Background

# Should be able to pull data from my postgres database using the rgdal package bu
Introduction t this doesn't want to work for some reason so using postGIStools package instead
# dsn <- "PG:dbname=spatial_data_store host=localhost port=5432 user=postgres pass
Load Packages word=postgres"
# ogrListLayers(dsn)
Load Ethnicity Data

Load Resident Population Numbers # Create connection to the postgis database where all my shapefiles are stored
con <- dbConnect(PostgreSQL(), dbname = "spatial_data_store", user = "postgres",
Download and Unzip the Output Area host = "localhost",
Lookup Data password = "postgres")

Obtain Data From a PostGIS # Pull all the output areas from the postgis database for a specific local authori
PostgreSQL System ty
oa_shp <- get_postgis_query(con,
Map of Ethnic Group by Output Area
"SELECT *
Dots in Polygon Mapping FROM output_area_december_2011_generalised_cli
pped_boundaries_in_eng
Clustering of Ethnic Group WHERE lad11cd='E06000039'",
geom_name = "geom")
GP Registration Numbers

Map of Ethnic Group by Output Area


Now we can see how the population of this particular ethnic group is distributed across Slough,
both as a rate per output area and also as actual numbers per output area.

https://rstudio-pubs-static.s3.amazonaws.com/346625_9b7a90358ca44d0b89db512afedc63b2.html#create_a_neighbourhood 5/40
23/10/2020 Clustering of Census Recorded Ethnic Background

oa_shp_tidy <- tidy(oa_shp,region="oa11cd")


Introduction

Load Packages # Merge population figures to shapefile


oa_shp_tidy <- oa_shp_tidy %>% left_join(ethnicity_detailed,by=c("id"="GEOGRAPHY_C
Load Ethnicity Data
ODE"))
Load Resident Population Numbers

Download and Unzip the Output Area plot_proportion <- ggplot(oa_shp_tidy, aes(long, lat, fill=POP_PROPORTION, group=i
Lookup Data d)) +
geom_polygon(col="grey") +
Obtain Data From a PostGIS scale_fill_gradient2(low="white",high="red", labels=scales::percent) +
PostgreSQL System labs(fill="Proportion") +
coord_fixed() +
Map of Ethnic Group by Output Area
theme_void() +
Dots in Polygon Mapping theme(legend.position="bottom")

Clustering of Ethnic Group plot_numbers <- ggplot(oa_shp_tidy, aes(long, lat, fill=OBS_VALUE, group=id)) +
geom_polygon(col="grey") +
GP Registration Numbers
scale_fill_gradient2(low="white",high="blue") +
labs(fill="Count") +
coord_fixed() +
theme_void() +
theme(legend.position="bottom")

grid.arrange(plot_proportion, plot_numbers, ncol=2, nrow=1)

https://rstudio-pubs-static.s3.amazonaws.com/346625_9b7a90358ca44d0b89db512afedc63b2.html#create_a_neighbourhood 6/40
23/10/2020 Clustering of Census Recorded Ethnic Background

Introduction

Load Packages

Load Ethnicity Data

Load Resident Population Numbers

Download and Unzip the Output Area


Lookup Data

Obtain Data From a PostGIS


PostgreSQL System

Map of Ethnic Group by Output Area

Dots in Polygon Mapping

Clustering of Ethnic Group

GP Registration Numbers Dots in Polygon Mapping


Purely to make an interesting presentation of this data, I use the dotsInPolys function from the
maptools package. This creates dot density data which we can use to create a 2d kernel density
estimate later on.

Generate the Dots

https://rstudio-pubs-static.s3.amazonaws.com/346625_9b7a90358ca44d0b89db512afedc63b2.html#create_a_neighbourhood 7/40
23/10/2020 Clustering of Census Recorded Ethnic Background

# merge the population data into the output area shapefile


Introduction oa_shp_data <- merge(oa_shp, ethnicity_detailed, by.x="oa11cd",by.y="GEOGRAPHY_CO
DE")
Load Packages oa_shp_data <- oa_shp_data[!is.na(oa_shp_data$OBS_VALUE),]
oa_shp_data <- oa_shp_data[oa_shp_data$OBS_VALUE>0,]
Load Ethnicity Data

Load Resident Population Numbers # create the dot density data


dot_density <- dotsInPolys(oa_shp_data, x=as.integer(oa_shp_data$OBS_VALUE), f="ra
Download and Unzip the Output Area ndom")
Lookup Data

Obtain Data From a PostGIS {


PostgreSQL System par(mar=c(0,0,0,0))
plot(dot_density, pch=16, cex=0.4, col="#00000044")
Map of Ethnic Group by Output Area
plot(oa_shp_data, add=TRUE, border="black")
Dots in Polygon Mapping }

Clustering of Ethnic Group

GP Registration Numbers

Create a 2d Kernel Density Raster

https://rstudio-pubs-static.s3.amazonaws.com/346625_9b7a90358ca44d0b89db512afedc63b2.html#create_a_neighbourhood 8/40
23/10/2020 Clustering of Census Recorded Ethnic Background

# Calculate a 500 metre margin around the shapefile to ensure the density hotspots
Introduction aren't cut off on the map plot.
map_padding <- 500
Load Packages map_padding <- c(-map_padding, map_padding,
-map_padding, map_padding)
Load Ethnicity Data

Load Resident Population Numbers shp_limits <- extent(oa_shp)[1:4]

Download and Unzip the Output Area kernal_density_estimate = kde2d(dot_density$x,


Lookup Data dot_density$y,
h=400, # bandwidth
Obtain Data From a PostGIS n=500, # gridpoints in each direction
PostgreSQL System lims=shp_limits+map_padding)

Map of Ethnic Group by Output Area

Dots in Polygon Mapping raster_kde2d = raster(kernal_density_estimate)

Clustering of Ethnic Group

GP Registration Numbers Plot the 2d Kernel Density Estimate Raster


Now a basic plot can be produced before we do the full version using ggplot.

{
plot(raster_kde2d, col=magma(7))
plot(oa_shp_data, add=TRUE, border="#555555")
}

https://rstudio-pubs-static.s3.amazonaws.com/346625_9b7a90358ca44d0b89db512afedc63b2.html#create_a_neighbourhood 9/40
23/10/2020 Clustering of Census Recorded Ethnic Background

Introduction

Load Packages

Load Ethnicity Data

Load Resident Population Numbers

Download and Unzip the Output Area


Lookup Data

Obtain Data From a PostGIS


PostgreSQL System

Map of Ethnic Group by Output Area

Dots in Polygon Mapping

Clustering of Ethnic Group

GP Registration Numbers

Some Additional Details


Now I pull some extra contextual information from my PostgreSQL spatial databse. These will
provide the overall outline of Slough, building outlines and also roads.
https://rstudio-pubs-static.s3.amazonaws.com/346625_9b7a90358ca44d0b89db512afedc63b2.html#create_a_neighbourhood 10/40
23/10/2020 Clustering of Census Recorded Ethnic Background

# Pull the outline of Slough


Introduction slough_shp <- get_postgis_query(con,
"SELECT *
Load Packages FROM local_authority_districts_december_2016_g
eneralised_clipped_bou
Load Ethnicity Data
WHERE lad16cd = 'E06000039'",
Load Resident Population Numbers geom_name = "geom")

Download and Unzip the Output Area


Lookup Data
# Pull Slough buildings
Obtain Data From a PostGIS bld_slough_shp <- get_postgis_query(con,
PostgreSQL System "SELECT *
FROM bld_clip_slough",
Map of Ethnic Group by Output Area
geom_name = "geom")
Dots in Polygon Mapping

Clustering of Ethnic Group # Pull Slough roads


rd_slough_shp <- get_postgis_query(con,
GP Registration Numbers
"SELECT *
FROM rd_clip_slough",
geom_name = "geom")

# Prepare the new spatial data for plotting in ggplot


slough_shp_tidy <- tidy(slough_shp,region="lad16cd")
bld_slough_shp_tidy <- tidy(bld_slough_shp,region="id")
rd_slough_shp_tidy <- tidy(rd_slough_shp,region="id")

Plot Kernel Density Map in ggplot

https://rstudio-pubs-static.s3.amazonaws.com/346625_9b7a90358ca44d0b89db512afedc63b2.html#create_a_neighbourhood 11/40
23/10/2020 Clustering of Census Recorded Ethnic Background

# set background and foreground colours


Introduction bckgrnd_col <- viridis(2, alpha = 1, begin = 0.1, end = 0.9, option="magma")[1]
frgrnd_col <- viridis(2, alpha = 1, begin = 0.1, end = 0.9, option="magma")[2]
Load Packages
windowsFonts(arial=windowsFont("Arial"))
Load Ethnicity Data

Load Resident Population Numbers # set themes


theme_map <- function(...) {
Download and Unzip the Output Area theme_minimal() +
Lookup Data theme(
text = element_text(family = "arial", colour="#FFFFFF"), #, size=36),
Obtain Data From a PostGIS axis.line = element_blank(),
PostgreSQL System axis.text.x = element_blank(),
axis.text.y = element_blank(),
Map of Ethnic Group by Output Area
axis.ticks = element_blank(),
Dots in Polygon Mapping axis.title.x = element_blank(),
axis.title.y = element_blank(),
Clustering of Ethnic Group panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
GP Registration Numbers
plot.background = element_rect(fill = bckgrnd_col, color = NA),
panel.background = element_rect(fill = bckgrnd_col, color = NA),
legend.background = element_rect(fill = bckgrnd_col, color = NA),
panel.border = element_blank(),
...
)
}

ggplot() +
#stat_density_2d(data=as.data.frame(coordinates(x)), aes(x=x, y=y,fill = ..densi
ty..), geom = "raster", contour = FALSE) +
geom_polygon(data=slough_shp_tidy,
aes(long, lat, group=id),
fill=NA,
colour="white",
size=1) +
geom_raster(data=as.data.frame(raster_kde2d,
xy=TRUE),
https://rstudio-pubs-static.s3.amazonaws.com/346625_9b7a90358ca44d0b89db512afedc63b2.html#create_a_neighbourhood 12/40
23/10/2020 Clustering of Census Recorded Ethnic Background

aes(x,y,fill=layer)) +
scale_fill_viridis(option="magma",
Introduction begin=0.1,
end=0.9,
Load Packages
guide = guide_colorbar(
Load Ethnicity Data direction = "horizontal",
#barheight = unit(30, units = "mm"),
Load Resident Population Numbers #barwidth = unit(300, units = "mm"),
draw.ulim = FALSE,
Download and Unzip the Output Area
title.position = 'top',
Lookup Data
title.hjust = 0.5,
Obtain Data From a PostGIS label.hjust = 0.5)) +
geom_polygon(data=bld_slough_shp_tidy, aes(long, lat, group=id), fill="white", c
PostgreSQL System
olour=NA, alpha=0.5) +
Map of Ethnic Group by Output Area geom_path(data=rd_slough_shp_tidy, aes(long, lat, group=id), colour="white", alp
ha=0.5) +
Dots in Polygon Mapping geom_polygon(data=slough_shp_tidy, aes(long, lat, group=id), fill=NA, colour="wh
ite",size=0.25) +
Clustering of Ethnic Group
coord_fixed() +
GP Registration Numbers labs(x=NULL,
y=NULL,
fill="Density") +
theme_void() +
theme(legend.position = "bottom")

https://rstudio-pubs-static.s3.amazonaws.com/346625_9b7a90358ca44d0b89db512afedc63b2.html#create_a_neighbourhood 13/40
23/10/2020 Clustering of Census Recorded Ethnic Background

Introduction

Load Packages

Load Ethnicity Data

Load Resident Population Numbers

Download and Unzip the Output Area


Lookup Data

Obtain Data From a PostGIS


PostgreSQL System

Map of Ethnic Group by Output Area

Dots in Polygon Mapping

Clustering of Ethnic Group

GP Registration Numbers

Clustering of Ethnic Group


Now we can test to see if there is significant clustering of this ethnic group and which output areas
are part of this clustering. For this the Moran’s I statistic can be used to look at both global and
local spatial autocorrelation. A useful tutorial on this can be found here:
https://mgimond.github.io/Spatial/spatial-autocorrelation-in-r.html
(https://mgimond.github.io/Spatial/spatial-autocorrelation-in-r.html)

Create a Neighbourhood
First we create a neighbourhood object using the poly2nb function and our output area shapefile.
We do this using the ‘Queen’s case’ setting, meaning that adjacent areas which share either a
border or a corner are counted as neighbours.

https://rstudio-pubs-static.s3.amazonaws.com/346625_9b7a90358ca44d0b89db512afedc63b2.html#create_a_neighbourhood 14/40
23/10/2020 Clustering of Census Recorded Ethnic Background

# remember to reset oa_shp_data as it's filtered in the previous script


Introduction neighbourhood <- poly2nb(oa_shp_data, queen=TRUE)

Load Packages {
par(mar=c(0,0,0,0))
Load Ethnicity Data
plot(oa_shp_data,
Load Resident Population Numbers border="grey")
plot(neighbourhood,
Download and Unzip the Output Area coords=coordinates(oa_shp_data),
Lookup Data col="red",
add=T)
Obtain Data From a PostGIS }
PostgreSQL System

Map of Ethnic Group by Output Area

Dots in Polygon Mapping

Clustering of Ethnic Group

GP Registration Numbers

Generate Neighbourhood Weights


https://rstudio-pubs-static.s3.amazonaws.com/346625_9b7a90358ca44d0b89db512afedc63b2.html#create_a_neighbourhood 15/40
23/10/2020 Clustering of Census Recorded Ethnic Background

# first generate list of weights


Introduction neighbourhood_weights_list <- nb2listw(neighbourhood, style="W", zero.policy=TRUE)

Load Packages

Load Ethnicity Data Compute Global Moran’s I test


Load Resident Population Numbers
moran.test(oa_shp_data$POP_PROPORTION,neighbourhood_weights_list)
Download and Unzip the Output Area
Lookup Data ##
Obtain Data From a PostGIS ## Moran I test under randomisation
##
PostgreSQL System
## data: oa_shp_data$POP_PROPORTION
Map of Ethnic Group by Output Area ## weights: neighbourhood_weights_list
##
Dots in Polygon Mapping ## Moran I statistic standard deviate = 23.441, p-value < 2.2e-16
## alternative hypothesis: greater
Clustering of Ethnic Group
## sample estimates:
GP Registration Numbers ## Moran I statistic Expectation Variance
## 0.7327179450 -0.0026041667 0.0009840443

Now a monte carlo based test of significance can be performed.

moran_i_monte_carlo <- moran.mc(oa_shp_data$POP_PROPORTION,


neighbourhood_weights_list,
nsim=599)

# View results (including p-value)


moran_i_monte_carlo

https://rstudio-pubs-static.s3.amazonaws.com/346625_9b7a90358ca44d0b89db512afedc63b2.html#create_a_neighbourhood 16/40
23/10/2020 Clustering of Census Recorded Ethnic Background

##
Introduction ## Monte-Carlo simulation of Moran I
##
Load Packages ## data: oa_shp_data$POP_PROPORTION
## weights: neighbourhood_weights_list
Load Ethnicity Data
## number of simulations + 1: 600
Load Resident Population Numbers ##
## statistic = 0.73272, observed rank = 600, p-value = 0.001667
Download and Unzip the Output Area ## alternative hypothesis: greater
Lookup Data

Obtain Data From a PostGIS # Plot the distribution (note that this is a density plot instead of a histogram)
PostgreSQL System plot(moran_i_monte_carlo)

Map of Ethnic Group by Output Area

Dots in Polygon Mapping

Clustering of Ethnic Group

GP Registration Numbers

https://rstudio-pubs-static.s3.amazonaws.com/346625_9b7a90358ca44d0b89db512afedc63b2.html#create_a_neighbourhood 17/40
23/10/2020 Clustering of Census Recorded Ethnic Background

A local moran’s i statistic can now be calculated for each output area.

Introduction
# Local Moran
Load Packages LM_Results <- localmoran(oa_shp_data$POP_PROPORTION,
neighbourhood_weights_list,
Load Ethnicity Data p.adjust.method="bonferroni",
na.action=na.exclude,
Load Resident Population Numbers
zero.policy=TRUE)
Download and Unzip the Output Area
Lookup Data summary(LM_Results)

Obtain Data From a PostGIS


## Ii E.Ii Var.Ii Z.Ii
PostgreSQL System
## Min. :-0.9174 Min. :-0.002604 Min. :0.05603 Min. :-2.1682
Map of Ethnic Group by Output Area ## 1st Qu.: 0.1113 1st Qu.:-0.002604 1st Qu.:0.13976 1st Qu.: 0.2451
## Median : 0.4019 Median :-0.002604 Median :0.19670 Median : 0.9107
Dots in Polygon Mapping ## Mean : 0.7327 Mean :-0.002604 Mean :0.20317 Mean : 1.7362
## 3rd Qu.: 0.8386 3rd Qu.:-0.002604 3rd Qu.:0.24652 3rd Qu.: 1.9323
Clustering of Ethnic Group
## Max. : 5.0224 Max. :-0.002604 Max. :0.99382 Max. :14.1130
GP Registration Numbers ## Pr(z > 0)
## Min. :0.0000
## 1st Qu.:0.1778
## Median :1.0000
## Mean :0.6611
## 3rd Qu.:1.0000
## Max. :1.0000

The results of the local moran’s i are merged back into the output area shapefile for plotting.

https://rstudio-pubs-static.s3.amazonaws.com/346625_9b7a90358ca44d0b89db512afedc63b2.html#create_a_neighbourhood 18/40
23/10/2020 Clustering of Census Recorded Ethnic Background

# add moran's I results back to the shapefile


Introduction oa_shp_data@data$lmoran_i <- LM_Results[,1]
oa_shp_data@data$lmoran_p <- LM_Results[,5]
Load Packages oa_shp_data@data$lmoran_sig <- LM_Results[,5]<0.01

Load Ethnicity Data

Load Resident Population Numbers # manually make a moran plot based on standardised variables
# standardise variables and save to a new column
Download and Unzip the Output Area oa_shp_data$SCALED_POP_PROPORTION <- scale(oa_shp_data$POP_PROPORTION)
Lookup Data
# create a lagged variable
Obtain Data From a PostGIS oa_shp_data$LAGGED_SCALED_POP_PROPORTION <- lag.listw(neighbourhood_weights_list,
PostgreSQL System oa_shp_data$SCALED_POP_PROPORTION)

Map of Ethnic Group by Output Area

Dots in Polygon Mapping oa_shp_data$SPATIAL_LAG_CAT <- factor(ifelse(oa_shp_data$SCALED_POP_PROPORTION>0 &


oa_shp_data$LAGGED_SCALED_POP_PROPORTION>0, "High-High",
Clustering of Ethnic Group ifelse(oa_shp_data$SCALED_POP_PROPORTION>0 & oa_shp_data$LAGGED_SCALED_POP_
PROPORTION<0, "High-Low",
GP Registration Numbers
ifelse(oa_shp_data$SCALED_POP_PROPORTION<0 & oa_shp_data$LAGGED_SCAL
ED_POP_PROPORTION<0, "Low-Low",
ifelse(oa_shp_data$SCALED_POP_PROPORTION<0 & oa_shp_data$LAGG
ED_SCALED_POP_PROPORTION>0, "Low-High",
"Equivalent")))))

First we look at the relationship between population proportion and the spatially lagged values.

ggplot(oa_shp_data@data, aes(SCALED_POP_PROPORTION, LAGGED_SCALED_POP_PROPORTION,


colour=lmoran_p)) +
geom_point(alpha=0.5, size=3) +
geom_smooth(method="lm", se=F, col="red") +
geom_hline(yintercept=0, lty=2) +
geom_vline(xintercept=0, lty=2) +
theme_bw() +
labs(title="Scaled Spatial Lag Comparison",
x="Scaled Value",
y="Lagged Scaled Value")

https://rstudio-pubs-static.s3.amazonaws.com/346625_9b7a90358ca44d0b89db512afedc63b2.html#create_a_neighbourhood 19/40
23/10/2020 Clustering of Census Recorded Ethnic Background

Introduction

Load Packages

Load Ethnicity Data

Load Resident Population Numbers

Download and Unzip the Output Area


Lookup Data

Obtain Data From a PostGIS


PostgreSQL System

Map of Ethnic Group by Output Area

Dots in Polygon Mapping

Clustering of Ethnic Group

GP Registration Numbers

Secondly we can plot a comparison set of maps using ggplot to look at the clusters which are
identified.

https://rstudio-pubs-static.s3.amazonaws.com/346625_9b7a90358ca44d0b89db512afedc63b2.html#create_a_neighbourhood 20/40
23/10/2020 Clustering of Census Recorded Ethnic Background

# set id columns to merge local moran's results back to the shapefile


Introduction oa_shp_data@data$id <- row.names(oa_shp_data@data)

Load Packages # tidy the shapefile


oa_shp_data_tidy <- tidy(oa_shp_data,region="id")
Load Ethnicity Data
oa_shp_data_tidy <- merge(oa_shp_data_tidy,oa_shp_data@data,by="id")
Load Resident Population Numbers

Download and Unzip the Output Area


Lookup Data # ==================================================
# Comparison plot
Obtain Data From a PostGIS
PostgreSQL System gg1 <- ggplot(oa_shp_data_tidy, aes(long, lat, fill=lmoran_sig, group=id)) +
geom_polygon(col="white") +
Map of Ethnic Group by Output Area
scale_fill_manual(values=c("grey","red")) +
Dots in Polygon Mapping coord_fixed() +
theme_void()
Clustering of Ethnic Group

GP Registration Numbers
gg2 <- ggplot(oa_shp_data_tidy, aes(long, lat, fill=POP_PROPORTION, group=id)) +
geom_polygon(col="white") +
scale_fill_gradient(low="white",high="red", labels=scales::percent) +
coord_fixed() +
theme_void()

gg3 <- ggplot(oa_shp_data_tidy, aes(long, lat, fill=lmoran_i, group=id)) +


geom_polygon(col="white") +
scale_fill_gradient(low="white",high="blue") +
coord_fixed() +
theme_void()

gg4 <- ggplot(oa_shp_data_tidy, aes(long, lat, fill=SPATIAL_LAG_CAT, group=id)) +


geom_polygon(col="white") +
scale_fill_manual(values=c("red","pink","light blue","blue")) +
coord_fixed() +
theme_void()

# grid.arrange(gg1, gg2, gg3, gg4, ncol=1, nrow=4)

https://rstudio-pubs-static.s3.amazonaws.com/346625_9b7a90358ca44d0b89db512afedc63b2.html#create_a_neighbourhood 21/40
23/10/2020 Clustering of Census Recorded Ethnic Background

gg1
Introduction

Load Packages

Load Ethnicity Data

Load Resident Population Numbers

Download and Unzip the Output Area


Lookup Data

Obtain Data From a PostGIS


PostgreSQL System

Map of Ethnic Group by Output Area

Dots in Polygon Mapping

Clustering of Ethnic Group

GP Registration Numbers

gg2

https://rstudio-pubs-static.s3.amazonaws.com/346625_9b7a90358ca44d0b89db512afedc63b2.html#create_a_neighbourhood 22/40
23/10/2020 Clustering of Census Recorded Ethnic Background

Introduction

Load Packages

Load Ethnicity Data

Load Resident Population Numbers

Download and Unzip the Output Area


Lookup Data

Obtain Data From a PostGIS


PostgreSQL System

Map of Ethnic Group by Output Area

Dots in Polygon Mapping

Clustering of Ethnic Group

GP Registration Numbers

gg3

https://rstudio-pubs-static.s3.amazonaws.com/346625_9b7a90358ca44d0b89db512afedc63b2.html#create_a_neighbourhood 23/40
23/10/2020 Clustering of Census Recorded Ethnic Background

Introduction

Load Packages

Load Ethnicity Data

Load Resident Population Numbers

Download and Unzip the Output Area


Lookup Data

Obtain Data From a PostGIS


PostgreSQL System

Map of Ethnic Group by Output Area

Dots in Polygon Mapping

Clustering of Ethnic Group

GP Registration Numbers

gg4

https://rstudio-pubs-static.s3.amazonaws.com/346625_9b7a90358ca44d0b89db512afedc63b2.html#create_a_neighbourhood 24/40
23/10/2020 Clustering of Census Recorded Ethnic Background

Introduction

Load Packages

Load Ethnicity Data

Load Resident Population Numbers

Download and Unzip the Output Area


Lookup Data

Obtain Data From a PostGIS


PostgreSQL System

Map of Ethnic Group by Output Area

Dots in Polygon Mapping

Clustering of Ethnic Group

GP Registration Numbers

Now we can create a plot which shows only the areas with a high proportion that are surrounded
by high proportion areas and are statistically significant.

oa_shp_data_tidy_sig_high_high <- oa_shp_data_tidy[oa_shp_data_tidy$lmoran_sig==T


& oa_shp_data_tidy$SPATIAL_LAG_CAT=="High-High",]

ggplot() +
geom_polygon(data=oa_shp_data_tidy, aes(long, lat, fill=lmoran_sig, group=id),fi
ll="grey",col="white") +
geom_polygon(data=oa_shp_data_tidy_sig_high_high, aes(long, lat, fill=lmoran_sig
, group=id),fill="red",col="white") +
coord_fixed() +
theme_void()

https://rstudio-pubs-static.s3.amazonaws.com/346625_9b7a90358ca44d0b89db512afedc63b2.html#create_a_neighbourhood 25/40
23/10/2020 Clustering of Census Recorded Ethnic Background

Introduction

Load Packages

Load Ethnicity Data

Load Resident Population Numbers

Download and Unzip the Output Area


Lookup Data

Obtain Data From a PostGIS


PostgreSQL System

Map of Ethnic Group by Output Area

Dots in Polygon Mapping

Clustering of Ethnic Group

GP Registration Numbers

We then union the clusters to form cluster boundaries.

oa_shp_data_sig_high_high <- oa_shp_data[oa_shp_data$lmoran_sig==T & oa_shp_data$S


PATIAL_LAG_CAT=="High-High",]
plot(oa_shp_data_sig_high_high)

https://rstudio-pubs-static.s3.amazonaws.com/346625_9b7a90358ca44d0b89db512afedc63b2.html#create_a_neighbourhood 26/40
23/10/2020 Clustering of Census Recorded Ethnic Background

Introduction

Load Packages

Load Ethnicity Data

Load Resident Population Numbers

Download and Unzip the Output Area


Lookup Data

Obtain Data From a PostGIS


PostgreSQL System

Map of Ethnic Group by Output Area

Dots in Polygon Mapping

Clustering of Ethnic Group

GP Registration Numbers

cluster_boundary <- gUnaryUnion(oa_shp_data_sig_high_high)


plot(cluster_boundary)

https://rstudio-pubs-static.s3.amazonaws.com/346625_9b7a90358ca44d0b89db512afedc63b2.html#create_a_neighbourhood 27/40
23/10/2020 Clustering of Census Recorded Ethnic Background

Introduction

Load Packages

Load Ethnicity Data

Load Resident Population Numbers

Download and Unzip the Output Area


Lookup Data

Obtain Data From a PostGIS


PostgreSQL System

Map of Ethnic Group by Output Area

Dots in Polygon Mapping

Clustering of Ethnic Group

GP Registration Numbers

The cluster boundaries can then be plotted.

https://rstudio-pubs-static.s3.amazonaws.com/346625_9b7a90358ca44d0b89db512afedc63b2.html#create_a_neighbourhood 28/40
23/10/2020 Clustering of Census Recorded Ethnic Background

cluster_boundary_tidy <- tidy(cluster_boundary)


Introduction
ggplot() +
Load Packages geom_polygon(data=slough_shp_tidy,
aes(long, lat, group=id),
Load Ethnicity Data
fill=NA,
Load Resident Population Numbers colour="white",
size=1) +
Download and Unzip the Output Area geom_raster(data=as.data.frame(raster_kde2d,
Lookup Data xy=TRUE),
aes(x,y,fill=layer)) +
Obtain Data From a PostGIS scale_fill_viridis(option="magma",
PostgreSQL System begin=0.1,
end=0.9,
Map of Ethnic Group by Output Area
guide = guide_colorbar(
Dots in Polygon Mapping direction = "horizontal",
#barheight = unit(30, units = "mm"),
Clustering of Ethnic Group #barwidth = unit(300, units = "mm"),
draw.ulim = FALSE,
GP Registration Numbers
title.position = 'top',
title.hjust = 0.5,
label.hjust = 0.5)) +
geom_polygon(data=bld_slough_shp_tidy, aes(long, lat, group=id), fill="white", c
olour=NA, alpha=0.5) +
geom_path(data=rd_slough_shp_tidy, aes(long, lat, group=id), colour="white", alp
ha=0.5) +
geom_polygon(data=slough_shp_tidy, aes(long, lat, group=id), fill=NA, colour="wh
ite",size=0.25) +
geom_polygon(data=cluster_boundary_tidy, aes(long, lat, group=piece), fill=NA, c
olour="white",size=1, lty=1) +
coord_fixed() +
labs(x=NULL,
y=NULL,
fill="Density") +
theme_void() +
theme(legend.position = "bottom")

https://rstudio-pubs-static.s3.amazonaws.com/346625_9b7a90358ca44d0b89db512afedc63b2.html#create_a_neighbourhood 29/40
23/10/2020 Clustering of Census Recorded Ethnic Background

Introduction

Load Packages

Load Ethnicity Data

Load Resident Population Numbers

Download and Unzip the Output Area


Lookup Data

Obtain Data From a PostGIS


PostgreSQL System

Map of Ethnic Group by Output Area

Dots in Polygon Mapping

Clustering of Ethnic Group

GP Registration Numbers

GP Registration Numbers
It would be useful to know if areas with low GP registration were in any way related to our
clusters. For now we will just do this visually.

Load GP Practice Locations


Here I pull GP practice coordinates from a SQLite database I have set-up which contains the ONS
postcoe directory data. I then filter them to only include ones in our selected Local Authority using
a point-in-polygon selection function.

https://rstudio-pubs-static.s3.amazonaws.com/346625_9b7a90358ca44d0b89db512afedc63b2.html#create_a_neighbourhood 30/40
23/10/2020 Clustering of Census Recorded Ethnic Background

# Connect to pre-existing SQLite database containing ONS postcode directory


Introduction db_connection <- dbConnect(SQLite(), dbname="C:\\Users\\User1\\Documents\\Rstudio_
Projects\\ONS_Lookup_Database\\ons_lkp_db")
Load Packages

Load Ethnicity Data


# SQL query to pull postcode lookup data from SQLite database
Load Resident Population Numbers pcd <- dbGetQuery(db_connection,
"SELECT pcd_spaceless, oseast1m, osnrth1m
Download and Unzip the Output Area FROM ONS_PD
Lookup Data WHERE length(oseast1m)!=0")

Obtain Data From a PostGIS # Load NHS GP reference data


PostgreSQL System gpp <- read.csv(file="data\\epraccur\\epraccur.csv", stringsAsFactors=F)

Map of Ethnic Group by Output Area


# Join postcode directory to GP reference data
Dots in Polygon Mapping gpp <- gpp %>%
mutate("pcd_spaceless"=gsub(" ","",gpp_pcd)) %>%
Clustering of Ethnic Group left_join(pcd)

GP Registration Numbers
# We lose a couple of hundred practices mainly from Jersey, Guernsey and Northern
Ireland
# drop those practices with null coordinates
gpp <- gpp %>% filter(!is.na(oseast1m))

# remove the pcd table as it takes up a lot of memory


rm(pcd)

# Now convert the GP practice coordinates to a spatial points dataframe


xy <- gpp[,5:6] # get the coordinates
gpp_shp <- SpatialPointsDataFrame(coords = xy, data = gpp,
proj4string = CRS("+proj=tmerc +lat_0=49 +lon_0=-2
+k=0.9996012717 +x_0=400000 +y_0=-100000 +datum=OSGB36 +units=m +no_defs +ellps=a
iry +towgs84=446.448,-125.157,542.060,0.1502,0.2470,0.8421,-20.4894"))

# filter our practices to those within our local authority area using the over fun
ction to perform a point-in-polygon selection
point_in_polygon <- sp::over(gpp_shp, slough_shp)
point_in_polygon$row_number <- row.names(point_in_polygon)
https://rstudio-pubs-static.s3.amazonaws.com/346625_9b7a90358ca44d0b89db512afedc63b2.html#create_a_neighbourhood 31/40
23/10/2020 Clustering of Census Recorded Ethnic Background

point_in_polygon <- point_in_polygon[is.na(point_in_polygon$gid)==F,]

Introduction gpp_shp_filtered <- gpp_shp[point_in_polygon$row_number,]

Load Packages

Load Ethnicity Data {


par(mar=c(0,0,0,0))
Load Resident Population Numbers plot(slough_shp)
plot(gpp_shp_filtered, add=TRUE)
Download and Unzip the Output Area
}
Lookup Data

Obtain Data From a PostGIS


PostgreSQL System

Map of Ethnic Group by Output Area

Dots in Polygon Mapping

Clustering of Ethnic Group

GP Registration Numbers

Generate GP Catchment Areas


https://rstudio-pubs-static.s3.amazonaws.com/346625_9b7a90358ca44d0b89db512afedc63b2.html#create_a_neighbourhood 32/40
23/10/2020 Clustering of Census Recorded Ethnic Background

Note that we will generate catchment areas and then select all of the ones which overlap with our
selected local authority.
Introduction

Load Packages # Read in the LSOA catchment information


gp_catch <- read.csv("data\\gp-reg-patients-LSOA-alt-tall.csv", stringsAsFactors=F
Load Ethnicity Data )

Load Resident Population Numbers


# Get a list of LSOAs within our local authority
Download and Unzip the Output Area slough_lsoa <- dbGetQuery(db_connection, "SELECT DISTINCT lsoa11 FROM ONS_PD WHERE
Lookup Data oslaua='E06000039'")

Obtain Data From a PostGIS


PostgreSQL System # Read in LSOA shapefile to attach to GP practices
lsoa_shp <- readOGR(dsn="shp", layer="Lower_Layer_Super_Output_Areas_December_2011
Map of Ethnic Group by Output Area _Generalised_Clipped__Boundaries_in_England_and_Wales", stringsAsFactors=FALSE)
Dots in Polygon Mapping

Clustering of Ethnic Group ## OGR data source with driver: ESRI Shapefile
## Source: "shp", layer: "Lower_Layer_Super_Output_Areas_December_2011_Generalised
GP Registration Numbers _Clipped__Boundaries_in_England_and_Wales"
## with 34753 features
## It has 6 fields
## Integer64 fields read as strings: objectid

# read in LSOA population figures


lsoa_pop_2016 <- read.csv("data\\ons_mid2016_lsoa_pop.csv", stringsAsFactors=F)
lsoa_pop_2016$All.Ages <- as.integer(gsub(",","",lsoa_pop_2016$All.Ages))

# View some summary information for each GP practice


gp_catch %>%
group_by(PRACTICE_CODE) %>%
summarise("n"=n(),
"TOTAL_POP"=sum(ALL_PATIENTS),
"MEAN_LSOA_POP"=mean(ALL_PATIENTS),
"MEDIAN_LSOA_POP"=median(ALL_PATIENTS))

https://rstudio-pubs-static.s3.amazonaws.com/346625_9b7a90358ca44d0b89db512afedc63b2.html#create_a_neighbourhood 33/40
23/10/2020 Clustering of Census Recorded Ethnic Background

## # A tibble: 7,531 x 5
Introduction ## PRACTICE_CODE n TOTAL_POP MEAN_LSOA_POP MEDIAN_LSOA_POP
## <chr> <int> <int> <dbl> <dbl>
Load Packages ## 1 A81001 50 4178 83.56000 6.5
## 2 A81002 91 19902 218.70330 220.0
Load Ethnicity Data
## 3 A81004 134 9344 69.73134 42.0
Load Resident Population Numbers ## 4 A81005 23 7931 344.82609 167.0
## 5 A81006 106 13661 128.87736 95.0
Download and Unzip the Output Area ## 6 A81007 68 9834 144.61765 160.5
Lookup Data ## 7 A81008 129 3973 30.79845 4.0
## 8 A81009 128 9084 70.96875 49.5
Obtain Data From a PostGIS ## 9 A81011 70 11723 167.47143 182.0
PostgreSQL System ## 10 A81012 132 4778 36.19697 25.0
## # ... with 7,521 more rows
Map of Ethnic Group by Output Area

Dots in Polygon Mapping

Clustering of Ethnic Group

GP Registration Numbers

https://rstudio-pubs-static.s3.amazonaws.com/346625_9b7a90358ca44d0b89db512afedc63b2.html#create_a_neighbourhood 34/40
23/10/2020 Clustering of Census Recorded Ethnic Background

lsoa_pop_var <- gp_catch %>%


Introduction filter(substring(LSOA_CODE,1,1)=="E") %>% # remove Wales
group_by(LSOA_CODE) %>%
Load Packages summarise(
"Number_of_Practices"=n(),
Load Ethnicity Data
"All_Patients"=sum(ALL_PATIENTS)
Load Resident Population Numbers ) %>%
left_join(lsoa_pop_2016, by=c("LSOA_CODE"="LSOA_Code")) %>%
Download and Unzip the Output Area mutate("Difference"=All_Patients-All.Ages,
Lookup Data "LA_Flag"=ifelse(LSOA_CODE %in% slough_lsoa$lsoa11,"LSOA_CODE",NA)) %>%
filter(is.na(LA_Flag)==F)
Obtain Data From a PostGIS
PostgreSQL System

Map of Ethnic Group by Output Area


lsoa_shp_pop <- sp::merge(lsoa_shp, lsoa_pop_var, by.x="lsoa11cd", by.y="LSOA_COD
Dots in Polygon Mapping E", all.x=FALSE)

Clustering of Ethnic Group # get polygon centroids


xy <- coordinates(lsoa_shp_pop)
GP Registration Numbers
lsoa_shp_pop$x <- xy[,1]
lsoa_shp_pop$y <- xy[,2]

# define a function to handle tidying (which used to be called fortifying) and the
n joinng the data items back in
clean <- function(shape){
shape@data$id = rownames(shape@data)
shape.points = tidy(shape, region="id")
shape.df = inner_join(shape.points, shape@data, by="id")
}

lsoa_shp_pop_tidy <- clean(lsoa_shp_pop)

ggplot(data=lsoa_shp_pop_tidy,
aes(long,
lat,
group=id,
https://rstudio-pubs-static.s3.amazonaws.com/346625_9b7a90358ca44d0b89db512afedc63b2.html#create_a_neighbourhood 35/40
23/10/2020 Clustering of Census Recorded Ethnic Background

fill=Difference)) +
geom_polygon(colour="white") +
Introduction geom_text(aes(x,y,label=Number_of_Practices),color = "white", size=3) +
scale_fill_gradient2(midpoint=0, low="red", mid="white",high="blue") +
Load Packages
coord_fixed() +
Load Ethnicity Data theme_void()

Load Resident Population Numbers

Download and Unzip the Output Area


Lookup Data

Obtain Data From a PostGIS


PostgreSQL System

Map of Ethnic Group by Output Area

Dots in Polygon Mapping

Clustering of Ethnic Group

GP Registration Numbers

https://rstudio-pubs-static.s3.amazonaws.com/346625_9b7a90358ca44d0b89db512afedc63b2.html#create_a_neighbourhood 36/40
23/10/2020 Clustering of Census Recorded Ethnic Background

# ----------
Introduction median_ons_pop <- median(lsoa_pop_var$All.Ages, na.rm=T)

Load Packages ggplot(lsoa_pop_var, aes(All.Ages)) +


geom_histogram(bins=200) +
Load Ethnicity Data
geom_vline(xintercept=median_ons_pop, linetype=2, col="red") +
Load Resident Population Numbers theme_bw() +
labs(x="LSOA Population (2016 Estimate)", y="Count", title="Distribution of LSOA
Download and Unzip the Output Area Population Estimates (2016)")
Lookup Data

Obtain Data From a PostGIS


PostgreSQL System

Map of Ethnic Group by Output Area

Dots in Polygon Mapping

Clustering of Ethnic Group

GP Registration Numbers

https://rstudio-pubs-static.s3.amazonaws.com/346625_9b7a90358ca44d0b89db512afedc63b2.html#create_a_neighbourhood 37/40
23/10/2020 Clustering of Census Recorded Ethnic Background

# ----------
Introduction median_pat_pop_diff <- median(lsoa_pop_var$Difference, na.rm=T)

Load Packages ggplot(lsoa_pop_var, aes(Difference)) +


geom_histogram(bins=100) +
Load Ethnicity Data
geom_vline(xintercept=median_pat_pop_diff, linetype=2, col="red") +
Load Resident Population Numbers theme_bw() +
labs(x="Difference between registered and resident population", y="Count", title
Download and Unzip the Output Area ="Distribution of Difference Between LSOA Population Estimates (2016) and Registra
Lookup Data nts") +
xlim(-2500,2500)
Obtain Data From a PostGIS
PostgreSQL System

Map of Ethnic Group by Output Area

Dots in Polygon Mapping

Clustering of Ethnic Group

GP Registration Numbers

https://rstudio-pubs-static.s3.amazonaws.com/346625_9b7a90358ca44d0b89db512afedc63b2.html#create_a_neighbourhood 38/40
23/10/2020 Clustering of Census Recorded Ethnic Background

# ----------
Introduction
ggplot(lsoa_pop_var, aes(All_Patients,All.Ages,colour=Difference)) +
Load Packages geom_point() +
scale_color_gradient2(low="#2c7bb6", mid="#ffffbf", high="#d7191c") +
Load Ethnicity Data
geom_smooth(method="lm", se=T, colour="black", linetype=3) +
Load Resident Population Numbers #geom_abline(intercept=0, slope=1) +
theme_bw() +
Download and Unzip the Output Area labs(x="Population Estimate 2016",
Lookup Data y="Registrants",
title="LSOA Population Estimates vs Number of Registrants"
Obtain Data From a PostGIS ) +
PostgreSQL System coord_equal()

Map of Ethnic Group by Output Area

Dots in Polygon Mapping

Clustering of Ethnic Group

GP Registration Numbers

https://rstudio-pubs-static.s3.amazonaws.com/346625_9b7a90358ca44d0b89db512afedc63b2.html#create_a_neighbourhood 39/40
23/10/2020 Clustering of Census Recorded Ethnic Background

# ----------
Introduction median_pat_num_prac <- median(lsoa_pop_var$Number_of_Practices, na.rm=T)

Load Packages ggplot(lsoa_pop_var, aes(Number_of_Practices)) +


geom_histogram(binwidth=1) +
Load Ethnicity Data
geom_vline(xintercept=median_pat_num_prac, linetype=2, col="red") +
Load Resident Population Numbers theme_bw()

Download and Unzip the Output Area


Lookup Data

Obtain Data From a PostGIS


PostgreSQL System

Map of Ethnic Group by Output Area

Dots in Polygon Mapping

Clustering of Ethnic Group

GP Registration Numbers

https://rstudio-pubs-static.s3.amazonaws.com/346625_9b7a90358ca44d0b89db512afedc63b2.html#create_a_neighbourhood 40/40

You might also like