0% found this document useful (0 votes)

106 views24 pages

Interactive Mapping in Python With UK Census Data

Interactive Mapping in Python with UK Census Data

Uploaded by

akshatanandmallik

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

106 views24 pages

Interactive Mapping in Python With UK Census Data

Interactive Mapping in Python with UK Census Data

Uploaded by

akshatanandmallik

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 24

Interactive Mapping in Python with UK

Census Data
Introduction
In this article I will describe the process I followed to create this dashboard displaying maps of
London showing UK Census Data:

I am new to GIS and mapping, so this was a voyage of discovery – I will describe what I learned, and
some of the issues I encountered along the way, with their resolutions.

I assume familiarity with Python. The most common Python data visualization libraries are
compared in this article The Top 6 Python Data Visualization Libraries: How to choose. Their
summary was:

 Matplotlib - foundation and comprehensive library, highly customizable, not very easy to
learn and use
 Seaborn - easy to learn and use, good for exploratory data analysis, relatively customizable,
can integrate with Matplotlib if need more customizations
 Plotly (Dash) - easy to use and customizable, create interactive graphs and web applications
 Bokeh - create interactive graphs and web applications
 Folium - good for nice looking map graphs
 Plotnine - grammar of graphics, based on the R ggplot2 library

The steps I followed, which will be described in detail below, were:

1. Download UK Census Data, and Census Ward and Local Authority geographic information (as
Shapefiles).
2. Plot a simple fixed map using GeoPandas and Matplotlib.
3. Change to use Plotly with GeoJSON geographic data.
4. Add interactive functionality using Dash.

My code is available in my Github repository. When I show code fragments below I will refer to the
file in the repository where it appears.

UK Census Data
Data
The UK Office for National Statistics is responsible for the Census. The 2021 Census results are still
being prepared, the schedule is discussed here. The 2011 Census data is described and can be
downloaded here. For this exercise, I downloaded Bulk Data which is described as:

Bulk data products enable users, particularly managers of information

systems, to load large volumes of tables into their own systems in
automated ways.

The data is compiled by various areas:

Datasets available in standard bulk format generally contain data only

for output areas, lower and middle layer super output areas, wards,
local authorities, unitary authorities, counties, regions, and countries.

I downloaded the Detailed Characteristics on demography and families for merged wards, local
authorities and regions. This is a 79MB Zip file, containing data in a structured CSV format, with an
Excel catalogue. I extracted the files into a subdirectory of my project data directory. Let’s look at
the code to read the data.

Tables
The different tables are described in the Excel catalogue Cell Numbered DC Tables
3.3.xlsx. The Index sheet of this file lists the tables, and I read it as follows. (This code appears
in the file census_read_data.py in the repository.)

import pandas as pd

CENSUS_DATA = 'data/BulkdatadetailedcharacteristicsmergedwardspluslaandregE&Wandinfo3.3'
f = None

def read_index():
# Read Index sheet

global f
excel_file = CENSUS_DATA+'/Cell Numbered DC Tables 3.3.xlsx'
f = pd.ExcelFile(excel_file)
index = f.parse(sheet_name='Index')
return index

The index DataFrame looks like this:

Table Number Table Title

0 DC1104EW DC1104EW: Residence type by sex by age
1 DC1106EW DC1106EW: Schoolchildren and full-time student...
2 DC1107EW DC1107EW: Marital and civil partnership status...
3 DC1108EW DC1108EW: Living arrangements by sex by age
4 DC1109EW DC1109EW: Household composition by age by sex
...

There are 25 tables of different statistics; each has its own sheet in the catalogue. Looking at the
sheet for the first table DC1104EW, we see:
Back to Index

DC1104EW: Residence type by sex by age

Table population: All usual residents

All categories: Residence type Lives in a household Lives in a communal establishment

All categories: Sex Males Females All categories: Sex Males Females All categories: Sex Males Females
All categories: Age 0001 0002 0003 0004 0005 0006 0007 0008 0009
Age 0 to 4 0010 0011 0012 0013 0014 0015 0016 0017 0018
Age 5 to 7 0019 0020 0021 0022 0023 0024 0025 0026 0027
Age 8 to 9 0028 0029 0030 0031 0032 0033 0034 0035 0036
Age 10 to 14 0037 0038 0039 0040 0041 0042 0043 0044 0045
Age 15 0046 0047 0048 0049 0050 0051 0052 0053 0054
Age 16 to 17 0055 0056 0057 0058 0059 0060 0061 0062 0063
Age 18 to 19 0064 0065 0066 0067 0068 0069 0070 0071 0072
Age 20 to 24 0073 0074 0075 0076 0077 0078 0079 0080 0081
Age 25 to 29 0082 0083 0084 0085 0086 0087 0088 0089 0090
Age 30 to 34 0091 0092 0093 0094 0095 0096 0097 0098 0099
Age 35 to 39 0100 0101 0102 0103 0104 0105 0106 0107 0108
Age 40 to 44 0109 0110 0111 0112 0113 0114 0115 0116 0117
Age 45 to 49 0118 0119 0120 0121 0122 0123 0124 0125 0126
Age 50 to 54 0127 0128 0129 0130 0131 0132 0133 0134 0135
Age 55 to 59 0136 0137 0138 0139 0140 0141 0142 0143 0144
Age 60 to 64 0145 0146 0147 0148 0149 0150 0151 0152 0153
Age 65 to 69 0154 0155 0156 0157 0158 0159 0160 0161 0162
Age 70 to 74 0163 0164 0165 0166 0167 0168 0169 0170 0171
Age 75 to 79 0172 0173 0174 0175 0176 0177 0178 0179 0180
Age 80 to 84 0181 0182 0183 0184 0185 0186 0187 0188 0189
Age 85 and over 0190 0191 0192 0193 0194 0195 0196 0197 0198

Crown Copyright applies unless otherwise stated, [email protected]

The table has categories for Residence Type, Sex and Age, with values for each combination of
category values, including All categories. The values in the table are the index of the column in the
data file for the statistic. The categories vary for each table, with one or two categories in columns,
and one or two in the rows. I could not find a way to automatically cope with these varied column
and row headings using pandas. Instead, I read the sheet into a DataFrame and then process the
headings in my own code, part of which is shown below.

def read_table(table_name):
# Read Table sheet, transforming it into a DataFrame with columns
# for each category and the Dataset index
if f == None:
exit("Call read_index() to open table list")
table = f.parse(sheet_name=table_name, header=None)

# Lots of code to process headings not shown here, see

# census_read_data.py. Constructs these variables:
# col_level_names - names of the column categories
# col_level_values - values of the column categories for each column
# row_level_names - names of the row categories
# row_level_values - values of the row categories for each row
# data_row_indexes – the index in the DF of each row of values

# Construct DataFrame from column and row level names and values
num_cols = len(col_level_values[0])
num_rows = len(row_level_values[0])
for l in range(row_levels):
# Repeat row values in order
row_level_values[l] = [x for x in row_level_values[l]
for n in range(num_cols)]
for l in range(col_levels):
# Repeat col values in turn
col_level_values[l] = col_level_values[l] * num_rows
values = []
for r in range(0, num_rows):
row = data_row_indexes[r]
values.extend(table.iloc[row, 1:])
values = [str(int).zfill(4) for int in values]
data = [row_level_values[l] for l in range(row_levels)] + \
[col_level_values[l] for l in range(col_levels)] + \
[values]
index = row_level_names+col_level_names+['Dataset']
df = pd.DataFrame(
data=data,
index=index
)
df = df.transpose()
return df

Calling read_table for DC1104EW gives this DataFrame:

Age Residence type Sex Dataset

0 All All All 0001
1 All All Males 0002
2 All All Females 0003
3 All Lives in a household All 0004
4 All Lives in a household Males 0005
...

Each row in the DataFrame identifies the Dataset for a combination of the category values. The
first row, with the value All for the three categories identifies Dataset 0001, which corresponds
to the column DC1104EW0001 in the data file DC1104EWDATA.CSV.

Data Files
Each table has many CSV files; the file we load for DC1104EW is DC1104EWDATA.CSV. As
explained above, this data has columns for the different combinations of the category values. The
rows have the counts for different geographical areas. We read the data as follows:

def read_data(table_name):
datafile = CENSUS_DATA + '/' + table_name + 'DATA.CSV'
df = pd.read_csv(datafile)
return df

The head of the DataFrame for table DC1104EW is:

GeographyCode DC1104EW0001 DC1104EW0002 DC1104EW0003 ... DC1104EW0198

0 K04000001 56075912 27573376 28502536 ... 156146
1 E92000001 53012456 26069148 26943308 ... 146915
2 W92000004 3063456 1504228 1559228 ... 9231
3 E12000001 2596886 1269703 1327183 ... 7608
4 E12000002 7052177 3464685 3587492 ... 20369

Geography
The census data is summarised by Merged Ward and by Local Authority District. Merged Wards refer
to Electoral Wards, where a few small wards are merged to protect privacy. Merged Wards are
assigned to Local Authorities, which are themselves assigned to Regions. The geography data is
published on the Open Geography Portal in a number of formats:

 Shapefile – A geospatial vector data format for geographic information system (GIS) software. It
is developed and regulated by Esri.
 GeoJSON – An open standard format designed for representing simple geographical features,
along with their non-spatial attributes, based on JSON.
 KML - Keyhole Markup Language is an XML format developed for use with Google Earth.

The Shapefile format is much more compact than GeoJSON, and is supported by GeoPandas, (see
below), so this is what I chose to download. (Plotly requires GeoJSON, which I created from the
Shapefiles later.)
The geography data for Wards and Local Authority Districts (LADs) that I used is on the Open
Geography Portal under the menu options Boundaries | Census Boundaries | Census Merged Wards
and Boundaries | Administrative Boundaries | Local Authority Districts. The files I downloaded were:

1. Census_Merged_Wards_(December_2011)_Boundaries – Download Shapefile format

2. Local_Authority_Districts_(December_2011)_Boundaries_EW_BFC – Download Shapefile
format
3. Ward_to_Census_Merged_Ward_to_Local_Authority_District_(December_2011)_Lookup_in
_England_and_Wales – Download CSV format

The Shapefiles I downloaded are high resolution, so large: about 120MB and 40MB respectively. The
portal has lower resolution versions that are a tenth of the size if you prefer to use those.
(Alternatively, you could use a site like mapshaper.org to compress the files to your preferred
resolution.)

These Shapefiles have Coordinate Reference System OSGB36 / British National Grid. In order to map
the data we need to change it to EPSG 4326 (aka WGS84), we will see this in the code below. (I am
afraid that it took me a long frustrating time, during which no maps were displayed by Plotly, to find
this out!)

I used the lookup CSV file to create a list geography lookup DataFrame, with rows for Merged
Wards and Local Authority Districts, and columns GeographyCode and Name:

def read_geography():
# Get Census Merged Ward and Local Authority Data
lookupfile =
'data/Ward_to_Census_Merged_Ward_to_Local_Authority_District_(December_2011)_Lookup_in_England_a
nd_Wales.csv'
cmwd = pd.read_csv(lookupfile, usecols=[
'CMWD11CD', 'CMWD11NM', 'LAD11CD', 'LAD11NM'])
cmwd.drop_duplicates(inplace=True)
locationcol = "GeographyCode"
cmwd[locationcol] = cmwd['CMWD11CD']
namecol = 'Name'
cmwd[namecol] = cmwd['CMWD11NM']
lad = pd.read_csv(lookupfile, usecols=['LAD11CD', 'LAD11NM'])
lad = lad.drop_duplicates()
lad[locationcol] = lad['LAD11CD']
lad[namecol] = lad['LAD11NM']
lad['CMWD11CD'] = ''
lad['CMWD11NM'] = ''
geography = pd.concat([cmwd, lad])
return geography
GeoPandas and Matplotlib
Having downloaded the data, we are ready to produce our first map! While I downloaded data for
the whole of England and Wales, I will restrict the mapping to London for simplicity.

GeoPandas is an open source project to make working with geospatial data in Python easier.
GeoPandas extends the datatypes used by pandas to allow spatial operations on geometric types.
Geometric operations are performed by shapely. Geopandas further depends on Fiona for file access
and Matplotlib for plotting.

I installed GeoPandas on Windows and it failed. Because it depends on packages that are
implemented in C/C++, special procedures are required to install it on Windows, these are described
in the Appendix below. (Apparently the install is straightforward on Linux and Mac.)

First, I loaded the Shapefile. (Code is in census_read_geopandas.py.)

import geopandas as gpd

def read_london_lad_geopandas():
# Get Local Authority Boundaries as GeoPandas
shapefile = 'data/Local_Authority_Districts_(December_2011)_Boundaries_EW_BFC/
Local_Authority_Districts_(December_2011)_Boundaries_EW_BFC.shp'
gdf = gpd.read_file(shapefile)
# Convert coordinates
gdf.to_crs(epsg=4326, inplace=True)
# London
ladgdf = gdf[gdf['lad11cd'].str.startswith('E090000')]
return ladgdf

GeoPandas loads the Shapefile, and then we convert the co-ordinates as discussed above, and filter
the rows by the LAD code for London LADs.

Here is simple code to map the LADs. (Code is in census_geopandas_script.py.)

import matplotlib.pyplot as plt

from mpl_toolkits.axes_grid1 import make_axes_locatable
import census_read_data as crd
import census_read_geopandas as crg

# Get LAD GeoPandas DataFrame

london_lads_gdf = crg.read_london_lad_geopandas()

# Default GeoPandas plot

london_lads_gdf.plot()
plt.show()
This displays a window with default formatting:

We can load the data to plot on the map as follows:

# Get Census data index and its table_names

index = crd.read_index()
table_names = crd.get_table_names(index)

# Get first data table

table_name = table_names[0][0]
tdf = crd.read_table(table_name)

# Get first row data item (all categories All)

datacol = table_name + tdf.iloc[0, -1]

# Read the data table (all data items) and merge with the LAD geo data
df = crd.read_data(table_name)
gdf = london_lads_gdf.merge(df, left_on='lad11cd', right_on='GeographyCode')
# Create a Matplotlib plot, turn off axes, position color bar, and plot
fig, ax = plt.subplots(1, 1)
ax.set_axis_off()
divider = make_axes_locatable(ax)
cax = divider.append_axes("right", size="5%", pad=0.0)
gdf.plot(column=datacol, ax=ax, legend=True, cax=cax)
plt.show()

This displays a window with a choropleth map, colour coded according to the first dataset for the
table DC1104EW, i.e. the column DC1104EW0001:

GeoJSON and Plotly

In order to add interactive features I used Plotly and Dash. Plotly requires geographical data in
GeoJSON format. This format is much larger than Shapefile, and the conversion from Shapefile is
time consuming – about 45 seconds for the Ward file on my environment.

I decided to cache the GeoJSON locally. (Code in file census_read_geojson.py.)

import geopandas as gpd

import json
import os

def read_london_lad_geojson():
# Get LAD GeoJSON
london_jsonfile = "data/json_files/London_LAD_Boundaries.json"
if not os.path.exists(london_jsonfile):
lad_jsonfile = "data/json_files/Local_Authority_Districts_(December_2011)_Boundaries_EW_BFC.json"
if not os.path.exists(lad_jsonfile):
# Get Census LAD Boundaries as GeoPandas
shapefile = 'data/Local_Authority_Districts_(December_2011)_Boundaries_EW_BFC/
Local_Authority_Districts_(December_2011)_Boundaries_EW_BFC.shp'
ladgdf = gpd.read_file(shapefile)
# Convert coordinates
ladgdf.to_crs(epsg=4326, inplace=True)
# Write GeoJSON
ladgdf.to_file(lad_jsonfile, driver='GeoJSON')
with open(lad_jsonfile) as f:
census_lads = json.load(f)
london_lads = census_lads
london_lads['features'] = list(filter(
lambda f: f['properties']['lad11cd'].startswith('E090000'), london_lads['features']))
with open(london_jsonfile, 'w') as f:
json.dump(london_lads, f)
else:
with open(london_jsonfile) as f:
london_lads = json.load(f)
return london_lads

There are two cached JSON files: the London LAD Boundaries, and the complete LAD Boundaries.
The JSON files are simply converted from GeoPandas.

The first Plotly map uses the GeoJSON files. (Code in census_plotly_script.py.)

import census_read_data as crd

import census_read_geojson as crg
import pandas as pd
import plotly.express as px

# Get Census Merged Ward and Local Authority Data

geography = crd.read_geography()
locationcol = "GeographyCode"
namecol = "Name"

# Get LAD GeoJSON

london_lads = crg.read_london_lad_geojson()

# Get Census data index and its table_names

index = crd.read_index()
table_names = crd.get_table_names(index)

# Get first data table and its categories

table_name = table_names[0][0]
tdf = crd.read_table(table_name)

# Get first row data item (all categories All)

datacol = table_name + tdf.iloc[0, -1]

# Read the data table (all data items) and merge with the geography names
df = crd.read_data(table_name)
df = pd.merge(df, geography, on=locationcol)

# Filter London Data

london_lad_ids = list(map(lambda f: f['properties']
['lad11cd'], london_lads["features"]))
london_flags = df[locationcol].isin(london_lad_ids)
london_lad_df = df[london_flags]

# Map data by LAD

key = "properties.lad11cd"
max_value = london_lad_df[datacol].max()
title = datacol + " by Local Authority"

fig = px.choropleth(london_lad_df,
geojson=london_lads,
locations=locationcol,
color=datacol,
color_continuous_scale="Viridis",
range_color=(0, max_value),
featureidkey=key,
scope='europe',
hover_data=[namecol],
title=title
)
fig.update_geos(
fitbounds="locations",
visible=False,
)
fig.update_layout(margin=dict(l=0, r=0, b=0, t=30),
title_x=0.5,
width=1200, height=600)

fig.show()

Observations on the code:

 The px.choropleth function call maps the london_lads GeoJson, colouring the map
according to the datacol (“DC1104EW0001”) column of the london_lad_df, matching
the GeoJson feature property key (“properties.lad11cd”) with the locationcol
("GeographyCode") column.
 The update_geos function call sets the bounds of the map from the displayed locations, and
hides the underlying map.
 The update_layout function call reduces the margin around the map, and specifies the
width and height.

This script creates a browser window:

The code specifies the width and height because a) the default size is smaller, and b) in order to have
an appropriate aspect ratio. However, the map is still surrounded by a large amount of white space. I
improved this by manually specifying the bounds for the map:
from turfpy.measurement import bbox
from functools import reduce

def compute_bbox(gj):
# Compute bounding box for GeoJSON
gj_bbox_list = list(
map(lambda f: bbox(f['geometry']), gj['features']))
gj_bbox = reduce(
lambda b1, b2: [min(b1[0], b2[0]), min(b1[1], b2[1]),
max(b1[2], b2[2]), max(b1[3], b2[3])],
gj_bbox_list)
return gj_bbox

gj_bbox = compute_bbox(london_lads)

fig.update_geos(
# fitbounds="locations",
center_lon=(gj_bbox[0]+gj_bbox[2])/2.0,
center_lat=(gj_bbox[1]+gj_bbox[3])/2.0,
lonaxis_range=[gj_bbox[0], gj_bbox[2]],
lataxis_range=[gj_bbox[1], gj_bbox[3]],
visible=False,
)

Observations on the code:

 The package turfpy.measurement provides a function bbox to compute the bounding box
for a GeoJSON feature geometry.
 The function compute_bbox computes the bounding box for each feature and reduces the list
of bounding boxes to compute the combined bounding box.
 Then update_geos uses the box to specify the center and longitude and latitude ranges for
the map.
Dash
Dash is built on top of Plotly and “abstracts away all of the technologies and protocols that are
required to build a full-stack web app with interactive data visualization”. I used Dash to allow
selection of the table, dataset and granularity for the map.

In my first attempt I used the Dash Core Components to add the selection controls. While functional,
the appearance was not great, so I switched to using Dash Bootstrap Components, which provide
the consistent Bootstrap look and feel to the controls without needing CSS expertise.

The first version just allowed selection of the map granularity, adding these controls:

The Granularity radio items change the granularity of the map between Local Authority, which we
have seen so far, and Ward, for a more detailed map. The dropdown selection optionally specifies
the Local Authority for a Ward map.

The code is in the file census_dash_script_simple.py.

import census_read_data as crd

import census_read_geojson as crg
import pandas as pd
import plotly.express as px
import plotly.graph_objects as go
from dash.dependencies import Input, Output
from dash import html
from dash import dcc
import dash
import dash_bootstrap_components as dbc
from turfpy.measurement import bbox
from functools import reduce

# Get Census Merged Ward and Local Authority Data

geography = crd.read_geography()
locationcol = "GeographyCode"
namecol = "Name"

# Get London GeoJSON

london_wards = crg.read_london_ward_geojson()

# Get LAD GeoJSON

london_lads = crg.read_london_lad_geojson()

# Get Census data index and its table_names

index = crd.read_index()
table_names = crd.get_table_names(index)

# Get first data table and its categories

table_name = table_names[0][0]
tdf = crd.read_table(table_name)

# Get first row data item (all categories All)

datacol = table_name + tdf.iloc[0, -1]

# Read the data table (all data items) and merge with the geography names
df = crd.read_data(table_name)
df = pd.merge(df, geography, on=locationcol)

# Filter London Data

london_ward_ids = list(map(lambda f: f['properties']
['cmwd11cd'], london_wards["features"]))
london_flags = df[locationcol].isin(london_ward_ids)
ldf = df[london_flags]

london_lad_ids = list(map(lambda f: f['properties']

['lad11cd'], london_lads["features"]))
london_flags = df[locationcol].isin(london_lad_ids)
london_lad_df = df[london_flags]
ward_max_value = ldf[datacol].max()
lad_max_value = london_lad_df[datacol].max()

# Dash

def blank_fig():
# Blank figure for initial Dash display
fig = go.Figure(go.Scatter(x=[], y=[]))
fig.update_layout(template=None)
fig.update_xaxes(showgrid=False, showticklabels=False, zeroline=False)
fig.update_yaxes(showgrid=False, showticklabels=False, zeroline=False)
return fig

app = dash.Dash(name, external_stylesheets=[dbc.themes.BOOTSTRAP])

local_authorities = london_lad_ids
all_local_authorities = ['All'] + local_authorities

map_controls = dbc.Card(
[
dbc.Row([
dbc.Label("Granularity", html_for="granularity", width=2),
dbc.Col(
[
dbc.RadioItems(
id='granularity',
options=[{'label': i, 'value': i}
for i in ['Local Authorities', 'Wards']],
value='Local Authorities',
inline=True
),
],
width=8
)
]),

dbc.Row([
dbc.Label('Wards for Local Authority',
html_for="local-authority", width=2),
dbc.Col(
[
dbc.Select(
id='local-authority',
options=[
#{'label': i, 'value': i}
{'label': 'All' if i == 'All'
else geography[geography[locationcol] == i][namecol].iat[0],
'value': i}
for i in all_local_authorities],
value='All'
)
],
width=8
)

]),
]
)

app.layout = dbc.Container(
[
html.H1("Census Data"),
html.Hr(),
dbc.Col(
[
dbc.Row(map_controls),
dbc.Row(dcc.Graph(id='map', figure=blank_fig()),
class_name='mt-3'),
],
align="center",
),
],
fluid=True,
)

@app.callback(
Output('map', 'figure'),
Input('local-authority', 'value'),
Input('granularity', 'value'),
)
def update_graph(local_authority, granularity):

if granularity == 'Local Authorities':

fdf = london_lad_df
gj = london_lads
key = "properties.lad11cd"
max_value = lad_max_value
title = datacol + " by Local Authority"
else:
key = "properties.cmwd11cd"
max_value = ward_max_value
if local_authority == 'All':
fdf = ldf
gj = london_wards
title = datacol + " by Ward"
else:
fdf = ldf[ldf['LAD11CD'].str.match(local_authority)]
gj = {
'features': list(filter(lambda f: f['properties']['lad11cd'] == local_authority,
london_wards["features"])),
'type': london_wards['type'],
'crs': london_wards['crs']
}
title = datacol + " by Ward for Local Authority"

gj_bbox = compute_bbox(gj)

fig = px.choropleth(fdf,
geojson=gj,
locations=locationcol,
color=datacol,
color_continuous_scale="Viridis",
range_color=(0, max_value),
featureidkey=key,
scope='europe',
hover_data=[namecol, 'LAD11NM'],
title=title
)
fig.update_geos(
center_lon=(gj_bbox[0]+gj_bbox[2])/2.0,
center_lat=(gj_bbox[1]+gj_bbox[3])/2.0,
lonaxis_range=[gj_bbox[0], gj_bbox[2]],
lataxis_range=[gj_bbox[1], gj_bbox[3]],
visible=False,
)
fig.update_layout(margin=dict(l=0, r=0, b=0, t=30),
title_x=0.5,
width=1200, height=600)
return fig

if __name__ == '__main__':
app.run_server(debug=True)

Observations on the code:

 This script reads the Ward data, in addition to the LAD data, using
crd.read_london_ward_geojson().
 The Dash functionality starts after the comment # Dash.
 The initial figure displayed by Dash is a chart, which I did not want to see, so the function
blank_fig() creates a blank figure to use as the initial display. (Credit to this Stack Overflow
answer.)
 The call to dash.Dash() creates the Dash application, using the standard Bootstrap
stylesheet.
 The assignment to map_controls creates the selection controls.
 The assignment to app.layout creates a container for the page with heading,
map_controls and placeholder for the map initially showing the blank figure.
 The @app.callback annotation defines a callback function that is called when either of the
selection controls is updated, returning the updated figure.
 The callback builds the map, using Plotly as before, using the data appropriate to the controls.
One minor change is the hover_data=[namecol, 'LAD11NM'], which adds the LAD
name to the hover display.
 app.run_server() starts the Dash server. In my environment it is accessed on
http://127.0.0.1:8050/ and displays this page:
If I change the granularity to Ward and select the Local Authority Ealing I get this map; the image
shows the hover text with the LAD name:
Next I added controls to select the table and dataset. This required many more inputs and outputs
on the callback. Ideally, I would like to have multiple callbacks chained together as described in Dash
Basic Callbacks. However, callbacks must be stateless, so it is not possible for the table selection to
update global state with the table data to be displayed on the dataset selection. So, I ended up with
one large callback that does everything.

The code is in census_dash_script_full.py. I will not reproduce it here, but summarise the
changes:

 The assignment to table_controls creates the table selection control.

 The assignment to category_controls creates four category selection controls. Initially
they are empty and the style is set to display none so they are invisible.
 The app.layout includes the new controls before the map_controls from above.
 The app.callback defines additional outputs for the category control label, values and style,
and additional inputs for the table and category values.
 The callback proceeds as follows:
 If the table_name is not set then raise the exception PreventUpdate so the figure is
not updated.
 Call crd.get_table_column_names_and_values() to get the category names and
values for the table.
 Set the outputs for the category names and values. Also, for categories that have a value
already selected build a query string to filter the table by the value.
 If all the table categories have selected values then query the table to get the dataset name,
load the data file, and display the map for the appropriate column.

A sample display is shown below:

The next display shows Ward level data for all LADs in London:
This article provides a solution for a non-trivial requirement, I hope it is useful!

Appendix – Python Packages

I used Python 3.9 on Windows 11 and normally use pip to install packages. However, geopandas
depends on packages that are implemented in C/C++, so special procedures are required to install it
on Windows. (Apparently the install is straightforward on Linux and Mac.)

The article Using geopandas on Windows by Geoff Boeing is referenced as the definitive explanation
of how to install geopandas on Windows. The comments on the article have many suggestions on
how best to proceed. The essential advice is to use pipwin which installs unofficial python package
binaries for Windows provided by Christoph Gohlke here. These are the steps I followed:

pip install wheel

pip install pipwin
pipwin install numpy
pipwin install pandas
pipwin install gdal
# Add the new GDAL path to the Windows PATH environment variable, e.g.:
# C:\Users\<username>\AppData\Local\Programs\Python\Python39\Lib\site-
packages\osgeo
# It seems this must be done before installing Fiona.

pipwin install fiona

pipwin install pyproj
pipwin install rtree
pipwin install shapely
pipwin install six
pip install geopandas

Python Cheat Sheet 2.0
100% (1)
Python Cheat Sheet 2.0
10 pages
Final Project - ML - Nikita Chaturvedi - 03.10.2021 - Jupyter Notebook
100% (11)
Final Project - ML - Nikita Chaturvedi - 03.10.2021 - Jupyter Notebook
154 pages
Python Cheat Sheet: Pandas - Numpy - Sklearn Matplotlib - Seaborn BS4 - Selenium - Scrapy
100% (4)
Python Cheat Sheet: Pandas - Numpy - Sklearn Matplotlib - Seaborn BS4 - Selenium - Scrapy
11 pages
Python Cheat Sheet: Pandas - Numpy - Sklearn Matplotlib - Seaborn BS4 - Selenium - Scrapy
100% (3)
Python Cheat Sheet: Pandas - Numpy - Sklearn Matplotlib - Seaborn BS4 - Selenium - Scrapy
9 pages
Library Management System
No ratings yet
Library Management System
15 pages
ML Lab Manual 1-10
No ratings yet
ML Lab Manual 1-10
58 pages
Practical File Python
No ratings yet
Practical File Python
25 pages
Python Cheat Sheet For Excel Users
100% (2)
Python Cheat Sheet For Excel Users
5 pages
Data Wrangling - Jupyter Notebook
No ratings yet
Data Wrangling - Jupyter Notebook
5 pages
Data Science Practical Problems
No ratings yet
Data Science Practical Problems
40 pages
Exp - 1 - Introduction To Data Analytics and Python Fundamentals - SDK - Ok
No ratings yet
Exp - 1 - Introduction To Data Analytics and Python Fundamentals - SDK - Ok
9 pages
Aiml Lab Manaual R23
100% (1)
Aiml Lab Manaual R23
10 pages
Ai Tools and Applications-Lab
No ratings yet
Ai Tools and Applications-Lab
33 pages
Statistical Data Analysis - Ipynb - Colaboratory
No ratings yet
Statistical Data Analysis - Ipynb - Colaboratory
6 pages
1 4-EDA Ipynb
No ratings yet
1 4-EDA Ipynb
12 pages
Python Codes
No ratings yet
Python Codes
17 pages
sakina_assign1_batch3
No ratings yet
sakina_assign1_batch3
8 pages
Pandas Library Problems For Parctice
No ratings yet
Pandas Library Problems For Parctice
13 pages
12 Pandas
No ratings yet
12 Pandas
14 pages
Matplotlib Library in Python
No ratings yet
Matplotlib Library in Python
85 pages
Week2 Lab
No ratings yet
Week2 Lab
8 pages
Day08-Pandas-Tutorial: Pandas - by Punith V T
No ratings yet
Day08-Pandas-Tutorial: Pandas - by Punith V T
8 pages
Pandas DataFrame Notes
100% (1)
Pandas DataFrame Notes
10 pages
Import Import As Import As: #Default To CSV
No ratings yet
Import Import As Import As: #Default To CSV
6 pages
Data Science Practicals - Ipynb
No ratings yet
Data Science Practicals - Ipynb
54 pages
Homework 1
No ratings yet
Homework 1
17 pages
Data Analysis Tools
No ratings yet
Data Analysis Tools
26 pages
Cheat Sheet: The Pandas Dataframe Object: Preliminaries Get Your Data Into A Dataframe
100% (1)
Cheat Sheet: The Pandas Dataframe Object: Preliminaries Get Your Data Into A Dataframe
12 pages
#Pip Install Pandas #Pandas Can Be Installed Using:: Import
No ratings yet
#Pip Install Pandas #Pandas Can Be Installed Using:: Import
6 pages
Eda - 1@3pm 8th Nov
No ratings yet
Eda - 1@3pm 8th Nov
2 pages
Cheat Sheet: The Pandas Dataframe Object I: Preliminaries Get Your Data Into A Dataframe
No ratings yet
Cheat Sheet: The Pandas Dataframe Object I: Preliminaries Get Your Data Into A Dataframe
12 pages
Pandas DataFrameObject
No ratings yet
Pandas DataFrameObject
4 pages
Cheat Sheet: The Pandas Dataframe Object: Preliminaries Get Your Data Into A Dataframe
100% (1)
Cheat Sheet: The Pandas Dataframe Object: Preliminaries Get Your Data Into A Dataframe
10 pages
DSC Lab Programs
No ratings yet
DSC Lab Programs
24 pages
Data Exploration in Python PDF
No ratings yet
Data Exploration in Python PDF
1 page
Pandas DataFrame Notes
No ratings yet
Pandas DataFrame Notes
10 pages
Pandas
No ratings yet
Pandas
27 pages
Document (1) (2)
No ratings yet
Document (1) (2)
20 pages
Python Pandas-DataFrames Complete - Jupyter Notebook
No ratings yet
Python Pandas-DataFrames Complete - Jupyter Notebook
34 pages
UNIT-4 Important Q-A
No ratings yet
UNIT-4 Important Q-A
28 pages
DAR CompleteFile 1
No ratings yet
DAR CompleteFile 1
41 pages
Pandas, Numpy, Matplotlib
No ratings yet
Pandas, Numpy, Matplotlib
11 pages
Ip Practical 2024
No ratings yet
Ip Practical 2024
12 pages
DHP Journal
No ratings yet
DHP Journal
29 pages
AE II Simulation File PDF
No ratings yet
AE II Simulation File PDF
32 pages
Exp 3
No ratings yet
Exp 3
10 pages
Practica 9
No ratings yet
Practica 9
24 pages
Sowmi DS
No ratings yet
Sowmi DS
27 pages
Data Science Python Cheat Sheet
No ratings yet
Data Science Python Cheat Sheet
25 pages
ML Lab Manual Final
No ratings yet
ML Lab Manual Final
36 pages
AI Practical 2025
No ratings yet
AI Practical 2025
14 pages
DP Prog
No ratings yet
DP Prog
10 pages
DAVPy 2024GE
No ratings yet
DAVPy 2024GE
12 pages
Python For Exploratory Data Analysis
No ratings yet
Python For Exploratory Data Analysis
12 pages
PPL Pharma
No ratings yet
PPL Pharma
16 pages
The Technology Behind Vtubing by Sree Harsha
No ratings yet
The Technology Behind Vtubing by Sree Harsha
9 pages
Blind 75
No ratings yet
Blind 75
13 pages
MAXHEALTH
No ratings yet
MAXHEALTH
18 pages
Debug 1214
No ratings yet
Debug 1214
4 pages
Ay2024 SCTP Diaf l02 Tutorialans
No ratings yet
Ay2024 SCTP Diaf l02 Tutorialans
3 pages
A Practical Guide To HL7 Interface Development
No ratings yet
A Practical Guide To HL7 Interface Development
5 pages
Ict Year 6 Ig Revision With Answers
No ratings yet
Ict Year 6 Ig Revision With Answers
8 pages
Fiber Optic Network Optical Wavelength Transmission Bands & FOA
No ratings yet
Fiber Optic Network Optical Wavelength Transmission Bands & FOA
174 pages
NinjaTrader 8 Connection Instructions - Topstep
No ratings yet
NinjaTrader 8 Connection Instructions - Topstep
1 page
Microlink Information Technology College Class Schedule For Sections Academic Year: 2017 Semester 1
No ratings yet
Microlink Information Technology College Class Schedule For Sections Academic Year: 2017 Semester 1
7 pages
Data Sample
No ratings yet
Data Sample
16 pages
FYP-3 Report Template For R&D Projects
No ratings yet
FYP-3 Report Template For R&D Projects
54 pages
Chapte 3 IoT
No ratings yet
Chapte 3 IoT
36 pages
NDG Online NDG Linux Essentials Challenge A: User Management
No ratings yet
NDG Online NDG Linux Essentials Challenge A: User Management
4 pages
B Cisco Nexus 9000 Series NX-OS Quality of Service Configuration Guide 7x
No ratings yet
B Cisco Nexus 9000 Series NX-OS Quality of Service Configuration Guide 7x
220 pages
Unit 5 MM
No ratings yet
Unit 5 MM
5 pages
Week Wise Syllabus - DMA
No ratings yet
Week Wise Syllabus - DMA
3 pages
System Development Methods
No ratings yet
System Development Methods
68 pages
CCMS Solution Design Document V1.0
No ratings yet
CCMS Solution Design Document V1.0
13 pages
NT6000 Enet Manual
No ratings yet
NT6000 Enet Manual
32 pages
Datacolor 800™ Datacolor 500™ Datacolor 850™ Datacolor 550™ User's Guide
No ratings yet
Datacolor 800™ Datacolor 500™ Datacolor 850™ Datacolor 550™ User's Guide
48 pages
2014 A Decision Model For Information Technology Selection Using AHP Integrated TOPSIS-Grey - The Case of Content Management Systems
No ratings yet
2014 A Decision Model For Information Technology Selection Using AHP Integrated TOPSIS-Grey - The Case of Content Management Systems
11 pages
RSP Access Control Essentials Mini-Catalog 2018
No ratings yet
RSP Access Control Essentials Mini-Catalog 2018
20 pages
B450M Ac R2.0 MultiQIG
No ratings yet
B450M Ac R2.0 MultiQIG
181 pages
All Programs WT
No ratings yet
All Programs WT
71 pages
Process Explorer Essentials - Antun Peicevic
No ratings yet
Process Explorer Essentials - Antun Peicevic
70 pages
Zaxis Aggregators PVT Ltd-21209quote
No ratings yet
Zaxis Aggregators PVT Ltd-21209quote
3 pages
Making The Game in p5.js: Our Player Character
No ratings yet
Making The Game in p5.js: Our Player Character
11 pages
Traffic Sign Detection and Recognition With Deep CNN Using Raspberry Pi 4 in Real-Time
No ratings yet
Traffic Sign Detection and Recognition With Deep CNN Using Raspberry Pi 4 in Real-Time
6 pages
Simulink - Dynamic System Simulation For MatLab 26
No ratings yet
Simulink - Dynamic System Simulation For MatLab 26
1 page
Google India's Girl Hackathon 2024
No ratings yet
Google India's Girl Hackathon 2024
2 pages
(Exams) Dmitry Sensei - Oracle 1Z0-062 Exam. 345q. Corrected (2018)
No ratings yet
(Exams) Dmitry Sensei - Oracle 1Z0-062 Exam. 345q. Corrected (2018)
152 pages

Interactive Mapping in Python With UK Census Data

Uploaded by

Interactive Mapping in Python With UK Census Data

Uploaded by

Interactive Mapping in Python with UK

The steps I followed, which will be described in detail below, were:

Bulk data products enable users, particularly managers of information

The data is compiled by various areas:

Datasets available in standard bulk format generally contain data only

The index DataFrame looks like this:

Table Number Table Title

DC1104EW: Residence type by sex by age

Table population: All usual residents

All categories: Residence type Lives in a household Lives in a communal establishment

Crown Copyright applies unless otherwise stated, [email protected]

# Lots of code to process headings not shown here, see

Calling read_table for DC1104EW gives this DataFrame:

Age Residence type Sex Dataset

The head of the DataFrame for table DC1104EW is:

GeographyCode DC1104EW0001 DC1104EW0002 DC1104EW0003 ... DC1104EW0198

1. Census_Merged_Wards_(December_2011)_Boundaries – Download Shapefile format

First, I loaded the Shapefile. (Code is in census_read_geopandas.py.)

import geopandas as gpd

Here is simple code to map the LADs. (Code is in census_geopandas_script.py.)

import matplotlib.pyplot as plt

# Get LAD GeoPandas DataFrame

# Default GeoPandas plot

We can load the data to plot on the map as follows:

# Get Census data index and its table_names

# Get first data table

# Get first row data item (all categories All)

GeoJSON and Plotly

I decided to cache the GeoJSON locally. (Code in file census_read_geojson.py.)

import geopandas as gpd

import census_read_data as crd

# Get Census Merged Ward and Local Authority Data

# Get LAD GeoJSON

# Get Census data index and its table_names

# Get first data table and its categories

# Get first row data item (all categories All)

# Filter London Data

# Map data by LAD

Observations on the code:

This script creates a browser window:

Observations on the code:

The code is in the file census_dash_script_simple.py.

import census_read_data as crd

# Get Census Merged Ward and Local Authority Data

# Get London GeoJSON

# Get LAD GeoJSON

# Get Census data index and its table_names

# Get first data table and its categories

# Get first row data item (all categories All)

# Filter London Data

london_lad_ids = list(map(lambda f: f['properties']

app = dash.Dash(__name__, external_stylesheets=[dbc.themes.BOOTSTRAP])

if granularity == 'Local Authorities':

Observations on the code:

 The assignment to table_controls creates the table selection control.

A sample display is shown below:

Appendix – Python Packages

pip install wheel

pipwin install fiona

You might also like

app = dash.Dash(name, external_stylesheets=[dbc.themes.BOOTSTRAP])