Thames Valley Crime Data Analysis
Thames Valley Crime Data Analysis
import pandas as pd
In [29]: data=pd.read_csv("2024-09-thames-valley-street.csv")
data
Out [29]:
Last
Reported LSOA Crime
Crime ID Month Location LSOA code outcome
by name type
category
On or Investigation
Thames Bracknell
2024- near complete;
0 600dc88bebb3499405cf71237c6a89fa4f0e94f51d669d... Valley E01016252 Forest Burglary
09 Winkfield no suspect
Police 001C
Lane identified
On or
Thames Bracknell
2024- near Other Under
1 96339937ce33810cdd4421d7960675e206d9a57e8cddd7... Valley E01016252 Forest
09 Montague theft investigation
Police 001C
Park
Thames On or Bracknell
2024- Other Under
2 48064c2c5c76557638c7d6e9351d59dd3c6a9105a042d5... Valley near Park E01016252 Forest
09 theft investigation
Police Lane 001C
On or Investigation
Thames Bracknell
2024- near Other complete;
3 b88c4f1501e108884efed2d65a1affd624539250adb7c9... Valley E01016252 Forest
09 Lovel theft no suspect
Police 001C
Lane identified
On or Violence
Thames Bracknell
2024- near and Under
4 1b31c1a8a16b1e04baddb82c0b993933388ccf7e8a3936... Valley E01016252 Forest
09 Lovel sexual investigation
Police 001C
Road offences
... ... ... ... ... ... ... ... ...
Thames
2024- No Other Under
15909 d6814bb49b473226149688ad158d39667c0e9cf93ec676... Valley NaN NaN
09 Location crime investigation
Police
Thames
2024- No Other Under
15910 6545cbb38070967441ff0ab6e11d68fedf9ae473efe30a... Valley NaN NaN
09 Location crime investigation
Police
Thames Awaiting
2024- No Other
15911 a36eb3c08d3dcfc613037003f57796630a1dd5edf6c9ed... Valley NaN NaN court
09 Location crime
Police outcome
Investigation
Thames
2024- No Other complete;
15912 4e8cf7cb5dede7bb61ed58559ca78ec6db80b0511c3465... Valley NaN NaN
09 Location crime no suspect
Police
identified
Thames Unable to
2024- No Other
15913 dd5a226b82a5658ef985e246324d50eb7c967a1fe7eba4... Valley NaN NaN prosecute
09 Location crime
Police suspect
Data Overview
In [13]:
# Check Data Types
print(data.dtypes)
Crime ID object
Month object
Reported by object
Location object
LSOA code object
LSOA name object
Crime type object
Last outcome category object
dtype: object
In [15]:
# Check for Missing Values
print(data.isnull().sum())
Crime ID 1512
Month 0
Reported by 0
Location 0
LSOA code 333
LSOA name 333
Crime type 0
Last outcome category 1512
dtype: int64
In [25]:
# Removing Null values from dataset
data.dropna(subset=['Crime ID'], inplace=True)
data.dropna(subset=['LSOA code'], inplace=True)
data.dropna(subset=['LSOA name'], inplace=True)
data.dropna(subset=['Last outcome category'], inplace=True)
In [27]:
data
Out [27]:
Last
Reported Crime
Crime ID Month Location LSOA code LSOA name outcome
by type
category
On or Investigation
Thames Bracknell
2024- near complete;
0 600dc88bebb3499405cf71237c6a89fa4f0e94f51d669d... Valley E01016252 Forest Burglary
09 Winkfield no suspect
Police 001C
Lane identified
On or
Thames Bracknell
2024- near Other Under
1 96339937ce33810cdd4421d7960675e206d9a57e8cddd7... Valley E01016252 Forest
09 Montague theft investigation
Police 001C
Park
Thames On or Bracknell
2024- Other Under
2 48064c2c5c76557638c7d6e9351d59dd3c6a9105a042d5... Valley near Park E01016252 Forest
09 theft investigation
Police Lane 001C
On or Investigation
Thames Bracknell
2024- near Other complete;
3 b88c4f1501e108884efed2d65a1affd624539250adb7c9... Valley E01016252 Forest
09 Lovel theft no suspect
Police 001C
Lane identified
On or Violence
Thames Bracknell
2024- near and Under
4 1b31c1a8a16b1e04baddb82c0b993933388ccf7e8a3936... Valley E01016252 Forest
09 Lovel sexual investigation
Police 001C
Road offences
... ... ... ... ... ... ... ... ...
On or Investigation
Thames
2024- near Wokingham Other complete;
15576 885751ac442a3fc0986990d3560163d8583f04bb17668f... Valley E01016707
09 Frensham 020C theft no suspect
Police
Road identified
On or Violence
Thames Unable to
2024- near Wokingham and
15577 017181b524fa94ed54e287c474a5b1b8f7f4dbd5c616dc... Valley E01016707 prosecute
09 Keats 020C sexual
Police suspect
Way offences
On or Investigation
Thames
2024- near The Wokingham Vehicle complete;
15578 a051998aa292cb3247ab4edafca727c1730a7205564367... Valley E01016708
09 Devil'S 020D crime no suspect
Police
Highway identified
On or
Thames
2024- near Wokingham Public Under
15579 df9a70642d0a6367afc606c425f0d05b27ad97f1d1c9ae... Valley E01016709
09 Butler 020E order investigation
Police
Road
On or Violence
Thames
2024- near Wokingham and Under
15580 a69c1481565d90268d5837fa23608b0143a467146ccca4... Valley E01016709
09 Wiltshire 020E sexual investigation
Police
Avenue offences
Statistical Summary
In [31]:
# Descriptive Statistics for Numerical Columns
print(data.describe())
Crime ID Month \
count 14402 15914
unique 14397 1
top 7f4757e9c065c9990940e66782f8a667b9887a8c51ae29... 2024-09
freq 2 15914
In [33]:
# Crime type count
crime_counts = data['Crime type'].value_counts()
print(crime_counts)
Crime type
Violence and sexual offences 5876
Shoplifting 1583
Anti-social behaviour 1512
Public order 1246
Other theft 1220
Vehicle crime 1123
Criminal damage and arson 1111
Drugs 527
Burglary 519
Bicycle theft 400
Other crime 315
Theft from the person 210
Robbery 151
Possession of weapons 121
Name: count, dtype: int64
In [37]:
# Status of each crime
crime_outcome_counts = data.groupby(['Crime type', 'Last outcome category']).size().reset_index(name='Count')
print(crime_outcome_counts)
Crime type \
0 Bicycle theft
1 Bicycle theft
2 Bicycle theft
3 Burglary
4 Burglary
.. ...
77 Violence and sexual offences
78 Violence and sexual offences
79 Violence and sexual offences
80 Violence and sexual offences
81 Violence and sexual offences
Visual analysis
In [41]: # Plot stacked bar chart
crime_outcome_counts = data.groupby(['Crime type', 'Last outcome category']).size().unstack(fill_value=0)
# Group by 'Month' and 'Crime type' to get the count of each crime type per month
crime_counts = data.groupby(['Month', 'Crime type']).size().reset_index(name='Crime Count')
In [74]:
# Pivot Table
pivot_table = data.pivot_table(
index=['Location'],
columns=['Crime type'],
aggfunc='size',
fill_value=0
)
print(pivot_table)
In [76]:
# Group by Location and Crime Type
location_crime_aggregated = data.groupby(['Location', 'Crime type']).size().reset_index(name='Crime Count')
print(location_crime_aggregated)
In [ ]: