0% found this document useful (0 votes)
42 views11 pages

Customer Service Requests Analysis - Solution

Uploaded by

monimugdhabora
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
42 views11 pages

Customer Service Requests Analysis - Solution

Uploaded by

monimugdhabora
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

9/16/2021 Customer Service Requests Analysis

In [1]: %matplotlib inline


### import libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from matplotlib import style
import seaborn as sns
import datetime

In [2]: df = pd.read_csv('D:/311_Service_Requests_from_2010_to_Present.csv', header=0, sep='

D:\SoftwarePrograms\anaconda\lib\site-packages\IPython\core\interactiveshell.py:314
6: DtypeWarning: Columns (48,49) have mixed types.Specify dtype option on import or
set low_memory=False.
has_raised = await self.run_ast_nodes(code_ast.body, cell_name,

In [3]: df.head()

Out[3]:
Created Closed Agency Complaint Incid
Agency Descriptor Location Type
Date Date Name Type

Unique
Key

2015- 2016- New York


Noise - Loud
32310363 12-31 01-01 NYPD City Police Street/Sidewalk 1003
Street/Sidewalk Music/Party
23:59:45 00:55:00 Department

2015- 2016- New York


Blocked
32309934 12-31 01-01 NYPD City Police No Access Street/Sidewalk 1110
Driveway
23:59:44 01:26:00 Department

2015- 2016- New York


Blocked
32309159 12-31 01-01 NYPD City Police No Access Street/Sidewalk 1045
Driveway
23:59:29 04:51:00 Department

2015- 2016- New York Commercial


32305098 12-31 01-01 NYPD City Police Illegal Parking Overnight Street/Sidewalk 1046
23:57:46 07:43:00 Department Parking

2015- 2016- New York


Blocked
32306529 12-31 01-01 NYPD City Police Illegal Parking Street/Sidewalk 1137
Sidewalk
23:56:58 03:24:00 Department

5 rows × 52 columns

In [4]: df.shape

Out[4]: (300698, 52)

In [5]: df.columns

Out[5]: Index(['Created Date', 'Closed Date', 'Agency', 'Agency Name',


'Complaint Type', 'Descriptor', 'Location Type', 'Incident Zip',
'Incident Address', 'Street Name', 'Cross Street 1', 'Cross Street 2',
'Intersection Street 1', 'Intersection Street 2', 'Address Type',
'City', 'Landmark', 'Facility Type', 'Status', 'Due Date',
'Resolution Description', 'Resolution Action Updated Date',
'Community Board', 'Borough', 'X Coordinate (State Plane)',
'Y Coordinate (State Plane)', 'Park Facility Name', 'Park Borough',
'School Name', 'School Number', 'School Region', 'School Code',
localhost:8889/nbconvert/html/Customer Service Requests Analysis.ipynb?download=false 1/11
9/16/2021 Customer Service Requests Analysis

'School Phone Number', 'School Address', 'School City', 'School State',


'School Zip', 'School Not Found', 'School or Citywide Complaint',
'Vehicle Type', 'Taxi Company Borough', 'Taxi Pick Up Location',
'Bridge Highway Name', 'Bridge Highway Direction', 'Road Ramp',
'Bridge Highway Segment', 'Garage Lot Name', 'Ferry Direction',
'Ferry Terminal Name', 'Latitude', 'Longitude', 'Location'],
dtype='object')

In [6]: df.describe()

Out[6]: School or Taxi Taxi Garage


X Coordinate Y Coordinate Vehicle
Incident Zip Citywide Company Pick Up Lo
(State Plane) (State Plane) Type
Complaint Borough Location Name

count 298083.000000 2.971580e+05 297158.000000 0.0 0.0 0.0 0.0 0.

mean 10848.888645 1.004854e+06 203754.534416 NaN NaN NaN NaN NaN

std 583.182081 2.175338e+04 29880.183529 NaN NaN NaN NaN NaN

min 83.000000 9.133570e+05 121219.000000 NaN NaN NaN NaN NaN

25% 10310.000000 9.919752e+05 183343.000000 NaN NaN NaN NaN NaN

50% 11208.000000 1.003158e+06 201110.500000 NaN NaN NaN NaN NaN

75% 11238.000000 1.018372e+06 224125.250000 NaN NaN NaN NaN NaN

max 11697.000000 1.067173e+06 271876.000000 NaN NaN NaN NaN NaN

In [7]: df['Descriptor'].unique()

Out[7]: array(['Loud Music/Party', 'No Access', 'Commercial Overnight Parking',


'Blocked Sidewalk', 'Posted Parking Sign Violation',
'Blocked Hydrant', 'With License Plate', 'Partial Access',
'Unauthorized Bus Layover', 'Double Parked Blocking Vehicle',
'Double Parked Blocking Traffic', 'Vehicle', 'Loud Talking',
'Banging/Pounding', 'Car/Truck Music', 'Tortured',
'In Prohibited Area', 'Congestion/Gridlock', 'Neglected',
'Car/Truck Horn', 'In Public', 'Other (complaint details)', nan,
'No Shelter', 'Truck Route Violation', 'Unlicensed',
'Overnight Commercial Storage', 'Engine Idling',
'After Hours - Licensed Est', 'Detached Trailer',
'Underage - Licensed Est', 'Chronic Stoplight Violation',
'Loud Television', 'Chained', 'Building', 'In Car',
'Police Report Requested', 'Chronic Speeding',
'Playing in Unsuitable Place', 'Drag Racing',
'Police Report Not Requested', 'Nuisance/Truant', 'Homeless Issue',
'Language Access Complaint', 'Disruptive Passenger',
'Animal Waste'], dtype=object)

In [ ]:

In [ ]:

In [ ]:

In [8]: complaintTypecity = pd.DataFrame({'count': df.groupby(['Complaint Type','City']).siz


complaintTypecity

Out[8]: Complaint Type City count

0 Animal Abuse ARVERNE 38

localhost:8889/nbconvert/html/Customer Service Requests Analysis.ipynb?download=false 2/11


9/16/2021 Customer Service Requests Analysis

Complaint Type City count

1 Animal Abuse ASTORIA 125

2 Animal Abuse BAYSIDE 37

3 Animal Abuse BELLEROSE 7

4 Animal Abuse BREEZY POINT 2

... ... ... ...

759 Vending STATEN ISLAND 25

760 Vending SUNNYSIDE 15

761 Vending WHITESTONE 1

762 Vending WOODHAVEN 6

763 Vending WOODSIDE 15

764 rows × 3 columns

In [9]: df.groupby(['Borough','Complaint Type','Descriptor']).size()

Out[9]: Borough Complaint Type Descriptor


BRONX Animal Abuse Chained 132
In Car 36
Neglected 673
No Shelter 71
Other (complaint details) 311
...
Unspecified Noise - Vehicle Engine Idling 11
Posting Advertisement Vehicle 1
Traffic Truck Route Violation 1
Vending In Prohibited Area 2
Unlicensed 5
Length: 288, dtype: int64

In [ ]:

In [10]: #df = pd.read_csv("D:\311_Service_Requests_from_2010_to_Present.csv", parse_dates=["

In [11]: df["Request_Closing_Time"] = df["Closed Date"] - df["Created Date"]

In [12]: #Have a look at the status of tickets


df['Status'].value_counts().plot(kind='bar',alpha=0.6,figsize=(7,7))
plt.show()

localhost:8889/nbconvert/html/Customer Service Requests Analysis.ipynb?download=false 3/11


9/16/2021 Customer Service Requests Analysis

Complaint type Breakdown with bar plot to figure out majority of complaint
types and top 10 complaints
In [13]: #Complaint type Breakdown with bar plot to figure out majority of complaint types an
df['Complaint Type'].value_counts().head(10).plot(kind='barh',figsize=(5,5));

In [14]: df.groupby(["Borough","Complaint Type","Descriptor"]).size()

Out[14]: Borough Complaint Type Descriptor


BRONX Animal Abuse Chained 132
In Car 36
Neglected 673
No Shelter 71
Other (complaint details) 311
...
Unspecified Noise - Vehicle Engine Idling 11

localhost:8889/nbconvert/html/Customer Service Requests Analysis.ipynb?download=false 4/11


9/16/2021 Customer Service Requests Analysis

Posting Advertisement Vehicle 1


Traffic Truck Route Violation 1
Vending In Prohibited Area 2
Unlicensed 5
Length: 288, dtype: int64

Most Common Complaints


In [15]: majorcomplints=df.dropna(subset=["Complaint Type"])
majorcomplints=df.groupby("Complaint Type")
sortedComplaintType = majorcomplints.size().sort_values(ascending =False)
sortedComplaintType = sortedComplaintType.to_frame('count').reset_index()
sortedComplaintType
sortedComplaintType.head(10)

Out[15]: Complaint Type count

0 Blocked Driveway 77044

1 Illegal Parking 75361

2 Noise - Street/Sidewalk 48612

3 Noise - Commercial 35577

4 Derelict Vehicle 17718

5 Noise - Vehicle 17083

6 Animal Abuse 7778

7 Traffic 4498

8 Homeless Encampment 4416

9 Noise - Park 4042

In [16]: sortedComplaintType = sortedComplaintType.head()


plt.figure(figsize=(5,5))
plt.pie(sortedComplaintType['count'],labels=sortedComplaintType["Complaint Type"], a
plt.show()

In [17]: def prepareData(df):


df['Request_Closing_Time'] = ((df['Closed Date'] - df['Created Date']).dt.second
df_clean=df[df['Request_Closing_Time'].notnull()]
df_perfect = df_clean[df_clean['Closed Date'] >= df_clean['Created Date']]
return df_perfect

localhost:8889/nbconvert/html/Customer Service Requests Analysis.ipynb?download=false 5/11


9/16/2021 Customer Service Requests Analysis

df_perfect=prepareData(df)
(df_perfect['Complaint Type'].value_counts()).head(15).plot(kind='bar',figsize=(10,6

Out[17]: <AxesSubplot:title={'center':'Most Common Complaints'}>

In [18]: (df_perfect['Complaint Type'].value_counts()).head(15).plot(kind='bar',figsize=(10,6

Out[18]: <AxesSubplot:title={'center':'Most Common Complaints'}>

localhost:8889/nbconvert/html/Customer Service Requests Analysis.ipynb?download=false 6/11


9/16/2021 Customer Service Requests Analysis

Least Complaints
In [19]: (df_perfect['Complaint Type'].value_counts()).tail(10).plot(kind='bar',figsize=(10,6

Out[19]: <AxesSubplot:title={'center':'Least Frequent Complaints'}>

localhost:8889/nbconvert/html/Customer Service Requests Analysis.ipynb?download=false 7/11


9/16/2021 Customer Service Requests Analysis

In [20]: df['Borough'].value_counts().plot(kind='pie',autopct='%1.1f%%',explode=(0.15,0,0,0,0

Out[20]: <AxesSubplot:ylabel='Borough'>

localhost:8889/nbconvert/html/Customer Service Requests Analysis.ipynb?download=false 8/11


9/16/2021 Customer Service Requests Analysis

In [ ]:

In [21]: df_perfect['Location Type'].value_counts().plot(kind='bar',figsize=(10,8),title='Loc

Out[21]: <AxesSubplot:title={'center':'Location type Vs Complain'}>

In [22]: groupedby_complainttype= df_perfect.sort_values(['Request_Closing_Time']).groupby(['

In [23]: dataFrameByLocationType = pd.DataFrame(groupedby_complainttype)


dataFrameByLocationType

Out[23]: Request_Closing_Time

Location Type Complaint Type

Bridge Homeless Encampment 3.819306

Club/Bar/Restaurant Drinking 4.019785

Noise - Commercial 2.891485

Urinating in Public 4.491429

Commercial Animal Abuse 4.568575

... ... ...

localhost:8889/nbconvert/html/Customer Service Requests Analysis.ipynb?download=false 9/11


9/16/2021 Customer Service Requests Analysis

Request_Closing_Time

Location Type Complaint Type

Street/Sidewalk Urinating in Public 3.209283

Vending 3.791013

Subway Station Animal Abuse 3.035606

Urinating in Public 1.152130

Vacant Lot Derelict Vehicle 4.045354

69 rows × 1 columns

Hypothesis testing
Whether the average response time across complaint
types is similar or not (overall)

Are the type of complaint or service requested and


location related?
Whether the average response time across complaint types is
similar or not (overall)
In [24]: df_complain_and_average = df_perfect.groupby('Complaint Type').Request_Closing_Time

In [25]: df_complain_and_average = pd.DataFrame(df_complain_and_average)

In [26]: average_response_time = df_perfect['Request_Closing_Time'].mean()


average_response_time

Out[26]: 3.929396621862499

In [27]: df_perfect.shape
sample_data = df_perfect.sample(n=2000)
Hnull = "Response time accross the complain type is not similar"
Halt = "Response time accross the complain type is similar"

In [28]: from scipy.stats import ttest_1samp

In [29]: ttest,pvalue = ttest_1samp(sample_data['Request_Closing_Time'],average_response_time


if pvalue<0.005:
print("Reject the null hypothesis i.e",end=' ')
print(Halt)
else:
print("Reject the null hypothesis i.e",end=' ')
print(Halt)

Reject the null hypothesis i.e Response time accross the complain type is similar

Are the type of complaint or service requested and location


related

localhost:8889/nbconvert/html/Customer Service Requests Analysis.ipynb?download=false 10/11


9/16/2021 Customer Service Requests Analysis

In [30]: Hnull="there is no reltation betweent the Complain and Location"


Halt = "there is relation between the compalin and location"

In [31]: datatable = pd.crosstab(df_perfect['Complaint Type'],df_perfect['Location Type'])

In [32]: datatable.head()

Out[32]: House House


Location Type Bridge Club/Bar/Restaurant Commercial Highway and of Park Park/
Store Worship

Complaint Type

Animal Abuse 0 0 62 0 93 0 0

Animal in a Park 0 0 0 0 0 0 1

Bike/Roller/Skate
0 0 0 0 0 0 0
Chronic

Blocked
0 0 0 0 0 0 0
Driveway

Derelict Vehicle 0 0 0 13 0 0 0

In [33]: observed_values = datatable.values

In [34]: from scipy import stats


val = stats.chi2_contingency(datatable)
pvalue = val[1]
alpha = 0.05

In [35]: if pvalue<alpha:
print("Reject the null Hypothesis i.e ",end=' ')
print(Halt)
else:
print("Accept the null Hypothesis i.e",end=' ')
print(Hnull)

Reject the null Hypothesis i.e there is relation between the compalin and location
there is no reltation betweent the Complain and Location

In [ ]:

localhost:8889/nbconvert/html/Customer Service Requests Analysis.ipynb?download=false 11/11

You might also like