0% found this document useful (0 votes)

26 views11 pages

Zomato Rating Prediction

The document outlines a mini-project focused on predicting Zomato restaurant ratings using a dataset containing 51,717 entries and 17 features. It details the steps taken for data cleaning, including handling missing values, removing unnecessary columns, and converting data types for analysis. The project aims to prepare the dataset for further analysis and modeling to predict restaurant ratings.

Uploaded by

kanadeshubhu04

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

26 views11 pages

Zomato Rating Prediction

Uploaded by

kanadeshubhu04

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 11

NAME : Kanade Shubhada Sanjay

ROLL NO. : 65
DIV : A

MINI-PROJECT
Zomato-rating-prediction

1. Importing the libraires

In [1]: import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

import warnings
warnings.filterwarnings('ignore')

1.1 Loading the dataset

In [2]: data = pd.read_csv('../input/zomato-bangalore-restaurants/zomato.csv')

In [3]: data
Out[3]:
url address name online_order book_t

942, 21st Main

https://www.zomato.com/bangalore/jalsa- Road, 2nd
0 Stage, Jalsa Yes
banasha...
Banashankari,
...

https://www.zomato.com/bangalore/spice- Spice
Elephant

1112, Next to
https://www.zomato.com/SanchurroBangalore? KIMS Medical San Churro
2 Yes
cont... College, 17th Cafe
Cross...

Addhuri
https://www.zomato.com/bangalore/addhuri-
Udupi
Bhojana

10, 3rd Floor,

https://www.zomato.com/bangalore/grand- Lakshmi Grand
4 No
village... Associates, Village
Gandhi Baza...

... ... ... ... ...

Best Brews
Four Points by
- Four
https://www.zomato.com/bangalore/best- Sheraton
51712 Points by No
brews-fo... Bengaluru,
Sheraton
43/3, White...
Bengaluru...

https://www.zomato.com/bangalore/vinod-
Palya, And
bar-and...
Mahadevapura, Restaurant

Plunge -
Sheraton
Sheraton
Grand
https://www.zomato.com/bangalore/plunge- Grand
51714 Bengaluru No
sherat... Bengaluru
Whitefield
Whitefield
Hotel & Co...
H...

Grand
Bengaluru
Whitefield
url address name online_order book_t

ITPL Main
Road, KIADB The Nest -
https://www.zomato.com/bangalore/the-nest-
51716 Export The Den No
the-...
Promotion Bengaluru
Industr...

51717 rows × 17 columns

1.2 checking the shape of dataset

In [4]: data.shape

Out[4]: (51717, 17)

there are total 51717 samples with 17 features.

In [5]: data.columns

Out[5]: Index(['url', 'address', 'name', 'online_order', 'book_table', 'rate', 'votes',

'phone', 'location', 'rest_type', 'dish_liked', 'cuisines',
'approx_cost(for two people)', 'reviews_list', 'menu_item',
'listed_in(type)', 'listed_in(city)'],
dtype='object')

1.3 checking the datatypes

In [6]: data.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 51717 entries, 0 to 51716
Data columns (total 17 columns):
# Column Non-Null Count Dtype

0 url 51717 non-null object

1 address 51717 non-null object
2 name 51717 non-null object
3 online_order 51717 non-null object
4 book_table 51717 non-null object
5 rate 43942 non-null object
6 votes 51717 non-null int64
7 phone 50509 non-null object
8 location 51696 non-null object
9 rest_type 51490 non-null object
10 dish_liked 23639 non-null object
11 cuisines 51672 non-null object
12 approx_cost(for two people) 51371 non-null object
13 reviews_list 51717 non-null object
14 menu_item 51717 non-null object
15 listed_in(type) 51717 non-null object
16 listed_in(city) 51717 non-null object
dtypes: int64(1), object(16)
memory usage: 6.7+ MB

there are so many object type columns, we have to convert them into numeric type. letter we will
convert oject dtype to numeric type
2. Data Cleaning
2.1 checking the missing values

In [7]: data.isnull().sum()

Out[7]: url 0
address 0
name 0
online_order 0
book_table 0
rate 7775
votes 0
phone 1208
location 21
rest_type 227
dish_liked 28078
cuisines 45
approx_cost(for two people) 346
reviews_list 0
menu_item 0
listed_in(type) 0
listed_in(city) 0
dtype: int64

there are so many null values.we can clearly see that in the 'rate', 'phone', 'location', 'rest_type',
'dish_liked', 'cuisines' and 'approx_cost(for two people)' these columns have missing values.So
firstly we have to handle the missing values.

2.2 Removing the unnecessary columns form data

In [8]: df = data.drop(['url', 'phone'], axis = 1) # dropped 'url' and 'phone' columns

In [9]: df.head()
Out[9]:
address name online_order book_table rate votes location rest_type dish_lik

Pas
942, 21st Lun
Main Road, Buff
Casual
0 2nd Stage, Jalsa Yes Yes 4.1/5 775 Banashankari Mas
Banashankari, Dining
Papa
... Pane
Laj

Mom
2nd Floor, 80 Lun
Feet Road, Spice Casual Buff
1 Yes No 4.1/5 787 Banashankari
Near Big Elephant Dining Chocola
Bazaar, 6th ... Nirva
Thai

Churr
1112, Next to San
KIMS Medical Cafe, Cannello
2 Churro
College, 17th Yes No 3.8/5 918 Banashankari Casual Minestro
Cafe Dining Soup, H
Cross...
Cho

Addhuri
Quick Mas
Bites
Bhojana
Banashankar...

10, 3rd Floor,

Casual
Lakshmi Grand No No 3.8/5 166 Basavanagudi Panipu
4
Associates, Village Dining Gol Gap
Gandhi Baza...

2.3 handling the null or missing values

In [10]: df.dropna(inplace = True)

In [11]: df.isnull().sum()

Out[11]: address 0
name 0
online_order 0
book_table 0
rate 0
votes 0
location 0
rest_type 0
dish_liked 0
cuisines 0
approx_cost(for two people) 0
reviews_list 0
menu_item 0
listed_in(type) 0
listed_in(city) 0
dtype: int64

Now there is no null values

2.4 checking the duplicates & handling the duplicates values

In [12]: df.duplicated().sum()

Out[12]: 11

In [13]: df.drop_duplicates(inplace = True)

df.duplicated().sum()

Out[13]: 0

Now there are no duplicate values.

In [ ]:

2.5 Renaming the columns appropriately

In [14]: df = df.rename(columns = {'approx_cost(for two people)':'cost',

'listed_in(type)':'type', 'listed_in(city)': 'city'})

In [15]: df.head()

Out[15]: address name online_order book_table rate votes location rest_type dish_lik

Pas
942, 21st Lun
Main Road, Casual Buff
Jalsa Yes Yes 4.1/5 775 Banashankari
0 2nd Stage, Mas
Dining
Banashankari, Papa
... Pane
Laj

Churr
1112, Next to San Cafe, Cannello
KIMS Medical
2 Churro Yes No 3.8/5 918 Banashankari Casual Minestro
College, 17th
Cafe Dining Soup, H
Cross...
Cho

Addhuri
Quick Mas
Bites
Bhojana
Banashankar...

10, 3rd Floor,

Lakshmi Grand Casual Panipu
4 No No 3.8/5 166 Basavanagudi
Associates, Village Dining Gol Gap
Gandhi Baza...

Sucessfully rename the columns

2.6 cleaning the "cost" column

In [16]: df['cost'].unique()

Out[16]: array(['800', '300', '600', '700', '550', '500', '450', '650', '400',
'750', '200', '850', '1,200', '150', '350', '250', '1,500',
'1,300', '1,000', '100', '900', '1,100', '1,600', '950', '230',
'1,700', '1,400', '1,350', '2,200', '2,000', '1,800', '1,900',
'180', '330', '2,500', '2,100', '3,000', '2,800', '3,400', '40',
'1,250', '3,500', '4,000', '2,400', '1,450', '3,200', '6,000',
'1,050', '4,100', '2,300', '120', '2,600', '5,000', '3,700',
'1,650', '2,700', '4,500'], dtype=object)

here we can see that data point is string type and some values like 5,000 6,000 have comma(,).
we have to remove that ',' from the values and we have convert them into numeric type.

In [17]: df['cost'] = df['cost'].apply(lambda x:x.replace(',', '')) # lo

df['cost'] = df['cost'].astype(float)

df['cost'].unique()

Out[17]: array([ 800., 300., 600., 700., 550., 500., 450., 650., 400.,
750., 200., 850., 1200., 150., 350., 250., 1500., 1300.,
1000., 100., 900., 1100., 1600., 950., 230., 1700., 1400.,
1350., 2200., 2000., 1800., 1900., 180., 330., 2500., 2100.,
3000., 2800., 3400., 40., 1250., 3500., 4000., 2400., 1450.,
3200., 6000., 1050., 4100., 2300., 120., 2600., 5000., 3700.,
1650., 2700., 4500.])

Now sucessfully we converted the values into numeric type

2.7 handling the rate columns

In [18]: df['rate'].unique()

Out[18]: array(['4.1/5', '3.8/5', '3.7/5', '4.6/5', '4.0/5', '4.2/5', '3.9/5',

'3.0/5', '3.6/5', '2.8/5', '4.4/5', '3.1/5', '4.3/5', '2.6/5',
'3.3/5', '3.5/5', '3.8 /5', '3.2/5', '4.5/5', '2.5/5', '2.9/5',
'3.4/5', '2.7/5', '4.7/5', 'NEW', '2.4/5', '2.2/5', '2.3/5',
'4.8/5', '3.9 /5', '4.2 /5', '4.0 /5', '4.1 /5', '2.9 /5',
'2.7 /5', '2.5 /5', '2.6 /5', '4.5 /5', '4.3 /5', '3.7 /5',
'4.4 /5', '4.9/5', '2.1/5', '2.0/5', '1.8/5', '3.4 /5', '3.6 /5',
'3.3 /5', '4.6 /5', '4.9 /5', '3.2 /5', '3.0 /5', '2.8 /5',
'3.5 /5', '3.1 /5', '4.8 /5', '2.3 /5', '4.7 /5', '2.4 /5',
'2.1 /5', '2.2 /5', '2.0 /5', '1.8 /5'], dtype=object)

here rating column also string type. we have to convert them into numeric type. we have to
remove the '/5' form given values.

there is 'NEW' value which make no sense. SO we have to remove that values.

In [19]: df = df.loc[df.rate != 'NEW'] # geting rid of 'NEW'

In [20]: df['rate'].unique()
Out[20]: array(['4.1/5', '3.8/5', '3.7/5', '4.6/5', '4.0/5', '4.2/5', '3.9/5',
'3.0/5', '3.6/5', '2.8/5', '4.4/5', '3.1/5', '4.3/5', '2.6/5',
'3.3/5', '3.5/5', '3.8 /5', '3.2/5', '4.5/5', '2.5/5', '2.9/5',
'3.4/5', '2.7/5', '4.7/5', '2.4/5', '2.2/5', '2.3/5', '4.8/5',
'3.9 /5', '4.2 /5', '4.0 /5', '4.1 /5', '2.9 /5', '2.7 /5',
'2.5 /5', '2.6 /5', '4.5 /5', '4.3 /5', '3.7 /5', '4.4 /5',
'4.9/5', '2.1/5', '2.0/5', '1.8/5', '3.4 /5', '3.6 /5', '3.3 /5',
'4.6 /5', '4.9 /5', '3.2 /5', '3.0 /5', '2.8 /5', '3.5 /5',
'3.1 /5', '4.8 /5', '2.3 /5', '4.7 /5', '2.4 /5', '2.1 /5',
'2.2 /5', '2.0 /5', '1.8 /5'], dtype=object)

In [21]: df['rate'] = df['rate'].apply(lambda x:x.replace('/5', ''))

df['rate'].unique()

Out[21]: array(['4.1', '3.8', '3.7', '4.6', '4.0', '4.2', '3.9', '3.0', '3.6',
'2.8', '4.4', '3.1', '4.3', '2.6', '3.3', '3.5', '3.8 ', '3.2',
'4.5', '2.5', '2.9', '3.4', '2.7', '4.7', '2.4', '2.2', '2.3',
'4.8', '3.9 ', '4.2 ', '4.0 ', '4.1 ', '2.9 ', '2.7 ', '2.5 ',
'2.6 ', '4.5 ', '4.3 ', '3.7 ', '4.4 ', '4.9', '2.1', '2.0', '1.8',
'3.4 ', '3.6 ', '3.3 ', '4.6 ', '4.9 ', '3.2 ', '3.0 ', '2.8 ',
'3.5 ', '3.1 ', '4.8 ', '2.3 ', '4.7 ', '2.4 ', '2.1 ', '2.2 ',
'2.0 ', '1.8 '], dtype=object)

In [22]: df['rate'] = df['rate'].apply(lambda x: float(x))

df['rate']

Out[22]: 0 4.1
1 4.1
2 3.8
3 3.7
4 3.8
...
51705 3.8
51707 3.9
51708 2.8
51711 2.5
51715 4.3
Name: rate, Length: 23248, dtype: float64

Now our data is cleaned and we can perform visulization

3. Data Visulaization
3.1 Most famous restaurant chains in banaglore

In [23]: plt.figure(figsize = (17,10))

chains = df['name'].value_counts()[:20]
sns.barplot(x = chains, y= chains.index, palette= 'deep')
plt.title('Most famous restaurants chains in bangalore')
plt.xlabel('Number of outlets')
plt.show()
Insights:

'Onesta', 'Empire Restaurant' & 'KFC' are the most famous restaurant in bangalore.

In [ ]:

3.2 checking online order or not

In [24]: v = df['online_order'].value_counts()
fig = plt.gcf()
fig.set_size_inches((10,6))
cmap = plt.get_cmap('Set3')
color = cmap(np.arange(len(v)))

plt.pie(v, labels = v.index, wedgeprops= dict(width = 0.6),autopct = '%0.02f', shadow = Tru

plt.title('Online orders', fontsize = 20)
plt.show()

Insight:
Most Restaurants offer option for online order and delivery.

3.3 Book table or not

In [25]: v = df['book_table'].value_counts()

fig = plt.gcf()
fig.set_size_inches((8,6))
cmap = plt.get_cmap('Set1')
color = cmap(np.arange(len(v)))

plt.pie(v, labels = v.index, wedgeprops= dict(width = 0.6),autopct = '%0.02f', shadow = Tru

plt.title('Book Table', fontsize = 20)
plt.show()

Insight:

Most of restaurants doesn't offer table booking.

3.4 Rating Distribution

In [26]: plt.figure(figsize = (9,7))

sns.distplot(df['rate'])
plt.title('Rating Distribution')

Out[26]: Text(0.5, 1.0, 'Rating Distribution')

Insight:

We can infer from above that most of the ratings are within 3.5 and 4.5

Foodhub Project Full Code .HTML
89% (9)
Foodhub Project Full Code .HTML
30 pages
Google Collab & Python
100% (1)
Google Collab & Python
50 pages
Multiple - Linear - Regression - AirBNB - Student - File0.2 - New (1) .Ipynb - Colaboratory
No ratings yet
Multiple - Linear - Regression - AirBNB - Student - File0.2 - New (1) .Ipynb - Colaboratory
8 pages
Pandas
No ratings yet
Pandas
43 pages
Zomato EDA
No ratings yet
Zomato EDA
8 pages
Zomoto Data Analysis Using Python - 1
No ratings yet
Zomoto Data Analysis Using Python - 1
10 pages
EDA Zomato 1681401606
No ratings yet
EDA Zomato 1681401606
15 pages
Zomoto Data Analysis Using Python
No ratings yet
Zomoto Data Analysis Using Python
10 pages
Zomato Dataset Analysis 1742311371
No ratings yet
Zomato Dataset Analysis 1742311371
16 pages
Predicting Home Prices in Bangalore
No ratings yet
Predicting Home Prices in Bangalore
18 pages
Project Report1
No ratings yet
Project Report1
9 pages
Multiple - Linear - Regression - AirBNB - Solution-0.2 - New - Ipynb - Colaboratory
No ratings yet
Multiple - Linear - Regression - AirBNB - Solution-0.2 - New - Ipynb - Colaboratory
11 pages
DA - Project 1
No ratings yet
DA - Project 1
12 pages
F 10
No ratings yet
F 10
3 pages
Zomato Data Aanalysis Using Machine Learning Algorithms
No ratings yet
Zomato Data Aanalysis Using Machine Learning Algorithms
7 pages
Finalproj Aml
No ratings yet
Finalproj Aml
69 pages
PYF Project LearnerNotebook LowCode
No ratings yet
PYF Project LearnerNotebook LowCode
6 pages
Zomato Sales Analysis
No ratings yet
Zomato Sales Analysis
13 pages
Zomato Data Analysis
No ratings yet
Zomato Data Analysis
11 pages
Zomato Ishani Abhi
No ratings yet
Zomato Ishani Abhi
33 pages
Housing Prices Notebook
No ratings yet
Housing Prices Notebook
14 pages
Customer Segmentation 1683225943
No ratings yet
Customer Segmentation 1683225943
34 pages
BigMart Sales Data Analysis
No ratings yet
BigMart Sales Data Analysis
16 pages
Task 1 - Data Preparation and Customer Analytics - Jupyter Notebook
No ratings yet
Task 1 - Data Preparation and Customer Analytics - Jupyter Notebook
64 pages
Boston Housing Solutions
No ratings yet
Boston Housing Solutions
3 pages
Restaurants Rating Prediction Using Machine Learning Algorithms
No ratings yet
Restaurants Rating Prediction Using Machine Learning Algorithms
4 pages
Project Detailed Review
No ratings yet
Project Detailed Review
9 pages
Project Template Notebook Ipynb 1
No ratings yet
Project Template Notebook Ipynb 1
23 pages
Final Project Report DA
No ratings yet
Final Project Report DA
3 pages
Customer Data Outliers Pyspark
No ratings yet
Customer Data Outliers Pyspark
1 page
00 Data Wrangling
No ratings yet
00 Data Wrangling
10 pages
Documentation Final
No ratings yet
Documentation Final
53 pages
Supervised Regression
No ratings yet
Supervised Regression
24 pages
Railway Price Prediction
No ratings yet
Railway Price Prediction
20 pages
Real Estate Price Prediction Model
No ratings yet
Real Estate Price Prediction Model
33 pages
House Price Prediction Models
No ratings yet
House Price Prediction Models
16 pages
BigMart PDF
100% (1)
BigMart PDF
42 pages
Project 2
No ratings yet
Project 2
40 pages
Ex 1
No ratings yet
Ex 1
119 pages
Learn Pandas
No ratings yet
Learn Pandas
40 pages
P1) Code Uber
No ratings yet
P1) Code Uber
6 pages
Quantam - Learning - Colaboratory
No ratings yet
Quantam - Learning - Colaboratory
13 pages
Food Recommendation System
No ratings yet
Food Recommendation System
13 pages
Data Clearning
No ratings yet
Data Clearning
7 pages
Data Science Project
No ratings yet
Data Science Project
7 pages
F 5
No ratings yet
F 5
2 pages
Setup: Chapter 2 - End-To-End Machine Learning Project
No ratings yet
Setup: Chapter 2 - End-To-End Machine Learning Project
31 pages
Pyspark Interview Questions
No ratings yet
Pyspark Interview Questions
4 pages
F 12
No ratings yet
F 12
3 pages
Restaurants Rating Prediction Using Machine Learning Algorithms
No ratings yet
Restaurants Rating Prediction Using Machine Learning Algorithms
4 pages
Colab
No ratings yet
Colab
2 pages
Report
No ratings yet
Report
25 pages
House Price Prediction: # Importing Necessary Libraries
No ratings yet
House Price Prediction: # Importing Necessary Libraries
18 pages
Pandas Lec 2
No ratings yet
Pandas Lec 2
21 pages
Assigment1 - Manuel Tapia
No ratings yet
Assigment1 - Manuel Tapia
3 pages
Zomato SQL Analysis Project
No ratings yet
Zomato SQL Analysis Project
23 pages
Machine Learning Lab
No ratings yet
Machine Learning Lab
20 pages
Zomato Restaurant Data Analysis Using PowerBI
50% (2)
Zomato Restaurant Data Analysis Using PowerBI
2 pages
Pandas Syntax Revision For ML
No ratings yet
Pandas Syntax Revision For ML
10 pages
Project 12 Big Mart Sales Prediction
No ratings yet
Project 12 Big Mart Sales Prediction
15 pages
MIA 1. Embedded Systems
No ratings yet
MIA 1. Embedded Systems
12 pages
Spec Prot 7 V1 6 1 E
No ratings yet
Spec Prot 7 V1 6 1 E
80 pages
StarterNotebook - Jupyter Notebook
No ratings yet
StarterNotebook - Jupyter Notebook
12 pages
L Python PDF
No ratings yet
L Python PDF
52 pages
C Programming
No ratings yet
C Programming
52 pages
FOCP Jan - Feb 2009 PDF
No ratings yet
FOCP Jan - Feb 2009 PDF
2 pages
CDG C
No ratings yet
CDG C
20 pages
PC Exam
No ratings yet
PC Exam
59 pages
Adv C MCQ
No ratings yet
Adv C MCQ
6 pages
Notes of CH 1 Exception Handling
No ratings yet
Notes of CH 1 Exception Handling
17 pages
MXR Serial Protocol
No ratings yet
MXR Serial Protocol
6 pages
Motion Function Block Lib Error Code
No ratings yet
Motion Function Block Lib Error Code
132 pages
Xii Practical Question21
No ratings yet
Xii Practical Question21
14 pages
C++ Development Environment
No ratings yet
C++ Development Environment
955 pages
CCP Notes - 2016 - 231118 - 074723
No ratings yet
CCP Notes - 2016 - 231118 - 074723
87 pages
Lab14 Diapo25 PDF
No ratings yet
Lab14 Diapo25 PDF
92 pages
CSC 126 Chapter 2 PART 2
No ratings yet
CSC 126 Chapter 2 PART 2
17 pages
Assignment 2
No ratings yet
Assignment 2
7 pages
Computer Applications Year File - 2024-2025
No ratings yet
Computer Applications Year File - 2024-2025
48 pages
FB1 0309 en en-US
No ratings yet
FB1 0309 en en-US
1,770 pages
Syw2l2p b256 E3 FREE Chapter03
No ratings yet
Syw2l2p b256 E3 FREE Chapter03
11 pages
Go Programming For Beginners An Introduction
No ratings yet
Go Programming For Beginners An Introduction
137 pages
C# Strings: String Vs String
No ratings yet
C# Strings: String Vs String
58 pages
CP 2 Mark and 16 Mark
No ratings yet
CP 2 Mark and 16 Mark
50 pages
Chapter 1 Introduction To Data Structure
No ratings yet
Chapter 1 Introduction To Data Structure
34 pages
Module - 2 - CP
No ratings yet
Module - 2 - CP
30 pages
Chapter #1 Concepts of Computer Programming
No ratings yet
Chapter #1 Concepts of Computer Programming
35 pages
Cantilever Labs
No ratings yet
Cantilever Labs
1 page
Solidity Documentation
No ratings yet
Solidity Documentation
117 pages

Zomato Rating Prediction

Uploaded by

Zomato Rating Prediction

Uploaded by

NAME : Kanade Shubhada Sanjay

1. Importing the libraires

1.1 Loading the dataset

In [2]: data = pd.read_csv('../input/zomato-bangalore-restaurants/zomato.csv')

942, 21st Main

10, 3rd Floor,

... ... ... ... ...

51717 rows × 17 columns

1.2 checking the shape of dataset

Out[4]: (51717, 17)

there are total 51717 samples with 17 features.

Out[5]: Index(['url', 'address', 'name', 'online_order', 'book_table', 'rate', 'votes',

1.3 checking the datatypes

0 url 51717 non-null object

2.2 Removing the unnecessary columns form data

In [8]: df = data.drop(['url', 'phone'], axis = 1) # dropped 'url' and 'phone' columns

10, 3rd Floor,

2.3 handling the null or missing values

In [10]: df.dropna(inplace = True)

Now there is no null values

In [13]: df.drop_duplicates(inplace = True)

Now there are no duplicate values.

2.5 Renaming the columns appropriately

In [14]: df = df.rename(columns = {'approx_cost(for two people)':'cost',

10, 3rd Floor,

Sucessfully rename the columns

In [17]: df['cost'] = df['cost'].apply(lambda x:x.replace(',', '')) # lo

Now sucessfully we converted the values into numeric type

2.7 handling the rate columns

Out[18]: array(['4.1/5', '3.8/5', '3.7/5', '4.6/5', '4.0/5', '4.2/5', '3.9/5',

In [19]: df = df.loc[df.rate != 'NEW'] # geting rid of 'NEW'

In [21]: df['rate'] = df['rate'].apply(lambda x:x.replace('/5', ''))

In [22]: df['rate'] = df['rate'].apply(lambda x: float(x))

Now our data is cleaned and we can perform visulization

In [23]: plt.figure(figsize = (17,10))

3.2 checking online order or not

plt.pie(v, labels = v.index, wedgeprops= dict(width = 0.6),autopct = '%0.02f', shadow = Tru

3.3 Book table or not

plt.pie(v, labels = v.index, wedgeprops= dict(width = 0.6),autopct = '%0.02f', shadow = Tru

Most of restaurants doesn't offer table booking.

3.4 Rating Distribution

In [26]: plt.figure(figsize = (9,7))

Out[26]: Text(0.5, 1.0, 'Rating Distribution')

You might also like