0% found this document useful (0 votes)
275 views9 pages

Airbnb PDF

The document discusses using machine learning models to classify Airbnb hosts as Superhosts without requiring them to apply, by analyzing variables like review scores, response times, authenticity, and home information from a dataset of over 23,000 Airbnb listings. Several supervised learning classifiers like random forests and gradient boosting machines were tested on a training and test dataset, with random forests performing best at predicting Superhost status. The models can help Airbnb more easily identify Superhosts based on their hosting performance and reviews.

Uploaded by

Yiyang Wang
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
275 views9 pages

Airbnb PDF

The document discusses using machine learning models to classify Airbnb hosts as Superhosts without requiring them to apply, by analyzing variables like review scores, response times, authenticity, and home information from a dataset of over 23,000 Airbnb listings. Several supervised learning classifiers like random forests and gradient boosting machines were tested on a training and test dataset, with random forests performing best at predicting Superhost status. The models can help Airbnb more easily identify Superhosts based on their hosting performance and reviews.

Uploaded by

Yiyang Wang
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

Superhost

Program
×
Airbnb Plus

InfiniteLoop
Camille Sambor Yiyang Wang Rebecca Lu Jiaxin Liu Yifan He Jiaqi Xu
Introduction

● Obtained Airbnb Superhost Dataset 4.8+ overall rating 10+ stays

● What is a Superhost?
● How can Superhost benefit Airbnb?
● What is the problem? <1% cancellation rate 90% response rate
○ Long pending times & long application form
○ The COVID 19 may influence the cancelation rate of superhost
● Main objectives:
○ Creation of an algorithm to help Airbnb more easily classify a Superhost without the need for an
application process.
Methodology
1. Clean data

1. Exploratory Data Analysis

1. Variable Selection
80% Training dataset
● Review scores
● Timeliness/ Promptness
Ordinary Host: ● Authenticity
Airbnb 8,374 ● House information
‘Superhost’ Superhost: 5,733
1. Modeling and Prediction:
program ● Binary classification problem
● Supervised learning - Classification
23,452 rows
×
106 columns
1. Impute missing values with median
20% Testing dataset 2. Prediction
3. Model Evaluation

Dataset from Airbnb official website


Exploratory Data Analysis
Results of Modeling
We use several supervised learning classifiers to classify the predicted category ( is a
superhost or not )

● The ‘white box’ : Logistic Regression


○ A white box that the effect of each feature could be explain to marketing team.

● The ‘black box’ :


○ Classification Tree
○ Ensemble methods: Random Forest & Gradient Boosting Machine (GBM)
Results of Modeling
● What could help us identified the superhost ?
Model Evaluation
Area Under ROC Curve:

The closer the AUC to 1, the better.

● Random Forest has the best


prediction performance on the
test data set .
- Random Forest
- Gradient Boosting Machines
- Logistic Regression
- Decision Tree
Recommendation

4.8+ overall rating 10+ stays

Before:

3+ pending months <1% cancellation rate 90% response rate


Recommendation

4.8+ overall rating 10+ stays


High Review Exact More Number of
Before:
Score Rating
Faster Host
Response Rate Location Reviews

3+ pending months <1% cancellation rate 90% response rate

More Host Instant Lower the Price Host Identity


Listing Count Bookable for an extra Verifies
people

You might also like