Problem Statement

You have been provided with a Seattle Airbnb dataset and asked to predict listing prices in the test set using a model of your choice. You are to perform exploratory data analysis to understand relationships between variables, conduct any needed data engineering, test several models and select a final model to make predictions. Your submission should include a Jupyter notebook documenting your process and a CSV file with predictions for the test set.

Uploaded by

suryansh

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

64 views

Problem Statement

Uploaded by

suryansh

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 1

You have been provided with the Seattle AirBnb dataset of listings data.

You will need to predict

the prices of listings shared in the test dataset. A natural use case for this would be in helping
people price their listings. In your assignment, feel free to use either Python or R, with any
libraries of your choice. You will be evaluated on being able to justify your solution during the
interview, explain the underlying mathematics and statistical phenomena, and make accurate
predictions.

Task Description

1. Download a sample of the Seattle AirBnb listings dataset linked here (original data can be
found here).

2. The goal of the assignment is to predict the prices of AirBnb listings from the test set, using a
model carefully selected by you, trained, tested, and explained.

3. Conduct some exploratory data analysis and understand the relationships between potential
predictors. Document your EDA in a notebook.

4. Note that this is not a particularly large dataset. You will be partially scored based on your
ability to perform ETL on the dataset. Describe what you have done for ETL in 3-4 sentences.

5. Try out a few different models (use your judgement after doing the EDA), and note down why
you have tried each one (2-3 sentences describing the “why” is enough).

6. Pick your final model, and explain why this model is better than the others. Train it, test it, and
list out your analyses (4-5 sentences, or more if required). Finally, run your predictions on the
real test set provided above. Submission Your final submission should be have two files, as
follows

: 1. Notebook with the following components and partial scores for each component: a. EDA-
documented in the notebook, with graphs describing correlations between variables, potential
predictors, initial analyses on the data, and feature engineering (if any)

b. Data Engineering - documented in the notebook, in a few sentences describing the ETL
process and any data engineering that was performed

c. Initial Modelling - a few models run on smaller folds of the dataset, with explanations for why
each model was experimented with

d. Model Selection - analyses around output from each of the models initially selected, and
justification for selecting one model over the others you had initially contemplated 2. CSV file of
listing prices for the test set: a. Final Predictions - each listing from the data set and the model-
predicted price (2 columns: id, price)

ISTQB Advanced Level Technical Test Analyst- Exam Insights: Q&A with Explanations
From Everand
ISTQB Advanced Level Technical Test Analyst- Exam Insights: Q&A with Explanations
SUJAN
No ratings yet
Managing the Testing Process: Practical Tools and Techniques for Managing Hardware and Software Testing
From Everand
Managing the Testing Process: Practical Tools and Techniques for Managing Hardware and Software Testing
Rex Black
4/5 (8)
Practice Questions for Tableau Desktop Specialist Certification Case Based
From Everand
Practice Questions for Tableau Desktop Specialist Certification Case Based
Exam OG
5/5 (1)
Data Scientist Test Task V2
No ratings yet
Data Scientist Test Task V2
1 page
Accenture Risk Analytics Network Credit Risk Analytics
No ratings yet
Accenture Risk Analytics Network Credit Risk Analytics
16 pages
DATA MINING AND MACHINE LEARNING. PREDICTIVE TECHNIQUES: REGRESSION, GENERALIZED LINEAR MODELS, SUPPORT VECTOR MACHINE AND NEURAL NETWORKS
From Everand
DATA MINING AND MACHINE LEARNING. PREDICTIVE TECHNIQUES: REGRESSION, GENERALIZED LINEAR MODELS, SUPPORT VECTOR MACHINE AND NEURAL NETWORKS
César Pérez López
No ratings yet
Document 4 (1)
No ratings yet
Document 4 (1)
4 pages
NN - CCP
No ratings yet
NN - CCP
10 pages
Airbnb Pricing Predictions
No ratings yet
Airbnb Pricing Predictions
8 pages
ese lab file
No ratings yet
ese lab file
30 pages
House price prediction
No ratings yet
House price prediction
5 pages
AP Computer Science Principles: Student-Crafted Practice Tests For Excellence
From Everand
AP Computer Science Principles: Student-Crafted Practice Tests For Excellence
Sama Alshatali
No ratings yet
Mastering Data Structures and Algorithms in Python & Java
From Everand
Mastering Data Structures and Algorithms in Python & Java
Sachin Naha
No ratings yet
For House Price Prediction Model
No ratings yet
For House Price Prediction Model
9 pages
Databyte ML Task 1
No ratings yet
Databyte ML Task 1
6 pages
DATA MINING and MACHINE LEARNING: CLUSTER ANALYSIS and kNN CLASSIFIERS. Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING: CLUSTER ANALYSIS and kNN CLASSIFIERS. Examples with MATLAB
César Pérez López
No ratings yet
Shub Neet Dt
No ratings yet
Shub Neet Dt
12 pages
Administering Microsoft Azure SQL Solutions DP 300
From Everand
Administering Microsoft Azure SQL Solutions DP 300
Manish Soni
No ratings yet
Random Sample Consensus: Robust Estimation in Computer Vision
From Everand
Random Sample Consensus: Robust Estimation in Computer Vision
Fouad Sabry
No ratings yet
HousePricePrediction_Zillow_solution_methodology
No ratings yet
HousePricePrediction_Zillow_solution_methodology
5 pages
#Practical 1 - Select and Write Down The Problem Statement For A Real Time System of Relevance
No ratings yet
#Practical 1 - Select and Write Down The Problem Statement For A Real Time System of Relevance
14 pages
Data Science with R: Beginner to Expert
From Everand
Data Science with R: Beginner to Expert
Narayana Nemani
No ratings yet
House Price Prediction Using Machine Learning Techniques
No ratings yet
House Price Prediction Using Machine Learning Techniques
5 pages
House Price Prediction Using Machine Learning Techniques
No ratings yet
House Price Prediction Using Machine Learning Techniques
5 pages
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
César Pérez López
No ratings yet
Evolutionary Algorithms for Food Science and Technology
From Everand
Evolutionary Algorithms for Food Science and Technology
Evelyne Lutton
No ratings yet
BMGT 7074
No ratings yet
BMGT 7074
21 pages
House Price Prediction 3 47
No ratings yet
House Price Prediction 3 47
45 pages
House price predictor ppt Project
No ratings yet
House price predictor ppt Project
13 pages
Data Analysis Project MAIN
No ratings yet
Data Analysis Project MAIN
6 pages
100 Puzzles to Learn Data Warehousing
From Everand
100 Puzzles to Learn Data Warehousing
Cristian Scutaru
No ratings yet
DS Assignment (1)
No ratings yet
DS Assignment (1)
2 pages
DEEP LEARNING TECHNIQUES: CLUSTER ANALYSIS and PATTERN RECOGNITION with NEURAL NETWORKS. Examples with MATLAB
From Everand
DEEP LEARNING TECHNIQUES: CLUSTER ANALYSIS and PATTERN RECOGNITION with NEURAL NETWORKS. Examples with MATLAB
César Pérez López
No ratings yet
Salesforce Certified Platform Developer I CRT-450 Exam Preparation
From Everand
Salesforce Certified Platform Developer I CRT-450 Exam Preparation
Georgio Daccache
No ratings yet
ml project part a 1
No ratings yet
ml project part a 1
6 pages
Surprise Housing Case Study Coincent
No ratings yet
Surprise Housing Case Study Coincent
4 pages
Predictive Modeling (MP) Project Report
100% (1)
Predictive Modeling (MP) Project Report
73 pages
A Synopsys Report
No ratings yet
A Synopsys Report
16 pages
Module 5
No ratings yet
Module 5
46 pages
Machine Learning For Data Science
No ratings yet
Machine Learning For Data Science
2 pages
Tableau 8.2 Training Manual: From Clutter to Clarity
From Everand
Tableau 8.2 Training Manual: From Clutter to Clarity
Larry Keller
No ratings yet
Functional Python Programming
From Everand
Functional Python Programming
Steven Lott
No ratings yet
Final Project - Regression Models
100% (1)
Final Project - Regression Models
35 pages
CS Assignment (Raam Kumar)
No ratings yet
CS Assignment (Raam Kumar)
32 pages
MY PRO DAY 9 Copy
No ratings yet
MY PRO DAY 9 Copy
59 pages
Couchbase Certified Java Developer - Exam Practice Tests
From Everand
Couchbase Certified Java Developer - Exam Practice Tests
Cristian Scutaru
No ratings yet
Title Predicting House Pricing Using AIML (KASHISH)
No ratings yet
Title Predicting House Pricing Using AIML (KASHISH)
2 pages
Real-Estate Property
No ratings yet
Real-Estate Property
11 pages
End To End Machine Learning Problem Problem Under Discussion
No ratings yet
End To End Machine Learning Problem Problem Under Discussion
12 pages
Apache Cassandra Developer Associate - Exam Practice Tests
From Everand
Apache Cassandra Developer Associate - Exam Practice Tests
Cristian Scutaru
No ratings yet
DATA MINING and MACHINE LEARNING. CLASSIFICATION PREDICTIVE TECHNIQUES: SUPPORT VECTOR MACHINE, LOGISTIC REGRESSION, DISCRIMINANT ANALYSIS and DECISION TREES: Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING. CLASSIFICATION PREDICTIVE TECHNIQUES: SUPPORT VECTOR MACHINE, LOGISTIC REGRESSION, DISCRIMINANT ANALYSIS and DECISION TREES: Examples with MATLAB
César Pérez López
No ratings yet
Mastering Data Structures and Algorithms in C and C++
From Everand
Mastering Data Structures and Algorithms in C and C++
Sachin Naha
No ratings yet
0.1 Guilherme Marthe - Boston House Pricing Challenge
100% (1)
0.1 Guilherme Marthe - Boston House Pricing Challenge
15 pages
PY016
No ratings yet
PY016
8 pages
Practice Questions for UiPath Certified RPA Associate Case Based
From Everand
Practice Questions for UiPath Certified RPA Associate Case Based
Exam OG
No ratings yet
Report On Java Chatting
No ratings yet
Report On Java Chatting
10 pages
LLM2
No ratings yet
LLM2
6 pages
Ian Talks Python A-Z
From Everand
Ian Talks Python A-Z
Ian Eress
No ratings yet
final-report-capstone-project-house-price-prediction
No ratings yet
final-report-capstone-project-house-price-prediction
35 pages
Project Report Gr-12
No ratings yet
Project Report Gr-12
25 pages
House
100% (2)
House
19 pages
Sess 2
No ratings yet
Sess 2
25 pages
Chapter-3: Statistical Analysis
No ratings yet
Chapter-3: Statistical Analysis
56 pages
Scanned With Camscanner
No ratings yet
Scanned With Camscanner
7 pages
Cloud Computing Sess1
No ratings yet
Cloud Computing Sess1
14 pages
14 - 8 - 2018 - 16 - 22 - 55 - 449 - UNIT I (Part A)
No ratings yet
14 - 8 - 2018 - 16 - 22 - 55 - 449 - UNIT I (Part A)
29 pages
Check Semantics - Error Reporting - Disambiguate - Type Coercion - Static Checking
No ratings yet
Check Semantics - Error Reporting - Disambiguate - Type Coercion - Static Checking
108 pages
Chapter 1
No ratings yet
Chapter 1
51 pages
Shift Reduce Parsing
No ratings yet
Shift Reduce Parsing
21 pages
DeepLearning ACL2012 Tutorial
No ratings yet
DeepLearning ACL2012 Tutorial
7 pages
41 Essential SQL Interview Questions and Answers
No ratings yet
41 Essential SQL Interview Questions and Answers
39 pages
Bright Network Marketing Communications Plan Template PDF
No ratings yet
Bright Network Marketing Communications Plan Template PDF
5 pages
DSA Day4
No ratings yet
DSA Day4
3 pages
Competitive Coding: January 23, 2021
No ratings yet
Competitive Coding: January 23, 2021
1 page
Plagiarism Spectrum Student Infographic PDF
No ratings yet
Plagiarism Spectrum Student Infographic PDF
1 page
Assignment 1
No ratings yet
Assignment 1
2 pages
Code of Conduct For Bright Network Event Participants: Expected Behaviour
No ratings yet
Code of Conduct For Bright Network Event Participants: Expected Behaviour
3 pages
ProbabilisticLearning Bayesian
No ratings yet
ProbabilisticLearning Bayesian
11 pages
42 Review5 Korpal Quinci
No ratings yet
42 Review5 Korpal Quinci
4 pages
Scie.7 Lesson1 NLC
No ratings yet
Scie.7 Lesson1 NLC
41 pages
Plastic Money
No ratings yet
Plastic Money
53 pages
Hypotheses Testing
No ratings yet
Hypotheses Testing
56 pages
Units 9 Philosophical Assumptions
No ratings yet
Units 9 Philosophical Assumptions
70 pages
Capitulo 3 Mccracken
No ratings yet
Capitulo 3 Mccracken
20 pages
Final Exam ABHS
No ratings yet
Final Exam ABHS
20 pages
Effect of Teachers' Action Research Difficulties On Perceived Valuation and Impact On Teaching in Gutalac I District
No ratings yet
Effect of Teachers' Action Research Difficulties On Perceived Valuation and Impact On Teaching in Gutalac I District
10 pages
Final Project
No ratings yet
Final Project
56 pages
Tallerde Produccion
No ratings yet
Tallerde Produccion
13 pages
En 8 Research Methods 2022-2023 Quantitative Methods
No ratings yet
En 8 Research Methods 2022-2023 Quantitative Methods
58 pages
PDF (SG) - EAP11 - 12 - Unit 10 - Lesson 1 - Kinds of Reports
No ratings yet
PDF (SG) - EAP11 - 12 - Unit 10 - Lesson 1 - Kinds of Reports
19 pages
07 Formulasi Kebijakan Kesehatan Berbasis Bukti
No ratings yet
07 Formulasi Kebijakan Kesehatan Berbasis Bukti
30 pages
Chapter 3 Methodology Thesis Format
100% (3)
Chapter 3 Methodology Thesis Format
8 pages
Chapter 3: Research Methodology
No ratings yet
Chapter 3: Research Methodology
16 pages
Final 1
No ratings yet
Final 1
64 pages
The Interconnection Between Interpretivist Paradigm and Qualitative Methods in Education
No ratings yet
The Interconnection Between Interpretivist Paradigm and Qualitative Methods in Education
5 pages
Project 1-2025-S1
No ratings yet
Project 1-2025-S1
2 pages
Student Midterm Final Exam
No ratings yet
Student Midterm Final Exam
11 pages
The Educational Role of The Art Museum and Its Collection in The Teaching of Undergraduate and Graduate Students
No ratings yet
The Educational Role of The Art Museum and Its Collection in The Teaching of Undergraduate and Graduate Students
92 pages
Secondary Research Dissertation Examples
100% (1)
Secondary Research Dissertation Examples
8 pages
Revised Checklist For RCTs
No ratings yet
Revised Checklist For RCTs
17 pages
Chapter-I: "Working Capital Management", The Mysore Paper Mills LTD
No ratings yet
Chapter-I: "Working Capital Management", The Mysore Paper Mills LTD
70 pages
Doa Urine: Preparation and Stability
No ratings yet
Doa Urine: Preparation and Stability
2 pages
Psy4001 Session1 Forwebsite
No ratings yet
Psy4001 Session1 Forwebsite
48 pages
SVKM's Narsee Monjee Institute of Management Studies Name of School - SBM, Bangalore
No ratings yet
SVKM's Narsee Monjee Institute of Management Studies Name of School - SBM, Bangalore
3 pages
Ncsu Dissertation Rules
100% (1)
Ncsu Dissertation Rules
6 pages
Mba Dissertation Examples PDF
100% (2)
Mba Dissertation Examples PDF
7 pages
Scientific Method Handout 1
No ratings yet
Scientific Method Handout 1
1 page
Hypothesis Testing
No ratings yet
Hypothesis Testing
51 pages

Problem Statement

Uploaded by

Problem Statement

Uploaded by

You have been provided with the Seattle AirBnb dataset of listings data.

You will need to predict

You might also like