GoTo Data Science Recruiting Assignment

Uploaded by

Pranav Khurana

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

58 views1 page

GoTo Data Science Recruiting Assignment

Uploaded by

Pranav Khurana

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 1

GoTo Data Science Recruiting Assignment: Solution Approach

The solution Q! has primarily achieved the following –

1. Ensuring seamless end-to-end execution of the pipeline
2. Augmenting the data merging script based on business and intent understanding
3. Implementing a feature capturing historical completed trips for each driver
4. Capture and store the model performance metrics
5. Improving performance of the classification model by hyperparameter tuning
6. Miscellaneous semantic changes

Detailed Walkthrough-

1. Initially, the code was setting the target as 1 wherever the merged dataset had
"ACCEPTED" in the `participant_status` column. However, this was incorrect
because a "CREATED" event is logged whenever the system polls a driver, after
which the driver either ACCEPTS, REJECTS, or IGNORES the request. Including
"CREATED" rows biased the target towards 0. To correct this, we should remove
"CREATED" rows and set the target to 0 only for "REJECTED" or "IGNORED"
statuses, as our goal is to maximize the "ACCEPTED" responses.

2. To implement the new feature capturing the track record of drivers, I retrieved the
number of unique rides COMPLETED by each driver from the booking_log
database. I then merged this data with the master database on the driver_id column
and dropped the null values.

3. To evaluate the model, I used accuracy, precision, recall, and F1 score metrics from
the `sklearn` library. I defined a new `predict_class` function in the
`SklearnClassifier` class to return the predicted classes instead of the predicted
probabilities, which the current `predict` function was computing.

4. The model was performing well with the predefined parameters on this data, but I
altered the hyperparameters, reducing the max_depth attribute to prevent overfitting
which resulted in a better score on test_data. Note that without making the
alterations to the code specified in Step 1, the model was performing poorly on the
test_data, reaffirming the fact that the data was biased towards 0 initially.

5. Finally, I made some semantic changes including adding relevant comments

wherever necessary, adding the new historical_completes column to the config file,
changing the name of target from is_complete to is_accepted.

Uber Data Analytics Project
No ratings yet
Uber Data Analytics Project
9 pages
Revised PROOFREAD Thesis Document
No ratings yet
Revised PROOFREAD Thesis Document
74 pages
Notebook
No ratings yet
Notebook
10 pages
Progress Report: Loading of Required Dataset
No ratings yet
Progress Report: Loading of Required Dataset
5 pages
Internship
No ratings yet
Internship
24 pages
NNProject t2
No ratings yet
NNProject t2
9 pages
Winter Report
No ratings yet
Winter Report
82 pages
Documentation
No ratings yet
Documentation
7 pages
Hackathon Best Practices
No ratings yet
Hackathon Best Practices
2 pages
Alok Kumar: Profile Skills
No ratings yet
Alok Kumar: Profile Skills
2 pages
Resume 1
No ratings yet
Resume 1
3 pages
Generic Best Practices For Hackathon
No ratings yet
Generic Best Practices For Hackathon
1 page
DSBDA LAB_2.1_1736750718198
No ratings yet
DSBDA LAB_2.1_1736750718198
9 pages
Phase 2 Aiml
No ratings yet
Phase 2 Aiml
7 pages
C1 W2
No ratings yet
C1 W2
60 pages
Sentiment Analysis Using NLP
No ratings yet
Sentiment Analysis Using NLP
42 pages
rapportml (2)
No ratings yet
rapportml (2)
54 pages
Predicting Mode of Transport
No ratings yet
Predicting Mode of Transport
29 pages
FAANGPath Simple Template
No ratings yet
FAANGPath Simple Template
2 pages
Day-4 DS Practicals
No ratings yet
Day-4 DS Practicals
5 pages
Shubham Mankodiya DS
No ratings yet
Shubham Mankodiya DS
6 pages
Daily Report
No ratings yet
Daily Report
3 pages
Car-price-prediction (1)
No ratings yet
Car-price-prediction (1)
42 pages
New Resume Saurav
No ratings yet
New Resume Saurav
1 page
Day 6 Report Template.. (1)
No ratings yet
Day 6 Report Template.. (1)
4 pages
ML_Extended Project Business Report-Richa
No ratings yet
ML_Extended Project Business Report-Richa
32 pages
Data Science Checklist
No ratings yet
Data Science Checklist
22 pages
Soham Bamane Resume
No ratings yet
Soham Bamane Resume
2 pages
Professional Synopsis:: Kunal Anarse
No ratings yet
Professional Synopsis:: Kunal Anarse
2 pages
KetkiShinde_Data Scientist1
No ratings yet
KetkiShinde_Data Scientist1
2 pages
SHUKLAdocument
No ratings yet
SHUKLAdocument
21 pages
proposal
No ratings yet
proposal
12 pages
R
No ratings yet
R
17 pages
CloudyML Mega Combo Course Brochure
No ratings yet
CloudyML Mega Combo Course Brochure
19 pages
U RKresume
No ratings yet
U RKresume
2 pages
Internship Progress Report Template Pg
No ratings yet
Internship Progress Report Template Pg
14 pages
adnan_internship
No ratings yet
adnan_internship
15 pages
Arun Kumar Data Analyst
No ratings yet
Arun Kumar Data Analyst
2 pages
Generic Best Practices For Hackathon
No ratings yet
Generic Best Practices For Hackathon
1 page
Cars Project PDF
No ratings yet
Cars Project PDF
9 pages
The Applied SQL Data Analytics Workshop - Second Edition: Develop your practical skills and prepare to become a professional data analyst, 2nd Edition
From Everand
The Applied SQL Data Analytics Workshop - Second Edition: Develop your practical skills and prepare to become a professional data analyst, 2nd Edition
Matt Goldwasser
No ratings yet
Informatica PowerCenter Workflow and Transformation Guide: Definitive Reference for Developers and Engineers
From Everand
Informatica PowerCenter Workflow and Transformation Guide: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
AI-Driven Web Apps: Practical Machine Learning for Software Developers
From Everand
AI-Driven Web Apps: Practical Machine Learning for Software Developers
Sivaramarajalu Ramadurai Venkataraajalu
No ratings yet
Microsoft Certified: Power BI Data Analyst Associate PL 300 Practice Tests
From Everand
Microsoft Certified: Power BI Data Analyst Associate PL 300 Practice Tests
CertSquad Professional Trainers
No ratings yet
DATA MINING and MACHINE LEARNING: CLUSTER ANALYSIS and kNN CLASSIFIERS. Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING: CLUSTER ANALYSIS and kNN CLASSIFIERS. Examples with MATLAB
César Pérez López
No ratings yet
Microsoft Azure Database Administrator DP 300
From Everand
Microsoft Azure Database Administrator DP 300
Manish Soni
No ratings yet
DATA MINING AND MACHINE LEARNING. PREDICTIVE TECHNIQUES: REGRESSION, GENERALIZED LINEAR MODELS, SUPPORT VECTOR MACHINE AND NEURAL NETWORKS
From Everand
DATA MINING AND MACHINE LEARNING. PREDICTIVE TECHNIQUES: REGRESSION, GENERALIZED LINEAR MODELS, SUPPORT VECTOR MACHINE AND NEURAL NETWORKS
César Pérez López
No ratings yet
Microsoft Azure Developer AZ 204
From Everand
Microsoft Azure Developer AZ 204
Manish Soni
No ratings yet
Microsoft Azure DevOps Engineer AZ 400
From Everand
Microsoft Azure DevOps Engineer AZ 400
Manish Soni
No ratings yet
Kafka Developer Certified: The Essential Guide
From Everand
Kafka Developer Certified: The Essential Guide
SUJAN
No ratings yet
Hallo Microsoft Excel: Mastering Data Analytics
From Everand
Hallo Microsoft Excel: Mastering Data Analytics
Agus Kurniawan
No ratings yet
Designing Microsoft Azure Infrastructure Solution AZ 305
From Everand
Designing Microsoft Azure Infrastructure Solution AZ 305
Manish Soni
No ratings yet
IT Specialist: Artificial Intelligence Exam Prep - 500 Questions for Certification Success (0225)
From Everand
IT Specialist: Artificial Intelligence Exam Prep - 500 Questions for Certification Success (0225)
Satou Takahiro
No ratings yet
Blue Prism Professional Developer Certification Case Based Practice Questions - Latest Edition 2023
From Everand
Blue Prism Professional Developer Certification Case Based Practice Questions - Latest Edition 2023
Exam OG
No ratings yet
DP-600: Implementing Analytics Solutions Using Microsoft Fabric Exam Preparation
From Everand
DP-600: Implementing Analytics Solutions Using Microsoft Fabric Exam Preparation
Georgio Daccache
No ratings yet
Google Cloud Data Engineer 100+ Practice Exam Questions With Well Explained Answers
From Everand
Google Cloud Data Engineer 100+ Practice Exam Questions With Well Explained Answers
vivian njoroge
No ratings yet
Administering Microsoft Azure SQL Solutions DP 300
From Everand
Administering Microsoft Azure SQL Solutions DP 300
Manish Soni
No ratings yet
VMWARE Certified Spring Professional Certification Cased Based Practice Questions - Latest Edition
From Everand
VMWARE Certified Spring Professional Certification Cased Based Practice Questions - Latest Edition
Exam OG
No ratings yet
Blue Prism Developer Certification Case Based Practice Question - Latest 2023
From Everand
Blue Prism Developer Certification Case Based Practice Question - Latest 2023
Exam OG
No ratings yet
Salesforce Certified Platform Developer I CRT-450 Exam Preparation
From Everand
Salesforce Certified Platform Developer I CRT-450 Exam Preparation
Georgio Daccache
No ratings yet

GoTo Data Science Recruiting Assignment

Uploaded by

GoTo Data Science Recruiting Assignment

Uploaded by

GoTo Data Science Recruiting Assignment: Solution Approach

The solution Q! has primarily achieved the following –

5. Finally, I made some semantic changes including adding relevant comments

You might also like