0% found this document useful (0 votes)

11 views3 pages

Data Science Challenge - 1

The Data Science Challenge requires participants to develop a predictive model for traffic volume forecasting using a provided dataset. Key tasks include data preprocessing, feature engineering, model training, and generating predictions to minimize RMSE. Submissions must include a Python Notebook and a CSV file of predictions, with evaluation based on data cleaning, feature engineering, model performance, and clarity.

Uploaded by

akshatmaurya1501

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views3 pages

Data Science Challenge - 1

Uploaded by

akshatmaurya1501

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 3

Data Science Challenge: Crack the Traffic Code: Build Smarter Predictions!

Problem Statement
In this challenge, you are tasked with developing a predictive model for traffic volume
forecasting using a real-world dataset. Your focus will be on data preprocessing, feature
engineering, and model training to achieve the best possible performance on the test
dataset.

Like other standard machine learning competitions, you need to submit predictions on a test
dataset, also your goal is to demonstrate strong data preparation skills and achieve a low
RMSE on the test data while following best practices in model development.
Dataset: Download the dataset using the below link:

Train Data: https://docs.google.com/spreadsheets/d/e/2PACX-1vQKu1w3cJAnm1R-1or-

jBLnU7SS48s6_u2ndxt7PyIThnw1ID-q1ewUZ1sm-xj6umpG23xLtq_4CSuM/pub?output=csv

Test Data: https://docs.google.com/spreadsheets/d/e/2PACX-

1vTTvl5JV9ajz5vvQg0dWg798tvDjyR9OrigvYg_617dJojhBgF1mjMhowpefMFLrjYrVSXCFObx1H2u/p
ub?output=csv

Sample Submission: https://docs.google.com/spreadsheets/d/e/2PACX-

1vTOtpdwwLU64wlMH_c1ImWfMVOCv1F_k0ihyzfN_EQXSKujNhCBbjaQ73ego3sIK2YWitN8yaLkFESK/
pubhtml

Dataset Description

You will be provided with a dataset containing:

• Timestamp-based traffic volume data

• Environmental conditions (e.g., weather, temperature, etc.)

• Other influential factors affecting traffic flow

Your Task

1. Load and Explore the Data

• Read and inspect the dataset to understand its structure.

2. Data Preprocessing

• Handle missing values, outliers, and any inconsistencies.

• Convert timestamps into meaningful time-based features (hour, day of the week,
etc.).

3. Feature Engineering
• Generate new features that could improve model performance.

• Handle categorical variables appropriately.

• Scale/normalize numerical features as needed.

4. Train a Predictive Model

• Choose an appropriate regression model or time series model to forecast traffic

volume.

• Justify your model choice and optimization strategy.

5. Generate Predictions on Test Data

• Apply your trained model to the test dataset to predict traffic volume.

• Submit your predicted values for the test dataset.

6. Evaluate Performance

• Your submission will be evaluated using Root Mean Squared Error (RMSE) based on
actual test labels.

7. Documentation

• Clearly explain your approach, preprocessing steps, and feature engineering choices
in a Jupyter Notebook.

Challenge:

Submit your predicted CSV file with a single column named 'Traffic_Vol' to "
https://challenge.astrikos.xyz:3443/ " to see your ranking on the leaderboard. You have a total of 10
submission attempts, so focus on fine-tuning your existing model to improve the RMSE.

Submission Guidelines

Submit a .zip file containing below files:

• Python Notebook (.ipynb) containing your data preprocessing and feature

engineering steps, your trained model and RMSE on the training dataset.
• A CSV file containing your predictions for the test dataset.

Fill the below form for submission: (you can only fill the form once)

Submission Form: https://forms.gle/rWamnUCQ24bfkzrN9

Evaluation Criteria

Criteria Weightage

Data Cleaning & Preprocessing 20%

Feature Engineering & Selection 30%

Model Performance (based on RMSE) 30%

Clarity & Explanation 20%

Think like a real-world data scientist—focus on meaningful preprocessing and feature

engineering to improve prediction quality. Good luck!

Advanced Traffic Machine Learning
No ratings yet
Advanced Traffic Machine Learning
12 pages
DSCI 553 Competition Project - F2024 .Docx - 553
No ratings yet
DSCI 553 Competition Project - F2024 .Docx - 553
5 pages
DS Assignment
No ratings yet
DS Assignment
2 pages
Task - Case Study - DLMDSME01
No ratings yet
Task - Case Study - DLMDSME01
7 pages
Doosan DX 190W
No ratings yet
Doosan DX 190W
315 pages
Factors That Influence The Distribution of Plants and Animals
No ratings yet
Factors That Influence The Distribution of Plants and Animals
17 pages
Traffic Flow Prediction Using The METR-LA Traffic
No ratings yet
Traffic Flow Prediction Using The METR-LA Traffic
8 pages
11plus Y3 English Comprehension Poetry Test & Answers
No ratings yet
11plus Y3 English Comprehension Poetry Test & Answers
9 pages
Kebutuhan Alat Praktek Teknik Alat Berat
No ratings yet
Kebutuhan Alat Praktek Teknik Alat Berat
2 pages
How To Publish Research Paper
0% (1)
How To Publish Research Paper
4 pages
Rizal 19th Century New
No ratings yet
Rizal 19th Century New
38 pages
Bsec 1907 Viaduct DN-24
No ratings yet
Bsec 1907 Viaduct DN-24
60 pages
7A Concept Map Cells
100% (1)
7A Concept Map Cells
4 pages
PHARMACY RX Business Plan
No ratings yet
PHARMACY RX Business Plan
17 pages
Kinetic Energy Recovery System
No ratings yet
Kinetic Energy Recovery System
9 pages
Jio Final Report
No ratings yet
Jio Final Report
55 pages
Menstural Cycle DISORDERS
No ratings yet
Menstural Cycle DISORDERS
28 pages
Mos Drywall
No ratings yet
Mos Drywall
4 pages
Learning Guide No 1
No ratings yet
Learning Guide No 1
66 pages
DainikBhaskar - Innovating Its Way To Success
No ratings yet
DainikBhaskar - Innovating Its Way To Success
7 pages
Annex A
No ratings yet
Annex A
1 page
BASF Animal Nutrition Balangut Brochure Poultry
No ratings yet
BASF Animal Nutrition Balangut Brochure Poultry
2 pages
Trig Cheat Sheet: Degree Measure Arc Length Radian Measure Coordinates
No ratings yet
Trig Cheat Sheet: Degree Measure Arc Length Radian Measure Coordinates
5 pages
Number Patterns Gr10
No ratings yet
Number Patterns Gr10
40 pages
Email Writing: 1. Semi-Formal Email 2. Formal Email 3. Informal Email
No ratings yet
Email Writing: 1. Semi-Formal Email 2. Formal Email 3. Informal Email
4 pages
Dynamics and Control of Cranes A Review
No ratings yet
Dynamics and Control of Cranes A Review
54 pages
Cbar
No ratings yet
Cbar
12 pages
To Business Strategy: A Guide by Reda Shuhumi
No ratings yet
To Business Strategy: A Guide by Reda Shuhumi
25 pages
Unit 1 Network and Security New Study Notes
No ratings yet
Unit 1 Network and Security New Study Notes
6 pages
Phenolphtalein
No ratings yet
Phenolphtalein
4 pages
User Manual en Skimmer EM0130 EM0140 EMEM22010612
No ratings yet
User Manual en Skimmer EM0130 EM0140 EMEM22010612
4 pages
Miller Approximation
No ratings yet
Miller Approximation
14 pages
NT9.1 SDH Network Takeover TL1 ED01
No ratings yet
NT9.1 SDH Network Takeover TL1 ED01
31 pages
1.B. Soal Bahasa Inggris X Sem 1
No ratings yet
1.B. Soal Bahasa Inggris X Sem 1
15 pages
Hot Water Storage Tank 2500L - Horizontal
No ratings yet
Hot Water Storage Tank 2500L - Horizontal
1 page
AWS Machine Learning Engineer Associate Complete Study Guide: 450+ Practice Questions with Real-World MLOps Projects for MLA-C01
From Everand
AWS Machine Learning Engineer Associate Complete Study Guide: 450+ Practice Questions with Real-World MLOps Projects for MLA-C01
Abrielle Wang Perkins
No ratings yet
High-Performance C: Optimizing Code for Speed and Efficiency
From Everand
High-Performance C: Optimizing Code for Speed and Efficiency
Larry Jones
No ratings yet
C Data Structures and Algorithms: Implementing Efficient ADTs
From Everand
C Data Structures and Algorithms: Implementing Efficient ADTs
Larry Jones
No ratings yet
Building Scalable Systems with C: Optimizing Performance and Portability
From Everand
Building Scalable Systems with C: Optimizing Performance and Portability
Larry Jones
No ratings yet
Practical Business Intelligence
From Everand
Practical Business Intelligence
Ahmed Sherif
3/5 (1)
AWS Certified Machine Learning Associate Exam Study Guide
From Everand
AWS Certified Machine Learning Associate Exam Study Guide
Dániel Rozmán
No ratings yet
AWS Certified Machine Learning - Specialty (MLS-C01) Certification Guide: The ultimate guide to passing the MLS-C01 exam on your first attempt
From Everand
AWS Certified Machine Learning - Specialty (MLS-C01) Certification Guide: The ultimate guide to passing the MLS-C01 exam on your first attempt
Somanath Nanda
No ratings yet
Data Cleaning with Power BI: The definitive guide to transforming dirty data into actionable insights
From Everand
Data Cleaning with Power BI: The definitive guide to transforming dirty data into actionable insights
Gus Frazer
No ratings yet
Microsoft Power BI Performance Best Practices: Learn practical techniques for building high-speed Power BI solutions
From Everand
Microsoft Power BI Performance Best Practices: Learn practical techniques for building high-speed Power BI solutions
Thomas LeBlanc
No ratings yet
Defect Prediction in Software Development & Maintainence
From Everand
Defect Prediction in Software Development & Maintainence
Rudra Kumar
No ratings yet
DATA MINING and MACHINE LEARNING. CLASSIFICATION PREDICTIVE TECHNIQUES: SUPPORT VECTOR MACHINE, LOGISTIC REGRESSION, DISCRIMINANT ANALYSIS and DECISION TREES: Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING. CLASSIFICATION PREDICTIVE TECHNIQUES: SUPPORT VECTOR MACHINE, LOGISTIC REGRESSION, DISCRIMINANT ANALYSIS and DECISION TREES: Examples with MATLAB
César Pérez López
No ratings yet
Microsoft Certified: Power BI Data Analyst Associate PL 300 Practice Tests
From Everand
Microsoft Certified: Power BI Data Analyst Associate PL 300 Practice Tests
CertSquad Professional Trainers
No ratings yet
Principles of MapReduce Systems: Definitive Reference for Developers and Engineers
From Everand
Principles of MapReduce Systems: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Comprehensive Guide to SAS Programming: Definitive Reference for Developers and Engineers
From Everand
Comprehensive Guide to SAS Programming: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Data Engineering with Scala and Spark: Build streaming and batch pipelines that process massive amounts of data using Scala
From Everand
Data Engineering with Scala and Spark: Build streaming and batch pipelines that process massive amounts of data using Scala
Eric Tome
No ratings yet
Learning Highcharts
From Everand
Learning Highcharts
Joe Kuan
No ratings yet
Microsoft 365 Identity and Services MS-100 Practice Test
From Everand
Microsoft 365 Identity and Services MS-100 Practice Test
CertSquad Professional Trainers
No ratings yet
Data Mining Models: Techniques and Applications
From Everand
Data Mining Models: Techniques and Applications
Ravi Deshpande
No ratings yet
ASP.NET Core 1.0 High Performance
From Everand
ASP.NET Core 1.0 High Performance
James Singleton
No ratings yet
Comprehensive Guide to Dash Applications: Definitive Reference for Developers and Engineers
From Everand
Comprehensive Guide to Dash Applications: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Applied Statistical Analysis with SPSS: Definitive Reference for Developers and Engineers
From Everand
Applied Statistical Analysis with SPSS: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Databricks Essentials: A Guide to Unified Data Analytics
From Everand
Databricks Essentials: A Guide to Unified Data Analytics
Robert Johnson
No ratings yet
SAP HANA SYSTEM REPLICATION SCENARIOS
From Everand
SAP HANA SYSTEM REPLICATION SCENARIOS
Giridhar Kankanala
No ratings yet
XGBoost in Practice: Definitive Reference for Developers and Engineers
From Everand
XGBoost in Practice: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Redshift Essentials: Definitive Reference for Developers and Engineers
From Everand
Redshift Essentials: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Practical RapidMiner Workflows and Automation: Definitive Reference for Developers and Engineers
From Everand
Practical RapidMiner Workflows and Automation: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
OpenACC Programming Essentials: Definitive Reference for Developers and Engineers
From Everand
OpenACC Programming Essentials: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
SageMaker Deployment and Development: Definitive Reference for Developers and Engineers
From Everand
SageMaker Deployment and Development: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
WhereScape Solutions for Data Warehouse Automation: Definitive Reference for Developers and Engineers
From Everand
WhereScape Solutions for Data Warehouse Automation: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Redash Data Analytics and Dashboarding: Definitive Reference for Developers and Engineers
From Everand
Redash Data Analytics and Dashboarding: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
The Data Detective's Toolkit: Cutting-Edge Techniques and SAS Macros to Clean, Prepare, and Manage Data
From Everand
The Data Detective's Toolkit: Cutting-Edge Techniques and SAS Macros to Clean, Prepare, and Manage Data
Kim Chantala
No ratings yet
DATA MINING and MACHINE LEARNING: CLUSTER ANALYSIS and kNN CLASSIFIERS. Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING: CLUSTER ANALYSIS and kNN CLASSIFIERS. Examples with MATLAB
César Pérez López
No ratings yet
Customizing AutoCAD 2020, 13th Edition
From Everand
Customizing AutoCAD 2020, 13th Edition
Prof. Sham Tickoo
No ratings yet
SAS Statistics Data Analysis Certification Questions: Unofficial SAS Data analysis Certification and Interview Questions
From Everand
SAS Statistics Data Analysis Certification Questions: Unofficial SAS Data analysis Certification and Interview Questions
equitypress
4.5/5 (2)
Implementing the Stakeholder Based Goal-Question-Metric (Gqm) Measurement Model for Software Projects
From Everand
Implementing the Stakeholder Based Goal-Question-Metric (Gqm) Measurement Model for Software Projects
Dr. Prashanth Harish Southekal
No ratings yet
PROC REPORT by Example: Techniques for Building Professional Reports Using SAS: Techniques for Building Professional Reports Using SAS
From Everand
PROC REPORT by Example: Techniques for Building Professional Reports Using SAS: Techniques for Building Professional Reports Using SAS
Lisa Fine
No ratings yet
Mastering SAS Programming: From Basics to Expert Proficiency
From Everand
Mastering SAS Programming: From Basics to Expert Proficiency
William Smith
No ratings yet
Machine Learning with SAS Viya
From Everand
Machine Learning with SAS Viya
SAS Institute Inc.
No ratings yet
DATA MINING AND MACHINE LEARNING. PREDICTIVE TECHNIQUES: REGRESSION, GENERALIZED LINEAR MODELS, SUPPORT VECTOR MACHINE AND NEURAL NETWORKS
From Everand
DATA MINING AND MACHINE LEARNING. PREDICTIVE TECHNIQUES: REGRESSION, GENERALIZED LINEAR MODELS, SUPPORT VECTOR MACHINE AND NEURAL NETWORKS
César Pérez López
No ratings yet
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
César Pérez López
No ratings yet
ServiceNow Platform Engineering Essentials: Definitive Reference for Developers and Engineers
From Everand
ServiceNow Platform Engineering Essentials: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Microsoft Dynamics NAV Administration
From Everand
Microsoft Dynamics NAV Administration
Amit Sachdev
No ratings yet
Google Cloud Platform for Data Engineering: From Beginner to Data Engineer using Google Cloud Platform
From Everand
Google Cloud Platform for Data Engineering: From Beginner to Data Engineer using Google Cloud Platform
alasdair gilchrist
5/5 (1)
AutoCAD 2019: A Problem - Solving Approach, Basic and Intermediate, 25th Edition
From Everand
AutoCAD 2019: A Problem - Solving Approach, Basic and Intermediate, 25th Edition
Prof. Sham Tickoo
No ratings yet
Learning Dynamics NAV Patterns: Create solutions that are easy to maintain, are quick to upgrade, and follow proven concepts and design
From Everand
Learning Dynamics NAV Patterns: Create solutions that are easy to maintain, are quick to upgrade, and follow proven concepts and design
Marije Brummel
No ratings yet
AutoCAD Electrical 2020 for Electrical Control Designers, 11th Edition
From Everand
AutoCAD Electrical 2020 for Electrical Control Designers, 11th Edition
Prof. Sham Tickoo
No ratings yet
Practical Data Strategies and Recipes
From Everand
Practical Data Strategies and Recipes
Tom Henricksen
No ratings yet
AutoCAD Plant 3D 2016 for Designers, 3rd Edition
From Everand
AutoCAD Plant 3D 2016 for Designers, 3rd Edition
Prof. Sham Tickoo
No ratings yet
SAP Basis Configuration Frequently Asked Questions
From Everand
SAP Basis Configuration Frequently Asked Questions
Equity Press
3.5/5 (4)
Learning Oracle 12c: A PL/SQL Approach
From Everand
Learning Oracle 12c: A PL/SQL Approach
Prof. Sham Tickoo
No ratings yet
Microsoft NAV Interview Questions: Unofficial Microsoft Navision Business Solution Certification Review
From Everand
Microsoft NAV Interview Questions: Unofficial Microsoft Navision Business Solution Certification Review
Equity Press
1/5 (1)
ISTQB Advanced Level Technical Test Analyst- Exam Insights: Q&A with Explanations
From Everand
ISTQB Advanced Level Technical Test Analyst- Exam Insights: Q&A with Explanations
SUJAN
No ratings yet
Random Sample Consensus: Robust Estimation in Computer Vision
From Everand
Random Sample Consensus: Robust Estimation in Computer Vision
Fouad Sabry
No ratings yet
SAS Interview Questions You'll Most Likely Be Asked
From Everand
SAS Interview Questions You'll Most Likely Be Asked
Vibrant Publishers
No ratings yet
SAP XI Exchange Infrastructure
From Everand
SAP XI Exchange Infrastructure
equitypress
1/5 (3)

Data Science Challenge - 1

Uploaded by

Data Science Challenge - 1

Uploaded by

Data Science Challenge: Crack the Traffic Code: Build Smarter Predictions!

Train Data: https://docs.google.com/spreadsheets/d/e/2PACX-1vQKu1w3cJAnm1R-1or-

Test Data: https://docs.google.com/spreadsheets/d/e/2PACX-

Sample Submission: https://docs.google.com/spreadsheets/d/e/2PACX-

You will be provided with a dataset containing:

• Timestamp-based traffic volume data

• Other influential factors affecting traffic flow

1. Load and Explore the Data

• Read and inspect the dataset to understand its structure.

• Handle missing values, outliers, and any inconsistencies.

• Handle categorical variables appropriately.

• Scale/normalize numerical features as needed.

• Choose an appropriate regression model or time series model to forecast traffic

• Justify your model choice and optimization strategy.

5. Generate Predictions on Test Data

• Submit your predicted values for the test dataset.

Submit a .zip file containing below files:

• Python Notebook (.ipynb) containing your data preprocessing and feature

Submission Form: https://forms.gle/rWamnUCQ24bfkzrN9

Data Cleaning & Preprocessing 20%

Feature Engineering & Selection 30%

Model Performance (based on RMSE) 30%

Clarity & Explanation 20%

Think like a real-world data scientist—focus on meaningful preprocessing and feature

You might also like