Hackathon Problem Statement

An e-commerce company wants to predict the top 3 categories each user might purchase from in the future based on their past transaction data. The training dataset contains user_id, product purchased, and order value for each transaction. Participants must predict the top 3 categories for each user in the test dataset and submit it in a CSV file with user_id and the 3 predicted categories (pred3) by July 17th. Submissions will be evaluated based on mean reciprocal rank and precision metrics which measure how accurate the top 3 predictions are for each user compared to actual future purchases.

Uploaded by

Tanmay Singh

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

121 views

Hackathon Problem Statement

Uploaded by

Tanmay Singh

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 4

Problem Statement

An e-commerce company wants to recommend products to its users.

The company has collected only transaction data in the past. The
training dataset has only 3 columns - user_id, Product bought and
Order value of the product. Using this dataset, predict for all the users
in the training dataset, the top 3 categories that the user might buy
from.

Training dataset sample

aov = Order Value of the product

category = Product Category where the purchase was made

What do you need to predict?

For each user, predict the top 3 probable product categories that they
may purchase from, in the future.

Timeline
DEADLINE EXTENDED

21 DAYS LEFT

SUBMISSIONS OPEN SAT JUN 26

LAST DATE SAT JUL 17

Training Data

This file contains the detailed purchasing history for every user. It has
order value and the category of the product.

Training Data Target

This file contains data for some users about the category of items they
bought in future.

Test Data

This file contains the detailed purchasing history for some users. It has
the order value and the category of the product. You have to predict
the top 3 categories that the users with these user_ids will purchase
from in the future.

Evaluation
Measurements will be based on mean relevance rank
(mrr) and precision. Both the measurements are explained here.

Mean Relevance Rank

User Reciproca |
Products in the order shown Product bought
id Rank
E-readers, Kitchen Supplies, Phones, Comics,
1 1/3
Technology books Technology Books
2 Phones, Comics, Fruits None 0
3 Groceries, Fruits, Phones None 0
Fruits, Home Decor,
4 Phones, Home Decor, readers 1/2
readers
Home Decor, Home Furnishings,
5 Phones, Books, Fruits 0
Kitchen Supplies

Technology Books is the 3rd top prediction for user_id 1 and that is

the one bought by the user - hence the reciprocal rank is 1/rank of the
right prediction which is 1/3. If there is more than one product
matching, the reciprocal rank still takes only the first matching product.
For instance - user_id 4 though both Home Decor and readers are
matching, the first match product is at position 2 and hence the
reciprocal rank is 1/2. Once we get the reciprocal ranks, we do an
average of the reciprocal ranks to get the mean reciprocal rank.
Final MRR
= ( ⅓ + ½ ) / 2 = 0.41666

Precision or Accuracy

We first find the Number of products in the prediction in each row that
matches with the number of products of the user_id. We then average
this number across all valid predictions. For the above table, precision
would look like -

User
Products in the order shown Product bought Precision
id
E-readers, Kitchen Supplies, Phones, Comics, Technology
1 1
Technology books Books
2 Phones, Comics, Fruits None NA
3 Groceries, Fruits, Phones None NA
4 Phones, Home Decor, readers Fruits, Home Decor, readers 2
Home Decor, Home Furnishings,
5 Phones, Books, Fruits NA
Kitchen Supplies

For user_id 1, one product matched and for user_id 4, two products

matched. So, accuracy is

number of items that matched / number of unique users with a

prediction.

Here it will be 3/2 = 1.5

Recall in this case, the number of items for which there is a prediction =
2/5 = 0.4

Ready to submit?

Submissions should be made in the same format as the sample

submission provided.

Sample Submission
Submissions should be made in the same format as the sample
provided.

Sample Prediction Dataset

Prediction dataset should be a .csv file with 19,981 rows (and one row for
headers) and the columns user_id and pred3 in the same format as the file
below.

Capstone Project 1 1
33% (3)
Capstone Project 1 1
4 pages
Conteo de Stickies - Tappi T213
No ratings yet
Conteo de Stickies - Tappi T213
12 pages
Data Analysis On BigMart Sales
67% (3)
Data Analysis On BigMart Sales
17 pages
Black Friday Sales
No ratings yet
Black Friday Sales
26 pages
SS Teamproject Documentation
No ratings yet
SS Teamproject Documentation
33 pages
BS en 01367-2-2009
100% (1)
BS en 01367-2-2009
18 pages
Bigmart Sales Solution Methodology
No ratings yet
Bigmart Sales Solution Methodology
5 pages
Machine Learning - It3190E: Hanoi University of Science and Technology School of Information and Communication Technology
No ratings yet
Machine Learning - It3190E: Hanoi University of Science and Technology School of Information and Communication Technology
14 pages
DSP Research Paper by Shanmukh and Meher
No ratings yet
DSP Research Paper by Shanmukh and Meher
33 pages
Sales Prediction and Product Recommendation Model Through
No ratings yet
Sales Prediction and Product Recommendation Model Through
20 pages
Final DMT Report PDF
No ratings yet
Final DMT Report PDF
27 pages
HET ka FML
No ratings yet
HET ka FML
13 pages
Master Sarvi Tuukka 2020
No ratings yet
Master Sarvi Tuukka 2020
68 pages
DGS IA: Inquiry Process Document (IPD)
No ratings yet
DGS IA: Inquiry Process Document (IPD)
5 pages
1142pm_1.EPRA JOURNALS 14814
No ratings yet
1142pm_1.EPRA JOURNALS 14814
6 pages
Group11 DL Project Presentation
No ratings yet
Group11 DL Project Presentation
19 pages
FML Micro Project
No ratings yet
FML Micro Project
12 pages
Coursera-Capstone-Project
No ratings yet
Coursera-Capstone-Project
4 pages
Laptop Price Pred
No ratings yet
Laptop Price Pred
11 pages
E Commerce
No ratings yet
E Commerce
20 pages
Seippel MA Eemcs
No ratings yet
Seippel MA Eemcs
95 pages
Big Mart Sales Prediction Using Machine Learning Report PDF
No ratings yet
Big Mart Sales Prediction Using Machine Learning Report PDF
56 pages
synopsis-big mart sales prediction
No ratings yet
synopsis-big mart sales prediction
3 pages
Chetan Research Paper
No ratings yet
Chetan Research Paper
7 pages
PPIR
No ratings yet
PPIR
8 pages
Content
No ratings yet
Content
8 pages
Amit Kumar: Bigmart Sales Prediction A Project Report
No ratings yet
Amit Kumar: Bigmart Sales Prediction A Project Report
47 pages
ML Project
100% (1)
ML Project
10 pages
IJCRT2105404 Bigmart 4
No ratings yet
IJCRT2105404 Bigmart 4
4 pages
Java
No ratings yet
Java
34 pages
Predicting Buying Behavior Using CPT+: A Case Study of An E-Commerce Company
No ratings yet
Predicting Buying Behavior Using CPT+: A Case Study of An E-Commerce Company
8 pages
Google Merchandise Store Data Analysis: - Google Analytics Customer Revenue Prediction
No ratings yet
Google Merchandise Store Data Analysis: - Google Analytics Customer Revenue Prediction
15 pages
RP 3
No ratings yet
RP 3
12 pages
final pbl of aaryan & Satyam
No ratings yet
final pbl of aaryan & Satyam
19 pages
Lecture_11
No ratings yet
Lecture_11
50 pages
A48 A20fe 008 (2024)
No ratings yet
A48 A20fe 008 (2024)
10 pages
PPIR!1
No ratings yet
PPIR!1
9 pages
Machine Learning Project
No ratings yet
Machine Learning Project
10 pages
majorpptfin
No ratings yet
majorpptfin
19 pages
Assignment 2
No ratings yet
Assignment 2
6 pages
Mini PRJCT
No ratings yet
Mini PRJCT
11 pages
Big Mart Sales Analysis
No ratings yet
Big Mart Sales Analysis
4 pages
Black Friday Sales Analysis & Prediction: A.Priyanka P.Anish K.Pruthvi Raj
No ratings yet
Black Friday Sales Analysis & Prediction: A.Priyanka P.Anish K.Pruthvi Raj
16 pages
Final Year Project
No ratings yet
Final Year Project
41 pages
SUKUMARREVIEWPPT2
No ratings yet
SUKUMARREVIEWPPT2
24 pages
A Novel Approach To Optimizing Customer Profiles in Relation To Business Metrics
No ratings yet
A Novel Approach To Optimizing Customer Profiles in Relation To Business Metrics
11 pages
Big Mart Outlets
100% (2)
Big Mart Outlets
11 pages
Improvizing Big Market Sales Prediction: Meghana N
No ratings yet
Improvizing Big Market Sales Prediction: Meghana N
7 pages
Master Theses MBirkeland
No ratings yet
Master Theses MBirkeland
70 pages
RetailSalesPredictionUsingMachineLearningAlgorithms
No ratings yet
RetailSalesPredictionUsingMachineLearningAlgorithms
9 pages
Full Text 02
No ratings yet
Full Text 02
52 pages
Aiml Team 6
No ratings yet
Aiml Team 6
22 pages
Revenue Predictor - Udit Ennam PDF
No ratings yet
Revenue Predictor - Udit Ennam PDF
30 pages
SFB - CIA 3 - Report - FINAL
No ratings yet
SFB - CIA 3 - Report - FINAL
28 pages
Ex 5.1 Customer Behaviour Prediction
No ratings yet
Ex 5.1 Customer Behaviour Prediction
8 pages
PLAG 4.2 final
No ratings yet
PLAG 4.2 final
41 pages
Sales Prediction
No ratings yet
Sales Prediction
37 pages
2015-17 Web
No ratings yet
2015-17 Web
68 pages
Trackpad Pro Ver. 5.0 Class 7: WINDOWS 11 & MS OFFICE 2021
From Everand
Trackpad Pro Ver. 5.0 Class 7: WINDOWS 11 & MS OFFICE 2021
Nidhi Arora
No ratings yet
Trackpad Ver. 1.0 Class 7: Windows 7 & MS Office 2010
From Everand
Trackpad Ver. 1.0 Class 7: Windows 7 & MS Office 2010
Nidhi Arora
No ratings yet
Trackpad Ver. 2.0 Class 7
From Everand
Trackpad Ver. 2.0 Class 7
Nidhi Arora
5/5 (1)
Ingleside Reviews’ Ultimate Guide to Electronics & Gadgets: Making Smart, Cost-Effective Choices: Ingleside Reviews’ Comprehensive Lifestyle Library: Your Ultimate Resource for Modern Living., #1
From Everand
Ingleside Reviews’ Ultimate Guide to Electronics & Gadgets: Making Smart, Cost-Effective Choices: Ingleside Reviews’ Comprehensive Lifestyle Library: Your Ultimate Resource for Modern Living., #1
The Ingleside Reviews Team
No ratings yet
Classification of Errors
0% (1)
Classification of Errors
5 pages
2022 IJ SCI SCOPUS 1-S2.0-S1746809422008217-Main
No ratings yet
2022 IJ SCI SCOPUS 1-S2.0-S1746809422008217-Main
9 pages
מאמר רועי יוזביץ1
No ratings yet
מאמר רועי יוזביץ1
10 pages
Exam AI-900 PDF Questions-Quick Way For Knowledge4sure AI-900 Exam Prepration
100% (1)
Exam AI-900 PDF Questions-Quick Way For Knowledge4sure AI-900 Exam Prepration
7 pages
Residues in Liquefied Petroleum (LP) Gases by Gas Chromatography With Liquid, On-Column Injection
No ratings yet
Residues in Liquefied Petroleum (LP) Gases by Gas Chromatography With Liquid, On-Column Injection
13 pages
Location, Climate and Vegetation .... by Suman Jyoti "Group B"
No ratings yet
Location, Climate and Vegetation .... by Suman Jyoti "Group B"
6 pages
Alignment Tests
No ratings yet
Alignment Tests
24 pages
January 2006 MS - C1 AQA
No ratings yet
January 2006 MS - C1 AQA
5 pages
Design and Statistical Analysis of Method Transfer Studies For Biotechnology Products
No ratings yet
Design and Statistical Analysis of Method Transfer Studies For Biotechnology Products
27 pages
MCD19 User Guide V61
No ratings yet
MCD19 User Guide V61
24 pages
Prosopografia
No ratings yet
Prosopografia
8 pages
Uncertainty of Measurement, Precision and
No ratings yet
Uncertainty of Measurement, Precision and
29 pages
ACCURACY AND PRECISION and PERCENT UNCERTAINTY
No ratings yet
ACCURACY AND PRECISION and PERCENT UNCERTAINTY
27 pages
3i's PT 1
No ratings yet
3i's PT 1
3 pages
Wahyudi 2021 J. Phys. Conf. Ser. 1830 012016
No ratings yet
Wahyudi 2021 J. Phys. Conf. Ser. 1830 012016
13 pages
OmniSX - MX2 - Training - 14D - Phased Array Analysis - Length Sizing PDF
100% (1)
OmniSX - MX2 - Training - 14D - Phased Array Analysis - Length Sizing PDF
19 pages
ITTC - Recommended Procedures and Guidelines: Preparation and Conduct of Speed/Power Trials
No ratings yet
ITTC - Recommended Procedures and Guidelines: Preparation and Conduct of Speed/Power Trials
23 pages
Q1 Week 5 English 10
No ratings yet
Q1 Week 5 English 10
14 pages
DMAIC
No ratings yet
DMAIC
32 pages
Mind Mat: Personalized Yoga Recommendation System for Wellbeing and Wellness Using Machine Learning Techniques
No ratings yet
Mind Mat: Personalized Yoga Recommendation System for Wellbeing and Wellness Using Machine Learning Techniques
9 pages
Cambridge IGCSE™
No ratings yet
Cambridge IGCSE™
5 pages
Menu Merchandising
80% (5)
Menu Merchandising
73 pages
MBA 2 Business Research by Robson Methods All
No ratings yet
MBA 2 Business Research by Robson Methods All
222 pages
اختبار السابع سلطنة عمان
No ratings yet
اختبار السابع سلطنة عمان
16 pages
Validation of Chromatographic Methods of Analysis of Drugs Derived From Herbs
No ratings yet
Validation of Chromatographic Methods of Analysis of Drugs Derived From Herbs
34 pages
Foth Engineering Report - PFAS
No ratings yet
Foth Engineering Report - PFAS
125 pages
General Physics 1 - Quarter 1 Module 1
No ratings yet
General Physics 1 - Quarter 1 Module 1
24 pages
Sample paper IMMC
No ratings yet
Sample paper IMMC
28 pages