Lecture 2 - The Dataset Presentation
Lecture 2 - The Dataset Presentation
Agenda
• Introduction
Project (Group)
You will work in groups of 5-6 students.
Fill the form to register for the group assignment. A separate link for individuals to sign up for the group
assignment. Links also present in Blackboard (Course work -> Group project)
The final deliverable will be a report and a presentation. Examples from previous year present in
Blackboard**.
4
Project (Group)
Data sources
1) Company datasets
Open-source datasets
9
E-Commerce Orders
https://www.kaggle.com/jainaashish/orders-merged
17
Goodreads
https://www.kaggle.com/datasets/jealousleopard/goodreadsbooks?datasetId=231310
20
Kaggle
https://www.kaggle.com/datasets
24
Project (Group)
Suggested template
1. Introduction and problem definition– Describe the context and the problem you wish to address (max 3 pages).
2. Background – Present specific objectives you want to achieve and describe how you approach the problem, how you will
design your data-strategy and what goals it is intended to achieve etc. (approx. 3-6 pages).
3. Method– Describe in detail the methods you are applying to analyze the data and the data-set you have selected (2-4 pages)
4. Analysis– Describe the data analysis you conducted and present the results. It is important that the results are described in
detail and visualized appropriately (3-10 pages)
5. Interpretation and recommendations– Describe an implementation plan based on the insights you extracted. You can set
specific actions that need to be implemented, a time-plan for deployment, and ideas for future data collection and improving
the analysis and results. (3-5 pages)
Does it matter which company/dataset we use? No, but you should ask interesting and actionable questions.
Do we need permission/contact with an individual in the company? No, use publicly available datasets.
Do we need to be present physically for the report submission and presentation? No, but plan it with your team.
When will the groups be assigned?: One week after the deadline of registration (20th September).
We are looking for 5-8 students to comprise the reference group. The purpose of the reference group is to provide
constructive feedback about the course through an ongoing open dialogue with other students throughout the semester.
You can read more about task of the reference group in this link.
If you want to sign up to be a member of the reference group, use this link.
A survey will be sent out to all to evaluate the course during the last week.
29
Next to come
The next lecture (6th September) will teach you some basics on machine learning with Kshitij Sharma.
In the lecture after that (13th September), I will be giving an overview of some low or no code data science tools.
The choice of tool/technique is open, and you can select any software/method/tool you think is best suited.
Nisha Dalal
Questions & Discussion [email protected]