Birla Institute of Technology & Science, Pilani: Work Integrated Learning Programmes Part A: Content Design
Birla Institute of Technology & Science, Pilani: Work Integrated Learning Programmes Part A: Content Design
Credit Units 3
Last Revised by
Version / Date
Course Objectives
# Course Objectives
1 Gain basic understanding of the role of Data Science in various scenarios in the real-world of
business, industry and government
2 Appreciate the application of the concepts of Statistics, Linear Algebra and Graph Theory in
Data Science/Machine learning etc.
3 Understand various roles and stages in a Data Science Project and ethical issues to be
considered.
4 Explore the processes, tools and technologies for collection and analysis of structured and
unstructured data
Text Books/References
ID Text/Ref Book
T1 Introducing Data Science by Cielen, Meysman and Ali
T2 Storytelling with Data, A data visualization guide for business professionals, by Cole
Nussbaumer Knaflic; Wiley
T3 The Art of Data Science by Roger D Peng and Elizabeth Matsui
R2 KDD, SEMMA and CRISP-DM: A Parallel Overview , Ana Azevedo and M.F. Santos ,
IADS-DM, 2008
R4 Python Data Science Handbook: Essential tools for working with data by Jake VanderPlas
* The above materials are reference only and are neither conclusive nor exhaustive. However, the
student is advised to refer latest content from online sources or instructor supplied materials for more
thorough understanding of the topics
Content Structure
1. Introduction to Data Science
1.1. Definition
1.2. Need of Data Science
1.3. Motivating Examples
1.4. Roles and responsibilities of a Data Scientist
1.5. Data Science vs BI
1.6. Data Science vs Statistics
1.7. Data Science Applications
1.8. Data Science Concerns
3. Data
3.1. Data quality
3.2. Types of Data
3.3. Data Formats
3.4. High dimensional data
3.5. Data representation
3.5.1. Graphs and networks,
3.5.2. Matrices, Vectors,
3.5.3. Data Frames, list
3.5.4. Libraries of Graph, Matrices and vectors
3.6. Data Models
3.6.1. Model as expectation
3.6.2. Comparing models to reality
3.6.3. Reactions to Data
3.6.4. Refining our expectations
3.7. Data Sampling
3.7.1. Probability sampling
3.7.2. Non-Probability sampling
4. Data Wrangling
4.1. Handling Numeric Data
4.2. Dealing with textual Data
4.3. Managing Categorical Attributes
4.4. Transforming Categorical to Numerical Values
4.5. Feature Engineering
4.6. Feature Selection
4.6.1. Curse of dimensionality
4.6.2. Dimensionality Reduction
4.6.2.1. Data Correlation,
4.6.2.2. PCA
4.6.3. Nonlinear Featurization
5. Data Analytics
5.1. Definitions
5.2. Types of data analytics
5.2.1. Predictive, Descriptive, Prescriptive, Diagnostic
5.3. Analytics terminology
5.4. Data analytics - methodologies
5.4.1. CRISP-DM Methodology
5.4.2. SEMMA
5.4.3. BIG DATA LIFE CYCLE
5.4.4. SMAM
5.4.5. ASUM- DM
5.5. Applications
6. Data visualization
6.1. Need for visualization
6.2. Exploratory vs Explanatory Analysis
6.3. Tables , Axis based Visualization and Statistical Plots
6.4. The Data Visualization Design Process
6.5. Lessons in Data Visualization Design
6.6. Stories and Dashboards
9. Review
Contents & Session delivery
Session Topics to cover Content Reference
(2 hrs)
1. Introduction to Data Science T1 – Chapter 1
• Definition
• Need of Data Science
• Motivating Examples
• Roles and responsibilities of a Data Scientist
• Data Science vs BI
• Data Science vs Statistics
• Data Science Applications
• Data Science Concerns
2. Data Science Process T1 - Chapter 2
• Roles in a Data Science project
• Setting expectations
• Data Science methodology
o Business understanding,
o Data Requirements,
o Data Acquisition,
o Data Understanding,
3. o Data preparation,
o Modelling,
o Model Evaluation,
o Deployment and feedback.
• Case Study
• Data Science Proposal Samples
• Data Science Proposal Evaluation
• Data Science Proposal Review Guide
4. Data T1 - Chapter 1
• Data quality
• Types of Data T3- Chapter 5
• Data Formats
• High dimensional data
• Data representation https://www.researchgate.net/p
o Graphs and networks, ublication/319998246_Sampling
o Matrices, Vectors, _Methods_in_Research_Method
o Data Frames, list ology_How_to_Choose_a_Sampl
5.
o Libraries of Graph, Matrices and vectors ing_Technique_for_Research/lin
• Data Models k/59c5f8c2a6fdccc719164f0b/do
o Model as expectation
wnload
o Comparing models to reality
o Reactions to Data
o Refining our expectations
• Data Sampling
o Probability sampling
o Non-Probability sampling
6. Data Wrangling R4 - Chapter 1, 5
• Handling Numeric Data http://www.feat.engineering
• Dealing with textual Data Class notes
• Managing Categorical Attributes
7. • Transforming Categorical to Numerical
Values
• Feature Engineering
• Feature Selection
o Curse of dimensionality
o Dimensionality Reduction
8.
• Data Correlation,
• PCA
o Nonlinear Featurization
Session(2 hrs) Topics to cover Content Reference
9. Data Analytics R2
• Definitions
• Types of data analytics
o Predictive, Descriptive, Prescriptive,
Diagnostic
• Analytics terminology
• Data analytics - methodologies
10. o CRISP-DM Methodology
o SEMMA
o BIG DATA LIFE CYCLE
o SMAM
o ASUM- DM
• Applications
16. Review
Evaluation Scheme:
Legend: EC = Evaluation Component; AN = After Noon Session; FN = Fore Noon Session
No Name Type Duration Wt. Date/Deadline*
EC-1A Quiz-I (Pre-Mid / 20MCQ) Online 3 days open 5%
EC-1B Quiz-II (Post-Mid / 20MCQ) Online 3 days open 5%
EC-1C Assignment Take-home 3 weeks 10%
EC-2R Mid-Semester Regular Closed Book 1.5 hours 30%
EC-2M Mid-Sem Makeup Closed Book 1.5 hours 30%
EC-3R Comprehensive Exam Open Book 2.5 hours 50%
EC-2M Compre Makeup Open Book 2.5 hours 50%
Notes:
➔ The release dates of Quiz-1/2 and assignments will be 3 days (for Quiz) and 3 weeks (for
assignments) before the completion/submission deadline
➔ Deadlines will NOT be extended for whatever reason and the student is requested not to
wait for the deadline to start working on Quiz/Assignment
➔ Syllabus for Quiz-I: Sessions: 1 to 4 / Quiz-II (all Sessions)
➔ Syllabus for Assignment: Hands-on Python-based Exercise (real-world problem, for individual
group of 3 students) / Group formation will be announced before Assignment release
➔ All Quiz/Assignments will be released and to be answered/submitted in Canvas LMS
➔ Syllabus for Mid-Semester Test (Closed Book): Topics in Session Nos. 1 to 8
➔ Syllabus for Comprehensive Exam (Open Book): All topics (Session Nos. 1 to 16)
➔ The student is strictly advised to stick to regular schedule of Mid-Sem and Compre
examinations, and Makeup examinations will be only for those students with business-related
absence/health related issues.
➔ Strictly NO MAKEUPS for Quiz and Assignments and all submissions after the above stated
deadlines will not be considered/evaluated.
➔ All students should conform to BITS students’ ethical code-of-conduct and all assignments will
be subjected to plagiarism check, and if violated will be subject to disciplinary action apart
from nullifying all the marks/grades assigned.
Contact sessions: Students should attend the online lectures as per the schedule provided in the Course
Handout (posted on Canvas LMS)
Evaluation Guidelines:
1. EC-1 consists of 2 Quizzes and 1 Assignments. Students will attempt them through the course
pages on the Canvas portal. Announcements will be made on the portal, in a timely manner.
2. For Closed Book tests: No books or reference material of any kind will be permitted.
3. For Open Book exams: Use of books and any printed / written reference material (filed or
bound) is permitted. However, loose sheets of paper will not be allowed. Use of calculators is
permitted in all exams. Laptops/Mobiles of any kind are not allowed. Exchange of any material
is not allowed.
4. If a student is unable to appear for the Regular Test/Exam due to genuine exigencies, the student
should follow the procedure to apply for the Make-Up Test/Exam which will be made available
on the Elearn portal. The Make-Up Test/Exam will be conducted only at selected exam centres.
It shall be the responsibility of the individual student to be regular in attending the contact-session
schedule as given in the course handout, and take all the prescribed evaluation components such as
Assignment/Quiz, Mid-Semester Test and Comprehensive Exam according to the evaluation scheme
provided in the handout