DS-301 Introduction To Data Science
DS-301 Introduction To Data Science
Course Objectives: At the conclusion of the course, students should learn the skill
sets required to be a data scientist. Basic statistical concepts such as probability
distributions, statistical inference etc. will be covered during the course. Python
language will be utilized to carry out basic statistical modeling and analysis.
Significance of exploratory data analysis (EDA) in data science will be explored
together with basic tools (plots, graphs, summary statistics).
Course Contents: Introduction: What is data science? Big data and data science
hype - and getting past the hype, skill sets needed, statistical inference, populations
and samples, statistical modelling, probability distributions, fitting a model,
exploratory data Analysis and the data science process, basic tools of EDA and
introductory concepts involved in machine learning.
Course Outcomes: Upon completion of this course, the student should be able to:
Describe data science
Explain statistical inference in basic terms
Explain the significance of exploratory data analysis (EDA) in data science
Text Book: Cathy O'Neil and Rachel Schutt. Doing Data Science, Straight Talk
From The Frontline. O'Reilly, 2014.
Reference Books:
1. Van Der Aalst, Wil. Process mining: data science in action. Vol. 2. Heidelberg:
Springer, 2016.
2. De Brouwer, Philippe JS. The Big R-Book: From Data Science to Learning
Machines and Big Data. John Wiley & Sons, 2020.
Weekly Breakdown
Week Section Topics
1 Chap 1 Introduction: What is Data Science? Big Data and Data Science
hype - and getting past the hype Skill sets needed
2.1 Statistical Inference Populations and samples Statistical modelling,
2
probability distributions, fitting a model
3 2.2 Exploratory Data Analysis and the Data Science Process Basic
tools (plots, graphs and summary statistics) of EDA