1.
School of Science and Technology
2. Department Computer Science and Information Systems
3. Programme Master of Science in Data Analytics
4. Module Title Statistical Methods for Data Science
5. Module Code CIS-SMD-611
6. Level 5
7. Credits 15
8. Revised After 3 years
9. Approval Date 2024
10. Lecture Hours 30
11. Tutorial Hours 10
12. Pre-requisites None
13. Co-requisites None
14. Aim(s) of the Course
The aim of the module is to provide a practical introduction to statistical methods and
applications in data science.
15. Intended Learning Outcomes
On successful completion of the module, students should be able to:
a) Understand and give examples of how probabilistic models are used in data science
applications.
b) Use statistical software libraries to compute descriptive statistics and visualisations.
c) Implement probabilistic models and apply them in data science applications,
d) Apply statistical tests for evaluating data science applications.
31
e) Justify which type of statistical methods is applicable for the most common type of
experiments in data science.
f) Discuss advantages and drawbacks of different types of probabilistic models that can
be applicable for a given data science application.
16. Indicative Content
a) Exploratory data analysis: basic visualisation for data preparation and modelling
strategy.
b) Probability models, in the context of the different statistical methods discussed in the
module.
c) Hypothesis testing and confidence intervals
d) Regression: linear and non-linear methods for explaining outcomes.
e) Classification: logistic regression, linear discriminant analysis.
f) Point estimation, maximum likelihood and basic optimisation: fitting generic statistical
models, k-nearest neighbour.
g) Resampling methods: cross-validation, bootstrap.
h) Principal Component Analysis
17. Method of Assessment:
Continuous assessment : 50%
Examination : 50%
18. Teaching and Learning Methods / Activities
a) Lectures
b) Tutorials
c) Students’ individual study
d) Case studies
e) Assignments
f) Group discussions
19. Prescribed Reading Lists
Rice, J. A. (2007). Mathematical statistics and data analysis (3rd ed.). Duxbury Press.
James, G., Witten, D., Hastie, T. & Tibshirani, R. (2012). An introduction to statistical
learning: with applications in R, Springer, New York, https://www.statlearning.com/
20. Recommended Resources
Diez, D. M., Christopher D Barr, C. D. and Mine Çetinkaya-Rundel, M. (2015). OpenIntro
Statistics (Third Edition), OpenIntro, Inc.
VanderPlas, J. (2017). Python Data Science Handbook: Essential Tools for Working with
Data, O'Reilly Media
32