Data Science Principles - ITS65704
Data Science Principles - ITS65704
OVERVIEW:
This module is designed to expose students with a range of topics related to data science. It covers various facets of data science practice, including
data collection, to processing, analysis and visualisation and effective communication. Focus in these topics will be on breadth, rather than depth, and
emphasis will be placed on integration and synthesis of concepts and its applications used to solve problems. The module delivery will include lecture
sessions, tutorials, hands -on exercises and invited talks from expert data science practitioners. Assessment for this module includes a test, a group
project which will give a platform for students to apply the knowledge of data science process and finally a 2 hours final exam to test the overall
principles and applications covered in this module
1
Name(s) of academic staff teaching the module, module leader and staff email:
Staff teaching the module: Dr Thulasyammal Ramiah Pillai
Module leader: Dr Thulasyammal Ramiah Pillai
Year-level: 2
Credit Value: 4
Pre-requisite: Nil
Co-requisite: Nil
Anti-requisite: Nil
Module offered as: Specialization, Minor, Free Elective, Extension (Choose 1 or more)
Programme Name: Bachelor of Computer Science (Hons), Bachelor of Software Engineering (BSE), Bachelor of Information Technology (BIT)
2
LEARNING OUTCOMES:
3
TEACHING, LEARNING AND ASSESSMENT
40%
Assessment Task 4:
MLO 1 - 20% 1,2 1 Final Exam Period
Final Exam
MLO 2 - 20%
4
Teaching and learning approach:
5
Teaching and Learning Activities:
MLO 2 is achieved after students have demonstrated the understanding of the concepts introduced in each chapter via implementation. The tutor
facilitates students through tutorial and practical computer laboratory tasks
To achieve MLO 3, students are required to work in a team to design, develop and solve identified problem in each domain using data science
algorithms which they have learned in this module. Interim review and critique of design are conducted progressively to provide feedback on
students’ conceptual ideas / design strategy and its development to solve the given problem.
6
Details of each assessment task:
Each student is required to take a 2 hours test testing them on the topics covered in the first five week of this module. In the test, they have to solve the 2
to 3 problems using data science principles, process and tools.
Assessment 2: Group Assignment: Implementing Data Science Algorithms for a data set from a given domain: 20%
Students are required extract or collect a set of data from a repository and perform the analysis based on the given scenario to solve a problem in the
identified area. Once the data is extracted the appropriate data cleaning methods should be applied to clean the raw data before the analysis is performed.
Once the clean set of data obtained students are required to use the appropriate model and algorithms derive the data model which will be further used in
the data visualization phase. The data model is derived and should be used for data visualisation. Students are required to prove the developed data model
can be used for decision making or solve the specific problem identified at the beginning of the assignment. This is a group assignment.
Project presentation. Each group is given 15 minutes for project presentation. All members of the project MUST participate on the project presentation.
The students must be able to confidently explain and justify how they used data science principles, process, algorithms and tools to solve the problem.
Project Report: Each group must submit a detailed report explaining the details of the data science process applied, the algorithms
implemented to solve the problem. The report should also extensively discuss the results of the algorithm implementation
This assessment is a 2-hour closed book examination which will test the concept and implementation covered in this module. Students are required to
answer 4 out of 5 question.
7
8
Rubrics for Each Assessment Task
Data Extraction: 5%
Data Cleaning: 5%
Data Analysis using the appropriate algorithm: 5%
Data Visualisation: 5%
A student must achieve at least 50% for the overall assessment and a final grade of C to pass the module. A student who obtains a minimum of 40% for the
overall assessment and overall grade of D or higher for the module may be allowed to resit the examination. The maximum passing grade awarded for the
resit examination will be a grade C.
A student who obtains 39% and below for the final assessment, will result in failing the module irrespective of the overall marks earned, even though
he/she has achieved 50% or more in the overall assessment. He/she will not be allowed to resubmit the final assessment
9
STUDENT LEARNING TIME
Student Learning Time (SLT) per topic/week of the content outline (SLT mapping against MLO, Teaching & Learning Activities [Guided Learning F2F
(L,T,P,O), NF2F & Independent Student Learning Time]:
10
Week 4 2h (1L, 1P) 2h 8 12
L: Exploratory Data Exploratory Data
Exploratory Data Analysis (EDA) Analysis (EDA) Analysis (EDA)
Philosophy of EDA
Data Science Process
P:
Basic tools (plots, graph and
summary statistics) of EDA
11
P:
Machine learning tools and
libraries
Week 8 5h 10 15
Extracting Meaning Extracting Meaning
from Data from Data
Feature selection
algorithms
Feature extraction
libraries and tools
12
P:
Data Visualization tools
Week 12 2h (1L, 1P) 2h 8 1 13
L: Graph Processing Graph Processing Assessment 2:
Graph Processing Group
Social Network as graphs Assignment-
Clustering graphs (30%)
Directory discovery of Assessment 3:
community in graphs Presentation
Partitioning in graphs (10%)
Neighbourhood properties
in graphs
P:
Graph processing and
visualization (tools and libraries)
P:
Data Science and Ethical Issues
13
case study and discussion
Week 15
Preparation of
final exam
Week 16
Assessment 4:
Final Exam (40%)
29 108 3 160
TOTAL 25
Hours
14
REFERENCES:
OTHER:
15