Coursework Specification
Coursework Specification
Coursework Specification
Read this coursework specification carefully, it tells you how you are going to be assessed,
how to submit your coursework on-time and how (and when) you’ll receive your marks and
Coursework Aim:
This team-based assignment involves analysing a real-world dataset and creating
meaningful insights to address certain business concerns and problems identified.
You’ll be working in pairs for this assignment.
Coursework Details:
Type: Report
Overview The objective of this individual assignment is to evaluate
your understanding of the basic theory, concepts, and
various methods and algorithms in data mining, and assess
your skills of applying appropriate Python packages, such
as NumPy, Pandas, Matplotlib, and Scikit-learn, etc., to
carry out a data mining project.
1. Problem Identification
1.1. Read the data description file (metadata) to learn the
basic characteristics of the dataset including the
certain business context associated with the data,
the total number of attributes (dimensions,
variables), the data type of each attribute, the value
range/mode, skewness, and kurtosis of each
attribute, the total number of instances, and simple
data exploration with essential plotting, etc.
1.2. Identify a set of meaningful business problems of
interest with regard to the data for analysis.
1.3. Identify what data mining tasks need to be performed
in order to address the business problems raised.
2. Data Preparation
2.1. Determine which variables to be used in which
analysis. Also refer to 1.2. and 1.3. Task 1.
2.2. Get your data for analysis. Choose appropriate
methods for data pre-processing, including detecting
and dealing with incorrect data types, irrelevant
variables, missing values, outliers, imbalanced
classes, and duplicates, changing data type, and
conducting proper dimensionality reduction, feature
extraction, data transformation, data partition, and
normalisation, etc. where appropriate. Also refer to
1.1. Task 1.
3. Model Construction
3.1. With the pre-processed dataset undertake the data
mining tasks you have identified in 1.2. You are
required to apply two different algorithms for both
predictive and descriptive modellings. For
descriptive modelling, you may choose to use the k-
means clustering and various EDA (Exploratory Data
Analysis) methods, e. g., histograms, bar charts, and
Person’s correlation coefficient, etc. For predictive
modelling, for example, you may use decision trees
and artificial neural networks, or decision trees and
k-nearest-neighbour, etc.
3.2. In order to build the most appropriate and accurate
models and identify meaningful hidden patterns,
different settings for the relevant model parameters
should be considered for each of the selected
algorithms and methods.
You may get a reduction in mark for not meeting the word
count limit.
Referencing: Harvard Referencing should be used, see your Library
Subject Guide for guides and tips on referencing.
Learning Outcomes
This coursework will partially assess the following learning outcomes for this module as
indicated by *.
Intellectual Skills
On successful completion of this module, you will be able to
• Identify different types of data mining tasks in relation to various business concerns,
including classification, prediction, cluster analysis and segmentation, and association
analysis and market basket analysis. *
• Critically review and appreciate the strengths and weaknesses of different data mining
techniques, models, and tools. *
Practical Skills
On successful completion of this module, you will be able to
• Select and apply appropriate data mining techniques for a given real-world problem. *
• Evaluate various models built from a data mining process. *
• Undertake a data mining project with clear business focus, in particular, in relation to
CRM analysis, RFM modelling, and credit risk scoring. *
Transferable Skills
On successful completion of this module, you will be able to
• Demonstrate analytical skills. *
• Demonstrate project management skills. *
• Teamwork skills. *
Feedforward comments
100 - 80% 79 - 70% 69 - 60% 59 - 50% 49 - 40% 39 - 30% 29 - 0%
1. Business Exceptionally clear and Thorough and clear analysis Clear analysis of business Clear analysis of Adequate analysis of the Inadequate analysis of Little or no analysis
Understanding and Data concise analysis of business of business concerns and concerns and relevant data business concerns and key business concerns business concerns and of business
Understanding concerns and relevant data relevant data mining tasks. mining tasks to a certain relevant data mining and data mining tasks. data mining tasks. Only concerns and data
mining tasks. Excellent and Excellent initial data depth. Sensible initial data tasks. Probably lack Limited initial data simple initial data mining tasks. Little
creative initial data exploration exploration with effective exploration performed with some in-depth view. exploration. Probably lack exploration performed. or no initial data
with effective means. means. appropriate means. Essential initial data some relevance. Lack clarity and exploration
presentation. Clear structure Clear structure and layout. layout. poor presentation.
and layout.
How to get help
All the module’s lectures, tutorial handouts, and the references recommended
in the module guide.