CETM 24 Part 2
CETM 24 Part 2
Skills
S1 Select and apply key analytical techniques (e.g. traditional and intelligent analytics) in order to be
able to conduct a big data analysis across the whole data science lifecycle on modern data science
platforms and with data science programming languages
S2 Conduct pre-processing, data fusion and data analysis on a wide variety of data sets and to report the
results.
Important Information
You are required to submit your work within the bounds of the University Infringement of Assessment
Regulations (see your Programme Guide). Plagiarism, paraphrasing and downloading large amounts of
information from external sources, will not be tolerated and will be dealt with severely. Although you
should make full use of any source material, which would normally be an occasional sentence and/or
paragraph (referenced) followed by your own critical analysis/evaluation. You will receive no marks for
work that is not your own. Your work may be subject to checks for originality which can include use of
an electronic plagiarism detection service.
Where you are asked to submit an individual piece of work, the work must be entirely your own. The
safety of your assessments is your responsibility. You must not permit another student access to your
work. Where referencing is required, unless otherwisestated, the Harvard referencing system must be
used (see your Programme Guide). Please ensure that you retain a duplicate of your assignment. We
are required to send samples of student work to the external examiners for moderation purposes. It will
also safeguard in the unlikely event of your work going astray.
See Canvas
Submission Location
Via Canvas
The assignment will be assessed and counts for 50% of the marks for module CETM24.
Part 1: Report (40 marks)
You are required to produce up to 2,000 word report (max) on the subject of anomaly detection. NO
MORE THAN 10 PAGES FOR THE MAIN BODY, ANY PAGES OVER THE LIMIT WILL NOT BE MARKED!
SOURCE CODE LISTINGS AT BACK OF REPORT WILL NOT COUNT. Choose an anomaly detection algorithm
that you find interesting and how this particular approach has been successfully deployed in data
mining. You are free to choose any application area such as finance, health, fraud, stock market,
military, engineering, social media, prediction, telecoms, planning etc. etc. The report should discuss
your data analysis, use R outputs, R code snippets and diagrams to assist your explanations.
Based on the machine learning method selected, implement a data mining analysis using this method on
your selected data set.
State where you obtained or simulated your data, the R packages you have used, any source code you
have used from others. Also, place a full R source listing at back of report and format it well - it will not
add to word count or page count.
You can refer to any of your course handouts, any other books, journals, online resources etc.
Output