0% found this document useful (0 votes)
109 views

Assignment 2 - Implementation-1

This document outlines an assignment for a university data science course. It will contribute 65% to the student's final module mark. Students are required to conduct data analysis on a provided dataset using R coding and submit a short report. The report should include sections on the introduction, data used, machine learning methods, practical preprocessing and implementation, results, and conclusions. Screenshots of code and outputs are to be included. Students must follow submission guidelines, formatting, and academic integrity policies.

Uploaded by

Ankush Dhiman
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
109 views

Assignment 2 - Implementation-1

This document outlines an assignment for a university data science course. It will contribute 65% to the student's final module mark. Students are required to conduct data analysis on a provided dataset using R coding and submit a short report. The report should include sections on the introduction, data used, machine learning methods, practical preprocessing and implementation, results, and conclusions. Screenshots of code and outputs are to be included. Students must follow submission guidelines, formatting, and academic integrity policies.

Uploaded by

Ankush Dhiman
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 6

University of Sunderland

School of Computer Science

CETM72 Assignment 2 – Implementation and report


Data Science Principles
This assignment contributes 65% to your final module mark.

The following learning outcomes will be assessed:

1. Critically select and apply key machine learning and statistical techniques for data analytics
projects across the whole data science lifecycle on modern data science platforms and with
data science programming languages.
2. Appropriately characterize the types of data; to perform the pre-processing, transformation,
fusion, analysis of a wide range type of data; and to visualize and report the results of the
analysis of various types of data.

Important Information

You are required to submit your work within the bounds of the University Infringement of
Assessment Regulations (see Programme Guide). Plagiarism, paraphrasing and downloading large
amounts of information from external sources, will not be tolerated and will be dealt with severely.
Although you should make full use of any source material, which would normally be an occasional
sentence and/or paragraph (referenced) followed by your own critical analysis/evaluation. You will
receive no marks for work that is not your own. Your work may be subject to checks for originality
which can include use of an electronic plagiarism detection service.

Where you are asked to submit an individual piece of work, the work must be entirely your own.
The safety of your assessments is your responsibility. You must not permit another student access
to your work.

Where referencing is required, unless otherwise stated, the Harvard referencing system must be
used (see your Programme Guide).

Please ensure that you retain a duplicate of your assignment. We are required to send samples of
student work to the external examiners for moderation purposes. It will also safeguard in the
unlikely event of your work going astray.

Submission Date and Time As advised on Canvas


Submission Location Via Canvas
Your Task

THIS ASSIGNMENT REQUIRES R CODING AND A SHORT REPORT (65% of module marks)
Your task is to conduct data analysis on a given data set. To help you in this task please look over our
past RStudio activities where we loaded in data, pre-processed it, trained machine learning
algorithms on the data and plotted the results.

The first part of the report is simply text describing the introduction, application area and data to be
used, machine learning algorithms to be used.

What I expect to see for the practical implementation part of the report are screenshots of your
code in the RStudio script editor. Screenshots of key outputs and screenshots of important diagrams.
Along with text to describe what I’m seeing and identify any salient points. The presentation of your
practical work should be identical to the way I’ve presented the Activities in R over the last seven
weeks. You need to use snipping tool in Windows or similar to grab screenshots of selected areas.

Finally, write up your work in a 1,500 word (+/- 10%) report

Student Information

The report should include the following headings:

Report – (40 marks)

Introduction (10 marks)

Application area and data (10 marks)

Machine learning algorithms (10 marks)

Conclusion, structure of report, including refs (10 marks)

 Practical Implementation - (60 marks)

Pre-processing on real or simulated data (10 marks)

R Programming content and your function (20 marks)

Display of data/results (20 marks)

Source code listing (10 marks)

State where you obtained or simulated your data, the R packages you have used, any source code
you have used from others. Also, place a full R source listing at back of report - it will not add to
word count but DO NOT go over page count of 15 pages

You can refer to any of your course handouts, any other books, journals, online resources etc. 
1. Introduction
Your introduction should include a summary of the main points that you will discuss in your
report. Your report should outline the area your data is from and what you hope to achieve.
Your introduction should be about 150 words in length.

2. Data used
The purpose of this section is to ensure you understand the types of data and the pre-processing
you will use. Provide details of where your data was obtained. What types of variables will be
used, integer, dates, strings, etc. Provide literature and examples associated with your chosen
topic. Do not use the iris or mtcars data and please check with your tutor the suitability of your
chosen data. This section should be approximately 150 words.

3. Machine learning methods used


In this section you should identify the machine learning methods that you will apply the data
from your chosen topic. What criteria will be used to measure the success of the machine
learning methods. This section should be approximately 150 words.

4. Practical: Pre-processing of data


In this section you should discuss how the data was read in, what pre-processing if any occurred
and why you did it. Show me screen shots of code with your text write up. This section should
be no more than 150 words in length.

5. Practical: R Programming content


In this section you should show me screen shots of code with your text write up. The R
programming content can include building your machine learning models, testing of models,
perhaps you have done a compare/contrast with several models. I would also like to see an R
function written by you. The source code should be neat and tidy, use comments where
necessary to explain the main actions of your code. This section should be no more than 300
words in length.

6. Practical: Display of data/results


This section you should use screenshots of key R output, important diagrams and anything to do
with your machine learning models. Along with text descriptions of the outputs. It should be no
more than 300 words in length.

7. Source code listing


This includes all your R code including the library commands. I expect to be able to obtain your
data, load in the libraries you have used and copy and paste and run your analysis.

8. Conclusions
In this section you should summarise your experimental results and findings. This section should
be approximately 150 words.

9. References and look and feel of report


These should be to Harvard standards (not included in work count but should be between 5-10
references). References should be valid and appropriate. The formatting of the report should be
neat and tidy. Diagrams should be used with good descriptive text. Diagrams should be easy to
read, and a sensible number of no more than 6-7 diagrams used. No more than 15 pages in total
for everything including source code listings, put source code listing in font size 10.

The word counts for the sections are just advisory based on marks allocated.
Submission Guidelines

Your report should be spell checked and contain references. You must use the Harvard style of
referencing, both for citations within the text and your reference list. It is important that you read
thoroughly the information on the cover sheet regarding the university assessment regulations,
including those regarding plagiarism and collusion. Assignment hand-in requirements are specified
on the front cover sheet. The approximate time you should spend on this assignment is 30-50 hours.
Your assignment must be handed in before the time specified. Your assessment will be assessed
according to the University’s Postgraduate Generic Assessment Criteria, which are provided on the
following pages.
CETM72 Data Science Principles – Assignment 2 Marking Sheet

Name Student Registration Number

Total Mark %

Categories
Grade Relevance Knowledge Analysis Argument and Critical Evaluation Presentation Relevance to Literature
Structure
Pass 86 – The work examined is exemplary and provides clear evidence of a complete grasp of the knowledge, understanding and skills appropriate to the Level of the qualification. There is also
100% unequivocal evidence showing that all the learning outcomes and responsibilities appropriate to that Level are fully satisfied. At this level it is expected that the work will be exemplary in all
the categories cited above. It will demonstrate a particularly compelling evaluation, originality, and elegance of argument, interpretation or discourse.

76 – The work examined is excellent and demonstrates comprehensive knowledge, understanding and skills appropriate to the Level of the qualification. There is also excellent evidence showing
85% that all the learning outcomes and responsibilities appropriate to that level are fully satisfied. At this level it is expected that the work will be excellent
in the majority of the categories cited above or by demonstrating particularly compelling evaluation and elegance of argument, interpretation or discourse and some evidence of originality

70 – The work examined is of a high standard and there is evidence of comprehensive knowledge, understanding and skills appropriate to the Level of the qualification. There is
75% clearly articulated evidence demonstrating that all the learning outcomes and responsibilities appropriate to that level are satisfied At this level it is expected that the standard of the work
will be high in the majority of the categories cited above or by demonstrating particularly compelling evaluation and elegance of argument, interpretation or discourse

60 – Directly relevant to the A substantial knowledge Comprehensive analysis Well supported, Contains distinctive or Well written, with Critical appraisal of up-
69% requirements of the of relevant material, - clear and orderly focussed argument independent thinking; standard spelling and to-date and/or
assessment showing a clear grasp of presentation which is clear and and begins to formulate grammar, in a readable appropriate literature.
themes, questions and logically structured. an independent style with acceptable Recognition of different
issues therein position in relation to format perspectives. Very good
theory and/or practice. use of a wide range of
sophisticated source
material.

50 – Some attempt to Adequate knowledge of Significant analytical Generally coherent and May contain some Competently written, Uses a good variety of
59% address the a fair range of relevant treatment which has a logically structured, distinctive or with only minor lapses literature which
requirements of the material, with clear purpose using an appropriate independent thinking; from standard includes recent texts
assessment: may drift intermittent evidence of mode of argument may begin to formulate grammar, with and/or appropriate
away from this in less an appreciation of its and/or theoretical an independent acceptable format literature, including a
focused passages significance mode(s) position in relation to substantive amount
theory and/or practice. beyond library texts.
Competent use of
source material.

40 – Some correlation with Basic understanding of Some analytical Some attempt to Sound work which A simple basic style but Evidence of use of
Pass 49% the requirements of the the subject but treatment, but may be construct a coherent expresses a coherent with significant appropriate literature
assessment but there addressing a limited prone to description, or argument, but may position only in broad deficiencies in which goes beyond that
are instances of range of material to narrative, which lacks suffer loss of focus and terms and in uncritical expression or format referred to by the tutor.
irrelevance clear analytical purpose consistency, with issues conformity to one or that may pose obstacles Frequently only uses a
at stake stated only more standard views of for the reader single source to support
vaguely, or theoretical the topic a point.
mode(s) couched in
simplistic terms

Fail 35 – Relevance to the A limited understanding Largely descriptive or A basic argument is Some evidence of a Numerous deficiencies Barely adequate use of
39% requirements of the of a narrow range of narrative, with little evident, but mainly view starting to be in expression and literature. Over reliance
assessment may be very material evidence of analysis supported by assertion formed but mainly presentation; the writer on material provided by
intermittent, and may and there may be a lack derivative. may achieve clarity (if at the tutor.
be reduced to its of clarity and coherence all) only by using a
vaguest and least simplistic or repetitious
challenging terms style

30 – The work examined provides insufficient evidence of the knowledge, understanding and skills appropriate to the Level of the qualification. The evidence provided shows that some of the
34% learning outcomes and responsibilities appropriate to that Level are satisfied. The work will be weak in some of the indicators.

15 – The work examined is unacceptable and provides little evidence of the knowledge, understanding and skills appropriate to the Level of the qualification. The evidence shows that few of the
29% learning outcomes and responsibilities appropriate to that Level are satisfied. The work will be weak in several of the indicators.

0– The work examined is unacceptable and provides almost no evidence of the knowledge, understanding and skills appropriate to the Level of the qualification. The evidence fails to show that
14/% any of the learning outcomes and responsibilities appropriate to that Level are satisfied. The work will be weak in the majority or all of the indicators.

Comments

You might also like