0% found this document useful (0 votes)
16 views2 pages

CWBrief

Uploaded by

Christopher Neo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views2 pages

CWBrief

Uploaded by

Christopher Neo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

This assignment is worth 25% of the total coursework for the module.

The assignment is divided into two parts. You should submit a single Jupyter Notebook and any related scripts
or SQL files as a single archive. The notebook should contain a description of your approach as well as any/all
processing used to manipulate, cleanse and sanitise the data for purpose. If your dataset exceeds 10MB, then
include a working sample of the data that can be used in place of the full dataset. Your project should focus on
one of the following themes.

Theme Scope Example project


Exercise and physical fitness Projects should focus on a dataset ‘Are we getting stronger?
around physical fitness and Weightlifting world records from
exercise the last decade.’
Interactive Media Projects should explore an area of ‘Community and fan engagement
interactive media such as in online game streaming
computer games or music platforms.’
performance.
Social Responsibility Projects should focus on a dataset ‘Public transport systems in
that explore social responsibility Europe. Finding a balance
such as antisocial crime or between convenience,
environmental wellbeing. affordability and efficiency.’

Deliverables

Part 1 (10%)
Your brief is to design a manageable data science project, and acquire the necessary dataset in a usable form.
You will need to submit your notebook as well as any resources that you have used throughout the exercise.

For Part 1 you should:


• Acquire and prepare your dataset. Here, you will be expected to seek out and find your own dataset –
presumably online. Be sure you’re allowed to share the data with others.
• Preparation might involve collation and/or manipulation of data into a usable format.
• It may involve creating a database or a flat file format to store and manage data.
• It may involve writing Python which produces a dummy dataset, for instance if you are working with sensitive
data. If this is your preferred option, you may need to think carefully about what you would expect the results of
an analysis to look like, so you can tailor the data accordingly (i.e. if generating random numbers, choose a
function or method which produces a realistic distribution, and perhaps a realistic amount of noise too.)
• Explain what programming techniques you have used in the preparation of your data (including any command-
line or SQL programming)

As a topic of your choosing you should:


• Outline the idea behind your project (i.e. its context.)
• Briefly detail what you intend to do with the data. You are not expected to explain the exact techniques you
will use, though it is important that you identify your process as you work.
• You should carefully consider any weaknesses or potential caveats in your approach and present these too.

Part 2 (15%)
You should aim to carry out the main body of work in analysing your data and producing relevant outputs.
You should summarise the project, report on your findings, and discuss any new insights/grounds for further
work.

For Part 2 you should show evidence of the following:


• That an appropriate dataset has been obtained.
• There is evidence that the data has undergone some form of manipulation to get it in a suitable format.
• Some meaningful statistical values or metrics have been derived from the data (e.g. descriptives/summary
statistics.)
• Visualisation(s) have been created programatically, and reveal aspects of
the dataset.
• A report has been produced in a notebook format:
– The report outlines the project, and summarises key findings.
– The report includes visualisation(s) and some discussion thereof.
– The report highlights areas of weakness/uncertainty, and suggests.
further work
• The related code is written in Python and submitted inside the notebook.
• Some or all of the techniques taught on the module have been demonstrated.
• There has been use and some extension of the techniques taught in the
Module.

Report Format

• A report presented in a single notebook. This should include:


– introduction/context
– brief description of data set (or output a sample), including relevant
information (i.e. how it was obtained)
– data visualisations
– summarise key findings/insights
– some form of discussion/critical analysis
– conclusion and further work
– references to any resources used
• Visualisations can be presented inline in the Notebook, or in separately
exported files (e.g. PNGs)
• You should include a working sample of your dataset, not exceeding 10MB.

You might also like