MANM528 Individual Assessment Brief-2024-02-12

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 5

Assessment Information/Brief 2023-24

Module title Data mining and text analytics

With application in SAS

Level 7

Assessment title Individual Assignment

Exploring Road Traffic Accident Data and Text Analytics Insights

This assessment is worth 100% of the overall module mark.
within module

Submission Tuesday 28th May 2024 at 4pm

deadline date and
Module leader: Abdolreza Roshani
set by
Submit on SurreyLearn
How to submit
Module Overview:
Assessment task This module provides an in-depth introduction to the data mining
details and instructions process and its applications in the fields of business and management.
Students will learn a range of techniques and tools for collecting,
accessing, and analysing data. Special attention will be given to text
mining and web analytics. Additionally, the module explores the
practical use of data mining models in real-world scenarios.

Assignment Description:
For this assignment, you will work with a comprehensive dataset
comprising real data collected from road traffic accidents in the UK.
This dataset includes detailed information about personal injury road
collisions in the Surrey area during the year 2022. To assist you in your
analysis, a data dictionary file, "RoadAccident-2021-Guide.xlsx" is
provided, offering in-depth definitions for all fields.

Task 1 – Data Exploration and Cleaning [20 marks]

The objective of this task is to enhance your skills in data exploration,
visualization, summary statistics generation, and data cleaning. You
 Load the dataset “RoadAccident-2021_Surrey.csv” and conduct an
exploratory data analysis of the dataset to gain insights into its
structure, content, and quality.
 Generate summary statistics for key variables to understand the

data's central tendencies and dispersion.
 Visualize the data using appropriate plots and charts to identify
patterns, outliers, and potential relationships between variables.
 Identify any data quality issues, including missing values, incorrect
data, and outliers.
 Develop a data cleaning strategy to address the identified issues.
 Execute the data cleaning process, which may involve imputing
missing values and addressing outliers.
Write a comprehensive report for this task, including clear
explanations of the steps taken.

Task 2 – Predicting Accident Severity [30 marks]

In this task, you will apply machine learning techniques to predict
accident severity using the dataset. You should:
 Develop a scenario and select appropriate variables for predicting
accident severity.
 Explore the use of at least two predictive models (e.g., neural
networks, decision trees, logistic regression, etc.).
 Provide a comparative analysis of the performance of these
models, discussing their strengths and weaknesses.
 Interpret the results obtained from both models and draw insights
from their outputs.
 Analyse the importance of different features in predicting accident
severity for each model.
 Summarize your findings and provide a conclusion regarding the
effectiveness of each model.
 Offer recommendations or insights for improving road safety
based on your analysis.
Write a comprehensive report for this task, including clear
explanations of the modelling process and results.

Task 3 – Text Analysis of Tweets [20 marks]

For this task, you will work with a dataset containing text data
collected from tweets related to road traffic accidents in the Surrey
area. Your tasks include:
 Loading and exploring the dataset to familiarize yourself with the
 Performing text preprocessing tasks such as removing special
characters, punctuation, tokenization, and handling start and stop
 Conducting an exploratory analysis of the text data, which may
involve calculating word frequency and visualizing word clouds.
 Performing sentiment analysis on the tweets to determine overall
sentiment (positive, negative, neutral) and providing visualizations
or summary statistics to illustrate sentiment distribution.
 Summarizing key insights and findings from your text analysis.
Write a comprehensive report for this task, including clear
explanations of the text analysis process and results.

Task 4 – Decision-Maker's Summary and Recommendations [20
Based on the results from the previous tasks, you will write a concise
summary intended for decision-makers. This report should provide an
explanation of the dataset, the insights gained, and offer
recommendations or suggestions related to road traffic safety or
public awareness. Ensure that the report is presented professionally,
includes clear explanations, and incorporates visualizations to support
your recommendations. Avoid technical jargon.

General Assessment Criteria [10 marks]

The overall layout, storytelling, professionalism, and Harvard
Referencing will be assessed. Make sure your assignment adheres to
appropriate formatting and citation standards.

Assessed intended learning outcomes

On successful completion of this assessment, you will be able to:
 Demonstrate an understanding of the data and resources available
Knowledge on the web of relevance to business intelligence
and  Demonstrate capability to access structured and unstructured data
Understandi  Apply the practical experience and the theoretical insight needed to
ng reveal patterns and valuable information hidden in large data sets
 Practice with leading data mining methods and their applications to
real- world problems
 Apply the fundamentals of business intelligent on business decision

The assessment strategy is designed to provide students with the

Practical, Professional
opportunity to demonstrate: the ability to analysing a large batch of
or Subject Specific
information to discern trends and patterns.
 Facilitate a comprehensive understanding of the various data
Module Aims mining and web analytics techniques
 Familiarize students with data mining and web analytics tools
 Equip students with the skills to apply data mining and web
analytics techniques effectively with real data in business context
for intelligence gathering and decision making.

What to deliver / Word You are required to submit:

count (if applicable) 1. One file (Word or PDF) containing two parts: a technical report
for tasks 1, 2, and 3 (maximum of 3000 words, excluding the title
page, tables, figures, and appendix) and a managerial report for
Task 4 (maximum of 2 pages, including tables, figures, no
appendix for this task)
2. A PDF file of your saved SAS project.

Formative feedback is provided during the module; summative

Feedback arrangements
feedback will be provided for the assignment.

L7 - Group marking matrix
Criteria Marks Outstanding Excellent Very good Good Satisfactory Unsatisfactory Inadequate Poor Very poor Extremely poor
90% – 100% 80% – 89% 70% – 79% 60% – 69% 50% – 59% 40% – 49% 30% – 39% 20% – 29% 10% – 19% 0% – 9%

Task 1-Business 30% Outstanding Excellent Very good Good business Satisfactory Unsatisfactory Inadequate Poor business Very poor Extremely poor
Intelligence business business business intelligence business business business intelligence business business
intelligence intelligence intelligence report intelligence intelligence intelligence report intelligence intelligence
report report report demonstrating report report report demonstrating report report
demonstrating demonstrating demonstrating actionable demonstrating demonstrating demonstrating actionable demonstrating demonstrating
actionable actionable actionable business actionable actionable actionable business actionable actionable
business business business insights. business business business insights. business business
insights. insights. insights. insights. insights. insights. insights. insights.

Task 2 – Machine 30% Outstanding Excellent Very good Good Satisfactory Unsatisfactory Inadequate Poor Very poor Extremely poor
Learning & AI understanding understanding understanding understanding understanding understanding understanding understanding understanding understanding
of Machine of Machine of Machine of Machine of Machine of Machine of Machine of Machine of Machine of Machine
Learning & AI in Learning & AI Learning & AI Learning & AI Learning & AI Learning & AI in Learning & AI in Learning & AI in Learning & AI Learning & AI in
a managerial in a managerial in a managerial in a managerial in a managerial a managerial a managerial a managerial in a managerial a managerial
business business business business business business business business business business
context. context.. context. context. context. context. context. context.. context. context.
Task 3 - Optimisations 30% Outstanding Excellent Very good Good insights Satisfactory Unsatisfactory Inadequate Poor use Very poor Extremely poor
insights into the insights into insights into into the use of insights into insights into the insights into the insights into the insights into insights into
use of the use of the use of optimisations the use of use of use of use of the use of the use of
optimisations optimisations optimisations for decision optimisations optimisations optimisations optimisations optimisations optimisations
for decision for decision for decision making. for decision for decision for decision for decision for decision for decision
making. making. making. making. making. making. making. making. making.

Presentation, 10% Outstanding Excellent Very good Good Satisfactory Unsatisfactory Inadequate Poor Very poor Extremely poor
storytelling, Harvard presentation, presentation, presentation, presentation, presentation, presentation, presentation, presentation, presentation, presentation,
Referencing storytelling, and storytelling, storytelling, storytelling, storytelling, storytelling, and storytelling, and storytelling, and storytelling, storytelling,
Harvard and Harvard and Harvard and Harvard and Harvard Harvard Harvard Harvard and Harvard and Harvard
Referencing Referencing Referencing Referencing Referencing Referencing Referencing Referencing Referencing Referencing


You might also like