Assignment 2 Task Sheet
Assignment 2 Task Sheet
Assignment 2
(20 marks)
Due: 21:00 AEST, 23 September 2022 Friday
Aim
This assignment aims to provide students with essential experience conducting big data
analytics experiments with the R or the Python programming language. In this assignment,
you should
• procedure big data analytics by following Big Data Analytics Lifecycle,
• appropriately choose, apply and evaluate core models/algorithms and analytics
techniques to complete the analysis tasks,
• understand and integrate the knowledge and skills learned in this subject, including
big data analytics lifecycle, data preparation, clustering, classification, regression,
association rules, data/model evaluation, data visualization and text processing.
Group work: You are to work on this assignment as a group. Each group is to work
independently from other groups on this assignment. Groups and group memberships are
as specified on Moodle. All group members are expected to contribute to this assignment.
Group members may use communications tools (e.g., UOW Zoom, UOW Webex, UOW
Teams, Slack, Discord, WhatsApp, etc.) and online collaboration workspace (e.g., UOW
OneDrive, Google Drive, GitHub, ZenHub, etc.) to complete the assignment. Please plan
before starting the assignment, then keep a detail digital work log and timesheet for each
group member. A justification and/or explanations must accompany all your answers to this
assignment. One submission per group only.
Penalties: If a group member fails to make a minimum contribution, the member will be
awarded zero marks. Claims of less or no contribution should provide evidence like a work
log. Plagiarism of any part in this assignment will result in zero marks being awarded to the
whole group.
Preliminaries
Read through the lecture slides, lab instructions and the recommended readings in Weeks 1
– 9. Conduct relevant background studies. You should use either R or Python for the tasks in
this assignment. You can use any publicly accessible toolbox of library for R and Python.
Your submission must include the source code file(s) which, when run, would re-create all
your results.
Task 1: Design a big data analytics project by following Big Data Analytics Lifecycle. (3 marks)
Task 2: Process data in different types and having different properties, and correspondingly
apply (mandatory) core models/algorithms. They are regression, association rules,
clustering, classification, and text processing. (10 marks)
Task 3: Visualize the data and visualization for evaluations. (5 marks)
Task 4: study factors in multiple views (e.g., text, color, tweet, etc.) and make suggestion to
amend switching between non-human and human profiles. (2 marks)
A report is required to summarize Tasks 1-4 in a well-organized way and cite referred
articles and programming resources in your writing. Tasks 2-4 need R/Python programming
to support your analysis.
Important:
• The report must be in a single file and in .pdf or .ipynb format. The title page
must list the full name and student ID of all members in the group. Clearly indicate
members who did not make a minimum in contributions.
• The report does not have a page limit.
• Marks will be deducted for incomplete or vague descriptions.
• Sufficient, suitable, and legible annotation shall be provided in your code to make it
easy to understand. Marks will be deducted for untidy code, code that is difficult to
read, code that does not run, or code that does not reproduce the results in your
report.
Note: Failure of your code to run may attract zero marks. Plagiarism of any part in your
code, or any part in your report will attract zero marks for this assignment. It is the
responsibility of the group to ensure that your submission does not contain plagiarized
material. You may be requested to demonstrate and explain your program or explain your
answer in the report. Marks are deducted if you are unable to offer an explanation. Marks
will be awarded for correct design, implementation, style, completeness, and justification.
---------------------------------------------------------- END----------------------------------------------------------