0% found this document useful (0 votes)
16 views

Assignment 2 Data Science Application Project

Uploaded by

bhattibaba118
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views

Assignment 2 Data Science Application Project

Uploaded by

bhattibaba118
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 3

Assignment 2 Data science application project (individual

assignment)

Students will work independently to perform the entire data science pipeline
on a given real-world dementia dataset using R. You will be required to
describe the entire project in a detailed report and submit the code.

The data set used in this study was obtained from a mobile health care service
offered in collaboration with non-governmental organizations that run elderly
care centers. This service was provided to elderly people residing in various
districts of Hong Kong for free from 2008 to 2018. The data set consists of 2299
cases, each of which includes eleven variables. These variables include age,
body height, body weight, education level, financial support, geriatric
depression scale score, out-of-pocket financial source (whether they were
independent or dependent on family), marital status, Mini Nutritional
Assessment part A score, Mini Nutritional Assessment part B score. The
outcome labels were based on the categories of the Mini Mental State Exam.

Assignment guidelines:
 Each student is required to submit one project report in a Word
document, and R files which are reproducible to generate all the results in
the report.
 R is the only accepted programming language for this assignment. You
must use R to complete all tasks and analyses.

Project report guidelines:


 Do not include any form of code snippets directly into the report. All code
should be included solely in the R files submitted.
 Word limit: 800 words (can be within a +/- 10% range of this word limit),
excluding references, figures, and tables. The report should be formatted in
Times New Roman 12 font with normal margins selected (from the Word
'Layout' menu, choose 'Normal').
 Note that 800 words can be a relatively short length for a project report, so
it's important to focus on being clear and concise in your writing, and make
the maximum use of well-designed visualization to help convey information
in a more efficient and impactful way. The following outline should be
followed:
Introduction: Introduce the topic of the data science project, including the
problem statement and the goals that the project aims to achieve.

Dataset description: Provide background information on the dataset used


in the project, including its source and any relevant characteristics. Include
summary statistics to give readers an overview of the data.

Data pre-processing: Explain any pre-processing steps that were necessary


for the dataset and justify why they were performed. This section should
consider steps such as cleaning, transforming or encoding the data.

Exploratory data analysis: Perform preliminary investigations on the


dataset using summary statistics and visualizations. This section should
provide insights into the dataset and help identify any potential patterns or
trends.

Prediction modelling: Select two prediction models and applied them on


the given dataset. This section should also include some brief information
on the selected models, explain why the chosen models were appropriate
for the dataset. Also evaluate the performance of the two models and
compare their results using the appropriate performance metrics.

Results and discussion: Analyze the results and discuss the findings in a
clear and engaging manner. This section should include visualizations and
any insights gleaned from the data.

Conclusion: summarize the project to give a concise overview of the


project and useful insights and conclusions.

In addition to the project report, we also require the submission of an R file


that includes the complete code performed from data loading to prediction
modeling. The code should be well-organized, easy to follow, and produce the
same outcomes as presented in the project report.

R file guidelines:
 In your submitted code file, include comments to explain the purpose and
functionality of each section of code.
 Organize the code into clear sections, such as data cleaning, exploratory
data analysis and prediction model implementation.
 Use white space and indentation to enhance readability.
 Avoid using overly complicated code, and instead focus on writing clear,
concise code.

You might also like