Assignment 2 Data Science Application Project
Assignment 2 Data Science Application Project
assignment)
Students will work independently to perform the entire data science pipeline
on a given real-world dementia dataset using R. You will be required to
describe the entire project in a detailed report and submit the code.
The data set used in this study was obtained from a mobile health care service
offered in collaboration with non-governmental organizations that run elderly
care centers. This service was provided to elderly people residing in various
districts of Hong Kong for free from 2008 to 2018. The data set consists of 2299
cases, each of which includes eleven variables. These variables include age,
body height, body weight, education level, financial support, geriatric
depression scale score, out-of-pocket financial source (whether they were
independent or dependent on family), marital status, Mini Nutritional
Assessment part A score, Mini Nutritional Assessment part B score. The
outcome labels were based on the categories of the Mini Mental State Exam.
Assignment guidelines:
Each student is required to submit one project report in a Word
document, and R files which are reproducible to generate all the results in
the report.
R is the only accepted programming language for this assignment. You
must use R to complete all tasks and analyses.
Results and discussion: Analyze the results and discuss the findings in a
clear and engaging manner. This section should include visualizations and
any insights gleaned from the data.
R file guidelines:
In your submitted code file, include comments to explain the purpose and
functionality of each section of code.
Organize the code into clear sections, such as data cleaning, exploratory
data analysis and prediction model implementation.
Use white space and indentation to enhance readability.
Avoid using overly complicated code, and instead focus on writing clear,
concise code.