0% found this document useful (0 votes)
28 views6 pages

STAT1008 - S1 2024 - Assignment Instructions

Uploaded by

Tanvi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
28 views6 pages

STAT1008 - S1 2024 - Assignment Instructions

Uploaded by

Tanvi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

Research School of Finance, Actuarial Studies and Statistics

Semester 1, 2024
STAT1008 Quantitative Research Methods
ASSIGNMENT

DUE DATE:. Wednesday 15 May 11:59pm

OBJECTIVES
The main goal of this assignment is to help you apply the statistical tools that you have learned in
this course, and to do so in a realistic data analysis setting. This assignment is a great way to solidify
your understand of exploratory data analysis; a key preliminary step in any data analysis task. This
assignment will also give you hands on practice in conducting statistical hypothesis testing. Hypothesis
testing supports evidence based decision making and this assignment will give you practice in carrying
out a hypothesis test of your choice and interpreting the results.

REQUIREMENTS
In this assignment you are required to analyse a dataset of your choice using the statistical tools and
methods discussed in Chapters 1, 2, 3, 7, 8, 9 and 10 of the textbook. You will need to write a report
to present and discuss the results of your analysis. You will need to decide on what data summaries and
graphical displays to produce, and what numerical descriptive measures to calculate for inclusion in your
report. You are required to carry out one hypothesis test on your chosen data set. You will need to
decide on what applied question you want to answer with your chosen data set, formulate the question in
terms of a hypothesis test, provide step-by-step details of your calculations, and interpret your analysis
results in relation to your original question of interest. Specific requirements of the report are described
in detail on page 3.

Listed below are a few applied questions to give you an example of the type of research questions that
you could investigate through the collection of relevant data, and running a hypothesis test:

• Do NBA players over 2 metres tall have a higher average two-point field goal percentage than NBA
players who are 2 metres or less in height?

• Do piano players have longer fingers than non-piano players?

• Do athletes sleep less than the recommended amount of 7 hours per night?

The dataset you choose could be one of academic or personal interest to you and can come from any field.
The freedom to choose your own dataset gives you the opportunity to stimulate your intellectual curiosity
so find data you are excited to work with!! The data set can come from any field such as economics,
law, medicine, education, sport, psychology, politics etc. Your analysis must be original and must not be
copied from another source. Once you have chosen your dataset, you may wish to confirm with your tutor
or the course convenor that your choice of data set is suitable for the assignment in terms of complexity
and structure.

Page 1 of 6
CAUTIONS!!

• YOU MUST ANALYSE RAW, INDIVIDUAL LEVEL DATA (NOT AGGREGATED DATA). DO
NOT ANALYSE DATA ALREADY SUMMARISED IN A FREQUENCY OR SUMMARY TABLE.
(Note: data tables available for public download from the Australian Bureau of Statistics tend to
be in aggregated form already, so this data is not suitable for the assignment). An example of a
data table that is already aggregated is shown in the screenshot below.

Figure 1: Example of aggregated (not individual level) data

• THE STATISTICAL HYPOTHESIS TESTING TOOLS WE LEARN IN THIS COURSE REQUIRE


INDEPENDENT UNITS OF OBSERVATIONS. FOR EXAMPLE, SERIALLY CORRELATED
DATA (that is, data taken at yearly or other regular time intervals) ARE NOT INDEPENDENT.
So time series data (eg. share prices over time) are not suitable for the assignment.

DATA SOURCES
If you have a specific topic or data field in mind, try running an online search for your topic eg. basketball
data, human rights data, cost of living data .
For a general search of datasets from a variety of topics, the following websites may be useful:
• The Data and Story Library https://dasl.datadescription.com/
• UC Irvine Machine Learning Repository https://archive.ics.uci.edu/ml/index.php
For those interested in global and country-level data, you may find the following websites useful:

• Our World in Data https://ourworldindata.org/


• OECD Data https://data.oecd.org/

As an alternative to sourcing data from the internet, you might like to collect your own data (by conducting
your own survey for example) to answer some question of interest to you. For example, according to the
2017 Universities Australia survey, the average number of hours worked per week amidst a semester is 16.3
hours for domestic undergraduate students. Do ANU domestic undergraduate students work on average
more on less than this amount per week?

A minimum sample size of 50 is recommended for a data set sourced from the internet.
A minimum sample size of 25 is recommended if you conduct your own survey to collect your own data.

Page 2 of 6
REPORT GUIDELINES
You must submit a written report to communicate your project findings. Please include the following
sections in your report:

• INTRODUCTION:

• State your research objective. What applied question are you trying to answer? Be very
explicit and clear on what your null hypothesis and what your alternative hypothesis are using
the framework discussed in class (H0 : .....) (HA : .....)
• State why your research question is of personal and practical interest.

• DATA SET DESCRIPTION:

• State the source of your data set. Either provide the website address(es) if you downloaded
the data from the internet or state that you conducted your own study to collect the data.
• Target population and data collection method. What is your population of interest? What
was the date of data collection from the original source? (eg GDP by country as at 31 Dec
2020) How were the records in your data set chosen for inclusion in your sample? If there are
biases in the data collection method, be sure to comment on how this may affect the validity
of your results.
• Data set size and variables. How many observations are in the data set? Which variables will
you analyse? Classify the variables by type (numerical, categorical etc....) Note: You do not
need to analyse all the variables in your chosen data set. For example, suppose you have a
data set which was collected to study the relationship between exercise and sleep patterns.
The data set has 20+ variables containing demographic and lifestyle information on the study
participants. However, you are particularly interested to see whether there is an association
between amount of hours slept per night and exercise hours per week, so you focus on these
two variables for your analysis.

• DATA SUMMARIES

• Provide summary (frequency or contingency) tables for your chosen variables. Include some
graphical displays (bar charts, histograms, scatter plots etc.)
• Provide some numerical descriptive measures of your chosen variables (example, sample means,
sample variances, sample proportions). Include a box and whisker plot for your numerical
variable(s). You do not need to show working for any numerical descriptive measures that you
calculate in this section. You can simply report the result. For example, the sample mean is
6.8 hours of sleep per night.
• From your data summaries, what conclusions can you draw about the shape of the distribution
of your variables or relationships between variables? Try to explain any patterns in the data
you notice.
• In the online submission box, there will be a separate tab to submit your data set as an Excel
spreadsheet. You must submit your data set as an Excel spreadsheet. This spreadsheet
must also show your Excel output with the relevant data summaries referred to in your report.

Page 3 of 6
• HYPOTHESIS TESTS - RESULTS and DISCUSSION:

• Carry out your hypothesis test. Restate H0 : ..... and HA : ..... as provided in your introduction.
Clearly state the test statistic you calculated and report the p-value of your test.
You must show all working. Specifically you need to provide the mathematical expressions for
your test statistic calculation and your p-value calculation. Stating the generic formula (for
example, t = sxx̄−µ
√ for a one-sample t-test on a population mean) is not acceptable. You need
/ n
to insert the specific numeric values for x̄, µ, sx and n that you used in your calculations.
Report the exact p-value calculated using the Excel function T.DIST(..) as demonstrated in
lectures. Also state your chosen significance level.
• Justify that your data variables conform to the assumptions required by your hypothesis test.

• HYPOTHESIS TESTS - DISCUSSION:

• Interpret your results in relation to your applied research question and provide some practical,
intuitive reasoning behind your results. For example, a significant positive correlation is found
between exercise hours and sleep hours. This makes sense as more rest time may be required
to recover from the additional physical exertion during exercise.

• CONCLUSIONS:

• Briefly summarise your key findings.


• Discuss any limitations of your analysis and potential future improvements. Are there any
further questions you would like to answer if you had the relevant statistical knowledge or if
you had access to additional data?

• REFERENCE LIST: If applicable, please use the Harvard referencing style as detailed here
https://www.anu.edu.au/students/academic-skills/academic-integrity/referencing/harvard.

SUBMISSION GUIDELINES

• Total length: 5-8 pages (including graphs, excluding reference list). Note this is a guideline on
total length. A submission greater than 10 pages will be penalised and the pages exceeding the page
limit will not be graded. On the other hand, it is doubtful that all elements of the assignment can
be adequately addressed in 2 pages, hence a minimum length of around 5 pages is expected.

• The assignment must be submitted online on the Wattle course website via the Turnitin submission
box. Please submit your report as a ‘.doc’ or‘.pdf’ file. Turnitin is a ’text-matching’ software
and will compare your submission against an archive of Internet documents, Internet data, a
repository of previously submitted papers, and subscription repository of periodicals, journals, and
publications. Turnitin then creates an ’Originality Report’ which can be viewed by both lecturers
and students, which identifies where the text within a student submission has matched another
source. It is important to note that Turnitin does not detect plagiarism. Turnitin will only match
the text within a student’s assignment to text located elsewhere (e.g. found on the Internet, within
journals or on databases of student papers).

• No late assignments or hard copy assignments will be accepted without prior permission from the
course convenor. Extension requests are to be submitted online on the course Wattle site. (See the
assessment extension block on the right hand side of the Wattle site).

Page 4 of 6
ACADEMIC INTEGRITY

• This assignment is to be done individually and not in collaboration with other students in the class.

• Students should not have another person/entity do any portion of the assignment for them, which
includes hiring a person or a company to complete any portion of the assignment.

• All parts of your assignment must uphold the principles of academic integrity, as defined in the ANU
Policy: Code of Practice for Students University Academic Integrity Rule (https://services.
anu.edu.au/learning-teaching/academic-integrity). You must attach a completed RSFAS
ASSESSMENT INTEGRITY DECLARATION form to the front of your assessment when submitted.
This form is available on the course Wattle site.

• Analytical and critical thinking skills, and effective communication of statistical ideas are part of
the learning outcomes of this course. Developing strong competencies in these areas will prepare
you for a competitive workplace. The assignment you submit must present your own, original work.

USE OF ARTIFICIAL INTELLIGENCE (AI)

• It is acceptable to use AI tools (like ChatGPT) to (i) generate project ideas, and/or (ii) as an editing
tool for your own work. If you choose to use AI in the above ways you must reference your use
of AI in detail as follows:

1. Include the following declaration after the Introduction to your report (and before the Dataset
Description):
I acknowledge the use of [insert name of AI tool and provide the website address] to prepare my
report.
I chose to use AI because [insert reason(s) for using AI ].
I used AI to [list the tasks you used AI for e.g to brainstorm ideas].
Screenshots of all AI prompts I used and the output generated are provided in the Appendix.
2. In the Appendix you must provide screenshots of all prompts you used and the output
generated by the AI tool.

• You should note that the material generated by AI programs may be inaccurate, incomplete, or
otherwise problematic. Thus use of AI may result in a lower quality product with AI unable to
produce the sophistication required for this task. AI tools should be used with caution and proper
citation. AI is not a replacement for your own thinking and research.

• It is very important that you do not use AI to merely ‘do’ your assignment for you. Submissions
that have been generated entirely by AI are not permitted and will be treated as plagiarism and a
breach of ANU’s Academic Integrity Rule.

• Failure to properly cite your use of AI as described above is also in violation of the ANU Academic
Integrity Rule.

Page 5 of 6
MARKING GUIDELINES
The assignment will be marked according to the following rubric:

Item Total marks Notes


available
Clarity 2 Is your report logically structured, easy to follow and neatly
presented? Are your research objectives clearly stated and
justified? Have you included a project title? Are your graphs
clear and well labelled?
Interest 2 Did you state why your data set is of personal interest to
you? Did you describe the practical importance of your
research questions? Did your data set have a mix of different
variable types? Is the sample size big enough?
Dataset 3 Did you answer all the questions in this section as required?
description
Data 4 Did you calculate a variety of numerical descriptive
summaries measures? Have you included some summary tables and
graphs? Have you provided some commentary and/or
interpretation of your summary tables and graphs? Have
you submitted a copy of your data set as an Excel file that
also shows the data summaries you generated?
Hypothesis 4 Are your null and alternative hypotheses clearly stated using
test the statistical notation shown in lectures? Did you provide
step-by-step calculation details including the mathematical
expressions to calculate your test statistic and p-value? Did
you calculate an exact p-value using the Excel function
T.DIST(...)? Did you justify the assumptions required for
the statistical methods you used?
Discussion 3 Did you interpret your hypothesis test results in relation to
your research questions? Do your results confirm your initial
hypothesis or are they counterintuitive? Did you provide
some practical or intuitive explanation for your results?
Conclusions 2 Did you discuss the limitations of your analysis? Have you
discussed potential future improvements to your study?
Total 20

Some tips:
• Choose a dataset you have a personal interest in.
• Consult with the teaching staff for advice and to check whether your chosen dataset is appropriate
or not.
• There is no need to use statistical methods that we haven’t discussed in class yet.
Additional notes:
• For copyright reasons and to avoid plagiarism, sample assignment reports cannot be made available
to students.
• The teaching staff will be happy to answer specific questions or concerns you have related to your
project write-up. But the teaching staff will not be available to read full drafts of your assignment
before submission and provide detailed feedback before submission.

Page 6 of 6

You might also like