Mathematics Standard 2 Year 12 Topic Guide Statistical Analysis
Mathematics Standard 2 Year 12 Topic Guide Statistical Analysis
Topic Guides provide support for the Mathematics Stage 6 courses. They contain information
organised under the following headings: Terminology; Use of technology; Background
information; General comments; Future study; Considerations and teaching strategies;
Suggested applications and exemplar questions.
Topic Guides illustrate ways to explore syllabus-related content and consequently do not
define the scope of problems or learning experiences that students may encounter through
their study of a topic. The terminology list contains terms that may be used in the teaching and
learning of the topic. The list is not exhaustive and is provided simply to aid discussion.
Please provide any feedback to the Mathematics and Numeracy Curriculum Inspector.
Terminology........................................................................................................................3
Use of technology...............................................................................................................3
Background information....................................................................................................4
General comments..............................................................................................................4
Future study........................................................................................................................4
Subtopics............................................................................................................................5
Subtopic focus......................................................................................................................... 6
Subtopic focus......................................................................................................................... 7
Mathematics Standard 2 Year 12 Topic guide: Statistical analysis, updated December 2018 Page 2 of 9
Topic focus
Statistical Analysis involves the collection, display, analysis and interpretation of data to
identify and communicate key information.
Knowledge of statistical analysis enables the careful interpretation of situations and raises
awareness of contributing factors when presented with information by third parties, including
the possible misrepresentation of information.
Terminology
bell-shaped intercept numerical variable
bias interpolation Pearson’s correlation
biometric data least-squares regression coefficient
bivariate line probability
bivariate dataset line of best fit probability density function
continuous random variable linear random variable
correlation linear association sample
dataset linear relationship scatterplot
dependent variable mean slope
empirical rule median standard deviation
event non-linear standardised score
extrapolation normal curve statistical investigation process
frequency distribution normal distribution trendline
histogram normally distributed z -score
independent variable
Use of technology
Appropriate technology should be used to construct, and determine the equation of a line of fit
and least-squares line of best fit, and to calculate correlation coefficients.
Teachers should demonstrate the least-squares regression line on a spreadsheet and then
have students explore the function with their own sets of data.
Graphing software can be used to fit a line of best fit to data and make predictions by
interpolation or extrapolation.
Real data that is relevant to students’ experience and interest areas can be sourced online.
Online data sources include the Australian Bureau of Statistics (ABS), the Australian Bureau
of Meteorology (BOM), the Australian Sports Commission and the Australian Institute of
Health and Welfare (AIHW) websites.
Mathematics Standard 2 Year 12 Topic guide: Statistical analysis, updated December 2018 Page 3 of 9
Background information
The concept of correlation originated in the 1880s with the works of Sir Francis Galton,
Charles Darwin’s cousin. He produced the first bivariate scatterplot, which showed a
correlation between children’s height and their parent’s height. A decade later, the British
statistician Karl Pearson introduced a powerful idea in mathematics: that a relationship
between two variables could be characterised according to its strength and expressed in
numbers, leading to the development of Pearson’s correlation coefficient, r . This then raised
the issue of how to interpret the data in a way that is helpful, rather than misleading. When
correlation is mistaken for causation, we find a cause that isn’t there, which is a problem. As
science grows more powerful and government relies on big data more and more, the stakes of
misleading relationships grow larger. There are many humorous examples on the internet, for
example the book and website entitled Spurious connections.
The normal distribution was developed from a model originally propounded by Abraham de
Moivre, an 18th century statistician and consultant to gamblers. The normal distribution curve
is sometimes called the ‘bell curve’ or the ‘Gaussian curve’ after the mathematician Gauss,
who played an important role in its development. The normal distribution is the most important
and widely used distribution in business, statistics and government. Indeed its importance
stems primarily from the fact that the distributions of many natural phenomena are at least
approximately normally distributed.
General comments
Materials used for teaching, learning and assessment should use or include current
information from a range of sources, including, but not limited to, newspapers, journals,
magazines, real bills and receipts, and the internet.
Students need access to real data sets and contexts. They can also develop their own data
sets for analysis or use some of the data sets available online.
Suitable data sets for statistical analysis could include, but are not limited to, home versus
away sports scores, male versus female data (for example height), young people versus older
people data (for example blood pressure), population pyramids of countries over time,
customer waiting times at fast-food outlets at different times of the day, and monthly rainfall for
different cities or regions.
This topic provides students with the opportunity to explore aspects of Mathematics involved in
any area of special interest to them.
Future study
Students may be asked to analyse data and produce a report in subjects that they are
studying for the HSC or in post-school contexts and training areas. This topic will set a good
baseline for knowledge, understanding and skills in statistical analysis.
The ability to analyse and critically evaluate statistical information will provide students with the
confidence and skills that help them become discerning citizens.
Mathematics Standard 2 Year 12 Topic guide: Statistical analysis, updated December 2018 Page 4 of 9
Subtopics
MS-S4: Bivariate Data Analysis
MS-S5: The Normal Distribution
Mathematics Standard 2 Year 12 Topic guide: Statistical analysis, updated December 2018 Page 5 of 9
MS-S4: Bivariate Data Analysis
Subtopic focus
The principal focus of this subtopic is to introduce students to a variety of methods for
identifying, analysing and describing associations between pairs of numerical variables.
Students develop the ability to display, interpret and analyse statistical relationships related to
bivariate numerical data analysis and use this ability to make informed decisions.
Mathematics Standard 2 Year 12 Topic guide: Statistical analysis, updated December 2018 Page 6 of 9
years. He created a scatterplot of the data and constructed a line of best fit to model the
relationship between the age and height of males.
(a) Determine the gradient of the line of best fit shown on the graph.
(b) Explain the meaning of the gradient in the context of the data.
(c) Determine the equation of the line of best fit shown on the graph.
(d) Use the line of best fit to predict the height of a typical 17-year-old male.
(e) Why would this model not be useful for predicting the height of a typical 45-year-old
male?
The height and length of the right foot of 10 high school students were measured. The
results were tabulated as follows:
Height (cm) 165 153 146 138 149 172 170 158 163 154
(a) Using technology, calculate the Pearson correlation coefficient for the data.
(b) Describe the strength of the association between height and length of the right foot for
this dataset.
Subtopic focus
The principal focus of this subtopic is to introduce students to a variety of methods for
identifying, analysing and describing associations between pairs of numerical variables.
Mathematics Standard 2 Year 12 Topic guide: Statistical analysis, updated December 2018 Page 7 of 9
Students develop the ability to display, interpret and analyse statistical relationships related to
bivariate numerical data analysis and use this ability to make informed decisions.
Students should understand that the normal distribution table lists areas under the bell
curve to the left of different values of z -score, as illustrated in the previous diagram, and
should know how to find the probability for a given or calculated z -score using the normal
distribution table.
Teachers should briefly explain the application of the normal distribution to quality control
and the benefits to consumers of goods and services. Reference should be made to
situations where quality control guidelines need to be very accurate, for example the
manufacturing of medications.
z-score 0 1
(a) What percentage of packets will have a mass less than 1.02 kg?
Mathematics Standard 2 Year 12 Topic guide: Statistical analysis, updated December 2018 Page 8 of 9
(b) What percentage of packets will have a mass between 1.00 and 1.04 kg?
(c) What percentage of packets will have a mass between 1.00 and 1.02 kg?
(d) What percentage of packets will have a mass less than the labelled mass?
A machine is set for the production of cylinders of mean diameter 5.00 cm, with standard
deviation 0.020 cm. Assuming a normal distribution, between which values will 95% of the
diameters lie? If a cylinder, randomly selected from this production, has a diameter of
5.070 cm, what conclusion could be drawn?
Students could investigate whether the results of a particular experiment are normally
distributed.
Find the probability that a person selected at random from a pool of people that took a test
on which the mean was 100 and the standard deviation was 15 will have a score:
(a) between 100 and 120
(b) of a least 120
(c) of greater than 120
Note: Normal distribution tables would be used to answer this question.
The lifetime of a particular make of lightbulb is normally distributed with mean 1020 hours
and standard deviation 85 hours. Find the probability that a lightbulb of the same make
chosen at random has a lifetime between 1003 and 1088 hours. Normal distribution tables
would be used to answer this question.
Mathematics Standard 2 Year 12 Topic guide: Statistical analysis, updated December 2018 Page 9 of 9