Level 1 Multivariate Workbook Answers
Level 1 Multivariate Workbook Answers
Multivariate
Workbook
ANSWERS
● Direction,
Example:
I wonder if the travel time (minutes) for students who catch the bus tends to be
longer than the travel time (minutes) for students who walk, for ALL high school
students in NZ, for data from Census at School in 2015.
Exercise:
Students who catch the bus, and students who walk to school
b. What is the numerical (measurement or count) variable? And its units?
ALL high school students in NZ, for data from Census at School in
2015.
d. What is the direction?
Rewritten question:
Does the height of 8 year old boys tend to be taller (cm) than the
height of 8 year old girls in NZ?
b) Do all 18 year old males tend to have a longer right foot than all 18 year old
females in NZ?
Numerical variable and units: Right foot length (cm) the units are
missing
Median: yes / no
Direction: yes / no males have LONGER foot length than females
Rewritten question:
Do all 18 year old males tend to have a longer right foot length (cm)
than all 18 year old females in NZ?
Numerical variable and units: School bag weight (kg) units are missing
Median: yes / no
Direction: yes / no
Rewritten question:
I wonder if the bag weight (kg) for girls tends to be heavier than the bag
weight for boys, for high school students in NZ?
d) How does the number of text messages all teenage girls send daily compare with
the number of text messages all teenage boys send daily, in Auckland?
Rewritten question:
Does the number of text messages sent each day by ALL teenage girls in
Auckland tend to be higher than the number of text messages sent each day
by teenage boys?
Variable Description
StridelengthCM The persons average stride length over the marathon in cm.
There are four possible comparison questions that you can write from the dataset
above. Write all four questions below.
I wonder if the stride length (cm) for males tends to be longer than the
stride length for females, for athletes doing marathons in NZ?
I wonder if the stride length (cm) for younger (under 40) athletes tends
to be longer than the stride length (cm) for older (over 40) athletes
running marathons in NZ?
Variable Description
There are four possible comparison questions that you can write from the dataset
above. Write all four questions below.
Data
Add summary statistics and a box plot to your graph.
You need to take a sample of between 20-40 in each group and change the point size
(to better see the shape).
There are 3 measures of center - mean, median and mode. For this assessment, we
are going to focus only on the median.
Example:
Median
Put the numbers in order: 1, 3, 3, 6, 8, 9
Find the number(s) in the middle: 1, 3, 3, 6, 8, 9
3+6
Find the median = = 4.5
2
Median = 4
Data: 4, 6, 3, 8, 2, 4, 9
Median = 3.5
Median = 29
That means the spread of the middle 50% of the data (the box part of the box and
whisker graph).
IQR = UQ - LQ
where UQ = Upper Quartile = the number where one quarter of the data lies above
it,
and LQ = Lower Quartile = the number where one quarter of the data lies below it.
Example:
Minimum = 0
LQ = 165
Median = 230
UQ = 315
Maximum = 650
Minimum = 76
LQ = 95
Median = 102
UQ = 115
Maximum = 140
Minimum = 760
LQ = 1170
Median = 1390
UQ = 1600
Maximum = 1870
1. Shape Exercise
Normal distribution
(hill/mound shapes, symmetric, bell shaped curve)
Left skewed
(Tail is on the left hand side)
Right Skewed
(tail is on the right hand side)
Bimodal
(there are two peaks)
Uniform
(the sides are straight and it looks like a box)
Example:
Sketch over the top of each graph and then state what shape it most closely matches.
1. 2. 3.
4. 5. 6.
7. 8. 9.
1
Thanks to Dr Pip Arnold for the graphs.
© Liz Sneddon 2020 Page 16
10. 11. 12.
Exercise:
For each graph, select the correct symmetry, peaks and tail description.
Number of peaks = 1
Number of peaks = 1
Number of peaks = 2
Number of peaks = 1
Number of peaks = 1
Number of peaks = 1
Number of peaks = 1
= 81.85kg - 68.45kg
= 13.4kg
The median weight for my sample of the males is heavier than for my sample of
females by 13.4kg.
= 185.45 - 174.7
= 10.75cm
The median height for the sample of males is taller than the median height for my
The median amount of money the sample of girls spent for the school ball is $310
The median amount of money the sample of boys spent for the school ball is $200
= $310 - $200
= $110
In the sample, the median amount of money girls spent for the school ball is greater
Find the IQR, and tell me which group’s spread is bigger and by how much.
For Merit, you need to add the justification and evidence.
Compare how wide the boxes are on the box plot. Is one group wider than the other?
IQR (females) = UQ - LQ
= 5.5 - 2
= 3.5kg
The IQR for bag weights for females is 3.5kg.
IQR (males) = UQ - LQ
= 5.1 - 2
= 3.1kg
The IQR for bag weights for females is 3.1kg.
In the sample, the spread of the middle 50% of bag weights for females is a little
wider than the spread of the middle 50% of bag weights for males.
IQR (females) = UQ - LQ
= 179.7 - 170.8
= 8.9cm
IQR (males) = UQ - LQ
= 191.5 - 179.6
= 11.9 cm
In the sample, the spread of the middle 50% of heights for females is smaller than
the spread of the middle 50% of heights for males.
IQR (girls) = UQ - LQ
= $580 - $210
= $370
The IQR for amount of money spent for the school ball for girls is $370
IQR (boys) = UQ - LQ
= 260 - 150
= $110
The IQR for amount of money spent for the school ball for boys is $110
In the sample, the spread of the middle 50% of how much girls spend for the school
ball is over 3 times larger than the spread of the middle 50% of boys spending for
the school ball.
Problem:
Is the median weight of girls’ school bags greater than the median weight of boys’
school bags, for ALL students at Intermediate schools in NZ?
● The shape of the females and male bag weights in have the same right skewed
shape. The females and male bag weights are right skewed because they have
one peak on the left hand side, are asymmetric, and there is a longer tail on the
right hand side.
● The median of the female bag weights is a little heavier than the bag weights
for males by 0.5kg. My evidence is that the median bag weight for females is
around 3.8kg while the median bag weight for males is around 3.3kg.
● The spread of the middle 50% of females bag weights is slightly larger than the
spread of males bag weights, because the IQR of the females is approximately
3.5 kg compared to the IQR for males of 3.1 kg.
1. I wonder if the median weight of male kiwi birds in NZ is heavier (kg) than the
median weight of female kiwis, for ALL kiwi birds in NZ.
The shape for both bus and walk students is right skewed
because there is a tail on the right side and the data is piled up on
the left side.
The median travel time for bus students is 25 minutes for walk
students it is 10 minutes so the median for walk students is 15
minutes less.
The spread of the middle 50% of bus students is longer than the
spread of the middle 50% of walking students because the
interquartile range for bus students is 25 minutes, for walking
students it is 15 minutes.
The shape of ages of students who DO NOT have their own device
is normal, because the shape is symmetric, one peak, and the
tails are both similar. The shape of ages of students who DO have
a device is left skewed because there is one peak, no symmetry,
and a longer tail on the left hand side.
The median age of students who DO NOT have their own device is
11 years old, while the median age of students who DO have their
own device is 14 years old. This shows that the median age of
students who DO have their own device is 3 years older than the
median age of students who DO NOT have their own device.
The spread of the middle 50% of ages for students who DO have
their own device is smaller than the spread of the middle 50% of
ages of students who DO NOT have their own device.
IQR (No device) = UQ - LQ = 12.5 - 10 = 2.5 years old
The shape of the memory test result for both males and females is
right skewed, because the shape is not symmetric, there is one
peak and a longer tail on the right hand side.
The median memory test for females is 47% an the median test
result for males is 50%. This shows that males have a higher
median test mark by 3%.
The spread of the middle 50% of memory test results for females
is about the same as the middle 50% of memory test results for
males.
IQR (female) = UQ - LQ = 57 - 40 = 17%
IQR (male) = UQ - LQ = 61 - 42 = 19%
Evidence:
If there is no overlap, then the results tend to be higher for one group.
75% of the data in one group is not 75% of the data in one group is bigger
bigger than 50% of the second group, so than 50% of the second group, so I do
Ido not have enough have enough evidence that
evidence that one group tends to one group tends to be larger than the
be larger than the second group. second group.
The median of both groups is inside The median of both groups is outside
the box of the other group, so I do the box of the other group, so I do
not have enough evidence have enough evidence that
that one group tends to be larger than one group tends to be larger than the
the second group. second group.
Problem:
I wonder if the median weight of babies born to mothers who smoked is smaller than
the median weight of babies born to mothers who didn’t smoke, for ALL participants
at Baystate Medical Center, Springfield, Mass. during 1986.
Conclusion:
(You can use either method 1 or method 2, you don’t need both)
Method 1:
50% of the weights of babies born to mothers who don’t smoke are larger than 75%
of the weights of babies born to smoking mothers, so I have enough evidence to
make the call.
Method 2:
The median weight of babies born to mothers who don’t smoke is OUTSIDE the box
of the baby weights for mothers who smoke, so I have enough evidence to make the
call.
Inference:
I can make the call, so I DO have enough evidence that the weight of babies born to
mothers who smoked tends to be smaller than the weight of babies born to mothers
who didn’t smoke, for ALL participants at Baystate Medical Center, Springfield, Mass
during 1986.
1) Here is a sample of the weights of male and female kiwi birds from around NZ.
Problem:
I wonder if the weight of female kiwis tends to be heavier than the weight for male
kiwis, for ALL kiwi birds from around NZ?
Conclusion:
Method 2: Does one or both medians lie outside the othe box? Yes / No
I do / don’t have enough evidence that the weight for female kiwis tends to be
heavier than the weight for male kiwis, for ALL kiwi birds from around NZ.
Problem:
I wonder if the height for males tends to be taller than the weight for females from,
for all adults in NZ?
Conclusion:
Method 2: Does one or both medians lie outside the othe box? Yes / No
I do / don’t have enough evidence that the height for males tends to be taller
than the height for females, for all adults in NZ.
Problem:
I wonder if the amount of money that female students tend to spend on ball wear is
more than the amount of money males spend, for all high school students in NZ?
Conclusion:
Method 2: Does one or both medians lie outside the othe box? Yes / No
Problem:
I wonder if the time it takes older people to complete a marathon tends to be longer
than the time it takes younger people to complete a marathon, for all marathon
runners in NZ?
Conclusion:
Method 2: Does one or both medians lie outside the othe box? Yes / No
Exercise
1) If each blue dot represents the height of a girl aged 12, why do they keep
changing each time they take another sample?
Exercise
1) Look at the median (the line in the middle of the box). Notice how it changes
every time another sample is taken. Explain why this happens.
2) Complete the following sentence. Think about the data and the medians.