Tutorial 10
Tutorial 10
1. Classify the following variables as either categorical or quantitative. For those variables
that are quantitative, state whether the variable is discrete or continuous.
(a) Type of pet
(b) Shoe size
(c) Time
(d) ‘Best 4’ score at Trinity
(e) Favourite fruit
(f) Height (cm)
(g) Height (to the nearest cm)
(h) Height (short/average/tall)
Number Frequency
1 163
2 176
3 155
4 161
5 181
6 164
Construct a relative frequency histogram for this data and describe the shape of the distribution.
Page 1 of 17
4. On a particular day, Hamed’s Hamburger joint receives 48 customers. The times the cus-
tomers arrive (given in 24 hour format) are given below. Construct a histogram to display this
data and describe the shape of the distribution.
1019, 1034, 1128, 1130, 1139, 1149, 1155, 1211, 1218, 1226, 1243, 1258,
1304, 1306, 1307, 1312, 1316, 1316, 1317, 1322, 1341, 1358, 1414, 1424,
1549, 1603, 1658, 1700, 1709, 1719, 1725, 1741, 1748, 1756, 1813, 1828,
1834, 1836, 1837, 1842, 1846, 1846, 1847, 1852, 1911, 1928, 1944, 1954
6. The following table gives the frequency of goals scored per game during the 135 matches
played in the 2015/2016 A-League season.
Goals 0 1 2 3 4 5 6 7 8 9
Frequency 10 17 23 29 27 17 8 3 0 1
(a) Construct a frequency histogram of the goals scored per game and comment on the shape.
(b) Find the median and average number of goals scored per game
7. At a cafe, the number of lunchtime customers is recorded on each of 100 consecutive days
and the frequency distribution summarised in the following table.
Number 111- 121- 131- 141- 151- 161- 171- 181- 191- 201-
120 130 140 150 160 170 180 190 200 210
Frequency 1 5 9 13 22 20 14 8 6 2
Construct a relative frequency histogram of the data and comment on the shape.
Page 2 of 17
8. For each scenario below, define a random variable, classify it as either discrete or contin-
uous, and give a suitable range of values.
(a) A football team plays 7 games in a season. Each game results in either a win or a loss.
(b) A researcher is interested in the amount of sleep a student gets each night between the hours
of 9pm and 5am.
9. If X and Y are independent random variables with E(X) = 12, Var(X) = 4.2, E(Y) = 8.5,
Var(Y) = 2.3, find the mean and variance of 4X - 2Y.
X–
10. Suppose a random variable X has mean and standard deviation and that Z = ------------- .
Show using the formulae on page15 of Chapter 10 that E Z = 0 and Var Z = 1 .
11. The heights of horses who compete in the Melbourne Derby are normally distributed with
a mean of 66 inches and a standard deviation of 3 inches.
(a) If a horse is randomly selected from this race, what height would you expect it to be?
(b) What percentage of horses who compete in the Melbourne Derby have heights between 60
and 72 inches?
(c) In a particular year there are 25 horses running in the Melbourne Derby. How many of these
would you expect to have a height of less than 63 inches?
12. The weight of apples a particular fruit picker is able to harvest each day has a mean of
1500kg with a standard deviation of 200kg. The fruit picker is paid $50 per day, plus $0.10 for
each kilogram of apples he harvests. Calculate the mean and standard deviation of the fruit pick-
er’s daily earnings.
13. A ferry transports vehicles across a lake. The number of occupants, X, per car (including
the driver) has an expected value of 2.3 and standard deviation of 0.63. Each vehicle is charged
a fee of $10 plus an additional $6 for each occupant.
(a) Find the expected value and standard deviation of the fee per vehicle.
(b) Each trip costs the ferry operators $15 per vehicle in operating expenses (such as boat fuel).
What is the expected daily profit made by the ferry operators if the ferry transports 80 vehicles
per day?
Page 3 of 17
14. An insurance company sells flood insurance policies, each of which has an expected payout
of $240 with standard deviation of $500.
(a) The policy is sold to 100 customers whose properties are widely dispersed geographically
so that the individual payouts are independent random variables. Calculate the expected value
and standard deviation of the total payout and comment on the level of risk involved.
(b) Suppose instead that the 100 properties are located on the same small section of flood plain
so that if one property is flooded, then all of them are flooded. In this case the individual payouts
will be highly dependent so that the total payout is effectively 100X where X is the payout for
any one policy. Calculate the expected value and standard deviation of 100X and comment on
the level of risk involved. (This question illustrates the problem of correlation in financial risk
and insurance).
15. A particular student borrows 23 books from the library throughout the course of the year.
This number is 4 standard deviations above the average of 2.6 books per year. What is the stand-
ard deviation?
16. A random variable Z has a normal distribution with mean 0 and standard deviation 1. Find
the probability that Z is:
(a) less than 1.36 (b) greater than 1.36 (c) less than -2.12
(d) between -2.12 and 1.36
17. Phil’s fruit stall sells apples and oranges. The number of apples sold per day is normally
distributed with a mean of 53 and a standard deviation of 10. The number of oranges sold per
day is normally distributed with a mean of 32 and a standard deviation of 5. One day, Phil sells
42 apples and 41 oranges. Which was more unusual: the number of apples sold that day, or the
number of oranges?
18. The scores on a certain IQ test are normally distributed with a mean of 100 and a standard
deviation of 20.
(a) What proportion of people score less than 70 on this test?
(b) What proportion of people score between 125 and 135?
(c) What score would a person need to achieve on this test to be in the top 2.5%?
Page 4 of 17
19. Weights of apples (in grams) are normally distributed with = 85 and = 6 . A su-
permarket will not stock apples which are too heavy (>97 grams) or too light (<70 grams). What
percentage of apples will be rejected by the supermarket?
20. At a large University, scores in the first-year subject Mathematics 1 are normally distrib-
uted with = 68 and = 12 . Students who score at least 50% pass the subject but only
students who score at least 60% are allowed to enrol in the second year subject Mathematics 2.
(a) What proportion of students pass Mathematics 1?
(b) What proportion of students pass Mathematics 1 but are not allowed to enrol in Mathemat-
ics 2?
(c) If the top 10% of Mathematics 1 students are allowed to enrol in the Advanced stream in
second year Mathematics, what is the cut-off score for entry into the Advanced stream?
21. A normal random variable, X, has a standard deviation of 10. The probability that X is less
than 43 is 0.1539. Find the mean of X.
22. If a random sample of n observations is taken from a distribution or population with mean
and standard deviation , use the formulae on pages 15 and 16 of chapter 10 to show that
2
E X = and Var X = ------ .
n
23. From 1937-1970 inclusive, Disney released a total of 92 feature films. These had an aver-
age running time of 93 minutes, with a standard deviation of 20 minutes. 12 Disney films chosen
at random from this era had an average running time of 106 minutes with a standard deviation
of 18 minutes.
(a) What is the population mean, ?
(b) What is the sample mean, x ?
(c) What is the population standard deviation, ?
(d) What is the sample standard deviation, s?
(e) What is the sample size, n?
(f) Consider the distribution of sample means taken from a large number of samples of size
12. What is the mean and standard deviation of this distribution?
Page 5 of 17
24. The distribution of the number of claims made by customers throughout the life of their
insurance policies is strongly skewed to the right. Which of the following statements about this
distribution is correct?
(a) The Central Limit Theorem tells us that as we look at more and more customers, their av-
erage number of claims gets closer to the population mean.
(b) The Law of Large Numbers tells us that if we consider a large sample of customers, the
distribution of the average number of claims will be approximately normal.
(c) The Central Limit Theorem tells us that if we consider a small sample of customers, the
distribution of the average number of claims will not be approximately normal.
25. Sasha’s Sashimi Shop prides itself on its quick service. The length of time a customer waits
in line before being served is normally distributed with an average of 63 seconds and a standard
deviation of 20 seconds.
(a) Calculate the probability that an individual customer waits more than 90 seconds to be
served.
(b) Calculate the probability that the average waiting time of a sample of 4 customers is more
than 90 seconds.
26. At Louiser’s Palace Casino, each patron loses an average of $200, with a standard deviation
of $5,000. Consider all patrons to be independent of one another.
(a) Calculate the Casino’s expected profit and standard deviation on a night when 4500 people
visit.
(b) What is the minimum number of patrons required to ensure that the Casino’s expected prof-
it minus three standard deviations is greater than zero?
27. Weights of strawberries (in grams) are normally distributed with = 12 and = 1.2 .
Suppose strawberries are packed into boxes of 16 and that a box is rejected if it weighs less than
184 grams. Assuming the weight of packaging to be negligible, what proportion of boxes are
rejected?
Page 6 of 17
28. An insurance company sells flood insurance at a price of $300 per policy. The expected
payout per policy, based on many years of records, is $240 with a standard deviation of $500.
If n policies are sold then the company makes a loss if the average payout per policy, X , exceeds
$300.
(a) What is the probability of a loss if 100 policies are sold?
(b) What is the probability of a loss if 400 policies are sold?
(c) If the company wants the probability of a loss to be less than 2%, then how many policies
should be sold?
29. Gary gambles regularly at a casino, occasionally winning a lot of money but usually losing.
In the long-term, Gary loses an average of $800 per game with a standard deviation of $21000.
(a) If Gary plays 160 games, what is the probability that he will make a profit (meaning his
average amount of winnings X is greater than 0)?
(b) Suppose the casino wants the probability of Gary making a profit on n games to be less than
4%. How many games should Gary play?
31. A researcher records the weights of 160 randomly chosen men in a large town and finds
the sample mean x to be 76.32 kg with s = 4.2 kg. Find a 95% confidence interval for the
average weight of men in the town.
32. In 2017, the sale price (in thousands of dollars) of thirty randomly selected 2-bedroom
dwellings in the South Australian town of Coober Pedy are listed below.
260, 85, 125, 53, 57, 89.5, 170, 47.5, 150, 65, 119, 50, 65, 119, 64
210, 99, 260, 72, 55, 52.5, 130, 89.5, 170, 45, 115, 67, 50, 120, 78
Calculate a 95% confidence interval for the average sale price of a 2-bedroom dwelling in Coo-
ber Pedy in 2017.
Page 7 of 17
33. An orange orchard harvests several thousand oranges per season whose weights are normal-
ly distributed with mean grams. A random sample of 100 oranges, taken just before harvest
time, yields a mean weight of x = 88.32 grams with s = 6.4 grams. Find a 95% confidence
interval for .
34. Elspeth is interested in the average age of elm trees in a particular region. She selects 36
elm trees at random and has them dated using a technique known as dendrochronology. The
sample mean is found to be 43 years, and the sample standard deviation is 5 years.
(a) Calculate a 90% confidence interval for the average age of Elspeth’s elm trees.
(b) What sample size would be required to halve the width of the confidence interval? (Assume
the standard deviation remains constant).
35. A license is required to fish in a particular river catchment. At the end of the fishing season,
a random survey of 80 licensed fishermen shows an average catch of 36.4 fish per license with
standard deviation 5.6.
(a) Find a 99% confidence interval for the average catch for all licensed fishermen.
(b) Find a 99% confidence interval (to the nearest integer) for the total number of fish caught
in the catchment if there are 2000 licensed fishermen.
36. Form an appropriate null and alternative hypothesis for each of the scenarios below. Re-
member to define the parameter.
(a) A ski resort manager believes that the average number of skiers per day attempting the dou-
ble black diamond run has increased from its long term average of 25.
(b) Felix can solve a Rubik’s cube in an average of 52 seconds. He purchases a new brand of
Rubik’s cube and wonders whether this will affect his average time.
(c) A study aims to test the claim that the average Australian household has less than two pets.
37. Which of the following could be the null and alternative hypotheses for a study? Explain
what is wrong with each of the other options.
(a) H 0 = 700 H A 700
(b) H 0 : = 700 H A : 700
(c) H 0 : x = 700 H A : x 700
(d) H 0 : 700 H A : = 700
Page 8 of 17
In the following questions p denotes P -value.
Page 9 of 17
41. The average age at death in a certain country is 75 years, with a standard deviation of 10
years. Researchers believe the average age at death of the country’s indigenous population is
significantly lower. A random sample of 50 records gives a sample mean of 72.4 years as the
age at death for the country’s indigenous population.
(a) Form an appropriate null and alternative hypothesis for this study.
(b) Calculate the z-score.
(c) Sketch a Standard Normal curve, position the sample on it and shade the area correspond-
ing to the P-value.
(d) Find the P-value.
(e) Are the results significant at the 5% level? Write your conclusions in the context of the
question.
42. A food processing company manufactures health bars with an average vitamin Z content
of 164 units per gram. A sample of 120 bars, manufactured using a new process designed to in-
crease the vitamin Z content, has an average vitamin Z content of 164.8 units per gram with
standard deviation 6.2. State the null and alternative hypothesis, calculate the P-value and test
the null hypothesis at the 5% significance level.
43. A railway line for many years has had an average of 7321.4 passengers per day. The line
operators are concerned that a change of timetable may reduce patronage (reducing profits) or
increase patronage (resulting in possible over-crowding and a breach of safety standards). The
60 days after the change is implemented have an average of x = 7277.2 passengers daily with
s = 180.2. State the null and alternative hypothesis, calculate the P-value and test the null hy-
pothesis at the 5% significance level.
44. In a particular country, recent University graduates have an average starting salary of
$60235. One University claims that it produces graduates with superior earning capabilities, cit-
ing the fact that its recent cohort of 1323 graduates have an average starting salary of $63726
with standard deviation of $10003. State the null and alternative hypothesis, calculate the P-val-
ue and comment on the University’s claim.
Page 10 of 17
45. The average recovery time for a particular type of muscle injury (using standard treat-
ments) is 10.42 days. A physiotherapist claims that a new treatment will significantly reduce
recovery times. 50 patients with this particular muscle injury are given the new treatment and
the recovery times average x = 10.01 days with standard deviation s = 1.1.
Find the P-value and test the physiotherapist’s claim at the 1% significance level.
46. The time taken to produce a satchel in Leigh’s leathergoods factory is normally distributed
with a longterm average of 53 minutes and a standard deviation of 3 minutes. After replacing a
machine which breaks down one day, Leigh wonders whether this average time has changed.
He records the times taken to produce 9 satchels, chosen at random, as follows:
49.1, 48.7, 57.2, 53.3, 51.8, 47.6, 50.0, 55.9, 49.3
(a) Conduct a hypothesis test at the 10% significance level to investigate.
(b) The sample size for this study is relatively small. Does this violate the conditions of the
hypothesis test? Why or why not?
Page 11 of 17
ANSWERS:
1. (a) Categorical. (b) Quantitative, discrete. (c) Quantitative, continuous. (d) Quantitative,
discrete. (e) Categorical. (f) Quantitative, continuous. (g) Quantitative, discrete. (h) Categori-
cal.
2. (a)
Frequency
10
3.
Relative frequency (%)
20
10
1 2 3 4 5 6 Number
Approximately uniform.
Page 12 of 17
4.
Frequency
10
Time
1000 1200 1400 1600 1800 2000
5. (a)
Frequency
Median
10
Mean
Score
20 30 40 50 60 70 80 90 100
Page 13 of 17
6. (a)
Frequency
20
10
Goals
0 1 2 3 4 5 6 7 8 9
Slightly right-skewed
7.
Relative
Frequency
0.2
0.1
Customers
110 120 130 140 150 160 170 180 190 200 210
Page 14 of 17
8. (a) Let X = the number of wins in the season. Then X 0 1 2 7 is a discrete random
variable. (b) Let X = the number of hours of sleep a student gets between 9pm and 5am. Then
X is a continuous random variable with 0 X 8
9. 31, 76.4
2
– + 0-
10. E Z = ------------ = 0 , Var Z = --------------
2
= 1
36. (a) H 0 : = 25 H A : 25 where is the average number of skiers per day attempting
of Rubik’s Cube.
(c) H 0 : = 2 H A : 2 where is the average number of pets per household in Australia.
Page 15 of 17
37. (a) is incorrect. H 0 and H A are not numbers, they are statements.
(b) is written correctly. These could be the null and alternative hypotheses of a study.
(c) is incorrect. H 0 and H A are statements about the population mean, not the sample
mean.
(d) is incorrect. H 0 must contain an equality, H A must contain an inequality.
38. (b)
39. (c)
40. (b), (d).
41. (a) H 0 : = 75 H A : 75 where is the average age at death of the country’s indige-
nous population.
(b) z = -1.84
(c)
-1.84 0
(d) P = 0.0329
(e) The results are statistically significant. The average age at death of the country’s indige-
nous population is less than 75.
42. H 0 : = 164 versus H A : 164 , P-value = 0.0793, accept (do not reject) H 0 at 5% level
of significance.
43. H 0 : = 7321.4 versus H A : 7321.4 , P-value = 0.0574, accept (do not reject) H 0 at
5% level of significance.
44. H 0 : = 60235 versus H A : 60235 , Z value is 12.69 which means the P-value is ap-
proximately 0. There is very strong evidence for rejecting H 0 and accepting the University’s
claim that its recent graduates have higher earning capability.
45. P-value = 0.0041 so reject the null hypothesis ( = 10.42 ) at the 1% significance level and
conclude that the new treatment reduces recovery times.
Page 16 of 17
46. (a) Z value is -1.57, P-value is 0.1164. Fail to reject H 0 . There is insufficient evidence to
conclude that the new machine has had an effect on the average time taken to produce a satchel.
(b) No. The population distribution is normal, so the sampling distribution is normal regard-
less of the sample size (Central Limit Theorem).
Page 17 of 17