0% found this document useful (0 votes)
53 views61 pages

AS Exercise 2021

1. The document describes 6 sampling exercises involving different sampling methods like simple random sampling, stratified sampling, systematic sampling, and cluster sampling. It also contains exercises on descriptive statistics such as measures of central tendency, dispersion, percentiles. 2. Quantitative and qualitative variables from health surveys are identified as being discrete or continuous. Measures of location and dispersion are calculated for various data sets. 3. Skewness of distributions are determined by comparing quartiles from sample data on expenses, salaries, and other variables. Measures include mean, median, range, IQR, standard deviation, percentiles.

Uploaded by

Pik ki Wong
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
53 views61 pages

AS Exercise 2021

1. The document describes 6 sampling exercises involving different sampling methods like simple random sampling, stratified sampling, systematic sampling, and cluster sampling. It also contains exercises on descriptive statistics such as measures of central tendency, dispersion, percentiles. 2. Quantitative and qualitative variables from health surveys are identified as being discrete or continuous. Measures of location and dispersion are calculated for various data sets. 3. Skewness of distributions are determined by comparing quartiles from sample data on expenses, salaries, and other variables. Measures include mean, median, range, IQR, standard deviation, percentiles.

Uploaded by

Pik ki Wong
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 61

AS 2021-22

Applied Statistics – Exercise

Chapter 1 – Sampling Methods

1. There is a multinational company with offices in five Asian countries. The human resources
executive of the company decides to survey the employee’s opinion about the existing insurance
policy. Since there is insufficient time to conduct a census in all countries, it is decided to
conduct a survey which involves a total of 140 employees. The number of employees in
Singapore, Korea, China, Malaysia and Thailand are 2000, 3000, 5000, 1500 and 2500
respectively. The sample is required to be a random representation without bias towards any
countries.

(a) Which sampling method should be used?


(b) How many representatives from each country to be taken?

2. The Hong Kong Federation of Restaurants & Related Trades is conducting a survey about the
working environment in restaurant. Among all registered restaurants in Hong Kong, 10
restaurants will be randomly selected for the survey. All full-time employed staff in the selected
restaurants will be invited to fill in the questionnaire.

(a) Describe the population of the study.


(b) What is the name of this sampling method?

1
AS 2021-22

3. A supermarket chain with 1500 supermarkets wishes to undertake a survey to understand the
safety facilities in the store. Questionnaires will be sent to a sample of 60 supermarkets and the
store manager will be responsible to fill in the questionnaire. One of the company records
indicates the location of the 1500 supermarkets as follow:

Central South North East West


550 250 200 350 150

(a) Describe the population of the study.


(b) Suppose the sample is selected by stratified sampling method with the five districts as five strata.
How many supermarkets from each district should be selected?
(c) Suppose the sample is selected by systematic sampling method instead of stratified sampling
method. What is the advantage of using systematic sampling method?
(d) When using systematic sampling method, a unique identity number should be assigned to each
supermarket. Suppose now supermarket with identity number 0037 is included in the sample.
Write down the identity number of the next three selected supermarket.

4. Classify each of the following samples be simple random, systematic, stratified, or cluster:
(a) Every 8th student entering a school is asked to name his / her favourite teacher.
(b) Every nurse working in the four randomly selected hospitals is asked about the total working hour
on 1 September 2019.
(c) Immigrants are selected based on random numbers to take place in a survey about education level.
(d) In a private estate with 20 buildings, the property management team wants to conduct a survey to
investigate resident’s satisfactory level to the service. Two buildings will be selected randomly
and all apartments in these two buildings will receive a questionnaire.
(e) Every 1 out of 75 visitors leaving the exhibition center is invited to join a survey about their
opinion on the exhibition.
(f) 5, 8, and 3 students are randomly selected from Korean, Japanese, and French clubs respectively,
as fair representatives of a University language center to take place in Foreign Language
Competency Test.

2
AS 2021-22

5. A questionnaire is designed for a telecommunication company to study the household


long-distance call usage. Determine whether each of the following random variable is
quantitative or qualitative. If quantitative, determine whether the variable is discrete or
continuous.

(a) Number of telephones (fixed line) in the household


(b) Number of long-distance calls made in August 2019
(c) Length (in minutes) of the longest long-distance calls made in August 2019
(d) Monthly charges (in dollars and cents) for long-distance calls made in August 2019
(e) Are you satisfied with the long-distance calls service?
- very satisfied / satisfied / not satisfied / not at all satisfied

6. Below are a few questions extracted from the Health survey questionnaire prepared by the
Department of Health. Determine whether each of the following random variable is quantitative
or qualitative. If quantitative, determine whether the variable is discrete or continuous.

(a) Gender (M / F)
(b) Weight (kg)
(c) Height (cm)
(d) Total number of residents in your household
(e) Do you smoke? (Yes / No)

3
AS 2021-22

Chapter 2 – Statistical Measures and Data Presentation

1. Below are the number of minutes (round off to the nearest minute) a sample of 20 employees in a
company go out for lunch on 2 July 2021:

39, 41, 45, 52, 53, 55, 58, 59, 60, 61, 62, 62, 63, 65, 65, 65, 69, 70, 72, 77

(a) Find the mean, mode, median, the first quartile, third quartile, 17th percentile, 87th percentile of the
data.
(b) Find the standard deviation and variance of the data.
(c) Comment on the skewness of the above data by comparing the quartiles. State your reason.

2. The spending ($) of a random sample of 10 people in a restaurant are as shown below:

95, 60, 120, 75, 60, 70, 40, 35, 115, 60

From the above data find the following measure of locations and dispersion.

(a) Mean;
(b) Mode;
(c) Median;
(d) First quartile;
(e) Third quartile;
(f) Range;
(g) IQR; and
(h) Standard deviation.
(i) Describe the skewness of the data by comparing the quartiles.

4
AS 2021-22

3. The following is the daily expense ($) on 1 July 2021 by a sample of 12 tourists from Singapore.

873 2460 951 730 327 1214


768 5293 662 591 820 4260

(a) Compute the sample mean, sample median, first quartile, third quartile, and the 10th percentile.
(b) Compute the range, inter-quartile range, and sample standard deviation.
(c) How do you describe the shape of the distribution? State your reason.

4. The followings are the bonuses paid to sales staff employed by two companies in January
according to their performance last year.

Company A: 23000, 30000, 28000, 31000, 29000, 26000,


34000, 36000, 28000, 32000, 25000, 26000
Company B: 53000, 65000, 70000, 50000, 62000, 58000,
52000, 63000, 69000, 72000, 64000, 54000

Find the population mean, median, first quartile, third quartile, variance and standard deviation for
company A and company B respectively. Present the result in a table for simple comparison.

5
AS 2021-22

5. Peter is working in the Labor Department. He is working on research which studies the daily
expense of laborers who are working in different districts. He samples 20 salespersons, 8
working in Yuen Long and 12 working in Causeway Bay. He records their spending on lunch on
1 September 2021 and the results are as follow:

Yuen Long: 33 35 38 40
48 53 55 62

Causeway Bay: 43 48 50 54
58 63 74 80
82 88 89 93

(a) Find the sample mean, sample standard deviation of spending on lunch on 1 September 2021 for
those who are working in Yuen Long and Causeway Bay respectively.
(b) Give a simple comparison on the spending on lunch by workers working in Yuen Long and
Causeway Bay by using the result in part (a).
(c) For the combined data, find the sample mean, sample standard deviation, and 82th percentile of
spending on lunch on 1 September 2021.
(d) Use quartiles to comment the skewness of the combined data. Show your calculation.

6. The salaries of a sample of four salespersons are as follow $48000, $55000, $51000 and $50000.

(a) Find the mean and the standard deviation of these four sample values.

There are two suggestions for salary increment:

(b) Method A: Each person has an increment of $5000.


Calculate the mean and the standard deviation of these four sample values after salary
increment by
(i) calculate the salary of each person
(ii) define new salary (Y) as a linear function of original salary (X)

(c) Method B: Each person has an 20% increment in salary.


Calculate the mean and the standard deviation of these four sample values after salary
increment by
(i) calculate the salary of each person
(ii) define new salary (W) as a linear function of original salary (X)

6
AS 2021-22

7. There are 300 overweight patients joining a weight reduction program which target to help patient
to reduce the body weight by 10%. A random sample of 16 patients is selected so to monitor the
progress of the program. Below is the weight (round off to the nearest kg) of these 16 patients
before joining the program.

86 85 93 108 84 84 99 92
91 91 87 103 113 98 118 106

(a) Find the sample mean and sample standard deviation for the data.
(b) Find the first quartile, median, and third quartile for the data.
(c) Find the 10th percentile and 90th percentile.
(d) The target of the program is to reduce the body weight by 10%. If all patients exactly meet the
target, what are the sample mean and sample standard deviation of the body weight of these 16
patients after the program?
(e) Report the sample mean and sample standard deviation of the body weight of these 16 patients
after the program in terms of pounds.

8. A sample of 10 headphones is selected from an online store and the selling prices (in US$) are
presented in an ordered array as:

16, 25, 75, 125, 150,


185, k, 190, 200, 260

(a) It is known that the sample mean selling price of the above data is US$ 141.1, find the value of k.

(b) Find the sample standard deviation and sample variance of the above data.
(c) Find the first quartile, median and third quartile of the above data.
(d) Customers can make overseas order from the online store. The currency exchange rate is US$ 1
to HK$ 7.75. When an order is made in Hong Kong, there will be a service charge of HK$ 35 for
each item. Find the sample mean and sample standard deviation of the payment of ordering the
headphone in HK$.

7
AS 2021-22

9. Identify each of the following symbols used in the lecture note by writing down the summary
statistics it represents.
(a) µ
(b) σ
(c) σ2
(d) 𝑥̅
(e) s2
(f) s
(g) IQR
(h) Q1
(i) Q2
(j) Q3

8
AS 2021-22

Chapter 3 - Probability

1. A traveling agency is conducting a research to study the number of trips a customer goes for in a
year. Below is the frequency table:

Number of trips 1 2 3 4 5 6
Number of customers 43 89 56 12 4 2

Convert the above frequency table to the probability table of the number of trips a customer goes
for in a year.

2. An America restaurant is conducting a research to collect customers’ opinion on its food. One
question in the questionnaire is to ask the customer to choose the most favourable pizza offered
by the restaurant. 60 customers are classified according to their age group and their most
favourable pizza:

Most favourable pizza


Age group Hawaiian California New York City Los Angeles
18 – 25 7 6 6 3
26 – 45 1 9 0 10
46 or above 3 2 5 8

(a) In general, which pizza is the most popular choice among all customers? What percentage of
customers like this type of pizza?
(b) Which pizza is the most popular choice among customers aged between 18 and 25? What
proportion of customers aged between 18 and 25 like this type of pizza?
(c) Which pizza is the most popular choice among customers aged between 26 and 45? What is the
probability that a randomly selected customer aged between 26 and 45 like this type of pizza?
(d) Which pizza is the most favourable choice among customers aged 46 or above? What
percentage of customers aged 46 or above like this type of pizza?

9
AS 2021-22

3. A company is reviewing the travelling expense made by its staff. The table below is the
frequency table of the number of business trips in a month made by senior managers from
different department:

Number of business trips goes for in a month, X


Department x=0 x=1 x=2 x=3 x=4 x=5
Marketing 10 15 8 4 2 2
Finance 8 10 18 4 3 0
Research 30 12 10 0 0 0

(a) If one senior manager is selected randomly from the company, compile the following
probabilities respectively: P(X = 0), P(X = 1), P(X = 2), P(X = 3), P(X = 4), and P(X = 5), where
X is the number of business trips the selected senior manager goes for in a month.
(b) Recalculate the probabilities P(X = 0), P(X = 1), P(X = 2), P(X = 3), P(X = 4), and P(X = 5),
conditional on the senior manager is working in the marketing department.
(c) Recalculate the probabilities P(X = 0), P(X = 1), P(X = 2), P(X = 3), P(X = 4), and P(X = 5),
conditional on the senior manager is working in the finance department.
(d) Recalculate the probabilities P(X = 0), P(X = 1), P(X = 2), P(X = 3), P(X = 4), and P(X = 5),
conditional on the senior manager is working in the research department.

4. A publisher conducted a survey of a sample of 600 subscribers. In the survey the subscriber was
asked to place himself/herself in the most appropriate of the four categories: full-time student,
full-time employment, part-time employment, and self-employed. Among all 600 subscribers,
the numbers of subscriber fall in the four categories are as follow: 120 are full-time students, 350
are in full-time employment, 80 are in part-time employment and 50 are in self-employed. It is
also known that there are 250 male subscribers in the survey, and the numbers of them in the four
categories are 40, 180, 20, and 10 respectively.

(a) If one subscriber is selected randomly from this sample, what are the probabilities that the
selected subscriber is a full-time student, full-time employment, part-time employment, and
self-employed respectively?
(b) If one male subscriber is selected randomly from this sample, what are the probabilities that the
selected male subscriber is a full-time student, full-time employment, part-time employment, and
self-employed respectively?
(c) May is one of the female subscribers. What are the probabilities that May is a full-time student,
full-time employment, part-time employment, and self-employed respectively?

10
AS 2021-22

5. “Walk with me” has recently conducted a research about family relationship. In the survey,
there were 900 interviewees, 500 were elderly and 400 were teenagers. Each interviewee was
asked “how often do you have dinner with your family?” The interviewee would choose
between “seldomly: < 7 days in a month”, “sometimes: 7 days to 14 days in a month”, “quite
often: 15 days to 21 days in a month”, “always: more than 21 days in a month”. Among all
interviewees, there were 240, 350, 60 and 250 response to seldomly, sometimes, quite often,
and always respectively. Among elderly, 210 of them seldomly have dinner with family and
180 of them sometimes have dinner with family. Among teenagers, 40 of them quite often
have dinner with family and 160 of them always have dinner with family.

(a) What is the probability that a randomly selected person from the survey always have dinner
with family?
(b) What is the probability that a randomly selected elderly from the survey seldomly have dinner
with family?
(c) What is the probability that a randomly selected elderly from the survey always have dinner
with family?
(d) If an interviewee is known to be seldomly have dinner with family, what is the probability that
he / she is a teenager?
(e) If an interviewee is known to be always have dinner with family, what is the probability that he
/ she is an elderly?

11
AS 2021-22

Chapter 4 – Probability Distribution

1. A marketing assistant is observing customers in a supermarket. He randomly observes a sample


of 150 customers and records the number of cans of cola they purchase, 4 cans, 6 cans or 9 cans.
In the sample, there are 60 male customers and 90 female customers. Below is his observation
result:
Male Purchasing 4 cans Purchasing 6 cans Purchasing 9 cans
Frequency 15 28 17

Female Purchasing 4 cans Purchasing 6 cans Purchasing 9 cans


Frequency 51 32 7

Use the combined result of all 150 customers, by using X to represent the number of cans of
cola purchased by a customer, present the probability distribution function of X.

2. The following is the probability distribution function of the number of tutorial classes (X) a
secondary school student attends in a week.

x 0 1 2 3 4 5
P(X = x) k k 0.4 0.15 2k 2k

(a) What is the value of k?


(b) Most likely, how many tutorial classes a secondary school student attends in a week? What is the
corresponding probability?
(c) What is the probability that a secondary school student attends at least three tutorial classes in a
week?
(d) What is the probability that a secondary school student attends at most four tutorial classes in a
week?

12
AS 2021-22

3. The following is the probability distribution function of the number of notebooks sold (X) by Alex
in a day.

x 0 1 2 3 4
P(X = x) 4k 10k 3k 2k k

(a) What is the value of k?


(b) Most likely, how many notebooks are sold by Alex in a day? What is the corresponding
probability?
(c) Alex can get the daily bonus if he can sell more than 3 notebooks in day. What is the probability
that Alex can get the daily bonus?

4. Based on recent records, the manager of a car painting center has determined the following
probability distribution for the number of service request per day (X). Suppose the center has the
capacity to serve two customers per day.

x 0 1 2 3 4 5
P(X = x) 0.05 0.20 0.30 0.25 0.15 0.05

(a) Most likely, how many service requests would there be in a day?
(b) What is the probability that there are less than 2 requests in a day?
(c) What is the probability that there are more than 2 requests in a day?
(d) At least by how many must the capacity be increased so the probability of turning a request away
is no more than 0.1?

13
AS 2021-22

5. The tables below are the probability distribution functions of number of sick leave taken in a
month by male and female employees in a large company.

X: number of sick leave taken by male

x 0 1 2 3 4 5
P(X = x) 0.3 0.29 0.24 0.12 0.03 0.02

Y: number of sick leave taken by female

y 0 1 2 3 4 5
p(Y = y) ? 0.32 0.34 0.06 0.04 0.03

(a) What are the expected, variance and standard deviation of numbers of sick leave taken in a month
by male employees?
(b) What are the expected, variance and standard deviation of numbers of sick leave taken in a month
by female employees?

6. The following is the probability distribution function of the projected profit, X, of a stock.

x -10000 -2000 1000 5600 20000 25000


P(X = x) 0.15 0.1 0.2 0.25 m 0.3 – m

(a) What is the range of the possible values of m?


(b) What are the E(X), Var(X), and (X) if m = 0.2?
(c) What should be the value of m so that the expected profit of this investment fund is $6900?
(d) What is the maximum possible expected profit?

7. The following is the probability distribution function of the revenue (X) by investing $10,000 in a
particular stock.

x 8000 9000 12000 18000


p(x) 0.2 0.5 0.2 k

(a) Find the value of k.


(b) Find E(X) and Var(X).
(c) Define profit Y = X – 10000. Calculate the expected profit?

14
AS 2021-22

8. The following is the probability distribution function of the number of job orders Amy gets in a
day (X).

x 0 1 2 3 4
P(X=x) 2a a a 3a 8a

(a) What is the value of a?


(b) What are the expectation and standard deviation of the number of job orders Amy gets in a day?
(c) Suppose Amy’s daily salary is calculated as Y = 150 + 80X, what are the expectation and standard
deviation of Amy's daily salary?

9. The daily income of a tourist guide is calculated with the following formula Y = 300 + 75X, where
X is the number of tourists in the group. Suppose the distribution function of the number of
tourists in a group is as follow:

x 11 12 13 14 15 16
P(X=x) a a 3a 3a 2a 2a

(a) Calculate the expectation and standard deviation of the number of tourists in a group.
(b) Calculate the expectation and standard deviation of the daily income of a tourist guide.

10. Mary is an online trader who helps customer to purchase handbag from the United States. The
expected number of purchases in a week is 4.6 with standard deviation is 1.1. Each handbag
costs her US$1020 and she sells it at HK$10500. Suppose the fixed weekly cost to run the online
business is $520.

(a) Use X to denote the number of purchases in a week and Y to denote the profit (HK$) she earns in a
week. Express Y in terms of X. (US$1 converts to $7.8)
(b) Find the expectation and standard deviation of Mary’s weekly profit in HK$.

15
AS 2021-22

11. The following table represents the probability distribution function of the number of complaints
(X) received by a customer service desk in a day.

x 0 1 2 3 4
P(X = x) k 2k 4k 2.5k 0.5k

(a) What is the probability that there is no complaint received in a day?


(b) Find E(X), Var(X) and σ(X).

Currently there is only one customer service officer responsible for handling complaints received
at the customer service desk. The senior management is discussing if extra manpower is needed.

Assume it takes 1.5 hours to handle each complaint.

(c) Find the expected value and standard deviation of the number of hours needed for handling
complaints in a day.
(d) What is the probability that it takes more than 5 hours to handle complaints in a day?

12. The Department of Health has conducted a survey about regular body check-up. According to
the report, on the average, an adult aged between 30 – 40 years old, has taken 1.7 times of regular
body check-up in the last five years, with a standard deviation of 0.82 times. Another survey
from the Dental Association reports that, on the average, an adult aged between 30 – 40 years old,
has taken 1.05 times of dental check-up in the last five years, with a standard deviation of 0.42
times. Suppose the number of regular check-up and the number of dental check-up taken by an
adult are independent. Find the expectation and standard deviation of the total number of
check-up (regular body check-up plus dental check-up) an adult aged between 30 – 40 years old
has taken in the last five years.

16
AS 2021-22

13. The following is the probability distribution function of the number of visits to Ocean Park in a
year for a customer who has the annual pass:

x 0 1 2 3 4 5 6 7 >7
P(X = x) 0 0.02 0.19 0.34 0.35 0.07 0.02 0.01 0

(a) Calculate the (i) expectation and (ii) standard deviation of the number of visits to Ocean Park in a
year for a customer who has the annual pass.
(b) Use Y to denote the number of visits to Disneyland in a year for customer who has the annual pass.
It is known that E(Y) = 3.67 days and σ(Y) = 1.28 days. Referring to the group of individual who
has both Ocean Park annual pass and Disneyland annual pass and assume that the number of visits
to Ocean Park and Disneyland are independent events, find (i) the expectation, (ii) variance, and
(iii) standard deviation of the total number of visits to the two theme parks in a year.

14. Lisa is a graphic designer. She works for an advertising firm as a part-time staff. The number
of jobs she works for the firm in a month has a probability distribution function as follow:

x 3 4 5 6 7 8
P(X = x) 0.1 k 3k 4k k k

(a) What is the value of k?


(b) Calculate the expectation, variance, and standard deviation of the number of jobs who works for
the firm in a month.
(c) Suppose she gets commission of $2500 for every job. What is the probability that she gets at
least $15000 commission in a particular month?
(d) Use Y to denote the monthly commission she earns from this advertising firm. Calculate E(Y)
and (Y).
(e) Lisa also works as a part-time teacher in a design school. Her monthly income from the school
has an average of $18000 with standard deviation of $5500. Suppose her income from the
advertising firm and the design school are independent. Use T to denote her total monthly
income from the advertising firm and the design school. Find E(T) and (T).

17
AS 2021-22

15. Tim is a social worker serving the government schools in Hong Kong Island. He provides
consultation to four new students every week. Each student needs to be referred to the senior
social worker with probability 0.3. Assume all cases are handled independently. Use X to
denote the number of cases needed to be referred to the senior social worker in a week, X ~ Bin(n,
p)

(a) What are the values of n and p?


(b) Tabulate the probability distribution function of X.
(c) Most likely, how many cases needed to be referred to the senior worker in a week?
(d) Find E(X) and σ(X).

16. In a café, there are two types of coffee, type A and type B. Type A is more popular than Type
B that 75% of the customers would choose type A coffee. Suppose now there are 12
customers in the line. Use X to denote the number of customers in the line would choose type
A coffee,
X ~ Bin(n, p).

(a) What are the values of n and p?


(b) Calculate E(X).
(c) Hence, most likely, how many customers would choose coffee A?
(d) Calculate P(X = 7), P(X = 8), P(X = 9), P(X = 10) and P(X = 11) respectively.

17. According to the results of a recent survey, 10% of the teenagers in a city are habitual smokers.
Now 20 teenagers are selected randomly and independently from the city.

(a) Find the probability that exactly 5 teenagers in the sample are habitual smokers.
(b) Find the probability that exactly 18 teenagers in the sample are not habitual smokers.
(c) Find the probability that at least 3 teenagers in the sample are habitual smokers.

18
AS 2021-22

18 It is known that 3% of the light bulbs in the production line are defective. A customer
complained on the poor quality of the light bulbs and asked for a checking. A random sample of
20 light bulbs is selected and will be sent out for inspection.

(a) Most likely, how many defective light bulbs in the sample?
(b) What is the probability that there are exactly two defective light bulbs in the sample?
(c) What is the probability that there are at least 3 defective light bulbs in the sample?

19. Every day, the first 25 visitors to the tourist information centre would play a lucky draw. Each of
them has a probability of 0.7 to win a souvenir. Whether visitors can get a souvenir are
independent events.

(a) What are the expectation and standard deviation of number of visitors can get a souvenir in a day?
(b) Most likely, how many visitors can get a souvenir in a day?
(c) What is the probability that more than 22 visitors can get a souvenir in a day?

20. Mary has three tickets so she can join the lucky draw three times. The probability of getting a
present from each round of lucky draw is 0.15.

(a) Use X to denote the number of presents she will get after three rounds of lucky draw. Fill in the
probability distribution function with calculation.

x 0 1 2 3
P(X = x) (i) (ii) (iii) (iv)

(b) Suppose the value of each present is $40. Calculate the expectation of the total value of all
presents she will get.

19
AS 2021-22

21. According to medical records, 65% of patients who have diagnosed of having disease X can
recover within one week. In an elderly care centre, there are 22 residents diagnosed of having
disease X. The recovery time of each affected resident is independent.

(a) Most likely, how many affected residents can recover within one week? What is the
corresponding probability? (Justify your answer with the calculation of the probabilities of the
most likely two possibilities.)
(b) What is the probability that at most 20 of them can recover within one week?
(c) The centre can apply medical allowance from the government for treating each affected resident.
There are $800 allowance for a patient who can recover within one week and $1200 allowance
for a patient who take more than one week to recover. Project the total allowance for the 22
affected residents by calculate the expected total allowance.

22. According to the statistics of an education organization which assists secondary school graduates
applying for student VISA for overseas study, most graduates go to either USA or UK.
Company “StudyFree” organizes briefing sessions to the graduates regularly to explain the
process of getting USA student VISA and UK student VISA. Each briefing session has 15
graduates and by the end of the session each graduate has to confirm which country he / she would
apply for. Following are the percentages of graduates who are interested in these two countries.

USA UK
30% 70%

Assume the choice of destination of each graduate is independent.

(a) What is the expectation and standard deviation of the number of graduates going to UK in a
briefing session?
(b) Most likely, how many graduates in a briefing session would go to UK? What is the
corresponding probability?
(c) What is the chance of having at least one graduate going to USA and at least one graduate going
to UK in a briefing session?
(d) The service charge of application of a USA student VISA is $1800 and the service charge of the
application of a UK student VISA is $2200? What are the (i) expectation and (ii) standard
deviation of total service charge collected in a briefing session?

20
AS 2021-22

23. Peter is a waiter of a café in Central. According to his observation, 80% customers order
breakfast A, while other customers order breakfast B. At this moment, all six tables in the café
are occupied, each table with one customer. He is going to take order from each of the customer.

(a) Use X to denote the number of customers will order breakfast B, so that X ~ Bin(n, p). What are
the values of n and p?
(b) Most likely, how many customers will order breakfast B? Calculate this probability.
(c) What is the probability that at least two customers will order breakfast B?
(d) The price of breakfast A is $45 and the price of breakfast B is $38. Calculate the (i)
expectation and (ii) standard deviation of the revenue for repeated samples of six individual
customers.

21
AS 2021-22

Chapter 5 –Normal Distribution

1. Practice the use of standard normal table


Use the standard normal table, find the following probabilities:

(a) P(0 < Z < 2)


(b) P(Z < 1.86)
(c) P(-0.24 < Z < 0)
(d) P(-0.24 < Z < 2.40)
(e) P(-1.79 < Z < -1.30)
(f) P(Z < -1.58)

2. Practice the use of standard normal table


Use the standard normal table, find the value of a:

(a) P(0 < Z < a) = 0.32


(b) P(Z > a) = 0.35
(c) P(Z > a) = 0.825
(d) P(Z < a) = 0.15
(e) P(Z < a) = 0.65
(f) P(-a < Z < a) = 0.4568

3. The length of time a patient waits in Dr. Chan’s waiting room is known to be normally distributed
with mean 14 minutes, and standard deviation 4 minutes.

(a) Find the probability that a patient will wait for more than 20 minutes to see the doctor.
(b) What proportion of patients will wait for more than 10 minutes?

4. Suppose you must establish regulations concerning the maximum number of people an elevator
can occupy. A study of elevator occupancies indicates that if eight people occupy an elevator, the
probability distribution of the total weight of the eight people is a normal distribution with mean of
1200 pounds and standard deviation of √9800 pounds.

(a) What is the probability that the total weight of eight people exceeds 1300 pounds?
(b) What is the probability that the total weight of eight people exceeds 1500 pounds?

22
AS 2021-22

5. According to the past experience, the average arrival time of a flight is 18:10. Consider X be the
number of minutes a flight being delay. X has a normal distribution with mean 0 and standard
deviation of 10 minutes.

(a) What is the probability that the flight arrives before its 18:00?
(b) Passengers must check in for a connecting flight by 18:30 at the latest. What is the probability that
passengers from the first flight arrive too late for the connecting flight? (Assume no traveling time
from aircraft to check-in)

6. In a very large class in world history, the final examination scores have a mean of 66.5 and a
standard deviation of 12.6. Assume the scores are normally distributed. The teachers are
discussing which method should be used as the grading criteria.

(a) Method 1, standard grading. Grade A are graded to those who get more than 78 marks. What
percentage of the students should get grade A?
(b) Method 2, relative grading. If grade A is given to the top 11.7% of the students, what is the
minimum score to get grade A?

7. The amount a customer spends on a single visit to Park & Save supermarkets has a normal
distribution with mean $75 and standard deviation $21. Park & Saves Supermarkets wish to
introduce a minimum amount for which credit cards may be used, which enables 80% of
customers to pay by credit card. At how much should this minimum spending be set at?

8. The price of an air ticket to European countries follow a normal distribution with mean $5200 and
standard deviation $740. Suppose 95% customers spend ($5200 - k, $5200 + k) for an air ticket
to European countries, what is the value of k?

9. A survey reports that the spending on an online order in supermarket SMART follows a normal
distribution with mean $360 and standard deviation $80.

(a) What is the probability that the spending of an online order is less than $428?
(b) There is 85% online order with spending between $(360 – M) and $(360 + M). What is the value
of M?

23
AS 2021-22

10. Peter is a mini bus drivers and he drives between Mong Kok and Kwun Tong. It may assume that
the journey time for each ride is normally distributed with mean of 30 minutes and standard
deviation of 5 minutes.

(a) Peter leaves Mong Kok at 9:00a.m. What is the probability that he arrives Kwun Tong after
9:20a.m.?
(b) If there is 93.7% chance that Peter would spend less than k minutes for one ride, what is the value
of k?

11. The monthly salary of an employee in ABC company is normally distributed with mean $12000
and standard deviation $1000. There is a 5% salary increment for every employee after the New
Year.

(a) What are the mean and standard deviation of the monthly salary after the salary adjustment?
(b) If there is 15% of the employees earns less than $k per month after the salary adjustment. What is
the value of k?

12. May and Sam own a cafe together. The monthly revenue of the cafe follows a normal
distribution with mean $45,000 and standard deviation $8,000.

(a) What is the probability that the monthly revenue of the cafe is between $35,000 and $41,800 in a
month?
(b) There is 67% chance that the monthly revenue of the cafe is less than $K. Find the value of K.
(c) In each month, besides a basic salary of $7000, 30% of the revenue of the cafe will goes to May's
salary. Find the expectation and variance of May’s monthly salary.
(d) What is the probability that May's monthly salary will be less than $17,000 in a month?

24
AS 2021-22

13. The manager of a local logistic company is reviewing the cost and the service charge of the
delivery service. Packages are classified as small size, middle size, and large size and the
delivery cost is calculated based on the weight of the package and the traveling distance.
According to the record, the delivery cost of a small size package follows a normal distribution
with mean $60 and standard deviation $12.

(a) What is the probability that the delivery cost of a small size package is $40 or more?
(b) There is 87.9% of the delivery cost of a small size package is more than $M. Find the value of M.
(c) The service charge is currently calculated by the formula, Y = 200 + 1.8X, where X is the delivery
cost for the package. Find the mean and the standard deviation of the service charge of the
delivery of a small size package.
(d) Instead of calculating the service charge by the original formula, the manager wants to fix the
service charge of delivering a small size package at $350. What proportion of small size
package will be charged more by the new pricing system than the original pricing system?

14. The monthly salary of the employees in a company follows a normal distribution with mean
$15,000 and standard deviation $500.

(a) What is the probability that the monthly salary of an employee is higher than $16,000?
(b) There are 85% of all employees with monthly salary less than $t. Find the value of t.
(c) The salary of each employee will be increased by 10%. Find the (i) mean and (ii) standard
deviation of the adjusted monthly salary.
(d) The manager of the company claims that over 20% of the employees would have monthly salary
more than $17,000 after the adjustment. Is it true? Support your answer with calculation.

25
AS 2021-22

15. A recent research has been conducted by a travel agency in order to understand customers’
expectation on cruise tours. One major topic in the research is to investigate the budget a
customer would be willing to spend on a 5 days-tour to Korea. The result shows that a customer
would be willing to spend on the average of $8,000 with a standard deviation of $1,200. It is
assumed the that the spending is normally distributed.

(a) What is the probability that a customer would be willing to spend at least $7,000 on a 5 days-tour
to Korea?
(b) There is an 85% chance that a customer would be willing to spend at least $K on a 5 days-tour to
Korea. What is the value of K?
(c) The research also reports that people are willing to pay 20% more if the destination is changed to
Japan. What are the (i) mean and (ii) standard deviation of the budget for a 5 days-tour to Japan?
(d) Suppose $(L1, L2) indicates the budget of the middle 92% customers willing to spend on a 5 days
tour to Japan. What are the values of (i) L1 and (ii) L2?

16. May owns a small store selling handmade accessory. The monthly income earned by selling
earring and the monthly income earned by selling wallet are independent normal variables. The
mean and standard deviation of the monthly income earned by selling wallet are $20000 and
$5000; while the mean and standard deviation of the monthly income earned by selling earring
are $15000 and $3300.

(a) What are the mean and standard deviation of the total monthly income earned by selling these two
products?
(b) She will not run the business anymore if the probability that she earns less than $30000 is higher
than 0.3. Should she quit the business? Justify your answer with calculation.

17. The lifetime of a watch battery is normally distributed with mean 5400 hours and standard
deviation 40 hours. Suppose every 2 batteries are packed together. Use T to denote the total
lifetime of the 2 batteries.

(a) What are the mean and standard deviation of total lifetime?
(b) What is the probability that the total lifetime of 2 batteries is less than 10900 hours?

26
AS 2021-22

18. Eva goes to Jennifer’s beauty salon every Sunday. According to her experience, the waiting time
to see Jennifer is normally distributed with mean 10 minutes and standard deviation 4 minutes.
The facial treatment time is normally distributed with mean 55 minutes and standard deviation 5
minutes. The waiting time and facial treatment time are independent.

(a) What is the probability that Eva waits for less than 5 minutes to see Jennifer?
(b) Using X to represent the waiting time and Y represents the facial treatment time, write down the
distribution of total time (waiting time plus facial treatment time) Eva spends in Jennifer’s Beauty
Salon.
(c) Eva arrives the Salon at 6:30p.m. What is the probability that she leaves the Salon before 7:30
p.m.?

19. John is an office assistant working in a large lawyer firm. Every morning, he handles the post
mails. On one hand, he collects post mails from colleagues; on the other hand, he distributes the
arrived post mails to colleagues. Suppose the time he spends on collecting mails each morning
follows a normal distribution with mean 40 minutes and standard deviation 12 minutes; while the
time he spends on delivering mails follows a normal distribution with mean 65 minutes and
standard deviation 8 minutes. It is reasonable to assume that the time he spends on colleting
mails and the time he spends on delivering mails are independent.

(a) What is the probability that he finishes the two jobs within 2 hours in a morning?
(b) There is 10% chance that he would use more than M minutes to finish the two jobs. What is the
value of M?

27
AS 2021-22

20. A survey was conducted in a secondary school to investigate students’ activities during lunch time.
The survey particularly focus on two items, (i) the total traveling time a student spends on walking
from school to the nearby restaurant and then walking back to school after lunch, and (ii) the time
a student spends on having lunch. The total traveling time follows a normal distribution with
mean 15 minutes and standard deviation 2 minutes, while the time a student spends on having
lunch follows a normal distribution with mean 22 minutes and standard deviation 5 minutes.

(a) What is the probability that a randomly selected student in the school spends more than 12
minutes on traveling?
(b) There are 62.5% of the students spends less than k minutes on traveling. What is the value of k?
(c) Use T denote the total number of minutes that a student spends outside school during lunch time,
which includes the total traveling time and the time a student spends on having lunch. Assume
the traveling time and the time for lunch are independent. What are the (i) mean and (ii) standard
deviation of T?
(d) Find P(T > 35).

21. The weight of a box of chocolates of a certain brand follows a normal distribution with a mean
of 85 grams and a standard deviation of 7 grams. During the promotion period, there are
special sets, each include two randomly selected boxes of chocolates. Use T to denote the total
weight of chocolates in a special set.

(a) Find the expectation, median, and standard deviation of T.


(b) What is the probability that the weight of chocolates in a special set is less than 160 grams?

28
AS 2021-22

Chapter 6 –Sampling Distribution and Central Limit Theorem

1. Random samples are repeatedly taken from the distribution of X, where X ~ N(60, 42), and the
sample means are calculated. What are the expectation and standard error of the sample mean if

(a) Sample size = 5


(b) Sample size = 10
(c) Sample size = 15

2. The weight of a can of soup follows a normal distribution with population mean of 375 grams and
standard deviation of 4 grams. Every six cans of soup are packed together randomly as a value
pack. Suppose the average weight of six cans of soup in each value pack is recorded.
What are the (a) expectation and (b) standard error of the sample mean?

3. The weight of a large luggage follows a normal distribution with mean 24 kg and standard
deviation 5 kg. Random samples of 5 large luggage are selected and the average weight for each
sample is recorded. What are the (a) expectation and (b) standard error of the distribution of the
average weight?

4. The lifetime of a new brand of clock battery is normally distributed with mean 8200 hours and
standard deviation 50 hours. Every 40 batteries of this brand are packed and sold in
supermarkets. What are the (a) expectation and (b) standard error of the average lifetime of a
pack of battery?

5. In a factory, it is known that 9% of the products are defective. Random samples of 80 items are
selected regularly for inspection. What are the (a) expectation and (b) standard error of the
sample proportion of defective item?

6. In a city, it is known that 20% of the residents is left-handed. What are (a) the expectation and (b)
the standard error of the sample proportion of left-handed resident for many samples with sample
size 75?

29
AS 2021-22

7. In a survey conducted by the credit card company, it is shown that 70% of the customers would
not pay the bill by monthly instalment if the credit amount is less than $10,000. Many samples,
each with sample size 40, are selected and the sample proportion of customers not paying the bill
by monthly instalment in each sample is calculated. What are the (a) expectation and (b)
standard error of the sample proportions?

8. BIG Bus Corporation has to conduct surveys regularly to evaluate its service quality. According
to previous studies, 87% of the passengers refuse the invitation to take part in such surveys.
Recently, it is planned to invite 350 passengers to take part in the survey. What are the (a)
expectation and (b) standard error of sample proportions of passengers who refuse the invitation to
take part in the survey, if every time 350 passengers are invited?

30
AS 2021-22

Chapter 7 – Estimation

1. In order to estimate the population mean amount spent for textbooks per student during the fall
semester at a large community college, a random sample of 75 students is surveyed. It is
assumed that the spending follows a normal distribution with unknown population mean and the
population standard deviation is $35. The sample mean spending of the 75 surveyed students is
$158.30.

(a) Give a point estimate for the population mean cost per student.
(b) Find the sampling error at 90% confidence level.
(c) Construct the 90% confidence interval estimate for the mean cost per student for all students

2. The quality control manager at a light bulb factory needs to estimate the average lifetime of a light
bulb in a large shipment. The process standard deviation is known to be 100 hours. A random
sample of 80 light bulbs indicated a sample average lifetime of 3500 hours.

(a) Give a point estimate for the population mean lifetime.


(b) Find the sampling error at 95% confidence level.
(c) Set up a 95% confidence interval estimate of the population average lifetime of light bulbs in this
shipment.

3. Suppose that a paint supply store wants to estimate the correct amount of paint contained in
one-gallon cans purchased from a nationally known manufacturer. It is known from the
manufacturer’s specifications that the standard deviation of the amount of paint is equal to 0.02
gallon. A random sample of 50 cans is selected, and the average amount of paint per one-gallon
can is 0.995 gallon. Set up a 99% confidence interval estimate of the population mean amount of
paint.

4. A travel agency frequently arranges seminars in different topics for promotion purpose. The
manager wants to estimate the population mean time for one seminar. A random sample of 40
seminars has the sample mean of 75 minutes. It is believed that the population standard deviation
of time require for a seminar is 10 minutes. Construct a 95% confidence interval estimate of
the population mean time required for one seminar.

31
AS 2021-22

5. A travel agency conducts a survey in order to study the customer’s spending in a 2-week holiday
in European. It is assumed the spending follows a normal distribution with unknown mean and
population standard deviation of $5500. A random sample of 25 customers is selected for
investigation and the data ($) is as follows:
58000 60000 43000 55000 50000
62000 47000 66000 62000 51000
49000 47000 53000 54000 49000
52000 50000 40000 32000 48000
52000 53000 47000 46000 49000

(a) Calculate the 98% sampling error for the estimation of the population mean spending.
(b) Construct the 98% confidence interval for the estimation of the population mean spending.

6. The monthly working hours of a sample of 13 part-time workers from a company were
83 58 70 56 76 64 80 76 70 97 68 78 108
It is assumed that the monthly working hours is normally distributed.

(a) Find the sample mean and standard deviation.


(b) Construct the 95% confidence interval for the estimate of the mean monthly working hours of all
part-time workers in the company.

7. Ten randomly selected i-cable TV customers were each asked to list out how many hours of
television watched per week. The results are:

82 66 90 84 75 88 80 94 110 91

Determine the 90% confidence interval estimate for the mean number of hours of television
watched per week by i-cable TV customers. Assume the number of hours is normally
distributed.

32
AS 2021-22

8. A random sample of 20 babies is randomly taken from the newborn babies at Northside Hospital
during the year 2015. The sample mean and standard deviation of the weight of a baby is 6.87 lb
and 1.76 lb respectively. Based on past information, it is assumed that weight of newborn baby
follows a normal distribution. Construct the 95% confidence interval estimate for the mean
weight of all babies born in this hospital in 2015.

9. A stationery store wants to estimate the mean retail value of greeting cards that it has in its
inventory. A random sample of 40 greeting cards indicates an average value of $16.7 and a
standard deviation of $3.2. Assume the value of greeting cards follows a normal distribution, set
up a 95% confidence interval estimate of the mean value of all greeting cards in the store’s
inventory.

10. A software company is organizing a competition which invites secondary school students to
produce an animation movie by using its software. The organizer takes a random sample of 12
movies and reviews the duration of each movie. The lengths of these 12 movies (in minutes) are:

71 91 64 83 73 77 82 93 65 84 89 69

It is assumed that the length of a movie is normally distributed with population standard deviation of
8 minutes. Construct a 95% confidence interval estimate for the population mean length of a
movie.

33
AS 2021-22

11. In a beverage manufacturing plant, a production line operates to fill the containers with 16 ounces
of cola. A quality control inspector checks a random sample of 25 bottles. The weight of cola
(ounces) in each bottle is recorded as follow:

15.95 16.07 16.11 15.93 16.08


16.02 15.87 16.12 16.02 16.08
15.88 16.03 16.09 15.93 15.99
16.08 16.11 16.01 16.04 16.12
15.97 16.22 15.89 16.12 16.04

Assume the weight of cola in a bottle is normally distributed.

(a) Construct a 98% confidence interval estimate of the population mean weight of cola in a bottle.
(b) For random samples, each with sample size 25, will be selected periodically. Suppose the 98%
confidence interval estimate of the population mean weight of cola in a bottle have been
constructed for 300 independent samples. Approximately how many such intervals can
successfully include the true unknown population mean?

12. Joey is asked to estimate the proportion of people driving “Benz” in a commercial building in
Central. She randomly identified 200 cars in the parking lot, of which she found 17 to be Benz.

(a) Find the point estimate for the population proportion of people driving “Benz” in this building.
(b) Calculate the sampling error at 90% confidence level.
(c) Construct the 90% confidence interval for the proportion of people driving “Benz” in this building.

13. A telephone survey was conducted to estimate the proportion of households with a personal
computer. Of the 380 households surveyed, 290 had a personal computer.

(a) Find the point estimate for the population proportion of household with a personal computer.
(b) Calculate the sampling error at 95% confidence.
(c) Construct the 95% confidence level estimate for the population proportion of household with a
personal computer.

34
AS 2021-22

14. In a sample of 60 randomly selected residents, the followings are the candidate they are going to
vote for:

Tim Tim Mike Mike Mike Julia Mike Mike Tim Tim
Mike Julia Tim Tim Mike Mike Julia Julia Tim Tim
Tim Mike Mike Tim Mike Tim Mike Julia Tim Tim
Mike Julia Tim Mike Tim Julia Tim Tim Tim Mike
Tim Mike Julia Tim Julia Mike Tim Mike Tim Tim
Tim Julia Mike Tim Mike Tim Mike Tim Mike Tim

Construct a 90% confidence interval for the proportion of all residents who support Mike to be the
next president.

15. It is known that the weight of a melon from a farm is normally distributed. It is believed that the
population standard deviation of the weight of a melon is 0.9 kg while the population mean is
unknown. In order to estimate the population mean, a random sample of 120 melons is taken
and the sample mean is 4 kg.

(a) Construct a 95% confidence interval estimate for the population mean weight of melons.

Another random sample is taken in order to estimate the population proportion of melons which
weigh heavier than 4.2 kg. In a sample with 200 randomly selected melons, 30 of them weigh
heavier than 4.2 kg.

(b) Construct a 90% confidence interval estimate for the population proportion of melons which
weigh heavier than 4.2 kg.

35
AS 2021-22

16. A survey is carried out to study whether the customers are satisfied with the service provided by
the shop. Out of a total of 300 randomly selected customers, 168 customers are satisfied with the
service. Furthermore, among these 300 customers, the sample mean spending on one visit to the
shop is $820 and the sample standard deviation is $165.

(a) Construct a 98% confidence interval estimate for the population proportion of customers who are
satisfied with the service.

(b) Construct a 98% confidence interval estimate for the population mean spending on one visit to the
shop.

17. A survey is conducting in a theme park. One of the objectives is to review the design of a game
counter.

(a) The following is the score obtained by a sample of 10 participants in the game counter:
97 117 140 78 99
148 108 135 126 121

(ai) Assume the score follows a normal distribution. Construct a 95% confidence interval estimate
for the population mean score.
(aii) Random samplings are conducted repeatedly and for each random sample a 95% confidence
interval is constructed for the population mean. After 200 samplings, each with sample size 10,
there are 200 confidence intervals. About how many such confidence intervals would cover the
true population mean?

(b) The survey also shows that 360 out of 500 randomly selected participants enjoy the game.
Construct a 90% confidence interval estimate for the population proportion of participants enjoy
the game.

36
AS 2021-22

18. A research is conducting to study the university students’ credit card usage.

(a) The first objective of the research is to estimate the average monthly spending with credit card.
A random sample of 45 students is taken and the sample mean spending in a month is $4,600 with
the sample standard deviation is $1,400. Assuming the amount of monthly spending with credit
card follows a normal distribution. Construct a 95% confidence interval estimate for the
population average monthly spending with credit card.

(b) Another objective of the research is to study whether student would make full payment of the loan.
Among these 45 students, 30 of them always pay the full payment before deadline. Construct a
90% confidence interval estimate for the population proportion of students always pay the full
payment before deadline.

37
AS 2021-22

Chapter 8 – Hypothesis Testing

1. The manager at Air Express feels that the weights of packages shipped recently are less than in the
past. Records show that in the past, packages had a mean weight of 36.7 lb and a standard
deviation of 14.2 lb. A random sample of 64 packages taken today yielded a mean weight of 32.1
lb. Is this sufficient evidence to conclude that weights of packages are less than in the past? Test
the hypothesis at 5% significance level. (Assume the standard deviation is unchanged.)

2. The director of manufacturing at a clothing factory needs to determine whether a new machine is
producing the cloth according to the standard of the mean breaking strength of 70 pounds. The
population standard deviation of the breaking strength is known to be 3.5 pounds. A sample of
49 pieces of cloth reveals a sample mean breaking strength of 69.1 pounds. Is there evidence that
the machine is not meeting the specification? (In terms of the population mean breaking strength
is different from 70 pounds.) Test the hypothesis at 5% significance level.

3. A salad dressing machine is working properly when 8 ounces of salad are dispensed into a bottle.
The standard deviation of the process is 0.15 ounce. A sample of 50 bottles is selected
periodically and the filling is stopped if there is evidence that the mean amount dispensed is
different from 8 ounces. Suppose that the mean amount dispensed in a sample of 50 bottles is
7.983 ounces. Is there evidence that the population average amount is different from 8 ounces?
Test the hypothesis at 10% level of significance.

4. The policy of a particular bank branch is that its ATMs must be stocked with enough cash to satisfy
customers making withdrawals over an entire weekend. The expected mean amount of money
withdrawn from the North Point branch per customer transaction over the weekend is $1600 with
an expected standard deviation of $300. In order to check if the customers’ average withdrawal
has been increased, a sample of 36 transactions is examined. The sample mean withdrawal is
$1680. Is there evidence to believe that the true mean withdrawal is greater than the expectation?
Test the hypothesis at 5% level of significance.

38
AS 2021-22

5. The director of admission team of a large university advises parents of incoming students about
the cost of textbooks during a semester. A sample of 100 students enrolled in the university
indicates a sample mean cost of $3154 with a sample standard deviation of $432. Test, at the
0.05 level of significance, is there evidence that the population mean cost of textbooks is above
$3000.

6. A manufacturer of flashlight batteries took a sample of 13 batteries from a day’s production and
used them continuously until they failed to work. The lifetime (hours) until failure was:

342 426 317 545 264 451 1,049


631 512 266 492 562 298

At the 0.05 level of significance, is there evidence that the mean life of the batteries is different
from 400 hours? Assume the lifetime of the batteries is normally distributed.

7. In order to test the null hypothesis that “the mean weight of adult males equals to 160 lb”, the
weights of 16 males were collected with the following results:

173 178 145 146 157 175 173 137


152 171 163 170 135 159 199 131

Assume normality, test at the 0.05 level of significance, is there evidence to reject the null
hypothesis for the alternative hypothesis that “the mean weight for adult males is different from
160 lb”.

39
AS 2021-22

8. In a city, according to a 1990 demographic report based on the census result, the average daily
spending on food per person is $75 with standard deviation of $15. In 2020, a random sample of
28 persons in the same city is selected and the daily spending ($) on food for each interviewee is
recorded as follow:

55.0 59.5 62.5 65.5 68.5 69.0 70.0 70.0 82.0 82.5
83.5 85.0 86.0 86.5 86.5 87.0 87.0 87.0 89.0 92.5
93.0 94.5 94.5 96.0 98.0 102.5 108.5 110.5

Is there sufficient evidence to conclude that the population mean daily spending on food per
person in 2020 is higher than 1990? Test at 1% level of significance with the assumption that
2020 daily spending on food is normally distributed with the same population standard deviation
as in 1990.

9. It is claimed that the students at a certain university will score an average of 35 on a given test. Is
the claim reasonable if a random sample of test scores from this university yields 33, 42, 38, 37, 30,
42? Complete a hypothesis test using α = 0.1. Assume test results are normally distributed.

10. The Better Sleep Council reports that 61% of the residents get more than seven hours of sleep per
night on the weekend. A random sample of 350 adults found that 235 had more than seven hours
sleep at last weekend. At the 0.05 level of significance, does this evidence show that the
proportion of the residents get more than seven hours of sleep per night on the weekend is more
than 61%?

11. A politician claims that she will receive more than 60% of the vote in an upcoming election. The
results of a properly designed random sample of 100 voters showed that 65 of those sampled will
vote for her. Test, at the 0.05 level of significance, is her supportively rate more than 60%?

12. A country judge has agreed that he will give up his country judgeship and run for the state
judgeship if there is evidence that more than 25% of his party fellow oppose him. A random
sample of 800 party members indicated that 217 of them opposed him. Does this sample suggest
that he should give up his country judgeship and run for the state judgeship? Carry out this
hypothesis test by using α = 0.10.

40
AS 2021-22

13. With respect to a statement claimed by a magazine “More men work at home than women”, a
woman right organization has conducted a survey to testify it. The study of 899 home-based
businesses reported that 369 were owned by women. Does the finding have sufficient evidence
to support that the proportion of home-based business owned by women is less than 0.5? Test the
hypothesis at 5% significance level.

14. A bank offers three types of welcoming gifts to VISA card applicants.
Gift A: $400 supermarket coupon
Gift B: a 2-persons tea set voucher
Gift C: 8 sets of movie tickets

The choice made by a random sample of 40 applicants is as follow:

B A B A A A C A
A B A A B A A B
B B A A B A B A
A B A C A C A C
A A A A A A B A

Based on the table above, test, at the 0.01 level of significance, is there sufficient evidence to
conclude that more than half of the applicants choose Gift A?

15. A test was conducted to compare the wearing quality of the tires produced by two tire companies.
All the factors are controlled on both brands of tires, car by car. The following is the summary of
the amount of wear (in thousandths of an inch) of six test cars:

Car 1 2 3 4 5 6
Brand A 125 64 94 38 90 106
Brand B 133 65 103 37 102 115

Test, at the 0.05 level of significance, is there evidence that the amount of wear of the two brands
of tire is different? Assume the amount of wear is normally distributed.

41
AS 2021-22

16. The following is the data obtained from an experiment designed to estimate the reduction in
diastolic blood pressure as a result of following a salt-free diet for two weeks. Assume diastolic
readings to be normally distributed.

Before 93 106 87 92 102 95 88 110


After 92 102 89 92 101 96 88 105

Is there a significant reduction in diastolic blood pressure after following a salt-free diet? (Use α =
0.05).

17. The table below shows the weights (in pounds) of 8 adult participants, measured before and after
joining the weight control program.

Before 122 130 114 139 150 147 155 153


After 123 125 116 132 141 138 158 152

Test, at the 5% significance level, whether there is any evidence showing that the weight control
program can effectively reduce weight.

18. A series of promotion program is conducted in January and February in a supermarket chain.
The average spending of a customer on 1 January and 1 March from a sample of 9 supermarkets
are recorded and the result is as follow:

Supermarket 1 2 3 4 5 6 7 8 9
Average spending of a 258 304 188 194 225 179 251 294 174
customer on 1 January
Average spending of a 262 312 196 187 250 182 247 274 190
customer on 1 March

Test, at the 1% significance level, if there is an increase in the average spending of a customer
from 1 January to 1 March.

42
AS 2021-22

19. Before buying a new milling machine, the purchasing director would like to check if there is
evidence that the parts produced by the new machine have a significantly higher average breaking
strength than those from the old machine. The process standard deviation of the breaking
strength for the old machine is 10 kilograms and for the new machine is 9 kilograms. A sample
of 100 parts taken from the old machine indicates a sample mean of 65 kilograms, while a sample
of 100 parts taken from the new machine indicates a sample mean of 72 kilograms. Is there
evidence to support the director to purchase the new machine? Write a test report in critical
value approach based on the following Excel output generated at 0.01 level of significance.

z-Test: Two Sample for Means

Old New
Mean 65 72
Known Variance 100 81
Observations 100 100
Hypothesized Mean Difference 0
z -5.2031
P(Z<=z) one-tail 0.0000
z Critical one-tail 2.3263
P(Z<=z) two-tail 0.0000
z Critical two-tail 2.5758

43
AS 2021-22

20. An experiment is designed to compare the differences in average surface hardness of two
modified materials, A and B. Based on past experience, it is believed that the standard
deviations in surface hardness are 10.2 and 6.4 for materials A and B respectively. In the
experiment, 60 items are selected, 30 items are material A and 30 items are material B. If the
sample means hardness of materials A and B are 163.4 and 156.9, use a 0.05 level of significance
to determine whether there is evidence of a difference between the hardness of the two materials.
Write a test report in p-value approach based on the following Excel output.

z-Test: Two Sample for Means

Material A Material B
Mean 163.4 156.9
Known Variance 104.04 40.96
Observations 30 30
Hypothesized Mean
Difference 0
z 2.9566
P(Z<=z) one-tail 0.0016
z Critical one-tail 1.6449
P(Z<=z) two-tail 0.0031
z Critical two-tail 1.9600

44
AS 2021-22

21. Two samples of burgers are selected from two branches of Macy Restaurants in order to test if the
average fat contents (in grams) of burgers from the two branches are the same. The following is
the results:
Branch A 33.7 21.6 32.1 38.2 33.2 35.9 34.1 39.8
23.5 21.2 23.3 18.9 30.3
Branch B 28.0 29.9 22.3 23.3 33.6 24.1 16.9 14.4
30.2 23.1 13.9 19.7 16.6 13.8 42.1 28.1

An excel report is generated for the test at the 0.05 level of significance:

t-Test: Two-Sample Assuming Equal Variances

Branch A Branch B
Mean 29.6769 23.75
Variance 50.0269 63.3933
Observations 13 16
Pooled Variance 57.4527
Hypothesized Mean Difference 0
df 27
t Stat 2.0941
P(T<=t) one-tail 0.0229
t Critical one-tail 1.7033
P(T<=t) two-tail 0.0458
t Critical two-tail 2.0518

(a) Find the sample mean fat contents of burgers in Branch A and Branch B respectively.
(b) State the null hypothesis and alternative hypothesis of the test.
(c) Report the p-value of the test.
(d) Is there evidence that the average fat contents of burgers from the two branches is different at 5%
significance level? Explain your answer by using the p-value in part (c).

45
AS 2021-22

22. Twenty laboratory mice were randomly divided into two groups of 10. Each group was fed
according to a prescribed diet. At the end of three weeks, the weight gained (in grams) by each
animal was recorded. Do the data in the following table justify the conclusion that the diet B has
a stronger effect in increasing mice’s weight than diet A?

Sample A 5 14 7 9 11
7 13 14 12 8
Sample B 10 21 16 23 24
16 13 19 19 21

An Excel report is generated for the test at the 0.01 level of significance.

t-Test: Two-Sample Assuming Equal Variances

Diet A Diet B
Mean 10 18.2
Variance 10.4444 19.7333
Observations 10 10
Pooled Variance 15.0889
Hypothesized Mean Difference 0
df 18
t Stat -4.7203
P(T<=t) one-tail 0.0001
t Critical one-tail 2.5524
P(T<=t) two-tail 0.0002
t Critical two-tail 2.8784

(a) Find the sample mean weight gained of mice by taking diet A and diet B respectively.
(b) Write a test report in critical value approach to conclude if there is evidence that the diet B has a
stronger effect in increasing mice’s weight than diet A.

46
AS 2021-22

23. The personnel department of a company decided to investigate whether the age of an employee
had any effect on learning new computing skills. Among those employees who had not
attended the company computing program, 8 employees aged below 40 and 10 employees aged
over 40 were selected randomly and then were given the same computing training course. At
the end of the course, the 18 employees were given a test. The results of the test were given as
follows:

Scores obtained by employees Scores obtained by employees


aged below 40 aged over 40
38 47 88 66 44 70 39 56 72 34

68 41 55 58 60 46 49 50

Below is the Excel output generated at 5% level of significance:

t-Test: Two-Sample Assuming Equal Variances

Employees Employees
aged below aged over
40 40
Mean 55.875 53.4
Variance 291.8393 151.3778

Observations 8 10
Pooled Variance 212.8297
Hypothesized Mean Difference 0
df 16
t Stat 0.3577
P(T<=t) one-tail 0.3626
t Critical one-tail 1.7459
P(T<=t) two-tail 0.7253
t Critical two-tail 2.1199

(a) Find the sample mean scores obtained by employees aged below 40 and employees aged above 40
respectively.

(b) Write a test report in p-value approach to conclude whether the age of an employee had any effect
on learning new computing skills at the 5% significance level.

47
AS 2021-22

24. A salesman claims that the percentage of defective mobile phones is no higher than that for a
similar model from his competitor in the promotion program. To test this statement, the retailer
took random samples from each manufacturer’s product.

Product Number of Defective Number Checked


Salesman’s 15 150
Competitor’s 6 150

Below is the Excel output generated at 5% level of significance:

z-Test: Two Sample for Proportions

Salesman Competitor
Proportion 0.1 0.04
Variance 0.0651 0.0651
Observations 150 150
Hypothesized Proportion Difference 0
z 2.0365
P(Z<=z) one-tail 0.0208
z Critical one-tail 1.6449
P(Z<=z) two-tail 0.0417
z Critical two-tail 1.9600

(a) Find the sample proportion of defective items in salesman’s mobiles, sample proportion of
defective items in competitor’s mobiles, and the pooled sample proportion of defective items
respectively.
(b) Write a test report in critical value approach to conclude if there is any evidence that the defective
rate of salesman’s mobile phones is actually higher than his competitor.

48
AS 2021-22

25. A survey has invited 200 men and 200 women to taste a brand of cola. Twenty-nine percent of
the men and 24% of women responded positively to the cola. Based on this survey, can we
conclude that there is a significant difference in the proportion of men and women response
positively to the cola at the 0.02 level of significance? Write a test report in p-value approach
according to the below Excel output:

z-Test: Two Sample for Proportions

Men Women
Proportion 0.29 0.24
Variance 0.1948 0.1948
Observations 200 200
Hypothesized Proportion Difference 0
z 1.1329
P(Z<=z) one-tail 0.1286
z Critical one-tail 2.0537
P(Z<=z) two-tail 0.2572
z Critical two-tail 2.3263

49
AS 2021-22

26. The table below is a summary of a hotel guest satisfactory study.

Hotel
Visit again? Westwind Goodview
Yes 163 154
No 64 108

A test is conducted to check if there is any evidence to say that a greater proportion of guests of
Westwind are likely to revisit than that of Goodview and the Excel output is as follow:

z-Test: Two Sample for Proportions

Westwind Goodview
Proportion 0.7181 0.5878
Variance 0.2280 0.2280
Observations 227 262
Hypothesized Proportion Difference 0
z 3.0088
P(Z<=z) one-tail 0.0013
z Critical one-tail 1.6449
P(Z<=z) two-tail 0.0026
z Critical two-tail 1.9600

(a) Find the sample proportion of guests would likely to revisit Westwind Hotel and Goodview Hotel
respectively.
(b) State the null hypothesis and alternative hypothesis of the test.
(c) Report the p-value of the test.
(d) Is there evidence to say that a greater proportion of guests of Westwind are likely to revisit than
that of Goodview at 5% significance level? Explain your answer by using the p-value in part (c).

50
AS 2021-22

Chapter 9 – Analysis of Variance

1. Suppose that we want to compare the price of pork sold in wet market in different districts.
Random samples of size 4 are taken from Mong Kok, Wan Chai and Tai Po and the price ($) to
purchase 400 grams pork is as follow:

Mong Kok 71, 75, 65, 69


Wan Chai 90, 80, 86, 84
Tai Po 72, 77, 76, 79

(a) Report the sample mean price to purchase 400 grams pork for each district and the combined
sample mean.
(b) Write a test report in critical value approach at 5% significance level, if the price of pork in three
districts are not all the same by referring to the following Excel output:

SUMMARY
Groups Count Sum Average Variance
Mong Kok 4 280 70 17.3333
Wan Chai 4 340 85 17.3333
Tai Po 4 304 76 8.6667

ANOVA
Source of
Variation SS df MS F P-value F crit
Between Groups 456 2 228 15.7846 0.0011 4.2565
Within Groups 130 9 14.4444

Total 586 11

51
AS 2021-22

2. The table below shows the scores for crunchiness of four competing potato crisps. Each type of
crisp was assessed by several testers.

Crisp 1: 13.4 12.2 12.4 12.8 12.2


Crisp 2: 9.3 10.8 8.4 9.7 9.5 7.9 9.5
Crisp 3: 12.5 14.7 12.9 11.8
Crisp 4: 14.0 15.6 14.1

(a) Write a test report in p-value approach to conclude, at the 5% level of significance, whether
the differences in crunchiness between the four crisps are significant based on the following
excel output:

SUMMARY
Groups Count Sum Average Variance
Crisp 1: 5 63 12.6 0.26
Crisp 2: 7 65.1 9.3 0.8767
Crisp 3: 4 51.9 12.975 1.5292
Crisp 4: 3 43.7 14.5667 0.8033

ANOVA
Source of Variation SS df MS F P-value F crit
Between Groups 75.4227 3 25.1409 30.1832 0.0000 3.2874
Within Groups 12.4942 15 0.8329

Total 87.9168 18

(b) Report the sample mean crunchiness of the four crisps and the combined sample mean.

52
AS 2021-22

3. A project conducted between social welfare department and hospital authority is investigating the
cognitive development of kids at different ages. 15 kids studying at K1, K2, and K3 have joined
the project and the time for each kid to finish a 50 pieces puzzle is recorded. Below are the
finishing times (in minutes):

K1: 21.1, 17.8, 18.6, 20.8, 17.9, 19.0


K2: 18.0, 16.4, 15.7, 19.6, 16.5, 18.2
K3: 16.5, 17.8, 16.1

A test is conducted to verify whether cognitive development of K1, K2, and K3 kids are different
by testing if the mean finishing times for three groups of kids are not all the same. Below is the
excel output of an ANOVA test conducted at 0.01 level of significance:

SUMMARY
Groups Count Sum Average Variance
K1: 6 115.2 19.2 2.044
K2: 6 104.4 17.4 2.108
K3: 3 50.4 16.8 0.79

ANOVA
Source of Variation SS df MS F P-value F crit
Between Groups 15.12 2 7.56 4.0609 0.0450 6.9266
Within Groups 22.34 12 1.8617

Total 37.46 14

(a) State the null hypothesis and the alternative hypothesis of the ANOVA test.
(b) Report the p-value of the test.
(c) Should the cognitive development of K1, K2, and K3 kids are concluded as not all the same?
Give a reason to your answer.

53
AS 2021-22

4. To study the effectiveness of five different kinds of packaging, a processor of breakfast foods
obtained the following data on the numbers of sales on five different days:

Packaging I: 60, 52, 56, 52, 65


Packaging II: 54, 64, 66, 54, 57
Packaging III: 55, 66, 68, 57, 55
Packaging IV: 55, 56, 70, 58, 56
Packaging V: 71, 65, 60, 59, 62

Below is the output result of the ANOVA test, conducted by Excel at the 5% significance level:

SUMMARY
Groups Count Sum Average Variance
Packaging I 5 285 57 31
Packaging II 5 295 59 32
Packaging III 5 301 60.2 39.7
Packaging IV 5 295 59 39
Packaging V 5 317 63.4 23.3

ANOVA
Source of Variation SS df MS F P-value F crit
Between Groups 111.04 4 27.76 0.841212 0.515328 2.866081
Within Groups 660 20 33

Total 771.04 24

(a) State the null hypothesis and the alternative hypothesis of the ANOVA test.
(b) Report the critical value of the test.
(c) Report the F statistics of the test.
(d) Should the effectiveness of the five different kinds of packaging be considered as all are the same?
Give a reason to your answer.

54
AS 2021-22

Chapter 10 – Chi Square Test

(Remark: calculation of expected frequency: to be correct to at least 2 d.p.)

1. According to the genetic theory the number of colour-strains pink, white, blue in a certain flower
should appear in the ratio 3:2:5. For 100 plants randomly selected from the garden, the results
were as follows:

Colour Pink White Blue Total


Number of plants 24 14 62 100

Test, at the 1% significance level, are the differences between the observed and expected
frequencies significant?

2. An Italian gelato shop opens its first shop in Hong Kong and offers three flavors of ice-cream,
namely vanilla, mango and strawberry. According to the previous study conducted in Italy, 20%
of the customers choose vanilla, 40% of the customers choose mango, and 40% of the customers
choose strawberry. Below is the record of the choices of these three flavors of ice-cream by
customers in the Hong Kong shop on the opening day.

Flavor
Vanilla Mango Strawberry
250 500 450

Test, at the 0.05 level of significance, if there is a significant difference in the preference on the 3
ice-cream flavors between the customers in Hong Kong and those in Italy.

3. 400 students were randomly selected and asked who they will vote for as the chairman of the
coming student union. The results are given below:

Candidate Mary John Peter May


Votes 131 121 99 49

Can you argue that the four candidates command different levels of support? Test the hypothesis
at 5% level of significance.

55
AS 2021-22

4. The following sample data represents the quality of the shipments received by a large firm from
three different vendors:

Number of
Number of Number of
imperfect but Total
rejected shipments perfect shipments
acceptable shipments
Vendor A 15 25 90 130
Vendor B 7 18 65 90
Vendor C 22 33 125 180
Total 44 76 280 400

The charge of shipments from Vendor B is much higher than the other two Vendors. The senior
management wants to have a detailed report about the quality of the shipments received from
Vendor B. Referring to the number of shipments received from Vendor B in the above table, test,
at the 5% level of significance, if the ratio of “rejected shipments : imperfect but acceptable
shipments : perfect shipments” equals to “1 : 1 : 8”.

56
AS 2021-22

Chapter 8 to 10 – What test should be conducted?

Suggest the most suitable test (z-test, t-test, ANOVA, 2 test) for the following cases:

(a) Test, if the average spending on lunch between male and female students are different (all data is
collected from a survey)

(b) Test, if the emotional problem among primary and secondary school are different by considering
the proportion of kids suffering from insomnia.

(c) Test, if the average daily income of Judy Restaurant in Monday, Tuesday, Wednesday, and
Thursday are all the same.

(d) Last year, the population mean working hour of a nurse is 8.5 hours a day with standard deviation
of 2.1 hours. Test, if this year the average mean working hour has been increased with the
assumption that the standard deviation is unchanged.

(e) A report claims that 15%, 65%, and 20% of travellers arrive Hong Kong International Airport by
taxi, bus, and airport express. Test, if the report is correct.

(f) A researcher wants to test if there is any difference between the mean processing times when
customers pay by VISA, pay by Octopus card, and pay by cash.

(g) A researcher wants to test if more than 30% of the car accidents are due to drunk-driving.

(h) An education researcher wants to compare the teaching effectiveness of four different teaching
methods. Final year students are randomly assigned to one of the four groups. The marks
obtained by the students in the final examination would be used for the test.

57
AS 2021-22

Chapter 11 – Linear Regression and Correlation

(Remark: Correct to at least 4 d.p. in your calculation)

1. The following data have been collected regarding sales and advertising expenditure of six
products:

Sales Advertising Expenditure


(dollars in millions), (dollars in thousands),
x y
8.5 210
9.2 250
7.9 290
8.6 330
9.4 370
10.1 410

(a) Fit the regression equation, y = a + bx, for the above data.
(b) Interpret the value of a and b in the regression equation in part (a).
(c) Calculate the coefficient of correlation and comment on it.
(d) Determine the advertising expenditure and comment on the reliability of the estimation when the
sales is:
i. 6 million
ii. 9 million

58
AS 2021-22

2. The following data obtained in a study conducted in a secondary school. The number of days
being late to school (X) and the examination scores in General Education of seven randomly
selected students were as follow:

Number of days being late 6 2 15 9 12 5 8


to school, X
Examination scores in 82 86 43 74 58 90 78
General Education, Y

(a) Present the relationship between the number of days being late to school and the examination
scores in General Education by calculating the coefficient of correlation of the above data and
comment on it.
(b) Present the relationship between the number of days being late to school and the examination
scores in General Education by estimating the values of a and b in the linear regression line,
y = a + bx.
(c) Explain the values of a and b of the regression line in part (b).
(d) Estimate the examination scores in General Education for a student who has been late to school for
11 days and comment on the reliability of the estimation.

3. The following table shows the amount of water, in centimetres, applied to six similar plots on an
experimental farm. It also shows the yield of hay in tons per acre.

Amount of water (x) Yield of hay (y)


30 4.85
45 5.20
60 5.76
75 6.60
105 7.35
120 7.77

(a) Fit the regression equation, y = a + bx, for the above data.
(b) Interpret the value of a and b in the regression equation.
(c) Calculate the coefficient of correlation and comment on it.
(d) Determine the expected yield and comment on the reliability when the amount of water (in
centimeters) is:
i. 90
ii. 150

59
AS 2021-22

4. James is a PhD student. He is studying the characteristics of a list of companies that go public for
the first time. He is particularly interested in the relationship between the size of the offering and
the price per share. A sample of 10 companies that recently went public revealed the following
information.

Company Size (in millions) Price per share (in dollars)


x y
1 9.0 10.8
2 94.4 11.3
3 27.3 11.2
4 179.2 11.1
5 71.9 11.1
6 97.9 11.2
7 93.5 11.0
8 70.0 10.7
9 160.7 11.3
10 96.5 10.6

(a) Calculate the correlation coefficient of the above data and comment on it.
(b) Fit the regression equation, y = a + bx for the above data.
(c) Interpret the value of a and b in the regression equation.
(d) Estimate the price per share of the company if the size of the offering is 10 millions and comment
on its reliability.

60
AS 2021-22

5. A sales is recently reviewing its customers’ profiles and wants to study if there is any relationship
between the customer’s monthly income and the price of the car he purchases. Below is the
information of a random sample of 7 customers:

x: monthly income ($) y: price of the car ($)


68000 250000
59000 230000
75000 300000
57000 280000
66000 320000
80000 330000
48000 220000

(a) Present the relationship between the monthly income and the price of the car by calculating the
coefficient of correlation of the above data and comment on it.
(b) Present the relationship between the monthly income and the price of the car by estimating the
values of a and b in the linear regression line, y = a + bx.
(c) Explain the values of a and b of the regression line in part (b).
(d) Estimate the price of the car a customer would purchase for a customer whom monthly income
is $90000 and comment on the reliability of the estimation.

6. Two judges, Peter and Tom, rank the eight photographs in a competition as follows:
Photograph A B C D E F G H
Rank by Peter 2 5 3 6 1 4 7 8
Rank by Tom 4 3 2 6 1 5 8 7

(a) Calculate the coefficient of rank correlation for the data.


(b) Do the two judges have similar or different judging criteria? Explain your answer by using the
result obtained in part (a).

61

You might also like