0% found this document useful (0 votes)
35 views112 pages

AS Exercise 23-24

Uploaded by

Lok Hang Li
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
35 views112 pages

AS Exercise 23-24

Uploaded by

Lok Hang Li
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 112

AS 2023-24

Applied Statistics – Exercise

Chapter 1 – Sampling Methods

1. There is a multinational company with offices in five Asian countries. The human resources
executive of the company decides to study employee’s opinion about the existing insurance policy.
Since there is insufficient time to conduct a census, which involves 14000 employees located in
different countries, it is decided to conduct a survey which involves a total of 140 employees.
According to the record, the number of employees in Singapore, Korea, China, Malaysia, and
Thailand are 2000, 3000, 5000, 1500 and 2500 respectively.

(a) Describe the objective of this study.


(b) Describe the population in this study.
(c) How large is the population size?
(d) Which sampling method should be used for the sample is required to be a random representation
without bias towards any countries?
(e) How many representatives from each country to be taken?

2. The Hong Kong Federation of Restaurants & Related Trades is conducting a survey about the
working environment of restaurants in Hong Kong. Among all registered restaurants in Hong
Kong, 10 restaurants will be randomly selected for the survey. All full-time employed staff in the
selected restaurants will be invited to fill in the questionnaire.

(a) Describe the population of the study.


(b) What is the name of this sampling method?

1
AS 2023-24

3. A supermarket chain with 1500 supermarkets wishes to undertake a survey to understand the
safety facilities in the stores. Questionnaires will be sent to a sample of 60 supermarkets and the
store manager will be responsible to fill in the questionnaire. One of the company records
indicates the location of the 1500 supermarkets as follow:

Central South North East West


550 250 200 350 150

(a) Describe the population of the study.


(b) Suppose the sample is selected by stratified sampling method with the five districts as five strata.
How many supermarkets from each district should be selected?
(c) Suppose the sample is selected by systematic sampling method instead of stratified sampling
method. What is the advantage of using systematic sampling method?
(d) When using systematic sampling method, a unique identity number should be assigned to each
supermarket. Suppose now supermarket with identity number 0037 is included in the sample.
Write down the identity number of the next three selected supermarket.

4. Classify each of the following samples be simple random, systematic, stratified, or cluster:
(a) Every 8th student entering a school is asked to name his / her favourite teacher.
(b) Every nurse working in the four randomly selected hospitals is asked about the total working hour
on 1 September 2019.
(c) Immigrants are selected based on random numbers to take place in a survey about education level.
(d) In a private estate with 20 buildings, the property management team wants to conduct a survey to
investigate resident’s satisfactory level to the service. Two buildings will be selected randomly
and all apartments in these two buildings will receive a questionnaire.
(e) Every 1 out of 75 visitors leaving the exhibition center is invited to join a survey about their
opinion on the exhibition.
(f) 5, 8, and 3 students are randomly selected from Korean, Japanese, and French clubs respectively,
as fair representatives of a University language center to take place in Foreign Language
Competency Test.

2
AS 2023-24

5. A questionnaire is designed for a telecommunication company to study the household long-


distance call usage. Determine whether each of the following random variable is quantitative or
qualitative. If quantitative, determine whether the variable is discrete or continuous.

(a) Number of telephones (fixed line) in the household


(b) Number of long-distance calls made in August 2019
(c) Length (in minutes) of the longest long-distance calls made in August 2019
(d) Monthly charges (in dollars and cents) for long-distance calls made in August 2019
(e) Are you satisfied with the long-distance calls service?
- very satisfied / satisfied / not satisfied / not at all satisfied

6. Below are a few questions extracted from the Health survey questionnaire prepared by the
Department of Health. Determine whether each of the following random variable is quantitative
or qualitative. If quantitative, determine whether the variable is discrete or continuous.

(a) Gender (M / F)
(b) Weight (kg)
(c) Height (cm)
(d) Total number of residents in your household
(e) Do you smoke? (Yes / No)

3
AS 2023-24

Chapter 2 – Statistical Measures and Data Presentation

1. Below are the number of minutes (round off to the nearest minute) a sample of 20 employees in a
company go out for lunch on 2 July 2021:

39, 41, 45, 52, 53, 55, 58, 59, 60, 61, 62, 62, 63, 65, 65, 65, 69, 70, 72, 77

(a) Find the mean, mode, median, the first quartile, third quartile, 17th percentile, 87th percentile of
the data.
(b) Find the standard deviation and variance of the data.
(c) Comment on the skewness of the above data by comparing the quartiles. State your reason.

2. The spending ($) of a random sample of 10 people in a restaurant are as shown below:

95, 60, 120, 75, 60, 70, 40, 35, 115, 60

From the above data find the following measure of locations and dispersion.

(a) Mean;
(b) Mode;
(c) Median;
(d) First quartile;
(e) Third quartile;
(f) Range;
(g) IQR; and
(h) Standard deviation.
(i) Describe the skewness of the data by comparing the quartiles.

4
AS 2023-24

3. The following is the daily expense ($) on 1 July 2021 by a sample of 12 tourists from Singapore.

873 2460 951 730 327 1214


768 5293 662 591 820 4260

(a) Compute the sample mean, sample median, first quartile, third quartile, and the 10th percentile.
(b) Compute the range, inter-quartile range, and sample standard deviation.
(c) How do you describe the shape of the distribution? State your reason.

4. The followings are the bonuses paid to sales staff employed by two companies in January
according to their performance last year.

Company A: 23000, 30000, 28000, 31000, 29000, 26000,


34000, 36000, 28000, 32000, 25000, 26000
Company B: 53000, 65000, 70000, 50000, 62000, 58000,
52000, 63000, 69000, 72000, 64000, 54000

Find the population mean, median, first quartile, third quartile, variance and standard deviation
for company A and company B respectively. Present the result in a table for simple comparison.

5
AS 2023-24

5. Peter is working in the Labor Department. He is working on research which studies the daily
expense of laborers who are working in different districts. He samples 20 salespersons, 8 working
in Yuen Long and 12 working in Causeway Bay. He records their spending on lunch on 1
September 2021 and the results are as follow:

Yuen Long: 33 35 38 40
48 53 55 62

Causeway Bay: 43 48 50 54
58 63 74 80
82 88 89 93

(a) Find the sample mean, sample standard deviation of spending on lunch on 1 September 2021 for
those who are working in Yuen Long and Causeway Bay respectively.
(b) Give a simple comparison on the spending on lunch by workers working in Yuen Long and
Causeway Bay by using the result in part (a).
(c) For the combined data, find the sample mean, sample standard deviation, and 82th percentile of
spending on lunch on 1 September 2021.
(d) Use quartiles to comment the skewness of the combined data. Show your calculation.

6. The salaries of a sample of four salespersons are as follow $48000, $55000, $51000 and $50000.

(a) Find the mean and the standard deviation of these four sample values.

There are two suggestions for salary increment:

(b) Method A: Each person has an increment of $5000.


Calculate the mean and the standard deviation of these four sample values after salary
increment by
(i) calculate the salary of each person
(ii) define new salary (Y) as a linear function of original salary (X)

(c) Method B: Each person has an 20% increment in salary.


Calculate the mean and the standard deviation of these four sample values after salary
increment by
(i) calculate the salary of each person
(ii) define new salary (W) as a linear function of original salary (X)

6
AS 2023-24

7. There are 300 overweight patients joining a weight reduction program which target to help patient
to reduce the body weight by 10%. A random sample of 16 patients is selected so to monitor the
progress of the program. Below is the weight (round off to the nearest kg) of these 16 patients
before joining the program.

86 85 93 108 84 84 99 92
91 91 87 103 113 98 118 106

(a) Find the sample mean and sample standard deviation for the data.
(b) Find the first quartile, median, and third quartile for the data.
(c) Find the 10th percentile and 90th percentile.
(d) The target of the program is to reduce the body weight by 10%. If all patients exactly meet the
target, what are the sample mean and sample standard deviation of the body weight of these 16
patients after the program?
(e) Report the sample mean and sample standard deviation of the body weight of these 16 patients
after the program in terms of pounds.

8. A sample of 10 headphones is selected from an online store and the selling prices (in US$) are
presented in an ordered array as:

16, 25, 75, 125, 150,


185, k, 190, 200, 260

(a) It is known that the sample mean selling price of the above data is US$ 141.1, find the value of k.

(b) Find the sample standard deviation and sample variance of the above data.
(c) Find the first quartile, median and third quartile of the above data.
(d) Customers can make overseas order from the online store. The currency exchange rate is US$ 1
to HK$ 7.75. When an order is made in Hong Kong, there will be a service charge of HK$ 35
for each item. Find the sample mean and sample standard deviation of the payment of ordering
the headphone in HK$.

7
AS 2023-24

9. Identify each of the following symbols used in the lecture note by writing down the summary
statistics it represents.
(a) µ
(b) σ
(c) σ2
(d) 𝑥̅
(e) s2
(f) s
(g) IQR
(h) Q1
(i) Q2
(j) Q3

8
AS 2023-24

Chapter 3 - Probability

1. A traveling agency is conducting a research to study the number of trips a customer goes for in a
year. Below is the frequency table:

Number of trips 1 2 3 4 5 6
Number of customers 43 89 56 12 4 2

Convert the above frequency table to the probability table of the number of trips a customer goes
for in a year.

2. An America restaurant is conducting a research to collect customers’ opinion on its food. One
question in the questionnaire is to ask the customer to choose the most favourable pizza offered
by the restaurant. 60 customers are classified according to their age group and their most
favourable pizza:

Most favourable pizza


Age group Hawaiian California New York City Los Angeles
18 – 25 7 6 6 3
26 – 45 1 9 0 10
46 or above 3 2 5 8

(a) In general, which pizza is the most popular choice among all customers? What percentage of
customers like this type of pizza?
(b) Which pizza is the most popular choice among customers aged between 18 and 25? What
proportion of customers aged between 18 and 25 like this type of pizza?
(c) Which pizza is the most popular choice among customers aged between 26 and 45? What is the
probability that a randomly selected customer aged between 26 and 45 like this type of pizza?
(d) Which pizza is the most favourable choice among customers aged 46 or above? What percentage
of customers aged 46 or above like this type of pizza?

9
AS 2023-24

3. A company is reviewing the travelling expense made by its staff. The table below is the frequency
table of the number of business trips in a month made by senior managers from different
department:

Number of business trips goes for in a month, X


Department x=0 x=1 x=2 x=3 x=4 x=5
Marketing 10 15 8 4 2 2
Finance 8 10 18 4 3 0
Research 30 12 10 0 0 0

(a) If one senior manager is selected randomly from the company, compile the following probabilities
respectively: P(X = 0), P(X = 1), P(X = 2), P(X = 3), P(X = 4), and P(X = 5), where X is the
number of business trips the selected senior manager goes for in a month.
(b) Recalculate the probabilities P(X = 0), P(X = 1), P(X = 2), P(X = 3), P(X = 4), and P(X = 5),
conditional on the senior manager is working in the marketing department.
(c) Recalculate the probabilities P(X = 0), P(X = 1), P(X = 2), P(X = 3), P(X = 4), and P(X = 5),
conditional on the senior manager is working in the finance department.
(d) Recalculate the probabilities P(X = 0), P(X = 1), P(X = 2), P(X = 3), P(X = 4), and P(X = 5),
conditional on the senior manager is working in the research department.

4. A publisher conducted a survey of a sample of 600 subscribers. In the survey the subscriber was
asked to place himself/herself in the most appropriate of the four categories: full-time student,
full-time employment, part-time employment, and self-employed. Among all 600 subscribers,
the numbers of subscriber fall in the four categories are as follow: 120 are full-time students, 350
are in full-time employment, 80 are in part-time employment and 50 are in self-employed. It is
also known that there are 250 male subscribers in the survey, and the numbers of them in the four
categories are 40, 180, 20, and 10 respectively.

(a) If one subscriber is selected randomly from this sample, what are the probabilities that the selected
subscriber is a full-time student, full-time employment, part-time employment, and self-employed
respectively?
(b) If one male subscriber is selected randomly from this sample, what are the probabilities that the
selected male subscriber is a full-time student, full-time employment, part-time employment, and
self-employed respectively?
(c) May is one of the female subscribers. What are the probabilities that May is a full-time student,
full-time employment, part-time employment, and self-employed respectively?

10
AS 2023-24

5. “Walk with me” has recently conducted a research about family relationship. In the survey, there
were 900 interviewees, 500 were elderly and 400 were teenagers. Each interviewee was asked
“how often do you have dinner with your family?” The interviewee would choose between
“seldomly: < 7 days in a month”, “sometimes: 7 days to 14 days in a month”, “quite often: 15
days to 21 days in a month”, “always: more than 21 days in a month”. Among all interviewees,
there were 240, 350, 60 and 250 response to seldomly, sometimes, quite often, and always
respectively. Among elderly, 210 of them seldomly have dinner with family and 180 of them
sometimes have dinner with family. Among teenagers, 40 of them quite often have dinner with
family and 160 of them always have dinner with family.

(a) What is the probability that a randomly selected person from the survey always have dinner with
family?
(b) What is the probability that a randomly selected elderly from the survey seldomly have dinner
with family?
(c) What is the probability that a randomly selected elderly from the survey always have dinner with
family?
(d) If an interviewee is known to be seldomly have dinner with family, what is the probability that
he / she is a teenager?
(e) If an interviewee is known to be always have dinner with family, what is the probability that he /
she is an elderly?

11
AS 2023-24

Chapter 4 – Probability Distribution

1. A marketing assistant is observing customers in a supermarket. He randomly observes a sample


of 150 customers and records the number of cans of cola they purchase, 4 cans, 6 cans or 9 cans.
In the sample, there are 60 male customers and 90 female customers. Below is his observation
result:
Male Purchasing 4 cans Purchasing 6 cans Purchasing 9 cans
Frequency 15 28 17

Female Purchasing 4 cans Purchasing 6 cans Purchasing 9 cans


Frequency 51 32 7

Use the combined result of all 150 customers, by using X to represent the number of cans of cola
purchased by a customer, present the probability distribution function of X.

2. The following is the probability distribution function of the number of tutorial classes (X) a
secondary school student attends in a week.

x 0 1 2 3 4 5
P(X = x) k k 0.4 0.15 2k 2k

(a) What is the value of k?


(b) Most likely, how many tutorial classes a secondary school student attends in a week? What is the
corresponding probability?
(c) What is the probability that a secondary school student attends at least three tutorial classes in a
week?
(d) What is the probability that a secondary school student attends at most four tutorial classes in a
week?

12
AS 2023-24

3. The following is the probability distribution function of the number of notebooks sold (X) by Alex
in a day.

x 0 1 2 3 4
P(X = x) 4k 10k 3k 2k k

(a) What is the value of k?


(b) Most likely, how many notebooks are sold by Alex in a day? What is the corresponding
probability?
(c) Alex can get the daily bonus if he can sell more than 3 notebooks in day. What is the probability
that Alex can get the daily bonus?

4. Based on recent records, the manager of a car painting center has determined the following
probability distribution for the number of service request per day (X). Suppose the center has the
capacity to serve two customers per day.

x 0 1 2 3 4 5
P(X = x) 0.05 0.20 0.30 0.25 0.15 0.05

(a) Most likely, how many service requests would there be in a day?
(b) What is the probability that there are less than 2 requests in a day?
(c) What is the probability that there are more than 2 requests in a day?
(d) At least by how many must the capacity be increased so the probability of turning a request away
is no more than 0.1?

13
AS 2023-24

5. The tables below are the probability distribution functions of number of sick leave taken in a
month by male and female employees in a large company.

X: number of sick leave taken by male

x 0 1 2 3 4 5
P(X = x) 0.3 0.29 0.24 0.12 0.03 0.02

Y: number of sick leave taken by female

y 0 1 2 3 4 5
p(Y = y) ? 0.32 0.34 0.06 0.04 0.03

(a) What are the expected, variance and standard deviation of numbers of sick leave taken in a month
by male employees?
(b) What are the expected, variance and standard deviation of numbers of sick leave taken in a month
by female employees?

6. The following is the probability distribution function of the projected profit, X, of a stock.

x -10000 -2000 1000 5600 20000 25000


P(X = x) 0.15 0.1 0.2 0.25 m 0.3 – m

(a) What is the range of the possible values of m?


(b) What are the E(X), Var(X), and (X) if m = 0.2?
(c) What should be the value of m so that the expected profit of this investment fund is $6900?
(d) What is the maximum possible expected profit?

7. The following is the probability distribution function of the revenue (X) by investing $10,000 in a
particular stock.

x 8000 9000 12000 18000


p(x) 0.2 0.5 0.2 k

(a) Find the value of k.


(b) Find E(X) and Var(X).
(c) Define profit Y = X – 10000. Calculate the expected profit?

14
AS 2023-24

8. The following is the probability distribution function of the number of job orders Amy gets in a
day (X).

x 0 1 2 3 4
P(X=x) 2a a a 3a 8a

(a) What is the value of a?


(b) What are the expectation and standard deviation of the number of job orders Amy gets in a day?
(c) Suppose Amy’s daily salary is calculated as Y = 150 + 80X, what are the expectation and standard
deviation of Amy's daily salary?

9. The daily income of a tourist guide is calculated with the following formula Y = 300 + 75X, where
X is the number of tourists in the group. Suppose the distribution function of the number of
tourists in a group is as follow:

x 11 12 13 14 15 16
P(X=x) a a 3a 3a 2a 2a

(a) Calculate the expectation and standard deviation of the number of tourists in a group.
(b) Calculate the expectation and standard deviation of the daily income of a tourist guide.

10. Mary is an online trader who helps customer to purchase handbag from the United States. The
expected number of purchases in a week is 4.6 with standard deviation is 1.1. Each handbag costs
her US$1020 and she sells it at HK$10500. Suppose the fixed weekly cost to run the online
business is $520.

(a) Use X to denote the number of purchases in a week and Y to denote the profit (HK$) she earns in
a week. Express Y in terms of X. (US$1 converts to $7.8)
(b) Find the expectation and standard deviation of Mary’s weekly profit in HK$.

15
AS 2023-24

11. The following table represents the probability distribution function of the number of complaints
(X) received by a customer service desk in a day.

x 0 1 2 3 4
P(X = x) k 2k 4k 2.5k 0.5k

(a) What is the probability that there is no complaint received in a day?


(b) Find E(X), Var(X) and σ(X).

Currently there is only one customer service officer responsible for handling complaints received
at the customer service desk. The senior management is discussing if extra manpower is needed.

Assume it takes 1.5 hours to handle each complaint.

(c) Find the expected value and standard deviation of the number of hours needed for handling
complaints in a day.
(d) What is the probability that it takes more than 5 hours to handle complaints in a day?

12. The Department of Health has conducted a survey about regular body check-up. According to
the report, on the average, an adult aged between 30 – 40 years old, has taken 1.7 times of regular
body check-up in the last five years, with a standard deviation of 0.82 times. Another survey
from the Dental Association reports that, on the average, an adult aged between 30 – 40 years old,
has taken 1.05 times of dental check-up in the last five years, with a standard deviation of 0.42
times. Suppose the number of regular check-up and the number of dental check-up taken by an
adult are independent. Find the expectation and standard deviation of the total number of check-
up (regular body check-up plus dental check-up) an adult aged between 30 – 40 years old has
taken in the last five years.

16
AS 2023-24

13. The following is the probability distribution function of the number of visits to Ocean Park in a
year for a customer who has the annual pass:

x 0 1 2 3 4 5 6 7 >7
P(X = x) 0 0.02 0.19 0.34 0.35 0.07 0.02 0.01 0

(a) Calculate the (i) expectation and (ii) standard deviation of the number of visits to Ocean Park in
a year for a customer who has the annual pass.
(b) Use Y to denote the number of visits to Disneyland in a year for customer who has the annual
pass. It is known that E(Y) = 3.67 days and σ(Y) = 1.28 days. Referring to the group of individual
who has both Ocean Park annual pass and Disneyland annual pass and assume that the number
of visits to Ocean Park and Disneyland are independent events, find (i) the expectation, (ii)
variance, and (iii) standard deviation of the total number of visits to the two theme parks in a year.

14. Lisa is a graphic designer. She works for an advertising firm as a part-time staff. The number of
jobs she works for the firm in a month has a probability distribution function as follow:

x 3 4 5 6 7 8
P(X = x) 0.1 k 3k 4k k k

(a) What is the value of k?


(b) Calculate the expectation, variance, and standard deviation of the number of jobs who works for
the firm in a month.
(c) Suppose she gets commission of $2500 for every job. What is the probability that she gets at
least $15000 commission in a particular month?
(d) Use Y to denote the monthly commission she earns from this advertising firm. Calculate E(Y)
and (Y).
(e) Lisa also works as a part-time teacher in a design school. Her monthly income from the school
has an average of $18000 with standard deviation of $5500. Suppose her income from the
advertising firm and the design school are independent. Use T to denote her total monthly
income from the advertising firm and the design school. Find E(T) and (T).

17
AS 2023-24

15. Tim is a social worker serving the government schools in Hong Kong Island. He provides
consultation to four new students every week. Each student needs to be referred to the senior
social worker with probability 0.3. Assume all cases are handled independently. Use X to denote
the number of cases needed to be referred to the senior social worker in a week, X ~ Bin(n, p)

(a) What are the values of n and p?


(b) Tabulate the probability distribution function of X.
(c) Most likely, how many cases needed to be referred to the senior worker in a week?
(d) Find E(X) and σ(X).

16. In a café, there are two types of coffee, type A and type B. Type A is more popular than Type B
that 75% of the customers would choose type A coffee. Suppose now there are 12 customers in
the line. Use X to denote the number of customers in the line would choose type A coffee,
X ~ Bin(n, p).

(a) What are the values of n and p?


(b) Calculate E(X).
(c) Hence, most likely, how many customers would choose coffee A?
(d) Calculate P(X = 7), P(X = 8), P(X = 9), P(X = 10) and P(X = 11) respectively.

17. According to the results of a recent survey, 10% of the teenagers in a city are habitual smokers.
Now 20 teenagers are selected randomly and independently from the city.

(a) Find the probability that exactly 5 teenagers in the sample are habitual smokers.
(b) Find the probability that exactly 18 teenagers in the sample are not habitual smokers.
(c) Find the probability that at least 3 teenagers in the sample are habitual smokers.

18
AS 2023-24

18 It is known that 3% of the light bulbs in the production line are defective. A customer complained
on the poor quality of the light bulbs and asked for a checking. A random sample of 20 light bulbs
is selected and will be sent out for inspection.

(a) Most likely, how many defective light bulbs in the sample?
(b) What is the probability that there are exactly two defective light bulbs in the sample?
(c) What is the probability that there are at least 3 defective light bulbs in the sample?

19. Every day, the first 25 visitors to the tourist information centre would play a lucky draw. Each of
them has a probability of 0.7 to win a souvenir. Whether visitors can get a souvenir are
independent events.

(a) What are the expectation and standard deviation of number of visitors can get a souvenir in a day?
(b) Most likely, how many visitors can get a souvenir in a day?
(c) What is the probability that more than 22 visitors can get a souvenir in a day?

20. Mary has three tickets so she can join the lucky draw three times. The probability of getting a
present from each round of lucky draw is 0.15.

(a) Use X to denote the number of presents she will get after three rounds of lucky draw. Fill in the
probability distribution function with calculation.

x 0 1 2 3
P(X = x) (i) (ii) (iii) (iv)

(b) Suppose the value of each present is $40. Calculate the expectation of the total value of all
presents she will get.

19
AS 2023-24

21. According to medical records, 65% of patients who have diagnosed of having disease X can
recover within one week. In an elderly care centre, there are 22 residents diagnosed of having
disease X. The recovery time of each affected resident is independent.

(a) Most likely, how many affected residents can recover within one week? What is the
corresponding probability? (Justify your answer with the calculation of the probabilities of the
most likely two possibilities.)
(b) What is the probability that at most 20 of them can recover within one week?
(c) The centre can apply medical allowance from the government for treating each affected resident.
There are $800 allowance for a patient who can recover within one week and $1200 allowance
for a patient who take more than one week to recover. Project the total allowance for the 22
affected residents by calculate the expected total allowance.

22. According to the statistics of an education organization which assists secondary school graduates
applying for student VISA for overseas study, most graduates go to either USA or UK. Company
“StudyFree” organizes briefing sessions to the graduates regularly to explain the process of
getting USA student VISA and UK student VISA. Each briefing session has 15 graduates and by
the end of the session each graduate has to confirm which country he / she would apply for.
Following are the percentages of graduates who are interested in these two countries.

USA UK
30% 70%

Assume the choice of destination of each graduate is independent.

(a) What is the expectation and standard deviation of the number of graduates going to UK in a
briefing session?
(b) Most likely, how many graduates in a briefing session would go to UK? What is the
corresponding probability?
(c) What is the chance of having at least one graduate going to USA and at least one graduate going
to UK in a briefing session?
(d) The service charge of application of a USA student VISA is $1800 and the service charge of the
application of a UK student VISA is $2200? What are the (i) expectation and (ii) standard
deviation of total service charge collected in a briefing session?

20
AS 2023-24

23. Peter is a waiter of a café in Central. According to his observation, 80% customers order breakfast
A, while other customers order breakfast B. At this moment, all six tables in the café are occupied,
each table with one customer. He is going to take order from each of the customer.

(a) Use X to denote the number of customers will order breakfast B, so that X ~ Bin(n, p). What are
the values of n and p?
(b) Most likely, how many customers will order breakfast B? Calculate this probability.
(c) What is the probability that at least two customers will order breakfast B?
(d) The price of breakfast A is $45 and the price of breakfast B is $38. Calculate the (i) expectation
and (ii) standard deviation of the revenue for repeated samples of six individual customers.

21
AS 2023-24

Chapter 5 –Normal Distribution

1. Practice the use of standard normal table


Use the standard normal table, find the following probabilities:

(a) P(0 < Z < 2)


(b) P(Z < 1.86)
(c) P(-0.24 < Z < 0)
(d) P(-0.24 < Z < 2.40)
(e) P(-1.79 < Z < -1.30)
(f) P(Z < -1.58)

2. Practice the use of standard normal table


Use the standard normal table, find the value of a:

(a) P(0 < Z < a) = 0.32


(b) P(Z > a) = 0.35
(c) P(Z > a) = 0.825
(d) P(Z < a) = 0.15
(e) P(Z < a) = 0.65
(f) P(-a < Z < a) = 0.4568

3. The length of time a patient waits in Dr. Chan’s waiting room is known to be normally distributed
with mean 14 minutes, and standard deviation 4 minutes.

(a) Find the probability that a patient will wait for more than 20 minutes to see the doctor.
(b) What proportion of patients will wait for more than 10 minutes?

4. Suppose you must establish regulations concerning the maximum number of people an elevator
can occupy. A study of elevator occupancies indicates that if eight people occupy an elevator, the
probability distribution of the total weight of the eight people is a normal distribution with mean
of 1200 pounds and standard deviation of √9800 pounds.

(a) What is the probability that the total weight of eight people exceeds 1300 pounds?
(b) What is the probability that the total weight of eight people exceeds 1500 pounds?

22
AS 2023-24

5. According to the past experience, the average arrival time of a flight is 18:10. Consider X be the
number of minutes a flight being delay. X has a normal distribution with mean 0 and standard
deviation of 10 minutes.

(a) What is the probability that the flight arrives before its 18:00?
(b) Passengers must check in for a connecting flight by 18:30 at the latest. What is the probability that
passengers from the first flight arrive too late for the connecting flight? (Assume no traveling time
from aircraft to check-in)

6. In a very large class in world history, the final examination scores have a mean of 66.5 and a
standard deviation of 12.6. Assume the scores are normally distributed. The teachers are
discussing which method should be used as the grading criteria.

(a) Method 1, standard grading. Grade A are graded to those who get more than 78 marks. What
percentage of the students should get grade A?
(b) Method 2, relative grading. If grade A is given to the top 11.7% of the students, what is the
minimum score to get grade A?

7. The amount a customer spends on a single visit to Park & Save supermarkets has a normal
distribution with mean $75 and standard deviation $21. Park & Saves Supermarkets wish to
introduce a minimum amount for which credit cards may be used, which enables 80% of
customers to pay by credit card. At how much should this minimum spending be set at?

8. The price of an air ticket to European countries follow a normal distribution with mean $5200 and
standard deviation $740. Suppose 95% customers spend ($5200 - k, $5200 + k) for an air ticket
to European countries, what is the value of k?

9. A survey reports that the spending on an online order in supermarket SMART follows a normal
distribution with mean $360 and standard deviation $80.

(a) What is the probability that the spending of an online order is less than $428?
(b) There is 85% online order with spending between $(360 – M) and $(360 + M). What is the value
of M?

23
AS 2023-24

10. Peter is a mini bus drivers and he drives between Mong Kok and Kwun Tong. It may assume that
the journey time for each ride is normally distributed with mean of 30 minutes and standard
deviation of 5 minutes.

(a) Peter leaves Mong Kok at 9:00a.m. What is the probability that he arrives Kwun Tong after
9:20a.m.?
(b) If there is 93.7% chance that Peter would spend less than k minutes for one ride, what is the value
of k?

11. The monthly salary of an employee in ABC company is normally distributed with mean $12000
and standard deviation $1000. There is a 5% salary increment for every employee after the New
Year.

(a) What are the mean and standard deviation of the monthly salary after the salary adjustment?
(b) If there is 15% of the employees earns less than $k per month after the salary adjustment. What
is the value of k?

12. May and Sam own a cafe together. The monthly revenue of the cafe follows a normal distribution
with mean $45,000 and standard deviation $8,000.

(a) What is the probability that the monthly revenue of the cafe is between $35,000 and $41,800 in a
month?
(b) There is 67% chance that the monthly revenue of the cafe is less than $K. Find the value of K.
(c) In each month, besides a basic salary of $7000, 30% of the revenue of the cafe will goes to May's
salary. Find the expectation and variance of May’s monthly salary.
(d) What is the probability that May's monthly salary will be less than $17,000 in a month?

24
AS 2023-24

13. The manager of a local logistic company is reviewing the cost and the service charge of the
delivery service. Packages are classified as small size, middle size, and large size and the delivery
cost is calculated based on the weight of the package and the traveling distance. According to the
record, the delivery cost of a small size package follows a normal distribution with mean $60 and
standard deviation $12.

(a) What is the probability that the delivery cost of a small size package is $40 or more?
(b) There is 87.9% of the delivery cost of a small size package is more than $M. Find the value of
M.
(c) The service charge is currently calculated by the formula, Y = 200 + 1.8X, where X is the delivery
cost for the package. Find the mean and the standard deviation of the service charge of the
delivery of a small size package.
(d) Instead of calculating the service charge by the original formula, the manager wants to fix the
service charge of delivering a small size package at $350. What proportion of small size package
will be charged more by the new pricing system than the original pricing system?

14. The monthly salary of the employees in a company follows a normal distribution with mean
$15,000 and standard deviation $500.

(a) What is the probability that the monthly salary of an employee is higher than $16,000?
(b) There are 85% of all employees with monthly salary less than $t. Find the value of t.
(c) The salary of each employee will be increased by 10%. Find the (i) mean and (ii) standard
deviation of the adjusted monthly salary.
(d) The manager of the company claims that over 20% of the employees would have monthly salary
more than $17,000 after the adjustment. Is it true? Support your answer with calculation.

25
AS 2023-24

15. A recent research has been conducted by a travel agency in order to understand customers’
expectation on cruise tours. One major topic in the research is to investigate the budget a customer
would be willing to spend on a 5 days-tour to Korea. The result shows that a customer would be
willing to spend on the average of $8,000 with a standard deviation of $1,200. It is assumed the
that the spending is normally distributed.

(a) What is the probability that a customer would be willing to spend at least $7,000 on a 5 days-tour
to Korea?
(b) There is an 85% chance that a customer would be willing to spend at least $K on a 5 days-tour to
Korea. What is the value of K?
(c) The research also reports that people are willing to pay 20% more if the destination is changed to
Japan. What are the (i) mean and (ii) standard deviation of the budget for a 5 days-tour to Japan?
(d) Suppose $(L1, L2) indicates the budget of the middle 92% customers willing to spend on a 5 days
tour to Japan. What are the values of (i) L1 and (ii) L2?

16. May owns a small store selling handmade accessory. The monthly income earned by selling
earring and the monthly income earned by selling wallet are independent normal variables. The
mean and standard deviation of the monthly income earned by selling wallet are $20000 and
$5000; while the mean and standard deviation of the monthly income earned by selling earring
are $15000 and $3300.

(a) What are the mean and standard deviation of the total monthly income earned by selling these
two products?
(b) She will not run the business anymore if the probability that she earns less than $30000 is higher
than 0.3. Should she quit the business? Justify your answer with calculation.

17. The lifetime of a watch battery is normally distributed with mean 5400 hours and standard
deviation 40 hours. Suppose every 2 batteries are packed together. Use T to denote the total
lifetime of the 2 batteries.

(a) What are the mean and standard deviation of total lifetime?
(b) What is the probability that the total lifetime of 2 batteries is less than 10900 hours?

26
AS 2023-24

18. Eva goes to Jennifer’s beauty salon every Sunday. According to her experience, the waiting time
to see Jennifer is normally distributed with mean 10 minutes and standard deviation 4 minutes.
The facial treatment time is normally distributed with mean 55 minutes and standard deviation 5
minutes. The waiting time and facial treatment time are independent.

(a) What is the probability that Eva waits for less than 5 minutes to see Jennifer?
(b) Using X to represent the waiting time and Y represents the facial treatment time, write down the
distribution of total time (waiting time plus facial treatment time) Eva spends in Jennifer’s Beauty
Salon.
(c) Eva arrives the Salon at 6:30p.m. What is the probability that she leaves the Salon before 7:30
p.m.?

19. John is an office assistant working in a large lawyer firm. Every morning, he handles the post
mails. On one hand, he collects post mails from colleagues; on the other hand, he distributes the
arrived post mails to colleagues. Suppose the time he spends on collecting mails each morning
follows a normal distribution with mean 40 minutes and standard deviation 12 minutes; while the
time he spends on delivering mails follows a normal distribution with mean 65 minutes and
standard deviation 8 minutes. It is reasonable to assume that the time he spends on colleting mails
and the time he spends on delivering mails are independent.

(a) What is the probability that he finishes the two jobs within 2 hours in a morning?
(b) There is 10% chance that he would use more than M minutes to finish the two jobs. What is the
value of M?

27
AS 2023-24

20. A survey was conducted in a secondary school to investigate students’ activities during lunch time.
The survey particularly focus on two items, (i) the total traveling time a student spends on walking
from school to the nearby restaurant and then walking back to school after lunch, and (ii) the time
a student spends on having lunch. The total traveling time follows a normal distribution with
mean 15 minutes and standard deviation 2 minutes, while the time a student spends on having
lunch follows a normal distribution with mean 22 minutes and standard deviation 5 minutes.

(a) What is the probability that a randomly selected student in the school spends more than 12 minutes
on traveling?
(b) There are 62.5% of the students spends less than k minutes on traveling. What is the value of k?
(c) Use T denote the total number of minutes that a student spends outside school during lunch time,
which includes the total traveling time and the time a student spends on having lunch. Assume
the traveling time and the time for lunch are independent. What are the (i) mean and (ii) standard
deviation of T?
(d) Find P(T > 35).

21. The weight of a box of chocolates of a certain brand follows a normal distribution with a mean of
85 grams and a standard deviation of 7 grams. During the promotion period, there are special sets,
each include two randomly selected boxes of chocolates. Use T to denote the total weight of
chocolates in a special set.

(a) Find the expectation, median, and standard deviation of T.


(b) What is the probability that the weight of chocolates in a special set is less than 160 grams?

28
AS 2023-24

Chapter 6 –Sampling Distribution and Central Limit Theorem

1. Random samples are repeatedly taken from the distribution of X, where X ~ N(60, 42), and the
sample means are calculated. What are the expectation and standard error of the sample mean if

(a) Sample size = 5


(b) Sample size = 10
(c) Sample size = 15

2. The weight of a can of soup follows a normal distribution with population mean of 375 grams and
standard deviation of 4 grams. Every six cans of soup are packed together randomly as a value
pack. Suppose the average weight of six cans of soup in each value pack is recorded.
What are the (a) expectation and (b) standard error of the sample mean?

3. The weight of a large luggage follows a normal distribution with mean 24 kg and standard
deviation 5 kg. Random samples of 5 large luggage are selected and the average weight for each
sample is recorded. What are the (a) expectation and (b) standard error of the distribution of the
average weight?

4. The lifetime of a new brand of clock battery is normally distributed with mean 8200 hours and
standard deviation 50 hours. Every 40 batteries of this brand are packed and sold in supermarkets.
What are the (a) expectation and (b) standard error of the average lifetime of a pack of battery?

5. In a factory, it is known that 9% of the products are defective. Random samples of 80 items are
selected regularly for inspection. What are the (a) expectation and (b) standard error of the sample
proportion of defective item?

6. In a city, it is known that 20% of the residents is left-handed. What are (a) the expectation and (b)
the standard error of the sample proportion of left-handed resident for many samples with sample
size 75?

7. In a survey conducted by the credit card company, it is shown that 70% of the customers would
not pay the bill by monthly instalment if the credit amount is less than $10,000. Many samples,

29
AS 2023-24

each with sample size 40, are selected and the sample proportion of customers not paying the bill
by monthly instalment in each sample is calculated. What are the (a) expectation and (b) standard
error of the sample proportions?

8. BIG Bus Corporation has to conduct surveys regularly to evaluate its service quality. According
to previous studies, 87% of the passengers refuse the invitation to take part in such surveys.
Recently, it is planned to invite 350 passengers to take part in the survey. What are the (a)
expectation and (b) standard error of sample proportions of passengers who refuse the invitation
to take part in the survey, if every time 350 passengers are invited?

30
AS 2023-24

Chapter 7 – Estimation

1. In order to estimate the population mean amount spent for textbooks per student during the fall
semester at a large community college, a random sample of 75 students is surveyed. It is assumed
that the spending follows a normal distribution with unknown population mean and the population
standard deviation is $35. The sample mean spending of the 75 surveyed students is $158.30.

(a) Give a point estimate for the population mean cost per student.
(b) Find the sampling error at 90% confidence level.
(c) Construct the 90% confidence interval estimate for the mean cost per student for all students

2. The quality control manager at a light bulb factory needs to estimate the average lifetime of a light
bulb in a large shipment. The process standard deviation is known to be 100 hours. A random
sample of 80 light bulbs indicated a sample average lifetime of 3500 hours.

(a) Give a point estimate for the population mean lifetime.


(b) Find the sampling error at 95% confidence level.
(c) Set up a 95% confidence interval estimate of the population average lifetime of light bulbs in this
shipment.

3. Suppose that a paint supply store wants to estimate the correct amount of paint contained in one-
gallon cans purchased from a nationally known manufacturer. It is known from the
manufacturer’s specifications that the standard deviation of the amount of paint is equal to 0.02
gallon. A random sample of 50 cans is selected, and the average amount of paint per one-gallon
can is 0.995 gallon. Set up a 99% confidence interval estimate of the population mean amount of
paint.

4. A travel agency frequently arranges seminars in different topics for promotion purpose. The
manager wants to estimate the population mean time for one seminar. A random sample of 40
seminars has the sample mean of 75 minutes. It is believed that the population standard deviation
of time require for a seminar is 10 minutes. Construct a 95% confidence interval estimate of the
population mean time required for one seminar.

31
AS 2023-24

5. A travel agency conducts a survey in order to study the customer’s spending in a 2-week holiday
in European. It is assumed the spending follows a normal distribution with unknown mean and
population standard deviation of $5500. A random sample of 25 customers is selected for
investigation and the data ($) is as follows:
58000 60000 43000 55000 50000
62000 47000 66000 62000 51000
49000 47000 53000 54000 49000
52000 50000 40000 32000 48000
52000 53000 47000 46000 49000

(a) Calculate the 98% sampling error for the estimation of the population mean spending.
(b) Construct the 98% confidence interval for the estimation of the population mean spending.

6. The monthly working hours of a sample of 13 part-time workers from a company were
83 58 70 56 76 64 80 76 70 97 68 78 108
It is assumed that the monthly working hours is normally distributed.

(a) Find the sample mean and standard deviation.


(b) Construct the 95% confidence interval for the estimate of the mean monthly working hours of all
part-time workers in the company.

7. Ten randomly selected i-cable TV customers were each asked to list out how many hours of
television watched per week. The results are:

82 66 90 84 75 88 80 94 110 91

Determine the 90% confidence interval estimate for the mean number of hours of television
watched per week by i-cable TV customers. Assume the number of hours is normally distributed.

32
AS 2023-24

8. A random sample of 20 babies is randomly taken from the newborn babies at Northside Hospital
during the year 2015. The sample mean and standard deviation of the weight of a baby is 6.87 lb
and 1.76 lb respectively. Based on past information, it is assumed that weight of newborn baby
follows a normal distribution. Construct the 95% confidence interval estimate for the mean weight
of all babies born in this hospital in 2015.

9. A stationery store wants to estimate the mean retail value of greeting cards that it has in its
inventory. A random sample of 40 greeting cards indicates an average value of $16.7 and a
standard deviation of $3.2. Assume the value of greeting cards follows a normal distribution, set
up a 95% confidence interval estimate of the mean value of all greeting cards in the store’s
inventory.

10. A software company is organizing a competition which invites secondary school students to
produce an animation movie by using its software. The organizer takes a random sample of 12
movies and reviews the duration of each movie. The lengths of these 12 movies (in minutes) are:

71 91 64 83 73 77 82 93 65 84 89 69

It is assumed that the length of a movie is normally distributed with population standard deviation of
8 minutes. Construct a 95% confidence interval estimate for the population mean length of a movie.

33
AS 2023-24

11. In a beverage manufacturing plant, a production line operates to fill the containers with 16 ounces
of cola. A quality control inspector checks a random sample of 25 bottles. The weight of cola
(ounces) in each bottle is recorded as follow:

15.95 16.07 16.11 15.93 16.08


16.02 15.87 16.12 16.02 16.08
15.88 16.03 16.09 15.93 15.99
16.08 16.11 16.01 16.04 16.12
15.97 16.22 15.89 16.12 16.04

Assume the weight of cola in a bottle is normally distributed.

(a) Construct a 98% confidence interval estimate of the population mean weight of cola in a bottle.
(b) For random samples, each with sample size 25, will be selected periodically. Suppose the 98%
confidence interval estimate of the population mean weight of cola in a bottle have been
constructed for 300 independent samples. Approximately how many such intervals can
successfully include the true unknown population mean?

12. Joey is asked to estimate the proportion of people driving “Benz” in a commercial building in
Central. She randomly identified 200 cars in the parking lot, of which she found 17 to be Benz.

(a) Find the point estimate for the population proportion of people driving “Benz” in this building.
(b) Calculate the sampling error at 90% confidence level.
(c) Construct the 90% confidence interval for the proportion of people driving “Benz” in this building.

13. A telephone survey was conducted to estimate the proportion of households with a personal
computer. Of the 380 households surveyed, 290 had a personal computer.

(a) Find the point estimate for the population proportion of household with a personal computer.
(b) Calculate the sampling error at 95% confidence.
(c) Construct the 95% confidence level estimate for the population proportion of household with a
personal computer.

34
AS 2023-24

14. In a sample of 60 randomly selected residents, the followings are the candidate they are going to
vote for:

Tim Tim Mike Mike Mike Julia Mike Mike Tim Tim
Mike Julia Tim Tim Mike Mike Julia Julia Tim Tim
Tim Mike Mike Tim Mike Tim Mike Julia Tim Tim
Mike Julia Tim Mike Tim Julia Tim Tim Tim Mike
Tim Mike Julia Tim Julia Mike Tim Mike Tim Tim
Tim Julia Mike Tim Mike Tim Mike Tim Mike Tim

Construct a 90% confidence interval for the proportion of all residents who support Mike to be
the next president.

15. It is known that the weight of a melon from a farm is normally distributed. It is believed that the
population standard deviation of the weight of a melon is 0.9 kg while the population mean is
unknown. In order to estimate the population mean, a random sample of 120 melons is taken and
the sample mean is 4 kg.

(a) Construct a 95% confidence interval estimate for the population mean weight of melons.

Another random sample is taken in order to estimate the population proportion of melons which
weigh heavier than 4.2 kg. In a sample with 200 randomly selected melons, 30 of them weigh
heavier than 4.2 kg.

(b) Construct a 90% confidence interval estimate for the population proportion of melons which
weigh heavier than 4.2 kg.

35
AS 2023-24

16. A survey is carried out to study whether the customers are satisfied with the service provided by
the shop. Out of a total of 300 randomly selected customers, 168 customers are satisfied with the
service. Furthermore, among these 300 customers, the sample mean spending on one visit to the
shop is $820 and the sample standard deviation is $165.

(a) Construct a 98% confidence interval estimate for the population proportion of customers who are
satisfied with the service.

(b) Construct a 98% confidence interval estimate for the population mean spending on one visit to
the shop.

17. A survey is conducting in a theme park. One of the objectives is to review the design of a game
counter.

(a) The following is the score obtained by a sample of 10 participants in the game counter:
97 117 140 78 99
148 108 135 126 121

(ai) Assume the score follows a normal distribution. Construct a 95% confidence interval estimate for
the population mean score.
(aii) Random samplings are conducted repeatedly and for each random sample a 95% confidence
interval is constructed for the population mean. After 200 samplings, each with sample size 10,
there are 200 confidence intervals. About how many such confidence intervals would cover the
true population mean?

(b) The survey also shows that 360 out of 500 randomly selected participants enjoy the game.
Construct a 90% confidence interval estimate for the population proportion of participants enjoy
the game.

36
AS 2023-24

18. A research is conducting to study the university students’ credit card usage.

(a) The first objective of the research is to estimate the average monthly spending with credit card.
A random sample of 45 students is taken and the sample mean spending in a month is $4,600 with
the sample standard deviation is $1,400. Assuming the amount of monthly spending with credit
card follows a normal distribution. Construct a 95% confidence interval estimate for the
population average monthly spending with credit card.

(b) Another objective of the research is to study whether student would make full payment of the loan.
Among these 45 students, 30 of them always pay the full payment before deadline. Construct a
90% confidence interval estimate for the population proportion of students always pay the full
payment before deadline.

37
AS 2023-24

Chapter 8 – Hypothesis Testing

1. The manager at Air Express feels that the weights of packages shipped recently are less than in the
past. Records show that in the past, packages had a mean weight of 36.7 lb and a standard deviation
of 14.2 lb. A random sample of 64 packages taken today yielded a mean weight of 32.1 lb. Is this
sufficient evidence to conclude that weights of packages are less than in the past? Test the
hypothesis at 5% significance level. (Assume the standard deviation is unchanged.)

2. The director of manufacturing at a clothing factory needs to determine whether a new machine is
producing the cloth according to the standard of the mean breaking strength of 70 pounds. The
population standard deviation of the breaking strength is known to be 3.5 pounds. A sample of
49 pieces of cloth reveals a sample mean breaking strength of 69.1 pounds. Is there evidence that
the machine is not meeting the specification? (In terms of the population mean breaking strength
is different from 70 pounds.) Test the hypothesis at 5% significance level.

3. A salad dressing machine is working properly when 8 ounces of salad are dispensed into a bottle.
The standard deviation of the process is 0.15 ounce. A sample of 50 bottles is selected periodically
and the filling is stopped if there is evidence that the mean amount dispensed is different from 8
ounces. Suppose that the mean amount dispensed in a sample of 50 bottles is 7.983 ounces. Is
there evidence that the population average amount is different from 8 ounces? Test the hypothesis
at 10% level of significance.

4. The policy of a particular bank branch is that its ATMs must be stocked with enough cash to satisfy
customers making withdrawals over an entire weekend. The expected mean amount of money
withdrawn from the North Point branch per customer transaction over the weekend is $1600 with
an expected standard deviation of $300. In order to check if the customers’ average withdrawal
has been increased, a sample of 36 transactions is examined. The sample mean withdrawal is
$1680. Is there evidence to believe that the true mean withdrawal is greater than the expectation?
Test the hypothesis at 5% level of significance.

38
AS 2023-24

5. The director of admission team of a large university advises parents of incoming students about
the cost of textbooks during a semester. A sample of 100 students enrolled in the university
indicates a sample mean cost of $3154 with a sample standard deviation of $432. Test, at the 0.05
level of significance, is there evidence that the population mean cost of textbooks is above $3000.

6. A manufacturer of flashlight batteries took a sample of 13 batteries from a day’s production and
used them continuously until they failed to work. The lifetime (hours) until failure was:

342 426 317 545 264 451 1,049


631 512 266 492 562 298

At the 0.05 level of significance, is there evidence that the mean life of the batteries is different
from 400 hours? Assume the lifetime of the batteries is normally distributed.

7. In order to test the null hypothesis that “the mean weight of adult males equals to 160 lb”, the
weights of 16 males were collected with the following results:

173 178 145 146 157 175 173 137


152 171 163 170 135 159 199 131

Assume normality, test at the 0.05 level of significance, is there evidence to reject the null
hypothesis for the alternative hypothesis that “the mean weight for adult males is different from
160 lb”.

39
AS 2023-24

8. In a city, according to a 1990 demographic report based on the census result, the average daily
spending on food per person is $75 with standard deviation of $15. In 2020, a random sample of
28 persons in the same city is selected and the daily spending ($) on food for each interviewee is
recorded as follow:

55.0 59.5 62.5 65.5 68.5 69.0 70.0 70.0 82.0 82.5
83.5 85.0 86.0 86.5 86.5 87.0 87.0 87.0 89.0 92.5
93.0 94.5 94.5 96.0 98.0 102.5 108.5 110.5

Is there sufficient evidence to conclude that the population mean daily spending on food per
person in 2020 is higher than 1990? Test at 1% level of significance with the assumption that
2020 daily spending on food is normally distributed with the same population standard deviation
as in 1990.

9. It is claimed that the students at a certain university will score an average of 35 on a given test. Is
the claim reasonable if a random sample of test scores from this university yields 33, 42, 38, 37,
30, 42? Complete a hypothesis test using α = 0.1. Assume test results are normally distributed.

10. The Better Sleep Council reports that 61% of the residents get more than seven hours of sleep per
night on the weekend. A random sample of 350 adults found that 235 had more than seven hours
sleep at last weekend. At the 0.05 level of significance, does this evidence show that the
proportion of the residents get more than seven hours of sleep per night on the weekend is more
than 61%?

11. A politician claims that she will receive more than 60% of the vote in an upcoming election. The
results of a properly designed random sample of 100 voters showed that 65 of those sampled will
vote for her. Test, at the 0.05 level of significance, is her supportively rate more than 60%?

12. A country judge has agreed that he will give up his country judgeship and run for the state
judgeship if there is evidence that more than 25% of his party fellow oppose him. A random
sample of 800 party members indicated that 217 of them opposed him. Does this sample suggest
that he should give up his country judgeship and run for the state judgeship? Carry out this
hypothesis test by using α = 0.10.

40
AS 2023-24

13. With respect to a statement claimed by a magazine “More men work at home than women”, a
woman right organization has conducted a survey to testify it. The study of 899 home-based
businesses reported that 369 were owned by women. Does the finding have sufficient evidence
to support that the proportion of home-based business owned by women is less than 0.5? Test the
hypothesis at 5% significance level.

14. A bank offers three types of welcoming gifts to VISA card applicants.
Gift A: $400 supermarket coupon
Gift B: a 2-persons tea set voucher
Gift C: 8 sets of movie tickets

The choice made by a random sample of 40 applicants is as follow:

B A B A A A C A
A B A A B A A B
B B A A B A B A
A B A C A C A C
A A A A A A B A

Based on the table above, test, at the 0.01 level of significance, is there sufficient evidence to
conclude that more than half of the applicants choose Gift A?

15. A test was conducted to compare the wearing quality of the tires produced by two tire companies.
All the factors are controlled on both brands of tires, car by car. The following is the summary of
the amount of wear (in thousandths of an inch) of six test cars:

Car 1 2 3 4 5 6
Brand A 125 64 94 38 90 106
Brand B 133 65 103 37 102 115

Test, at the 0.05 level of significance, is there evidence that the amount of wear of the two brands
of tire is different? Assume the amount of wear is normally distributed.

41
AS 2023-24

16. The following is the data obtained from an experiment designed to estimate the reduction in
diastolic blood pressure as a result of following a salt-free diet for two weeks. Assume diastolic
readings to be normally distributed.

Before 93 106 87 92 102 95 88 110


After 92 102 89 92 101 96 88 105

Is there a significant reduction in diastolic blood pressure after following a salt-free diet? (Use α
= 0.05).

17. The table below shows the weights (in pounds) of 8 adult participants, measured before and after
joining the weight control program.

Before 122 130 114 139 150 147 155 153


After 123 125 116 132 141 138 158 152

Test, at the 5% significance level, whether there is any evidence showing that the weight control
program can effectively reduce weight.

18. A series of promotion program is conducted in January and February in a supermarket chain. The
average spending of a customer on 1 January and 1 March from a sample of 9 supermarkets are
recorded and the result is as follow:

Supermarket 1 2 3 4 5 6 7 8 9
Average spending of a 258 304 188 194 225 179 251 294 174
customer on 1 January
Average spending of a 262 312 196 187 250 182 247 274 190
customer on 1 March

Test, at the 1% significance level, if there is an increase in the average spending of a customer
from 1 January to 1 March.

42
AS 2023-24

19. Before buying a new milling machine, the purchasing director would like to check if there is
evidence that the parts produced by the new machine have a significantly higher average breaking
strength than those from the old machine. The process standard deviation of the breaking strength
for the old machine is 10 kilograms and for the new machine is 9 kilograms. A sample of 100
parts taken from the old machine indicates a sample mean of 65 kilograms, while a sample of 100
parts taken from the new machine indicates a sample mean of 72 kilograms. Is there evidence to
support the director to purchase the new machine? Write a test report in critical value approach
based on the following Excel output generated at 0.01 level of significance.

z-Test: Two Sample for Means

Old New
Mean 65 72
Known Variance 100 81
Observations 100 100
Hypothesized Mean Difference 0
z -5.2031
P(Z<=z) one-tail 0.0000
z Critical one-tail 2.3263
P(Z<=z) two-tail 0.0000
z Critical two-tail 2.5758

43
AS 2023-24

20. An experiment is designed to compare the differences in average surface hardness of two
modified materials, A and B. Based on past experience, it is believed that the standard deviations
in surface hardness are 10.2 and 6.4 for materials A and B respectively. In the experiment, 60
items are selected, 30 items are material A and 30 items are material B. If the sample means
hardness of materials A and B are 163.4 and 156.9, use a 0.05 level of significance to determine
whether there is evidence of a difference between the hardness of the two materials. Write a test
report in p-value approach based on the following Excel output.

z-Test: Two Sample for Means

Material A Material B
Mean 163.4 156.9
Known Variance 104.04 40.96
Observations 30 30
Hypothesized Mean
Difference 0
z 2.9566
P(Z<=z) one-tail 0.0016
z Critical one-tail 1.6449
P(Z<=z) two-tail 0.0031
z Critical two-tail 1.9600

44
AS 2023-24

21. Two samples of burgers are selected from two branches of Macy Restaurants in order to test if the
average fat contents (in grams) of burgers from the two branches are the same. The following is
the results:
Branch A 33.7 21.6 32.1 38.2 33.2 35.9 34.1 39.8
23.5 21.2 23.3 18.9 30.3
Branch B 28.0 29.9 22.3 23.3 33.6 24.1 16.9 14.4
30.2 23.1 13.9 19.7 16.6 13.8 42.1 28.1

An excel report is generated for the test at the 0.05 level of significance:

t-Test: Two-Sample Assuming Equal Variances

Branch A Branch B
Mean 29.6769 23.75
Variance 50.0269 63.3933
Observations 13 16
Pooled Variance 57.4527
Hypothesized Mean Difference 0
df 27
t Stat 2.0941
P(T<=t) one-tail 0.0229
t Critical one-tail 1.7033
P(T<=t) two-tail 0.0458
t Critical two-tail 2.0518

(a) Find the sample mean fat contents of burgers in Branch A and Branch B respectively.
(b) State the null hypothesis and alternative hypothesis of the test.
(c) Report the p-value of the test.
(d) Is there evidence that the average fat contents of burgers from the two branches is different at 5%
significance level? Explain your answer by using the p-value in part (c).

45
AS 2023-24

22. Twenty laboratory mice were randomly divided into two groups of 10. Each group was fed
according to a prescribed diet. At the end of three weeks, the weight gained (in grams) by each
animal was recorded. Do the data in the following table justify the conclusion that the diet B has
a stronger effect in increasing mice’s weight than diet A?

Sample A 5 14 7 9 11
7 13 14 12 8
Sample B 10 21 16 23 24
16 13 19 19 21

An Excel report is generated for the test at the 0.01 level of significance.

t-Test: Two-Sample Assuming Equal Variances

Diet A Diet B
Mean 10 18.2
Variance 10.4444 19.7333
Observations 10 10
Pooled Variance 15.0889
Hypothesized Mean Difference 0
df 18
t Stat -4.7203
P(T<=t) one-tail 0.0001
t Critical one-tail 2.5524
P(T<=t) two-tail 0.0002
t Critical two-tail 2.8784

(a) Find the sample mean weight gained of mice by taking diet A and diet B respectively.
(b) Write a test report in critical value approach to conclude if there is evidence that the diet B has a
stronger effect in increasing mice’s weight than diet A.

46
AS 2023-24

23. The personnel department of a company decided to investigate whether the age of an employee
had any effect on learning new computing skills. Among those employees who had not attended
the company computing program, 8 employees aged below 40 and 10 employees aged over 40
were selected randomly and then were given the same computing training course. At the end of
the course, the 18 employees were given a test. The results of the test were given as follows:

Scores obtained by employees Scores obtained by employees


aged below 40 aged over 40
38 47 88 66 44 70 39 56 72 34

68 41 55 58 60 46 49 50

Below is the Excel output generated at 5% level of significance:

t-Test: Two-Sample Assuming Equal Variances

Employees Employees
aged below aged over
40 40
Mean 55.875 53.4
Variance 291.8393 151.3778

Observations 8 10
Pooled Variance 212.8297
Hypothesized Mean Difference 0
df 16
t Stat 0.3577
P(T<=t) one-tail 0.3626
t Critical one-tail 1.7459
P(T<=t) two-tail 0.7253
t Critical two-tail 2.1199

(a) Find the sample mean scores obtained by employees aged below 40 and employees aged above
40 respectively.

(b) Write a test report in p-value approach to conclude whether the age of an employee had any effect
on learning new computing skills at the 5% significance level.

47
AS 2023-24

24. A salesman claims that the percentage of defective mobile phones is no higher than that for a
similar model from his competitor in the promotion program. To test this statement, the retailer
took random samples from each manufacturer’s product.

Product Number of Defective Number Checked


Salesman’s 15 150
Competitor’s 6 150

Below is the Excel output generated at 5% level of significance:

z-Test: Two Sample for Proportions

Salesman Competitor
Proportion 0.1 0.04
Variance 0.0651 0.0651
Observations 150 150
Hypothesized Proportion Difference 0
z 2.0365
P(Z<=z) one-tail 0.0208
z Critical one-tail 1.6449
P(Z<=z) two-tail 0.0417
z Critical two-tail 1.9600

(a) Find the sample proportion of defective items in salesman’s mobiles, sample proportion of
defective items in competitor’s mobiles, and the pooled sample proportion of defective items
respectively.
(b) Write a test report in critical value approach to conclude if there is any evidence that the defective
rate of salesman’s mobile phones is actually higher than his competitor.

48
AS 2023-24

25. A survey has invited 200 men and 200 women to taste a brand of cola. Twenty-nine percent of
the men and 24% of women responded positively to the cola. Based on this survey, can we
conclude that there is a significant difference in the proportion of men and women response
positively to the cola at the 0.02 level of significance? Write a test report in p-value approach
according to the below Excel output:

z-Test: Two Sample for Proportions

Men Women
Proportion 0.29 0.24
Variance 0.1948 0.1948
Observations 200 200
Hypothesized Proportion Difference 0
z 1.1329
P(Z<=z) one-tail 0.1286
z Critical one-tail 2.0537
P(Z<=z) two-tail 0.2572
z Critical two-tail 2.3263

49
AS 2023-24

26. The table below is a summary of a hotel guest satisfactory study.

Hotel
Visit again? Westwind Goodview
Yes 163 154
No 64 108

A test is conducted to check if there is any evidence to say that a greater proportion of guests of
Westwind are likely to revisit than that of Goodview and the Excel output is as follow:

z-Test: Two Sample for Proportions

Westwind Goodview
Proportion 0.7181 0.5878
Variance 0.2280 0.2280
Observations 227 262
Hypothesized Proportion Difference 0
z 3.0088
P(Z<=z) one-tail 0.0013
z Critical one-tail 1.6449
P(Z<=z) two-tail 0.0026
z Critical two-tail 1.9600

(a) Find the sample proportion of guests would likely to revisit Westwind Hotel and Goodview Hotel
respectively.
(b) State the null hypothesis and alternative hypothesis of the test.
(c) Report the p-value of the test.
(d) Is there evidence to say that a greater proportion of guests of Westwind are likely to revisit than
that of Goodview at 5% significance level? Explain your answer by using the p-value in part (c).

50
AS 2023-24

Chapter 9 – Analysis of Variance

1. Suppose that we want to compare the price of pork sold in wet market in different districts.
Random samples of size 4 are taken from Mong Kok, Wan Chai and Tai Po and the price ($) to
purchase 400 grams pork is as follow:

Mong Kok 71, 75, 65, 69


Wan Chai 90, 80, 86, 84
Tai Po 72, 77, 76, 79

(a) Report the sample mean price to purchase 400 grams pork for each district and the combined
sample mean.
(b) Write a test report in critical value approach at 5% significance level, if the price of pork in three
districts are not all the same by referring to the following Excel output:

SUMMARY
Groups Count Sum Average Variance
Mong Kok 4 280 70 17.3333
Wan Chai 4 340 85 17.3333
Tai Po 4 304 76 8.6667

ANOVA
Source of
Variation SS df MS F P-value F crit
Between Groups 456 2 228 15.7846 0.0011 4.2565
Within Groups 130 9 14.4444

Total 586 11

51
AS 2023-24

2. The table below shows the scores for crunchiness of four competing potato crisps. Each type of
crisp was assessed by several testers.

Crisp 1: 13.4 12.2 12.4 12.8 12.2


Crisp 2: 9.3 10.8 8.4 9.7 9.5 7.9 9.5
Crisp 3: 12.5 14.7 12.9 11.8
Crisp 4: 14.0 15.6 14.1

(a) Write a test report in p-value approach to conclude, at the 5% level of significance, whether
the differences in crunchiness between the four crisps are significant based on the following
excel output:

SUMMARY
Groups Count Sum Average Variance
Crisp 1: 5 63 12.6 0.26
Crisp 2: 7 65.1 9.3 0.8767
Crisp 3: 4 51.9 12.975 1.5292
Crisp 4: 3 43.7 14.5667 0.8033

ANOVA
Source of Variation SS df MS F P-value F crit
Between Groups 75.4227 3 25.1409 30.1832 0.0000 3.2874
Within Groups 12.4942 15 0.8329

Total 87.9168 18

(b) Report the sample mean crunchiness of the four crisps and the combined sample mean.

52
AS 2023-24

3. A project conducted between social welfare department and hospital authority is investigating the
cognitive development of kids at different ages. 15 kids studying at K1, K2, and K3 have joined
the project and the time for each kid to finish a 50 pieces puzzle is recorded. Below are the
finishing times (in minutes):

K1: 21.1, 17.8, 18.6, 20.8, 17.9, 19.0


K2: 18.0, 16.4, 15.7, 19.6, 16.5, 18.2
K3: 16.5, 17.8, 16.1

A test is conducted to verify whether cognitive development of K1, K2, and K3 kids are
different by testing if the mean finishing times for three groups of kids are not all the same.
Below is the excel output of an ANOVA test conducted at 0.01 level of significance:

SUMMARY
Groups Count Sum Average Variance
K1: 6 115.2 19.2 2.044
K2: 6 104.4 17.4 2.108
K3: 3 50.4 16.8 0.79

ANOVA
Source of Variation SS df MS F P-value F crit
Between Groups 15.12 2 7.56 4.0609 0.0450 6.9266
Within Groups 22.34 12 1.8617

Total 37.46 14

(a) State the null hypothesis and the alternative hypothesis of the ANOVA test.
(b) Report the p-value of the test.
(c) Should the cognitive development of K1, K2, and K3 kids are concluded as not all the same?
Give a reason to your answer.

53
AS 2023-24

4. To study the effectiveness of five different kinds of packaging, a processor of breakfast foods
obtained the following data on the numbers of sales on five different days:

Packaging I: 60, 52, 56, 52, 65


Packaging II: 54, 64, 66, 54, 57
Packaging III: 55, 66, 68, 57, 55
Packaging IV: 55, 56, 70, 58, 56
Packaging V: 71, 65, 60, 59, 62

Below is the output result of the ANOVA test, conducted by Excel at the 5% significance level:

SUMMARY
Groups Count Sum Average Variance
Packaging I 5 285 57 31
Packaging II 5 295 59 32
Packaging III 5 301 60.2 39.7
Packaging IV 5 295 59 39
Packaging V 5 317 63.4 23.3

ANOVA
Source of Variation SS df MS F P-value F crit
Between Groups 111.04 4 27.76 0.841212 0.515328 2.866081
Within Groups 660 20 33

Total 771.04 24

(a) State the null hypothesis and the alternative hypothesis of the ANOVA test.
(b) Report the critical value of the test.
(c) Report the F statistics of the test.
(d) Should the effectiveness of the five different kinds of packaging be considered as all are the
same? Give a reason to your answer.

54
AS 2023-24

Chapter 10 – Chi Square Test

(Remark: calculation of expected frequency: to be correct to at least 2 d.p.)

1. According to the genetic theory the number of colour-strains pink, white, blue in a certain flower
should appear in the ratio 3:2:5. For 100 plants randomly selected from the garden, the results
were as follows:

Colour Pink White Blue Total


Number of plants 24 14 62 100

Test, at the 1% significance level, are the differences between the observed and expected
frequencies significant?

2. An Italian gelato shop opens its first shop in Hong Kong and offers three flavors of ice-cream,
namely vanilla, mango and strawberry. According to the previous study conducted in Italy, 20%
of the customers choose vanilla, 40% of the customers choose mango, and 40% of the customers
choose strawberry. Below is the record of the choices of these three flavors of ice-cream by
customers in the Hong Kong shop on the opening day.

Flavor
Vanilla Mango Strawberry
250 500 450

Test, at the 0.05 level of significance, if there is a significant difference in the preference on the
3 ice-cream flavors between the customers in Hong Kong and those in Italy.

3. 400 students were randomly selected and asked who they will vote for as the chairman of the
coming student union. The results are given below:

Candidate Mary John Peter May


Votes 131 121 99 49

Can you argue that the four candidates command different levels of support? Test the hypothesis
at 5% level of significance.

55
AS 2023-24

4. The following sample data represents the quality of the shipments received by a large firm from
three different vendors:

Number of
Number of Number of
imperfect but Total
rejected shipments perfect shipments
acceptable shipments
Vendor A 15 25 90 130
Vendor B 7 18 65 90
Vendor C 22 33 125 180
Total 44 76 280 400

The charge of shipments from Vendor B is much higher than the other two Vendors. The senior
management wants to have a detailed report about the quality of the shipments received from
Vendor B. Referring to the number of shipments received from Vendor B in the above table, test,
at the 5% level of significance, if the ratio of “rejected shipments : imperfect but acceptable
shipments : perfect shipments” equals to “1 : 1 : 8”.

56
AS 2023-24

Chapter 8 to 10 – What test should be conducted?

Suggest the most suitable test (z-test, t-test, ANOVA, 2 test) for the following cases:

(a) Test, if the average spending on lunch between male and female students are different (all data is
collected from a survey)

(b) Test, if the emotional problem among primary and secondary school are different by considering
the proportion of kids suffering from insomnia.

(c) Test, if the average daily income of Judy Restaurant in Monday, Tuesday, Wednesday, and
Thursday are all the same.

(d) Last year, the population mean working hour of a nurse is 8.5 hours a day with standard deviation
of 2.1 hours. Test, if this year the average mean working hour has been increased with the
assumption that the standard deviation is unchanged.

(e) A report claims that 15%, 65%, and 20% of travellers arrive Hong Kong International Airport by
taxi, bus, and airport express. Test, if the report is correct.

(f) A researcher wants to test if there is any difference between the mean processing times when
customers pay by VISA, pay by Octopus card, and pay by cash.

(g) A researcher wants to test if more than 30% of the car accidents are due to drunk-driving.

(h) An education researcher wants to compare the teaching effectiveness of four different teaching
methods. Final year students are randomly assigned to one of the four groups. The marks
obtained by the students in the final examination would be used for the test.

57
AS 2023-24

Chapter 11 – Linear Regression and Correlation

(Remark: Correct to at least 4 d.p. in your calculation)

1. The following data have been collected regarding sales and advertising expenditure of six products:

Sales Advertising Expenditure


(dollars in millions), (dollars in thousands),
x y
8.5 210
9.2 250
7.9 290
8.6 330
9.4 370
10.1 410

(a) Fit the regression equation, y = a + bx, for the above data.
(b) Interpret the value of a and b in the regression equation in part (a).
(c) Calculate the coefficient of correlation and comment on it.
(d) Determine the advertising expenditure and comment on the reliability of the estimation when the
sales is:
i. 6 million
ii. 9 million

58
AS 2023-24

2. The following data obtained in a study conducted in a secondary school. The number of days
being late to school (X) and the examination scores in General Education of seven randomly
selected students were as follow:

Number of days being late 6 2 15 9 12 5 8


to school, X
Examination scores in 82 86 43 74 58 90 78
General Education, Y

(a) Present the relationship between the number of days being late to school and the examination
scores in General Education by calculating the coefficient of correlation of the above data and
comment on it.
(b) Present the relationship between the number of days being late to school and the examination
scores in General Education by estimating the values of a and b in the linear regression line,
y = a + bx.
(c) Explain the values of a and b of the regression line in part (b).
(d) Estimate the examination scores in General Education for a student who has been late to school
for 11 days and comment on the reliability of the estimation.

3. The following table shows the amount of water, in centimetres, applied to six similar plots on an
experimental farm. It also shows the yield of hay in tons per acre.

Amount of water (x) Yield of hay (y)


30 4.85
45 5.20
60 5.76
75 6.60
105 7.35
120 7.77

(a) Fit the regression equation, y = a + bx, for the above data.
(b) Interpret the value of a and b in the regression equation.
(c) Calculate the coefficient of correlation and comment on it.
(d) Determine the expected yield and comment on the reliability when the amount of water (in
centimeters) is:
i. 90
ii. 150
4. James is a PhD student. He is studying the characteristics of a list of companies that go public for
the first time. He is particularly interested in the relationship between the size of the offering and

59
AS 2023-24

the price per share. A sample of 10 companies that recently went public revealed the following
information.

Company Size (in millions) Price per share (in dollars)


x y
1 9.0 10.8
2 94.4 11.3
3 27.3 11.2
4 179.2 11.1
5 71.9 11.1
6 97.9 11.2
7 93.5 11.0
8 70.0 10.7
9 160.7 11.3
10 96.5 10.6

(a) Calculate the correlation coefficient of the above data and comment on it.
(b) Fit the regression equation, y = a + bx for the above data.
(c) Interpret the value of a and b in the regression equation.
(d) Estimate the price per share of the company if the size of the offering is 10 millions and comment
on its reliability.

60
AS 2023-24

5. A sales is recently reviewing its customers’ profiles and wants to study if there is any relationship
between the customer’s monthly income and the price of the car he purchases. Below is the
information of a random sample of 7 customers:

x: monthly income ($) y: price of the car ($)


68000 250000
59000 230000
75000 300000
57000 280000
66000 320000
80000 330000
48000 220000

(a) Present the relationship between the monthly income and the price of the car by calculating the
coefficient of correlation of the above data and comment on it.
(b) Present the relationship between the monthly income and the price of the car by estimating the
values of a and b in the linear regression line, y = a + bx.
(c) Explain the values of a and b of the regression line in part (b).
(d) Estimate the price of the car a customer would purchase for a customer whom monthly income
is $90000 and comment on the reliability of the estimation.

6. Two judges, Peter and Tom, rank the eight photographs in a competition as follows:
Photograph A B C D E F G H
Rank by Peter 2 5 3 6 1 4 7 8
Rank by Tom 4 3 2 6 1 5 8 7

(a) Calculate the coefficient of rank correlation for the data.


(b) Do the two judges have similar or different judging criteria? Explain your answer by using the
result obtained in part (a).

61
AS 2023-24

Applied Statistics –Exercise Solution

Chapter 1 – Sampling Methods

1.
(a) To study employee’s opinion about the existing insurance policy
(b) All employees working in this company.
(c) 14000
(d) Stratified sampling method
(e) Sample size for each country should be:
140
Singapore: 2000× = 20
14000

140
Korea: 3000× 14000 = 30

140
China: 5000× 14000 = 50

140
Malaysia: 1500× 14000 = 15

140
Thailand: 2500 × 14000 = 25

2.
(a) All full-time employed staff working in restaurants.
(b) Cluster sampling
The subject of the survey is individual full-time employed staff working in restaurants. Instead
of preparing a sampling frame which identifies all full-time employed staff working in restaurants,
a list of registered restaurants is prepared and samples of restaurant are selected. All full-time
employed staff working in the selected restaurants will be surveyed. This procedure is classified
as cluster sampling.

1
AS 2023-24

3.
(a) All (1500) supermarkets of this supermarket chain.
(b) Sample size for each district should be:
60
Central: 550× 1500 = 22

60
South: 250× 1500 = 10

60
North: 200× 1500 = 8

60
East: 350× 1500 = 14

60
West: 150× =6
1500

(c) It is fast to generate the sample


(d) For systematic sampling, we need to choose 1 supermarket from every k supermarkets, where
1500
k= = 25.
60

As the first identity number is 0037, then the next three identity numbers are:
0037 + 25 = 0062,
0062 + 25 = 0087,
0087 + 25 = 0112

4.
(a) Systematic
(b) Cluster
(c) Simple random
(d) Cluster
(e) Systematic
(f) Stratified

5.
(a) Quantitative (discrete)
(b) Quantitative (discrete)
(c) Quantitative (continuous)
(d) Quantitative (continuous)
(e) Qualitative

2
AS 2023-24

6.
(a) Qualitative
(b) Quantitative (Continuous)
(c) Quantitative (Continuous)
(d) Quantitative (Discrete)
(e) Qualitative

3
AS 2023-24

Chapter 2 - Statistical Measures and Data Presentation

1. The ordered array: 39, 41, 45, 52, 53, 55, 58, 59, 60, 61, 62, 62, 63, 65, 65, 65, 69, 70, 72, 77
* Data must be rearranged as ordered array before finding any percentile.

(a) mean = 59.65 minutes (from calculator)


mode = 65 minutes
50 61+62
median = 61.5 minutes (n = 20, i = 20  = 10, median = )
100 2

25 53+55
Q1 = 54 minutes (n = 20, i = 20  = 5, 𝑄1 = )
100 2

75 65+65
Q3 = 65 minutes (n = 20, i = 20  = 15, 𝑄3 = )
100 2

17
17th percentile = 52 minutes (𝑛 = 20, 𝑖 = 20 × 100 = 3.4 ↑ 4, 17𝑡ℎ 𝑝𝑒𝑟𝑐𝑒𝑛𝑡𝑖𝑙𝑒 = 52)

87
87th percentile = 70 minutes (𝑛 = 20, 𝑖 = 20 × 100 = 17.4 ↑ 18, 87𝑡ℎ 𝑝𝑒𝑟𝑐𝑒𝑛𝑡𝑖𝑙𝑒 = 70)

(b) sample standard deviation = 9.9328 minutes (from calculator)


sample variance = 9.93282 = 98.6605
(c) Left-skewed distribution as Q2 - Q1 (61.5 – 54 = 7.5) > Q3 - Q2 (65 – 61.5 = 3.5).

2. The raw data: 95, 60, 120, 75, 60, 70, 40, 35, 115, 60
The ordered array: 35, 40, 60, 60, 60, 70, 75, 95, 115, 120

(a) $73 (from calculator)


(b) $60
50 60+70
(c) $65 (𝑛 = 10, 𝑖 = 10 × 100 = 5, 𝑚𝑒𝑑𝑖𝑎𝑛 = )
2

25
(d) $60 (𝑛 = 10, 𝑖 = 10 × 100 = 2.5 ↑ 3, 𝑄1 = 60)

75
(e) $95 (𝑛 = 10, 𝑖 = 10 × = 7.5 ↑ 8, 𝑄3 = 95)
100

(f) $85 (120 – 35 = 85)


(g) $35 (Q3 – Q1 = 95 – 60 = 35)
(h) $28.8868 (from calculator)
(i) It's a right-skewed distribution as Q2 - Q1 (65 – 60 = 5) < Q3 - Q2 (95 – 65 = 30).

4
AS 2023-24

3. The ordered array of the dataset:


327 591 662 730 768 820
873 951 1214 2460 4260 5293
(a) sample mean = $1579.1 (from calculator)
sample median = $846.5 (𝑛 = 12, 𝑖 = 12 × 0.5 = 6)
662+730
Q1 = = $696 (𝑛 = 12, 𝑖 = 12 × 0.25 = 3)
2

1214+2460
Q3 = = $1837 (𝑛 = 12, 𝑖 = 12 × 0.75 = 9)
2

10th percentile = $591 (𝑛 = 12, 𝑖 = 12 × 0.1 = 1.2 ↑ 2)


(b) range = 5293 - 327 = $4966
IQR = 1837 - 696 = $1141
Sample standard deviation = $1598.9 (from calculator)
(c) Skewed to the right, as Q2 – Q1 = 150.5 < Q3 – Q2 = 990.5

4.
Company A Company B
Mean $29000 $61000
Median $28500 $62500
Q1 $26000 $53500
Q3 $31500 $67000
variance 13333333.33 51666666.67
standard deviation $3651.48 $7187.95

5
AS 2023-24

5.
(a) Yuen Long: sample mean = $45.5, sample standard deviation = $10.54
Causeway Bay: sample mean = $68.5, sample standard deviation = $17.86
(b) On the average workers working at Causeway Bay spend more money on lunch than those
working at Yuen Long. The deviation of spending among workers working at Causeway Bay is
larger than the deviation of the spending among workers working at Yuen Long.

Ordered array of the combined data:


33 35 38 40 43 48 48 50 53 54
55 58 62 63 74 80 82 88 89 93

(c) sample mean = $59.3,


sample standard deviation = $18.95
82th percentile = $82 (i = 20(0.82) = 16.4↑17)
43+48
(d) First quartile = = $45.5 (i = 20(0.25) = 5)
2

54+55
Median = = $54.5 (i = 20(0.5) = 10)
2

74+80
Third quartile = = $77 (i = 20(0.75) = 15)
2

With Q2 – Q1 = $9 < Q3 – Q2 = $22.5


It is a right-skewed distribution.

6.
(a) Mean = $51000, sample standard deviation = $2943.9203
(b) (i) New salary: $53000, $60000, $56000, $55000
New mean = $56000, new sample standard deviation = $2943.9203
(ii) Use X to denote the original salary and Y to denote the salary after an increment of $5000,
Y = X + 5000,
Mean of Y = Mean of X + 5000 = 51000 + 5000 = $56000,
Sample standard deviation of Y = sample standard deviation of X = $2943.9203;
(c) (i) New salary: $57600, $66000, $61200, $60000
New mean = $61200, new sample standard deviation = $3532.7043
(ii) Use X to denote the original salary and W to denote the salary after 20% increment,
W = 1.2(X)
Mean of W = (1.2) (Mean of X) = 1.2(51000) = $61200,
Sample standard deviation of W = (1.2) (sample standard deviation of X)
= 1.2 (2943.9203) = $3532.7044

6
AS 2023-24

7.
86+85+93+108+⋯+106
(a) Sample mean = = 96.125 kg
16

(86−96.125)2 +(85−96.125)2 +⋯+(106−96.125)2


Sample standard deviation = √ = 10.782 kg
16−1

(b) Ordered array:


84 84 85 86 87 91 91 92
93 98 99 103 106 108 113 118

86+87
Q1 = = 86.5 kg (i = 16(0.25) = 4)
2

92+93
Median = = 92.5 kg (i = 16(0.5) = 8)
2

103+106
Q3 = 2
= 104.5 kg (i = 16(0.75) = 12)

(c) 10th percentile = 84 kg (i = 16(0.1) = 1.6 ↑ 2)


90th percentile = 113 kg (i = 16(0.9) = 14.4 ↑ 15)
(d) Use X to denote the body weight before joining the program, Y to denote the body weight after
joining the program,
Y = 0.9X
Sample mean of Y = 0.9(sample mean of X) = 0.9(96.125) = 86.513 kg
Sample standard deviation of Y = 0.9(Sample standard of X) = 0.9(10.782) = 9.704 kg
(e) Use W to denote the body weight after joining the program in lb,
W = 2.2Y
Sample mean of W = 2.2(sample mean of Y) = 2.2(86.513) = 190.329 lb
Sample standard deviation of W = 2.2(Sample standard of Y) = 2.2(9.704) = 21.349 lb

7
AS 2023-24

8.
16+25+⋯+260
(a) Sample mean = = US$ 141.1
10

k = 185
(b) Sample standard deviation = US$ 80.0617 (from calculator)
2 2
Sample standard variance = (80.0617) = (US$) 6409.8778
(c) First quartile = US$ 75 (i = 10(0.25) = 2.5 ↑3)
150+185
Median = = US$ 167.5 (i = 10(0.5) = 5)
2

Third quartile = US$ 190 (i = 10(0.75) = 7.5 ↑8)


(d) Use X to denote the selling price of a headphone and Y to denote the payment of ordering the
headphone
(i) Y = 7.75X + 35
(ii) Sample Mean of Y = 7.75 (sample mean of X) + 35 = 7.75(141.1) + 35 = HK$ 1128.53
Sample standard deviation of Y = 7.75 (sample standard deviation of X)
= 7.75(80.0617) = HK$ 620.4782

9.
(a) population mean
(b) population standard deviation
(c) population variance
(d) sample mean
(e) sample variance
(f) sample standard deviation
(g) interquartile range
(h) first quartile
(i) second quartile / median
(j) third quartile

8
AS 2023-24

Chapter 3 – Probability

1.
Number of trips 1 2 3 4 5 6
Probability 0.2087 0.4320 0.2718 0.0583 0.0194 0.0097

2.
21
(a) “Los Angeles” is the most popular choice of pizza among all customers. 35% (60 × 100%)

customers like “Los Angeles” pizza.


(b) “Hawaiian” is the most popular choice of pizza among customers aged between 18 and 25.
7
of those customers aged between 18 and 25 like "Hawaiian" pizza.
22

(c) “Los Angeles” is the most popular choice of pizza among customers aged between 26 and 45.
The probability that a randomly selected customer aged between 26 and 45 like “Los Angeles”
10
pizza = = 0.5
20

(d) “Los Angeles” is the most popular choice of pizza among customers aged 46 or above.
8
44.44% (18 × 100%) customers aged 46 or above like “Los Angeles” pizza.

3.
(a)
Whole Company Number of business trips goes for in a month, X
x 0 1 2 3 4 5
Probability 0.3529 0.2721 0.2647 0.0588 0.0368 0.0147

(b)
Marketing department Number of business trips goes for in a month, X
x 0 1 2 3 4 5
Probability 0.2439 0.3659 0.1951 0.0976 0.0488 0.0488

(c)
Finance department Number of business trips goes for in a month, X
x 0 1 2 3 4 5
Probability 0.1860 0.2326 0.4186 0.0930 0.0698 0

(d)
Research department Number of business trips goes for in a month, X
x 0 1 2 3 4 5
Probability 0.5769 0.2308 0.1923 0 0 0

9
AS 2023-24

4. Regenerate the contingency table from the given information:

Male Female Total


Full-time student 40 80 120
Full-time employment 180 170 350
Part-time employment 20 60 80
Self-employed 10 40 50
Total 250 350 600

120
(a) P(full-time student) = = 0.2
600

350
P(full-time employment) = = 0.5833
600

80
P(part-time employment) = = 0.1333
600

50
P(self-employed) = = 0.0833
600

40
(b) P(full-time student | male) = = 0.16
250

180
P(full-time employment | male) = = 0.72
250

20
P(part-time employment | male) = = 0.08
250

10
P(self-employed | male) = = 0.04
250

80
(c) P(full-time student | May is a female) = = 0.2286
350

170
P(full-time employment | May is a female) = = 0.4857
350

60
P(part-time employment | May is a female) = = 0.1714
350

40
P(self-employed | May is a female) = = 0.1143
350

10
AS 2023-24

5.
How often do you dinner with Elderly Teenager Total
your family?
Seldomly 210 240 – 210 = 30 240
Sometimes 180 350 – 180 = 170 350
Quite Often 60 – 40 = 20 40 60
Always 250 – 160 = 90 160 250
Total 500 400 900

250
(a) P(always have dinner with family) = = 0.2778
900

210
(b) P(seldomly have dinner with family | elderly) = = 0.42
500

90
(c) P(always have dinner with family | elderly) = = 0.18
500

30
(d) P(teenager | seldomly have dinner with family) = 240 = 0.125

90
(e) P(elderly | always dinner with family) = = 0.36
250

11
AS 2023-24

Chapter 4 – Probability Distribution

1. The whole sample has 150 customers, there are 66 customers purchasing 4 cans of cola, 60
customers purchasing 6 cans of cola, and 24 customers purchasing 9 cans of cola.
66
P(purchasing 4 cans of cola) = = 0.44
150

60
P(purchasing 6 cans of cola) = = 0.4
150

24
P(purchasing 9 cans of cola) = = 0.16
150

The probability distribution function of X (X: number of cans of cola purchased)

x 4 6 9
P(X = x) 0.44 0.4 0.16

2.
(a) As total probability = 1, k + k + 0.4 + 0.15 + 2k + 2k = 1, k = 0.075
(b) Most likely, a secondary school student attends 2 tutorial classes in a week, with the probability
of 0.4.
(c) P(X  3) = p(3) + p(4) + p(5) = 0.15 + 2(0.075) + 2(0.075) = 0.45
(d) P(X ≤ 4) = p(0) + p(1) + p(2) + p(3) + p(4) = (0.075) + (0.075) + 0.4 + 0.15 + 2(0.075) = 0.85

3.

(a) 4k + 10k + 3k + 2k + k = 1, k = 0.05


(b) Most likely, Alex can sell 1 notebook in a day. The probability of selling 1 notebook in a day
is 10(0.05) = 0.5.
(c) P(X > 3) = p(4)= 0.05

4.
(a) Most likely, there are 2 requests in a day, with the probability of 0.3.
(b) P(X < 2) = 0.20 + 0.05 = 0.25
(c) P(X > 2) = 0.25 + 0.15 + 0.05 = 0.45
(d) In order to have the probability of turning a request away to be less than 0.1, the center needs to
have the capacity to serve 4 customers, P(X > 4) = 0.05 < 0.1. So the capacity must be
increased by 2.

12
AS 2023-24

5.
(a) For male employees,
Expectation = E(X) = 0(0.3) + 1(0.29) + 2(0.24) + 3(0.12) + 4(0.03) + 5(0.02) = 1.35 days
Variance = Var(X) = 02 (0.3) + 12 (0.29) + 22 (0.24) + 32 (0.12) + 42 (0.03) + 52 (0.02) − 1.352
= 1.4875 (days2)
Standard deviation = (X) = √𝑉𝑎𝑟(𝑋) = 1.2196 days

(d) For female employees


P(Y=0) = 1 - 0.32 - 0.34 - 0.06 - 0.04 - 0.03 = 0.21
Expectation = E(Y) = 0(0.21) + 1(0.32) + 2(0.34) + 3(0.06) + 4(0.04) + 5(0.03) = 1.49 days
Variance = Var(Y) = 02 (0.21) + 12 (0.32) + 22 (0.34) + 32 (0.06) + 42 (0.04) + 52 (0.03) − 1.492
= 1.3899 (days2)
Standard deviation = (Y) = √𝑉𝑎𝑟(𝑌) = 1.1789 days

6.
(a) As each 0 ≤ p(x) and ∑ 𝑝(𝑥) = 1,
∴ 0  m  0.3

(b) When m = 0.2,


E(X) = (-10000)(0.15) + (-2000)(0.1) + 1000(0.2) + 5600(0.25) +20000(0.2) + 25000(0.1)
= $6400
Var(X) = E(X2) - E(X) 2 = (-10000) 2 (0.15) + (-2000) 2 (0.1) + 10002 (0.2) + 56002 (0.25)
+200002 (0.2) + 250002 (0.1) - 64002 = 124980000 ($2)
σ(X) = √𝑉𝑎𝑟(𝑋) = $11179.45

(c) Set E(X) = 6900


(-10000)(0.15) + (-2000)(0.1) + 1000(0.2) + 5600(0.25) + 20000(m) + 25000(0.3 - m) = 6900
m = 0.1

(d) The expected profit is maximum when m = 0,


E(X) = (-10000)(0.15) + (-2000)(0.1) + 1000(0.2) + 5600(0.25) + 25000(0.3) = $7400

13
AS 2023-24

7.
(a) k = 1 – 0.2 – 0.5 – 0.2 = 0.1
(b) E(X) = 8000(0.2) + 9000(0.5) + 12000(0.2) + 18000(0.1) = $10300
Var(X) = 80002(0.2) + 90002(0.5) + 120002(0.2) + 180002(0.1) – 103002 = 8410000 ($2)
(c) E(profit) = E(revenue – Cost) = E(revenue) – E(Cost) = 10300 – 10000 = $300

8.
(a) As total probability = 1, 2a + a + a + 3a + 8a = 1, a = 1/15
(b) E(X) = 0(2/15) + 1(1/15) + 2(1/15) + 3(3/15) + 4(8/15) = 2.9333 orders,
Var(X) = 02(2/15) + 12 (1/15) + 22 (1/15) + 32 (3/15) + 42 (8/15) – 2.93332 = 2.0624
(X) = √2.0624 =1.4361 orders
(c) Y = 150+ 80X,
E(Y) = 150 + 80E(X) = $384.664,
(Y) = 80(X) = $114.888

1
9. As total probability = 1, a + a + 3a + 3a + 2a + 2a = 1, a = 12

x 11 12 13 14 15 16
P(X=x) 1 1 3 3 2 2
12 12 12 12 12 12

1 1 3 3 2 2
(a) E(X) = 11(12) + 12(12) + 13(12) + 14(12) + 15 (12) + 16(12) = 13.8333 customers

1 1 3 3 2 2
Var(X) = 112(12) + 122(12) + 132(12) + 142(12) + 152(12) + 162(12) - 13.83332 = 2.1398

σ(X) = √2.1398 = 1.4628 customers


(b) Denote the daily income as Y, then Y = 300 + 75X
E(Y) = 300 + 75E(X) = $1337.50
σ(Y) = 75σ(X) = $109.71

10. With profit = revenue – cost


(a) Y = 10500X – (1020)(7.8)X - 520,
Y = 2544X - 520
(b) E(Y) = 2544E(X) - 520 = 2544(4.6) - 520 = $11182.4
σ(Y) = 2544σ(X) = 2544(1.1) = $2798.4

14
AS 2023-24

11.
(a) As total probability = 1, k + 2k + 4k + 2.5k + 0.5k = 1, 10k = 1, k = 0.1
Probability that there is no complaint is 0.1
(b) E(X) = 0(0.1) + 1(0.2) + 2(0.4) + 3(0.25) + 4(0.05) = 1.95 complaints
Var(X) = 02(0.1) + 12(0.2) + 22(0.4) + 32(0.25) + 42(0.05) – 1.952 = 1.0475
(X) = 1.0235 complaints
(c) Use Y to represent number of hours needed for handling complaints in a day
Y = 1.5X
E(Y) = 1.5 E(X) = 2.925 hours
(Y) = 1.5 (X) = 1.5(1.0235) = 1.5353 hours
(d) P(Y > 5) = P(1.5X > 5) = P(X > 3.33) = 0.5(0.1) = 0.05

12. Use X to denote the number of regular body check-up, Y to denote the number of dental check-
up, and T to denote the total number of check-up.
T=X+Y
E(T) = E(X) + E(Y) = 1.7 + 1.05 = 2.75 times
Var(T) = Var(X) + Var(Y) = 0.822 + 0.42 2 = 0.8488
σ(T) = √0.8488 = 0.9213 times

13.
(a) E(X) = 1(0.02) + 2(0.19) + 3(0.34) + 4(0.35) + 5(0.07) + 6(0.02) + 7(0.01) = 3.36 days
Var(X) = 12(0.02) + 22 (0.19) + 32 (0.34) + 42 (0.35) + 52 (0.07) + 62 (0.02) + 72 (0.01) - 3.362
= 1.1104
σ(X) = 1.0538 days
(b) Use T to denote the total number of visits to the two theme parks
T=X+Y
E(T) = E(X) + E(Y) = 3.36 + 3.67 = 7.03 days
Var(T) = Var(X) + Var(Y) = 1.1104 + 1.282 = 2.7488
σ(T) = 1.6580 days

15
AS 2023-24

14.
(a) As total probability = 1,
0.1 + k + 3k + 4k + k + k = 1, k = 0.09
(b) E(X) = 3(0.1) + 4(0.09) + 5(0.27) + 6(0.36) + 7(0.09) + 8(0.09) = 5.52 jobs
Var(X) = 32(0.1) + 42(0.09) + 52(0.27) + 62(0.36) + 72(0.09) + 82(0.09) - 5.522 = 1.7496
(X) = √1.7496 = 1.3227 jobs
(c) P(at least $15000 commission) = P(X ≥ 6) = 6k = 0.54
(d) Let Y be the monthly commission, Y = 2500X
E(Y) = 2500E(X) = $13800
(Y) = 2500(X) = $3306.75
(e) With Y as the monthly commission from the advertising firm, W as the monthly earning from
the design school, and T as the total monthly income
T=Y+W
E(T) = E(Y) + E(W) = 13800 + 18000 = $31800
Var(T) = Var(Y) + Var(W) = 3306.752 + 55002 = 41184595.56
(T) = √41184595.56 = $6417.52

15.
(a) Use X to denote the number of referred cases in a week
Tim provides consultation to four new students in a week and the chance to refer a case is 0.3,
n = 4, p = 0.3, X ~ Bin(4, 0.3)
(b) With p(x) = 4Cx(0.3)x(0.7)(4-x), for x = 0, 1, 2, 3, 4

x 0 1 2 3 4
p(x) 0.2401 0.4116 0.2646 0.0756 0.0081

(c) Most likely, 1 case will be referred to the senior social worker in a week, with probability =
0.4116
(d) E(X) = 0(0.2401) + 1(0.4116) + 2(0.2646) + 3(0.0756) + 4(0.0081) = 1.2 cases
Var(X) = 02(0.2401) + 12(0.4116) + 22(0.2646) + 32(0.0756) + 42(0.0081) – 1.22 = 0.84
(X) = √0.84 = 0.9165 cases

Another way to find the expectation and standard deviation for Binomial variable is
E(X) = np = 4(0.3) = 1.2 cases
Var(X) = np(1-p) = 4(0.3)(0.7) = 0.84
(X) = √np(1 − p) = √4(0.3)(0.7) = 0.9165 cases

16
AS 2023-24

16.
(a) X be the number of customers in the line would choose type A coffee
There are 12 customers in the line and the chance for each customer to choose type A coffee is
0.75,
n = 12, p = 0.75 , X ~ Bin(12, 0.75).
(b) E(X) = 12(0.75) = 9 customers,
(c) With E(X) = 9, most likely nine customers among 12 would choose type A coffee.
(d) P(X = 7) = 12C7(0.75)7(0.25)5 = 0.1032
P(X = 8) = 12C8(0.75)8(0.25)4 = 0.1936
P(X = 9) = 12C9(0.75)9(0.25)3 = 0.2581
P(X = 10) = 12C10(0.75)10(0.25)2 = 0.2323
P(X = 11) = 12C11(0.75)11(0.25)1 = 0.1267

17.
(a) The variable is the number of habitual smokers in the sample.
Let X bet the number of habitual smokers. There are 20 teenagers and the chance for each one
to be a habitual smoker is 0.1, X ~ Bin(20, 0.1)
P(X = 5) = 20C5(0.1)5(0.9)15 = 0.0319
(b) 18 non-habitual smokers = 20 – 18 = 2 habitual smokers
P(X = 2) = 20C2(0.1)2(0.9)18 = 0.2852
(c) P(X ≥ 3) = 1– P(X = 0) – P(X = 1) – P(X = 2)
= 1 – (0.9)20 – 20C1(0.1)1(0.9)19 – 20C2(0.1)2(0.9)18 = 0.3231

18
(a) The variable is the number of defective light bulb in the sample
Let X be the number of defective light bulb. There are 20 light bulbs and the chance for each
light bulb to be defective is 0.03, X ~ Bin(20, 0.03)
E(X) = np = 20(0.03) = 0.6 defective light bulb
With E(X) = 0.6 is not an integer, we compare P(X = 0) and P(X = 1) to find out the mode.
P(X = 0) = (0.97)20 = 0.5438
P(X = 1) = 20C1(0.03)1(0.97)19 = 0.3364
As P(X = 0) > P(X = 1), most likely, there is no defective light bulb in the sample.
(b) P(X = 2) = 20C2(0.03)2(0.97)18 = 0.0988
(c) P(X  3) = 1 – P(X = 0) – P(X = 1) – P(X = 2)
= 1 – (0.97)20 - 20C1(0.03)1(0.97)19 - 20C2(0.03)2(0.97)18 = 0.0210

17
AS 2023-24

19.
(a) The variable is the number of visitors can get a souvenir.
Let X be the number of visitors can get a souvenir. There are 25 visitors in a day would play
the lucky draw and the chance for each visitor to get a souvenir is 0.7, X ~ Bin(25, 0.7)
E(X) = 25(0.7) = 17.5 visitors
σ(X) = √25(0.7)(1 − 0.7) = 2.2913 visitors
(b) With E(X) = 17.5 is not an integer, we compare P(X = 17) and P(X = 18) to find out the mode.
P(X = 17) = 25C17(0.7)17(0.3)8 = 0.1651
P(X = 18) = 25C18(0.7)18(0.3)7 = 0.1712
As P(X = 17) < P(X = 18), most likely, 18 visitors can get a souvenir in a day.
(c) P(X > 22) = p(23) + p(24) + p(25) = 25C23(0.7)23(0.3)2 + 25C24(0.7)24(0.3)1 + (0.7)25
= 0.0090

20. The variable is the number of presents she will get.


Let X be the number of presents she will get. There are 3 rounds of lucky draw and the chance
for each round she will get a present is 0.15, X ~ Bin(3, 0.15)
(a)
x 0 1 2 3
P(X = x) (i) 0.6141 (ii) 0.3251 (iii) 0.0574 (iv) 0.0034

(i) P(X = 0) = 0.853 = 0.6141


(ii) P(X = 1) = 3C1(0.15)1(0.85)2 = 0.3251
(iii) P(X = 2) = 3C2(0.15)2(0.85)1 = 0.0574
(iv) P(X = 3) = (0.15)3 = 0.0034
(b) E(X) = 3(0.15) = 0.45
for Y be the value of all presents she will get, Y = 40X,
E(Y) = 40E(X) = $18

18
AS 2023-24

21. Let Y be the number of patients who can recover within one week, Y ~ Bin(22, 0.65).
(a) E(Y) = 22(0.65) = 14.3 patients
P(Y = 14) = 22C14 (0.65)14 (0.35)8 = 0.1730
P(Y = 15) = 22C15 (0.65)15 (0.35)7 = 0.1714
With E(Y) = 14.3, mostly likely there will be about 14 or 15 patients can recover within one week.
According to the above calculation, most likely, there will be 14 patients can recover within one
week, which probability is 0.1730.
(b) P(at most 20 recover) = P(Y ≤ 20) = 1 – P(Y = 21) – P(Y = 22)
= 1 – 22C21 (0.65)21 (0.35)1 – 22C22 (0.65)22 (0.35)0 = 0.9990
(c) Use W to denote the total allowance. With Y as the number of affected residents recover within
one week, there are the other (22 – Y) residents take more than one week to recover.
So that, W = 800Y + 1200(22 – Y) = 26400 – 400Y
E(W) = 26400 – 400E(Y) = 26400 – 400(14.3) = $20680

22.
(a) Use X to denote the number of graduates going to UK in a briefing session. With there are 15
graduates and the chance of each of them to go UK is 0.7, X ~ Bin(15, 0.7)
E(X) = 15(0.7) = 10.5 persons
σ(X) = √15(0.7)(0.3) = 1.7748 persons
(b) With E(X) = 10.5, Most likely, there would be 10 or 11 persons going to UK.
P(X = 10) = 15C10(0.7)10(0.3)5 = 0.2061
P(X = 11) = 15C11(0.7)11(0.3)4 = 0.2186
So, most likely 11 graduates in a briefing session will go to UK, with probability 0.2186.
(c) P(at least one graduate going to USA and at least one graduate going to UK)
= P(1 ≤ X ≤ 14)
= 1 – P(X = 0) – P(X = 15)
= 1 – (0.3)15 – (0.7)15 = 0.9953
(d) Use Y to denote the service charge collected in a briefing session.
With X as the number of graduates would go to UK and the service charge for each of them is
$2200, the other (15 - X) graduates would go to USA and the service charge for each of them is
$1800,
Y = 2200X + 1800(15 – X) = 27000 + 400X
E(Y) = 27000 + 400E(X) = 27000 + 400(10.5) = $31200
σ(Y) = 400σ(X) = 400(1.7748) = $709.92

19
AS 2023-24

23.
(a) The variable is the number of customers who will order breakfast B. There are 6 customers
and the probability for each customer ordering breakfast B is 1 – 0.8 = 0.2.
n = 6, p = 0.2
(b) With E(X) = np = 6(0.2) = 1.2, most likely there is 1 or 2 customers will order breakfast B
P(X = 1) = 6C1 (0.2)1 (0.8)5 = 0.3932
P(X = 2) = 6C2 (0.2)2 (0.8)4 = 0.2458
So, most likely, there will be 1 customer ordering breakfast B. This probability is 0.3932.
(c) P(X ≥ 2) = 1 – P(X = 0) – P(X = 1) = 1 – (0.8)6 - 6C1(0.2)1 (0.8)5 = 0.3446
(d) With X be the number of customers ordering breakfast B, (6 – X) would be the number of
customers ordering breakfast A. Use W to denote the revenue, W = 45(6 – X) + 38X,
W = 270 – 7X
E(W) = 270 – 7E(X) = 270 – 7(1.2) = $261.6
σ(W) = 7σ(X) = 7√6(0.2)(0.8) = $6.8586

20
AS 2023-24

Chapter 5 – Normal Distribution

1.
(a) 0.4772
(b) P(Z < 0) + P(0 < Z < 1.86) = 0.5 + 0.4685 = 0.9685
(c) 0.0948
(d) P(-0.24 < Z < 0) + P(0 < Z < 2.40) = 0.0948 + 0.4918 = 0.5866
(e) P(-1.79 < Z < 0) – P(-1.30 < Z < 0) = 0.4633 – 0.4032 = 0.0601
(f) P(Z < 0) – P(-1.58 < Z < 0) = 0.5 – 0.4429 = 0.0571

2.
(a) 0.92
(b) 0.39 as P(0 < Z < 0.39) ~ 0.15 from table
(c) -0.93 as P(-0.93 < Z < 0) ~ 0.325 from table
(d) -1.04 as P(-1.04 < Z < 0) ~ 0.35 from table
(e) 0.39 as P(0 < Z < 0.39) ~ 0.15 from table
(f) 0.61 as P(0 < Z < 0.61) ~ 0.2284 from table

3. Let X be the length of waiting time a patient waits in Dr. Chan’s waiting room. ( )
X ~ N 14,42 .
 20 − 14 
(a) P( X  20) = P Z   = P(Z  1.5) = 0.5 − 0.4332 = 0.0668
 4 
10−14
(b) P(X > 10) = P(Z > ) = P(Z > -1) = 0.5 + 0.3413 = 0.8413
4

X ~ N 1200, 9800  .
2
4. Let X be the total weight of eight people.
 

(a) P( X  1300) = P(Z  1.01) = 0.5 − 0.3438 = 0.1562

(b) P( X  1500) = P(Z  3.03) = 0.5 − 0.4988 = 0.0012

21
AS 2023-24

5. Let X be the number of minutes a flight being delay, X ~ N(0, 102)


(a) For a flight arrives before 18:00, that means X < -10,
−10−0
P(X < -10) = P(Z < ) = P(Z < -1) = 0.5 – 0.3413 = 0.1587
10

(b) For the passenger to be late, that means X > 20


20−0
P(X > 20) = P(Z > ) = P(Z > 2) = 0.5 – 0.4772 = 0.0228
10

6. Let X be the score, X ~ N(66.5, 12.62)


78-66.5
(a) P(X > 78) = P(Z > ) = P(Z > 0.91) = 0.5 – 0.3186 = 0.1814 = 18.14%
12.6
(b) Let k be the minimum score to get A,
P(X > k) = 0.117
P(66.5 < X < k) = 0.5 – 0.117 = 0.383
As P(0 < Z < 1.19) = 0.383 from table
𝑘−66.5
= 1.19
12.6

k = 66.5 + 12.6(1.19) = 81.49 marks

7. Let X be the amount of a customer spending on a single visit to Park & Save supermarkets and K
(
be the minimum amount for which credit card may be used. X ~ N 75,212 . )
P(X > K) = 0.8
P(K < X < 75) = 0.8 – 0.5 = 0.3
As P(-0.84 < Z < 0) = 0.3 from table
𝐾 − 75
= −0.84
21
K = 75 + (-0.84)(21) = 57.36

22
AS 2023-24

8. X ~ N(5200, 7402)
P(5200 - k < X < 5200 + k) = 0.95
0.95
P(5200 < X < 5200 +k) = = 0.475
2

As P(0 < Z < 1.96) = 0.475 from table


5200+𝑘−5200
= 1.96
740

k = 1.96(740) = 1450.4

9. Let X be the spending of an online order, X ~ N(360, 802)


(a) P(X < 428) = P(Z < 0.85) = 0.5 + 0.3023 = 0.8023
(b) P(360 - M < X < 360 + M) = 0.85
0.85
P(360 < X < 360 + M) = 2
= 0.425

As P(0 < Z < 1.44) = 0.425 from table


360+𝑀−360
= 1.44
80

M = 1.44(80) = 115.2

10. Let X (in minutes) be the journey time on a ride, X ~ N(30, 52)
20−30
(a) P(X > 20) = P(Z > ) = P(Z > -2) = 0.5 + 0.4772 = 0.9772
5

(b) P(X < k) = 0.937


P(30 < X < k) = 0.937 – 0.5 = 0.437
As P(0 < Z < 1.53) = 0.437 from table
𝑘−30
5
= 1.53,

k = 30 + 1.53(5) = 37.65

23
AS 2023-24

11. Let X be the monthly salary of an employee, X ~ N(12000, 1000 2 )


Let Y be the monthly salary of an employee after the salary adjustment.
With Y = (1.05)X
(a) Mean of Y = 1.05(12000) = $12600
Standard deviation of Y = 1.05(1000) = $1050
(b) P(Y < k) = 0.15
P(k < Y < 12600) = 0.5 – 0.15 = 0.35
As P(-1.04 < Z < 0) = 0.35 (from table)
𝒌−𝟏𝟐𝟔𝟎𝟎
= -1.04
𝟏𝟎𝟓𝟎

k = 12600 - (1.04)(1050) = 11508

12. Let X be the monthly revenue of the café, X ~ N(45000, 80002)


35000−45000 41800−45000
(a) P(35000 < X < 41800) = P( <Z< ) = P(-1.25 < Z < -0.4)
8000 8000

= 0.3944 – 0.1554 = 0.239


(b) P(X < K) = 0.67,
P(45000 < X < K) = 0.67 – 0.5 = 0.17,
As P(0 < Z < 0.44) = 0.17 (from table)
𝐾−45000
= 0.44,
8000

K = 48520
(c) Use M to denote May’s monthly salary, M = 7000 + 0.3X
E(M) = 7000 + (0.3)E(X) = 20500
Var(M) = 0.32 Var(X) = 5760000
17000−20500
(d) P(M < 17000) = P(Z < ) = P(Z < -1.46) = 0.5 – 0.4279 = 0.0721
√5760000

24
AS 2023-24

13. Let X be the delivery cost of a small package, X ~ N(60, 122)


40−60
(a) P(X ≥ 40) = P(Z > ) = P(Z > -1.67) = 0.5 + 0.4525 = 0.9525
12

(b) P(X > M) = 0.879


P(M < X < 60) = 0.879 – 0.5 = 0.379
P(-1.17 < Z < 0) = 0.379 from table
M = 60 + 12(-1.17) = 45.96
(c) Y = 200 + 1.8X
Mean of Y = 200 + 1.8(60) = $308
Standard deviation of Y = 1.8(12) = $21.6
(d) P(A package will be charged more by the new system)
350−308
= P(Y < 350) = P(Z < ) = P(Z < 1.94) = 0.5 + 0.4738 = 0.9738
21.6

14. Let X be the monthly salary, X ~ N(15000, 5002)


16000−15000
(a) P(X > 16000) = 𝑃 (𝑍 > ) = P(Z > 2) = 0.5 - 0.4772 = 0.0228
500

(b) P(X < t) = 0.85


P(15000 < X < t) = 0.85 – 0.5 = 0.35
P(0 < Z < 1.04) = 0.35 (from table)
𝑡−15000
= 1.04
500

t = 15520
(c) Let M be the adjusted salary, M = 1.1X
E(M) = (1.1)E(X) = 16500
ơ(M) = (1.1) ơ(X) = 550
17000−16500
(d) P(M > 17000) = 𝑃 (𝑍 > 550
) = P(Z > 0.91) = 0.5 - 0.3186 = 0.1814

So the statement is incorrect.

25
AS 2023-24

15. Use X to denote the budget a customer would be willing to spend on a 5 days tour to Korea,
X ~ N(8000, 12002)
7000−8000
(a) P(X ≥ 7000) = P (𝑍 ≥ )= P(Z ≥ -0.83) = 0.5 + 0.2967 = 0.7967
1200

(b) P(X ≥ K) = 0.85


P(K < X < 8000) = 0.85 – 0.5 = 0.35
P(-1.04 < Z < 0) = 0.35 from table
K−8000
= -1.04, K = 6752
1200

(c) Use Y to denote the budget a customer would be willing to spend on a 5 days tour to Japan,
Y = 1.2X
E(Y) = 1.2(8000) = $9,600
σ(Y) = 1.2(1200) = $1440
(d) P(L1 < Y < L2) = 0.92
0.92
P(L1 < Y < 9600) = = 0.46 = P(9600 < Y < L2)
2

P(-1.75 < Z < 0) = 0.46 = P(0 < Z < 1.75) from table
L1 = 9600 – 1.75(1440) = 7,080
L2 = 9600 + 1.75(1440) = 12,120

16. Use X to denote the monthly income earned by selling wallet, Y to denote the monthly income
earned by selling earring, T to denote the total monthly income.
T=X+Y
(a) E(T) = E(X) + E(Y) = 20000 + 15000 = $35000
Var(T) = Var(X) + Var(Y) = 50002 + 33002 = 35890000
σ(T) = √35890000 = $5990.83
30000−35000
(b) P(T < 30000) = P(Z < 5990.83
) = P(Z < -0.83) = 0.5 – 0.2967 = 0.2033

With the probability of earning less than $30000 in a month is 0.2033, which is less than 0.3,
May should not quit the business.

26
AS 2023-24

17. Let X be the lifetime of a battery, X ~ N(5400, 402)


Let 𝑇 be the total lifetime of 2 batteries,
T = X1 + X2, T ~ N(5400 + 5400, 402 + 402)
(a) Mean of T = 10800 hours
Standard deviation of T = √𝟒𝟎𝟐 + 𝟒𝟎𝟐 = 56.5685 hours
𝟏𝟎𝟗𝟎𝟎−𝟏𝟎𝟖𝟎𝟎
(b) P(T < 10900) = P(Z < ) = P(Z < 1.77) = 0.5 + 0.4616 = 0.9616
𝟓𝟔.𝟓𝟔𝟖𝟓

18. Let X be the waiting time and Y be the treatment time,


X ~ N(10, 42)
Y ~ N(55, 52)
5−10
(a) P(X < 5) = P(Z < ) = P(Z < -1.25) = 0.5 – 0.3944 = 0.1056
4

(b) T = X + Y,
2
T ~ N(10 + 55, 42 + 52) , T ~ N(65, 41) i.e. T ~ N(65, √41 )

60−65
(c) P(T < 60) = P(Z < ) = P(Z < -0.78) = 0.5 - 0.2823 = 0.2177
√41

19. Use X to denote time he spends on collecting post and Y to denote time he spends on delivering
post,
X ~ N(40, 122)
Y ~ N(65, 82)
Let T be the total time he spends on two jobs,
2
E(T) = 40 + 65, Var(T) = 122 + 82 = 208, i.e. T ~ N(105, √208 )

120−105
(a) P(T < 120) = P (Z < ) = P(Z < 1.04) = 0.5 + 0.3508 = 0.8508
√208

(b) P(T > M) = 0.1


P(105 < T < M) = 0.5 – 0.1 = 0.4
As P(0 < Z < 1.28) = 0.4 from table
M = 105 + 1.28(√208) = 123.46

27
AS 2023-24

20. Use X to denote the total traveling time, X ~ N(15, 2 2)


(a) P(X > 12) = P( Z > -1.5) = 0.5 + 0.4332 = 0.9332
(b) P(X < k) = 0.625
P(15 < X < k) = 0.625 – 0.5 = 0.125
With P(0 < Z < 0.32) = 0.125 from table
k = 15 + 2 (0.32) = 15.64
(c) E(T) = 15 + 22 = 37 minutes
Var (T) = 22 + 52 = 29
𝜎(T) = √29 = 5.3852 minutes
35−37
(d) P(T > 35) = P (Z > ) P(Z > -0.37) = 0.1443 + 0.5 = 0.6443
√29

21. Let T be the weight of a special set, T~N(85 + 85, 49 + 49); T ~ N(170, 98)
(a) expectation = 85 + 85 = 170 grams
median = expectation = 170 grams
standard deviation = √49 + 49 = 9.8995
160−170
(b) 𝑃(𝑇 < 160) = 𝑃 (𝑍 < ) = 𝑃(𝑍 < −1.01) = 0.5 − 0.3438 = 0.1562
√98

28
AS 2023-24

Chapter 6 – Sampling Distribution and Central Limit Theorem

42
1. For X ~ N(60, 42), X ~ N(60, )
𝑛

(a) Mean of sample mean E(𝑋̅) = 60


4
Standard error of sample mean SE(𝑋̅ ) = =1.7889
√5

(b) Mean of sample mean E(𝑋̅) = 60


4
Standard error of sample mean SE(𝑋̅ ) = =1.2649
√10

(c) Mean of sample mean E(𝑋̅) = 60


4
Standard error of sample mean SE(𝑋̅ ) = =1.0328
√15

2. Let X be the weight of a can of soup and X be the sample mean weight of 6 cans of soup
As  = 375, (X) = 4, and n = 6, then
(a) Mean of sample mean E(𝑋̅) = 375 grams
4
(b) Standard error of sample mean SE(𝑋̅) = = 1.6330 grams
√6

3. Let X be the weight of a luggage, X ~ N(24, 52),


52
For n = 5, the average weight of 5 luggage, X ~ N(24, 5 )

(a) expectation = 24 kg,


5
(b) standard error = = 2.2361 kg
√5

4. Let X be the lifetime, X ~ N(8200, 50 2 )


502
For n = 40, the average lifetime of 40 batteries, X ~ N(8200, 40 )

(a) expectation = 8200 hours,


50
(b) standard error = = 7.9057 hours
√40

29
AS 2023-24

5. Let p be the population proportion of defective item, p = 0.09,


0.09(0.91)
For n = 80, 𝑝̂ ~ 𝑁(0.09, )
80

(a) expectation E(𝑝̂ ) = 0.09


0.09(0.91)
(b) standard error SE(𝑝̂ ) = √ = 0.0320
80

6. Let p be the proportion of left-handed resident, p = 0.2


0.2(0.8)
For n = 75, 𝑝̂ ~ 𝑁 (0.2, 75
)

(a) expectation E(𝑝̂ ) = 0.2


0.2(0.8)
(b) standard error SE(𝑝̂ ) = √ = 0.0462
75

7. Let p be the proportion of customers would not pay the bill by monthly instalment if the credit
amount is less than $10,000, p = 0.7
0.7(0.3)
For n = 40, 𝑝̂ ~ 𝑁 (0.7, )
40

(a) expectation of sample proportions E(𝑝̂ ) = 0.7

0.7  0.3
(b) standard error of sample proportions SE(𝑝̂ ) = =0.0725
40

8. Let p be the proportion of passengers refuse the invitation, p = 0.87


0.87(0.13)
For n = 350, 𝑝̂ ~ N(0.87, )
350

(a) expectation E(𝑝̂ ) = 0.87

0.87  0.13
(b) standard error SE(𝑝̂ ) = =0.01798
350

30
AS 2023-24

Chapter 7 – Estimation

1. For the estimation of the population mean amount spent for textbook
Given  = 35, n = 75, 𝑥̅ = $158.3, z0.05 = 1.645
(a) point estimate: $158.30
35
(b) Sampling error at 90% confidence level = 1.645× = $6.6482
√75

35 35
(c) 90% C.I. = (158.3 − 1.645 × , 158.3 + 1.645 × ) = $(151.6518, 164.9482)
√75 √75

2. For the estimation of the population mean lifetime of a light bulb


Given  = 100, n = 80, 𝑥̅ = 3500 hours, z0.025 = 1.96
(a) Point estimate: 3500 hours
100
(b) Sampling error at 95% confidence level = 1.96× = 21.9135 hours
√80

100 100
(c) 95% C.I. = (3500 − 1.96 × , 3500 + 1.96 × ) = (3478.0865, 3521.9135) hours
√80 √80

3. For the estimation of the population mean amount of paint in a one-gallon can
Given  = 0.02, n = 50, 𝑥̅ = 0.995 gallon, z0.005 = 2.575
Point estimate: 0.995 gallon
0.02
Sampling error at 99% confidence level = 2.575× = 0.0073 gallon
√50

0.02 0.02
99% C.I. = (0.995 − 2.575 × , 0.995 + 2.575 × ) = (0.9877, 1.0023) gallon
√50 √50

4. For the estimation of the population mean time required for one seminar
Given 𝑥̅ = 75, 𝜎 = 10, 𝑛 = 40, 𝑧0.025 = 1.96
point estimate = 75 minutes
10
sampling error at 95% confidence level = 1.96 = 3.099 minutes
√40

95% C.I. for μ = (75 - 3.099, 75 + 3.099) = (71.901, 78.099) minutes

31
AS 2023-24

5. For the estimation of the population mean spending


Given 𝜎 = 5500, 𝑛 = 25, 𝑧0.01 = 2.33
5500
(a) 98% sampling error = 2.33× = $2563
√25

(b) With 𝑥̅ = $51000 (from calculator)


98% C.I. = (51000 − 2563, 51000 + 2563) = $(48437, 53563)

6.
83+58+⋯+108
(a) 𝑥̅ = = 75.6923 hours
13

(83−75.6923)2 + (58−75.6923)2+⋯+(108−75.6923)2
s= √ = 14.5395 hours
13−1

(b) For the estimation of the population mean monthly working hours of part-time worker
𝑥̅ = 75.6923, s = 14.5395, n = 13, t(12, 0.025) = 2.179,
14.5395 14.5395
95% C.I. = (75.6923 − 2.179 × , 75.6923 + 2.179 × )
√13 √13

= (66.9054, 84.4792) hours

7. For the estimation of the population mean number of hours of television watched per week
82+66+⋯+91
𝑥̅ = = 86 hours
10

(82−86)2 + (66−86)2 +⋯+(91−86)2


s= √ = 11.8415 hours
10−1

n = 10, t(9, 0.05) = 1.833


11.8415 11.8415
90% C.I. = (86 − 1.833 × , 86 + 1.833 × ) = (79.1361, 92.8639) hours
√10 √10

8. For the estimation of the population mean weight of a new born baby
Given 𝑥̅ = 6.87 lb, s = 1.76 lb, n = 20, t(19, 0.025) = 2.093
1.76 1.76
95% C.I. = (6.87 − 2.093 × , 6.87 + 2.093 × ) = (6.0463, 7.6937) lb
√20 √20

32
AS 2023-24

9. For the estimation of the population mean value of a greeting card


Given 𝑥̅ = 16.7, s = 3.2, n = 40, t(39, 0.025) = 1.96
3.2 3.2
95% C.I. = (16.7 − 1.96 × , 16.7 + 1.96 × ) = $(15.7083, 17.6917)
√40 √40

10. For the estimation of the population mean length of a movie

𝑥̅ = 71 + 91 +    + 69 = 78.4167 minutes
12
Given that 𝜎 = 8, n = 12, z 0.025 = 1.96
8 8
95% C.I. for μ = (78.4167 – 1.96× , 78.4167 +1.96× )
√12 √12

= (73.8903,82.9431) minutes

11. For the estimation of the population mean weight of cola in a bottle,
with 𝑥̅ = 16.0308, s = 0.0880, n = 25, t(24, 0.01) = 2.492
0.0880 0.0880
98% CI = (16.0308 − 2.492 × , 16.0308 + 2.492 × )
√25 √25

= (15.9869, 16.0747) grams

(b) About 300(0.98) = 294 intervals


can successfully cover the population mean.

12. For the estimation of the population proportion of people driving Benz in the building
17
(a) point estimate of p = = 0.085
200

0.085(0.915)
(b) sampling error at 90% confidence level = 1.645√ = 0.0324
200

0.085(0.915) 0.085(0.915)
(c) 90% C.I. of p = (0.085 − 1.645√ , 0.085 + 1.645√ )
200 200

= (0.0526, 0.1174)

33
AS 2023-24

13. For the estimation of the population proportion of people driving Benz in the building
290
(a) point estimate of p = = 0.7632
380

0.7632(0.2368)
(b) sampling error at 95% confidence level = 1.96√ = 0.0427
380

0.7632(0.2368) 0.7632(0.2368)
(c) 95% C.I. of p = (0.7632 − 1.96√ , 0.7632 + 1.96√ )
380 380

= (0.7205, 0.8059)

14. For the estimation of the population proportion of residents supporting Mike as the next president
22
n = 60, 𝑝̂ = = 0.3667, z0.05 = 1.645
60

0.3667(0.6333) 0.3667(0.6333)
90% C.I. of p = (0.3667 − 1.645√ , 0.3667 + 1.645√ )
60 60

= (0.2644, 0.4690)

15.
(a) For the estimation of the population mean weight of a melon
n = 120, 𝑥̅ = 4, 𝜎 = 0.9, z0.025 = 1.96,
0.9 0.9
95% CI for μ = (4 − 1.96 , 4 + 1.96 ) = (3.8390, 4.1610) kg
√120 √120

(b) For the estimation of the population proportion of melons which weigh heavier than 4.2 kg
30
n = 200, 𝑝̂ = = 0.15, z0.05 = 1.645
200

0.15(0.85) 0.15(0.85)
90% CI for p = (0.15 − 1.645√ 200
, 0.15 + 1.645√ 200
) = (0.1085, 0.1915)

34
AS 2023-24

16.
(a) For the estimation of the population proportion of customers who are satisfied with the service
168
p̂ = = 0.56, n = 300, z0.01 = 2.33
300
0.56  (1 − 0.56) 0.56  (1 − 0.56)
98% C.I. for p = (0.56 − 2.33  , 0.56 + 2.33  )
300 300
= (0.4932, 0.6268)
(b) For the estimation of the population mean spending on one visit to the shop
𝑥̅ = 820, s = 165, n = 300, d.f.= 299, t(299, 0.01)= 2.326
165 165
98% C.I. for μ = (820 – 2.326× , 820 + 2.326× ) = $ (797.84, 842.16)
√300 √300

17.
(ai) For the estimation of population mean score
𝑥̅ = 116.9, s = 21.6972, n = 10, d.f. = 10 – 1 = 9, t(9, 0.025) = 2.262
21.6972 21.6972
95% CI for μ = (116.9 − 2.262 , 116.9 + 2.262 )
√10 √10

= (101.3798, 132.4202)
(aii) 200(95%) = 190
(b) For the estimation of population proportion of participants enjoy the game.
360
p̂ = = 0.72, n = 500, z0.05= 1.645
500

0.72(0.28) 0.72(0.28)
90% CI for p = (0.72 − 1.645√ , 0.72 + 1.645√ )
500 500

= (0.6870, 0.7530)

18.
(a) For the estimation of population average monthly spending with credit card
𝑥̅ = 4600, s = 1400, n = 45, d.f. = 45 – 1 = 44, t(44, 0.025) = 1.96
1400 1400
95% C.I. for μ = (4600 – 1.96 , 4600 + 1.96 ) = $ (4190.95, 5009.05)
√45 √45

(b) For the estimation of population proportion of students always pay the full payment before
deadline
30
p̂ = = 0.6667, n = 45, z0.05= 1.645
45

0.6667×0.3333 0.6667×0.3333
90% C.I. for p = (0.6667 − 1.645√ , 0.6667 + 1.645√ )
45 45

= (0.5511, 0.7823)

35
AS 2023-24

Chapter 8 – Hypothesis Testing

1. For the test of the population mean weight of a package is less than 36.7lb
(i) H0:  = 36.7, H1:  < 36.7
(ii) With σ is given as 14.2, z-test should be used.
Reject H0 when z < -1.645 (z0.05 = 1.645)
(iii) x = 32.1, σ = 14.2, n = 64
32.1 − 36.7
z= = −2.59
14.2 / 64
(iv) As –2.59 < -1.645 (z0.05 = 1.645), H0 is rejected.
Conclusion: There is sufficient evidence to conclude that the weights of packages are less than in
the past.

2. For the test of the population mean breaking strength is different from 70 pounds
(i) H0:  = 70, H1:  ≠ 70
(ii) With σ is given as 3.5, z-test should be used
Reject H0 when z < -1.96 or z > 1.96 (z0.025 = 1.96)
(iii) x = 69.1, σ = 3.5, n = 49
69.1 − 70
z= = −1.8
3.5 / 49
(iv) As -1.96 < -1.8 < 1.96, H0 is not rejected.
Conclusion: There is no evidence to say the mean breaking strength is different from 70 pounds.

3. For the test of the population mean amount of salad dressing is different 8 ounces
(i) H0:  = 8, H1:  ≠ 8
(ii) With σ is given as 0.15, z-test should be used
Reject H0 when z < -1.645 or z > 1.645 (z0.05 = 1.645)
(iii) x = 7.983, σ = 0.15, n = 50
7.983 − 8
z= = −0.8014
0.15 / 50
(iv) As –1.645 < -0.8014 < 1.645, H0 is not rejected.
Conclusion: There is no evidence to say the average amount is different from 8 ounces.

36
AS 2023-24

4. For the test of the population mean withdrawal is more than $1600
(i) H0:  = 1600, H1:  >1600
(ii) With σ is given as 300, z-test should be used
Reject H0 when z > 1.645 (z0.05 = 1.645)
(iii) x = 1680, σ = 300, n = 36
1680−1600
z= = 1.6
300/√36

(iv) As 1.6 < 1.645 (z0.05 = 1.645), H0 is not rejected.


Conclusion: There is no evidence to say that the average withdrawal is greater than the
expectation.

5. For the test of the population mean cost of textbooks is above $3000
(i) H0:  = 3000, H1:  > 3000
(ii) As σ is unknown, t-test should be used with d.f. = 100 – 1 = 99
Reject H0 when t > 1.645 (t0.05,  = 1.645)
(iii) x = 3154, s = 432, n = 100
3154−3000
t= = 3.5648
432/√100

(iv) As 3.5648 > 1.645, H0 is rejected at 5% significance level.


Conclusion: There is sufficient evidence to conclude the population mean is above $3000.

6. For the test of the population mean lifetime of a battery is different from 400 hours
(i) H0:  = 400, H1:  ≠ 400
(ii) As σ is unknown, t-test should be used with d.f. = 13- 1 = 12
Reject H0 when t < -2.179 or t > 2.179 (t(12, 0.025 = 2.179)
(iii) x = 473.4615, s = 210.7663, n = 13
473.4615 − 400
t= = 1.2567
210.7663 / 13
(iv) As –2.179 < 1.2567 < 2.179 (t(12, 0.025) = 2.179), H0 is not rejected.
Conclusion: There is no evidence to say that the average lifetime is different from 400 hours.

37
AS 2023-24

7. For the test of the population mean weight of adult male is different from 160 lb
(i) H0:  = 160, H1:  ≠ 160
(ii) As σ is unknown, t-test should be used with d.f. = 16 – 1 = 15
Reject H0 when t < -2.131 or t > 2.131 (t(15, 0.025) = 2.131)
(iii) x = 160.25, s = 18.4878, n = 16
160.25 − 160
t= = 0.0541
18.4878 / 16
(iv) As –2.131 < 0.0541 < 2.131, H0 is not rejected.
Conclusion: There is no evidence to reject the null hypothesis.

8. For the test of the population mean daily spending on food per person in 2020 is higher than in
1990, which is $75, a z-test should be used (as the population standard deviation is given as $15).
Define μ as the population mean daily spending on food per person in 2020.
(i) H0: μ = 75 v.s. H1: μ > 75
(ii) Reject H0 if z > 2.33 (z 0.01 = 2.33)
(iii) x = 84 (from calculator)
84−75
z= = 3.1749
15/√28

(iv) As z = 3.1749 > 2.33, H0 is rejected at 1% significance level.

There is sufficient evidence to conclude the population mean daily spending on food per person
in 2020 is higher than in 1990.

9. For the test of the population mean score of the test is different from 35 marks
(i) H0:  = 35, H1:  ≠ 35
(ii) As σ is unknown, t-test should be used with d.f. = 6 – 1 = 5
Reject H0 when t < -2.015 or t > 2.015 (t(5, 0.05) = 2.015)
(iii) x = 37, s = 4.8166, n = 6
37 − 35
t= = 1.0171
4.8166 / 6
(iv) As –2.015 < 1.0171 < 2.015, H0 is not rejected.
Conclusion: The assumption that the average score is 35 is reasonable.

10. For the test of the population proportion of residents get more than seven hours of sleep per night

38
AS 2023-24

is more than 61%


Denote the proportion of residents get more than seven hours of sleep per night as p
(i) H0: p = 0.61, H1: p > 0.61
(ii) Reject H0 when z > 1.645 (z0.05 = 1.645)
235
(iii) 𝑝̂ = 350 = 0.6714, n = 350

0.6714 − 0.61
z= = 2.36
(0.61)(0.39) / 350

(iv) As 2.36 > 1.645 (z0.05 = 1.645), H0 is rejected.


Conclusion: There is strong evidence to conclude that more than 61% of us sleep more than seven
hours during weekend.

11. For the test of the population proportion of voters will vote the politician is more than 0.60
Denote the politician’s supportive rate as p
(i) H0: p = 0.6, H1: p > 0.6
(ii) Reject H0 when z > 1.645 (z0.05 = 1.645)
65
(iii) 𝑝̂ = = 0.65, n = 100
100

0.65−0.6
z= = 1.0206
0.6(0.4)

100

(iv) As 1.0206 < 1.645 (z0.05 = 1.645), H0 is not rejected.


Conclusion: There is no evidence to say that her supportive rate is more than 60%.

12. For the test of the population proportion of his party fellow oppose him is more than 0.25
Denote the judge’s opposing rate as p
(i) H0: p = 0.25, H1: p > 0.25
(ii) Reject H0 when z > 1.28 (z0.10 = 1.28)
217
(iii) 𝑝̂ = = 0.2713, n = 800
800

0.2713 − 0.25
z= = 1.3913
0.25(0.75) / 800

(iv) As 1.3913 > 1.28 (z0.10 = 1.28), H0 is rejected.


Conclusion: There is sufficient evidence that more than 25% of members against him and he
should give up the country judgeship and run for the state judgeship.
13. For the test of the population proportion of home-based business is owned by women is less than

39
AS 2023-24

0.5
Denote the population proportion of home-based business which is owned by women as p
(i) H0: p = 0.5, H1: p < 0.5
(ii) Reject H0 when z < -1.645 (z0.05 = 1.645)
369
(iii) 𝑝̂ = = 0.4105, n = 899
899

0.4105 − 0.5
z= = −5.367
0.5(0.5) / 899

(iv) As –5.367 < -1.645 (z0.05 = 1.645), H0 is rejected.


Conclusion: There is sufficient evidence to support that less that 50% of home-based business is
owned by women.

14. For the test of is there more than half of the applicants choose Gift A
Denote the population proportion of applicants choose Gift A as p
(i) H0: p = 0.5, H1: p > 0.5
(ii) Reject H0 when z > 2.33 (z0.01 = 2.33)
25
(iii) 𝑝̂ = = 0.625, n = 40
40

0.625−0.5
z= = 1.5811
0.5(0.5)

40

(iv) As 1.5811 < 2.33, H0 is not rejected.


Conclusion: There is no evidence to support that more than half of the applicants choose Gift A.

15. For the test of the wearing of the two brands of tire is different
Let D = brand B wearing amount – brand A wearing amount
d: 8, 1, 9, -1, 12, 9
(i) H0: D = 0 v.s. H1: D ≠ 0
(ii) Reject H0 when t < -2.5706 or t > 2.5706 (t(5, 0.025) = 2.5706)
(iii) d = 6.3333 , s = 5.1251, n = 6
6.3333
t= = 3.0269
5.1251 / 6
(iv) As 3.0269 > 2.5706, H0 is rejected.
Conclusion: There is sufficient evidence to conclude that the wear of the two brands of tires are
different.
16. For the test of a reduction in blood pressure after diet

40
AS 2023-24

Let D = blood pressure after diet – blood pressure before diet


d:-1, -4, 2, 0, -1, 1, 0, -5
(i) H0: D = 0 v.s. H1: D < 0
(ii) Reject H0 when t < -1.895 (t(7, 0.05) = 1.895)
(iii) d = −1 , s = 2.3905 , n = 8
−1
t= = −1.1832
2.3905 / 8
(iv) As -1.895 < -1.1832, H0 is not rejected.
Conclusion: There is no evidence to conclude the diastolic blood pressure is reduced after diet.

17. For the test of a reduction of weight after the program


Let D = weight after program – weight before program
d: 1, -5, 2, -7, -9, -9, 3, -1
(i) H0: D = 0 v.s. H1: D < 0
(ii) Reject H0 when t < -1.895 (t(7, 0.05) = 1.895)
(iii) 𝑑̅ = -3.125 , s = 4.9696, n = 8
−3.125
t= = -1.7786
4.9696/√8

(iv) As -1.7786 > -1.895, H0 is not rejected.


Conclusion: There is no evidence to conclude that the weight control program can effectively
reduce weight.

18. For the test of an increase in customers spending from 1 January to 1 March
Let D = average spending of a customer on 1 March – average spending of a customer on 1
January
d: 4, 8, 8, -7, 25, 3, -4, -20, 16
(i) H0: D = 0 v.s. H1: D > 0
(ii) Reject H0 when t > 2.896 (t(8, 0.01) =2.896)
(iii) 𝑑̅ = 3.6667 , s = 13.1244, n = 9
3.6667
t= = 0.8381
13.1244/√9

(iv) As 0.8381 < 2.896, H0 is not rejected.


Conclusion: There is no evidence to conclude that there is an increase in the average spending of
a customer from 1 January to 1 March.

19. For the test of the new machine is better than the old machine, in terms of the higher average

41
AS 2023-24

breaking strength
(i) H0: μOLD – μNew = 0 v.s. H1: μOLD – μNew < 0
(ii) Reject H0 when z < -2.3263 (Critical value for one-tailed test, left tail)
(iii) z statistics is calculated as -5.2031
(iv) As -5.2031 < -2.3263, H0 is rejected.
Conclusion: The new machine is better than the old machine so the new machine should be
purchased.

20. For the test of the average surface hardness of material A and B are different
(i) H0: μA = μB v.s. H1: μA ≠ μB
(ii) z statistics is calculated as 2.9566
(iii) p-value = 0.0031
(iv) As 0.0031 < 0.05, H0 is rejected at 5% level of significance
Conclusion: There is sufficient evidence to say the hardness of the two materials is different.

21.
(a) In branch A, sample mean fat contents is 29.6769 grams. In branch B, sample mean fat contents
is 23.75 grams.
(b) For the test of the average fat contents of burgers in branch A and branch B are different:
H0: μA = μB
H1: μA ≠ μB
(c) p-value = 0.0458
(d) There is sufficient evidence to say that the fat contents of burgers from the two branches are
ifferent as p-value = 0.0458 < 0.05.

42
AS 2023-24

22.
(a) sample mean weight gained by mice taking diet A = 10 grams
Sample mean weight gained by mice taking diet B = 18.2 grams
(b) For the test of the average weight gained by mice after taking the diet B is greater than that after
taking diet A
(i) H0: μA - μB = 0 v.s. H1: μA - μB < 0
(ii) Reject H0 when t < -2.5524 (One-tailed test, left tailed)
(iii) t statistics is calculated as -4.7203
(iv) As - 4.7203 < -2.552, H0 is rejected.
Conclusion: There is strong evidence to say that mean weight gained on diet B was greater than
the mean weight gained on diet A.

23.
(a) Sample mean score gained by employees aged below 40 is 55.875 marks.
Sample mean score gained by employees aged above 40 is 53.4 marks.
(b) For the test of whether the age of an employee had any effect on learning new computing skills,
we test if the average score obtained by employees aged below 40 is different from the average
score obtained by employees aged above 40
(i) H0: μ1 – μ2 = 0 v.s. H1: μ1 – μ2 ≠ 0
(ii) t statistics is calculated as 0.3577
(iii) p-value = 0.7253
(iv) Since p-value = 0.7253 > 0.05, H0 is not rejected at 5% significance level.
Conclusion: There is no evidence that age has any effect on learning computing skills.

43
AS 2023-24

24.
(a) Sample proportion of defective item in salesman’s mobiles = 0.1
Sample proportion of defective item in competitor’s mobiles = 0.04
15+6
Pooled sample proportion of defective item = = 0.07
150+150

(b) For the test of the defective rate of the salesman’s mobile phone is higher than its competitor
Denote the defective rate as p
(i) H0: psalesman - pcompetitor = 0 v.s. H1: psalesman - pcompetitor > 0
(ii) Reject H0 when z > 1.6449 (Critical value for one-tailed test, right tailed)
(iii) z statistics is calculated as 2.0365
(iv) As 2.0365 > 1.6449, H0 is rejected.
Conclusion: There is sufficient evidence to conclude the defective rate of salesperson’s mobile
phones is higher than that of the competitor.

25. For the test of the positive response rate of men and women are different
Denote the ratio of positive response as p
(i) H0: pmen - pwomen = 0 v.s. H1: pmen - pwomen ≠ 0
(ii) z statistics is calculated as 1.1329
(iii) p-value = 0.2572
(iv) As 0.2572 > 0.02, H0 is not rejected at 2% level of significance.
Conclusion: There is no evidence to conclude that there is difference in the proportion of men and
women responding positively.

26.
(a) Sample proportion of guests would likely to revisit Westwind Hotel is 0.7181
Sample proportion of guests would likely to revisit Goodview Hotel is 0.5878
(b) For the test of the revisit rate in Westwind is higher than that in Goodview
Denote the revisit rate as p
H0: pwestwind - pgoodview = 0
H1: pwestwind - pgoodview > 0
(c) p-value = 0.0013
(d) There is sufficient evidence to say that a greater proportion of guests at the Westwind are likely
to return than at the Goodview as p-value 0.0013 < 0.05

44
AS 2023-24

Chapter 9 – Analysis of Variance

1.
(a)
Group Sample Mean
Mong Kok $ 70
Wan Chai $ 85
Tai Po $ 76
Combined $ 77

(b) To test if the average spending of the three districts are not all the same:
(i) H0: µMong Kok = µWan Chai = µTai Po
H1: not all μ are equal
(ii) Reject H0 when F > 4.2565
(iii) F is calculated as 15.7846
(iv) As F = 15.7846 > 4.2565, H0 is rejected
Conclusion: The mean price of pork from the three districts are concluded to be not all the
same.

2.
(a) To test if the differences in crunchiness between the four crisps are significant
(i) H0: µ1 = µ2 = µ3 = µ4
H1: not all μ are equal
(ii) F is calculated as 30.1832
(iii) p-value = 0.0000
(iv) As p-value = 0.0000 < 0.05, H0 is rejected at 5% level of significance.
Conclusion: Differences in crunchiness between the four crisps are significant.

(b)
Crisps Sample Mean
Crisp 1 12.6
Crisp 2 9.3
Crisp 3 12.975
Crisp 4 14.5667
Combined 11.7737

45
AS 2023-24

3.
(a) To test the cognitive development of K1, K2, and K3 kids by using the finishing time as the
variable,
H0: µK1 = µK2 = µK3
H1: not all μ are equal
(b) p-value = 0.0450
(c) Cognitive development among K1, K2, and K3 kids are concluded as all the same as p-value =
0.0450 > 0.01, H0 is not rejected at 1% level of significance.

4.
(a) H0: µ1 = µ2 = µ3 = µ4 = µ5
H1: not all μ are equal
where µi is the average number of sales of package i, for i = 1, 2, 3, 4, 5
(b) critical value = 2.8661
(c) F statistics is calculated as 0.8412
(c) The effectiveness of the five different kinds of packaging is considered as all are the same as the
F-statistics = 0.8412 < 2.8661, H0 is not rejected.

46
AS 2023-24

Chapter 10 – Chi Square Test

1. (i) H0: pink : white : blue = 3 : 2 : 5


H1: pink : white : blue ≠ 3 : 2 : 5
(ii) Reject H0 if χ2 > 9.21, (  2 (0.01, 2) = 9.21)
(iii)
Colour Pink White Blue
Observed frequency 24 14 62
Expected frequency 30 20 50

2
(24 − 30)2 (14 − 20)2 (62 − 50)2
𝜒 = + + = 5.88
30 20 50
(iv) χ2 = 5.88 < 9.21, H0 is not rejected.
Conclusion: The differences in the observed and expected frequencies are not significant at the
1% level.

2. (i) H0: ratio of Vanilla: Mango: Strawberry = 1 : 2 : 2


H1: ratio of Vanilla: Mango: Strawberry ≠ 1 : 2 : 2
(ii) Reject H0 when χ2 > 5.991 (  2 (0.05, 2) = 5.991)
(iii)

Favour Vanilla Mango Strawberry


Observed frequency 250 500 450
Expected frequency 240 480 480

(250 - 240)2 (500 - 480)2 (450 - 480)2


c2 = + + = 3.125 < 5.991( c 2,0.05
2
)
240 480 480
(iv) As χ2 = 3.125 < 5.991, H0 is not rejected.
Conclusion: There is no significant difference between the choices of ice-cream by customers in
the Hong Kong shop compared to the result from the previous study conducted in Italy.

47
AS 2023-24

3. (i) H0 : Mary : John : Peter : May = 1 : 1 : 1 : 1


H1: Mary : John : Peter : May ≠ 1 : 1 : 1 : 1
(ii) Reject H0 if χ2 >7.81 (χ2 (0.05, 3)= 7.81)
(iii)
Candidate Mary John Peter May
Observed frequency 131 121 99 49
Expected frequency 100 100 100 100
(131 − 100)2 (121 − 100)2 (99 − 100)2 (49 − 100)2
𝜒2 = + + + = 40.04
100 100 100 100
(iv) χ2 = 40.04 > 7.81, H0 is rejected.
Conclusion: There is sufficient evidence to conclude that the four candidates command different
levels of support.

4. (i) H0: “rejected products” : “imperfect but acceptable products” : “perfect products” = 1 : 1 : 8
H1: “rejected products” : “imperfect but acceptable products” : “perfect products”  1 : 1 : 8
(ii) Reject H0 when 2 > 2(0.05, 2) = 5.991
(iii)
Number of
Number of Number of
imperfect but
Vendor B rejected perfect Total
acceptable
shipments shipments
shipments
Observed frequency 7 18 65 90
Expected frequency 9 9 72 90
(7−9)2 (18−9)2 (65−72)2
2 = 9
+ 9
+ 72
=10.125

(iv) As 2 = 10.125 > 5.991, H0 is rejected at 5% significance level.


Conclusion: There is sufficient evidence that the ratio of “rejected products : imperfect but
acceptable products : perfect products” received from Vendor B is different from 1 : 1 : 8.

48
AS 2023-24

Chapter 8 to 10 – What test should be conducted?

(a) t-test
Variable: spending on lunch (quantitative)
Test objective: Comparing the population mean of 2 independent populations (male, female) with
unknown population variances.

(b) z-test
Variable: suffering from insomnia (qualitative)
Test objective: Comparing the population proportion of 2 independent populations (primary
school kids, secondary school kids)

(c) ANOVA
Variable: daily income of Judy Restaurant (quantitative)
Test objective: Comparing the population mean of 4 independent populations (Monday,
Tuesday, Wednesday, Thursday)

(d) z-test
Variable: working hour in a day (quantitative)
Test objective: Test the population mean of 1 population with known population variance

(e) 2 test
Variable: traveling method (qualitative)
Test objective: Test the ratio of people using the 3 traveling method against the suggested ratio.

(f) ANOVA
Variable: processing time (quantitative)
Test objective: Comparing the mean processing of 3 independent populations (paid by VISA, by
Octopus card, by cash)

(g) z-test
Variable: cause of car accident (qualitative)
Test objective: Test the proportion of car accident due to drunk-driving is more than 0.3

(h) ANOVA
Variable: marks in final examination (quantitative)
Test objective: Test the effectiveness of four teaching methods by comparing the marks obtained
by four groups of students.

49
AS 2023-24

Chapter 11 – Linear Regression and Correlation

1.
(a) The equation is y = -242.1393 + 61.6915x
(b) When the sales of the product is 0, the advertising expenditure is - $ 242139.3. When the sales
of the product is increased by 1 million dollars, the advertising expenditure is increased by
$61,691.5.
(c) r = 0.6402, moderate positive relationship between sales and advertising expenditure
(d) (i) Put x = 6, y = 128.0100.
The advertising expenditure is estimated as $128,010. The prediction is unreliable as it is
extrapolation estimation.
(ii) Put x = 9, y = 313.0846.
The advertising expenditure is estimated as $313,085. The reliability is questionable as it is
interpolation estimation with moderate correlation.

2.
(a) r = -0.9442, it’s strong negative relationship between the number of days being late to school and
the examination scores in General Education.
(b) The equation is y = 102.4925 – 3.6219x
(c) When a student being late for 0 day, he gets 102.4925 marks in the examination. When the
student being late for one extra day, his mark would be decreased by 3.6219 marks.
(d) Put x = 11, y = 62.6516.
The examination mark is estimated as 62.6516. The estimation is reliable as it is interpolation
estimation with strong correlation.

3.
(a) The equation is y = 3.8134 + 0.0337x
(b) When no water is added, the yield of hay is 3.8134 tons per acre. For every centimetre of water
being added, the yield of hay is increased by 0.0337 ton per acre.
(c) r = 0.9929, strong positive relationship between amount of water and yield of hay.
(d) (i) Put x = 90, y = 6.8443.
The expected yield of hay is 6.8443 tons per acre. The prediction is reliable by using
interpolation estimation with strong correlation.
(ii) Put x = 150, y = 8.8650.
The expected yield of hay is 8.8650 tons per acre. The prediction is unreliable by using
extrapolation.

50
AS 2023-24

4.
(a) r = 0.3277. There is a moderate positive relationship between size of the offering and price per
share.
(b) y = 10.8883 + 0.001574 x
(c) When the size of offering is 0, the price per share is $10.8883. When the size of offering is
increased by every 1 million, the price per share is increased by $0.001574.
(d) y = 10.8883 + 0.001574(10) = 10.9040.
The price per share is estimated as $10.9040. Because it is interpolation estimation with
moderate correlation, the reliability is questionable.

5.
(a) r = 0.7710, it is a strong positive correlation
(b) y = 77890.9953 + 3.0569x
(c) When the customer’s monthly income is $0, he would buy a car with selling price $77890.9953.
For the customer’s monthly income increased by $1, the price of the car he purchases will be
increased by $3.0569.
(d) y = 77890.9953 + 3.0569(90000) = 353012.00
The price of the car is estimated as $353012.00. The estimation is unreliable as it is
extrapolation estimation.

6.
Photograph A B C D E F G H
Rank by Peter 2 5 3 6 1 4 7 8
Rank by Tom 4 3 2 6 1 5 8 7
d -2 2 1 0 0 -1 -1 1 ∑d=0
d2 4 4 1 0 0 1 1 1 ∑ d2 = 12

6(12)
(a) rs = 1 − 8(82−1) = 0.8571

(b) The two judges have similar judging criteria as the correlation between the ranking of the
photographs made by them is strong and positive.

51

You might also like