0% found this document useful (0 votes)
14 views5 pages

Mid Apply

Uploaded by

ngocthanh2821
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views5 pages

Mid Apply

Uploaded by

ngocthanh2821
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 5

Date: April 6th, 2024

Dr. Tran Thanh Tu

Applied statistics
Exercises – Practice for Midterm exam
For each of the question: provide the equation, apply the value and calculate the results

1. For the data set:

Certainly, let's calculate the mean, median, mode, first quartile, and third quartile for the given
dataset:

Data Set: 61, 64, 61, 54, 83


Number of data point n=5

a. Mean, Median, and Mode:

 Mean = (61 + 64 + 61 + 54 + 83) / 5 = 323 / 5 = 64.6


 Median:
First, we arrange the numbers in ascending order: 54, 61, 61, 64, 83.
Since there are 5 numbers (an odd number), the median is the middle number, which is
61.
 Mode: The mode is the number that appears most frequently. In this dataset, 61 appears
twice, which is more frequent than any other number. So, the mode is 61.

b. First and Third Quartiles:

 First Quartile (Q1): This is the value that separates the lowest 25% of the data from the
rest.
o We already have the numbers in ascending order: 54, 61, 61, 64, 83.
o Since there are 5 numbers, the first quartile (Q1) is the median of the lower half
of the data. The lower half consists of 54 and 61. The median of these two
numbers is (54 + 61) / 2 = 57.5.

 Third Quartile (Q3): This is the value that separates the highest 25% of the data from
the rest.
o The upper half of the data consists of 61, 64, and 83. The median of these three
numbers is 64.

Therefore, the first quartile (Q1) is 57.5, and the third quartile (Q3) is 64.

2. For the events:


These 5 samples have the same probability of 0.2.
Determine:
a. P(A)
b. P(B)
c. P(B') (B' is complement of B)
d. P(AUB)
e. P(AՈB)

 Sample space: Consists of 5 samples, each sample has an equal probability of 0.2.
 Event A: Occurs when the result is 2, 4 or 6.
 Event B: Occurs when the result is 3 or 6.

Prize:

a. P(A): Probability of event A occurring.

 A has 3 favorable elements (2, 4, 6) out of a total of 5 elements in the sample space.
 So P(A) = 3/5 = 0.6

b. P(B): Probability of event B occurring.

 B has 2 favorable elements (3, 6) out of a total of 5 elements in the sample space.
 So P(B) = 2/5 = 0.4

c. P(B'): Probability of the opposite event of B occurring (ie B does not occur).

 B' consists of elements that do not belong to B, which are 2 and 4.


 So P(B') = 2/5 = 0.4

d. P(AUB): Probability of event A or B or both occurring.

 AUB consists of elements that belong to A or B or both, namely 2, 3, 4, 6.


 So P(AUB) = 4/5 = 0.8

e. P(A ∩ B): Probability of both events A and B occurring at the same time.

 A ∩ B consists of elements that belong to both A and B, that is 6.


 So P(A ∩ B) = 1/5 = 0.2

Conclude:

 The probability of the events occurring in turn is:


o P(A) = 0.6
o P(B) = 0.4
o P(B') = 0.4
o P(A U B) = 0.8
o P(A ∩ B) = 0.2

Note:

 AUB: Represents the union of two sets A and B, that is, the set consisting of all elements
that belong to A or to B or to both.
 A ∩ B: Represents the intersection of two sets A and B, that is, the set consisting of all
elements that belong to both A and B.

Formulas used:

 Probability of an event: Number of elements favorable to the event / Total number of


elements in the sample space.

3. For a sample survey that has below result:

Let A denotes the event that a sample is employed, and B denotes the event that a sample is
female. A' and B' are complements of A and B, respectively.
a. Determine P(A), P(B), P(A|B), P(B|A)
b. Determine P(AUB), P(AՈB), P(AՈB'), P(A'UB')
c. Event A and event B are independent or not? Why?
d. What is the probability for the sample is unemployed female?
e. What is the probability for the sample is employed male?
tep 1: Calculate the total number of samples

Total number of samples = 460 + 40 + 140 + 260 = 900

Step 2: Identify events and calculate probability

 Event A: The selected sample is employed.


 Event B: The selected sample is female

a. Determine the probabilities:

 P(A) = (Number of employed people) / (Total sample) = (460 + 140) / 900 = 6/9 = 2/3
 P(B) = (Number of females) / (Total sample) = (140 + 260) / 900 = 4/9
 P(A ∩ B) = (Number of employed women) / (Total sample) = 140 / 900 = 7/45
 P(B|A) = P(A ∩ B) / P(A) = (7/45) / (2/3) = 7/30

b. Determine the probabilities:

 P(A ∪ B) = P(A) + P(B) - P(A ∩ B) = (2/3) + (4/9) - (7/45) = 17/15 (Note: The
probability cannot be greater than 1, there may be an error in the data or calculation)
 P(A' ∩ B') = P((A ∪ B)') = 1 - P(A ∪ B) = 1 - (17/15) (This value is negative,
unreasonable, there may be an error in the data or calculation)

P(A' ∪ B') = 1 - P(A ∩ B) = 1 - (7/45) = 38/45


 P(A ∩ B') = P(A) - P(A ∩ B) = (2/3) - (7/45) = 23/45

c. Check the independence of A and B: Two events A and B are independent if and only if
P(A ∩ B) = P(A) * P(B). In this case, (7/45) ≠ (2/3) * (4/9), so A and B are not independent.

d. Probability that the sample is female and unemployed: = (Number of female and
unemployed) / (Total number of samples) = 260 / 900 = 13/45

e. Probability that the sample is male and has a job: = (Number of male employees) / (Total
number of samples) = 460 / 900 = 23/45

4. The XO Group Inc. conducted a survey of 13,000 brides and grooms married in the
United States and found that the average cost of a wedding is $29,858 (XO Group website,
January 5, 2015). Assume that the cost of a wedding is normally distributed with a mean of
$29,858 and a standard deviation of $5600.
a. What is the probability that a wedding costs less than $20,000?
b. What is the probability that a wedding costs between $20,000 and $30,000?
c. For a wedding to be among the 5% most expensive, how much would it have to cost?

Step 1: Standardize data:

To calculate the probability, we need to standardize the data to a standard normal distribution
(with a mean of 0 and a standard deviation of 1). The normalization formula:

z = (x - μ) / σ

In there:

 z: normalized value
 x: value for which probability needs to be calculated
 μ: average
 σ: standard deviation

a. Probability of wedding costing less than $20,000:

 z = (20000 - 29858) / 5600 ≈ -1.76


 Using a normal distribution table or a calculator, we find P(Z < -1.76) ≈ 0.0392

So, the probability of a wedding costing less than $20,000 is about 3.92%.

b. Probability of a wedding costing between $20,000 and $30,000:

 z1 = (20000 - 29858) / 5600 ≈ -1.76


 z2 = (30000 - 29858) / 5600 ≈ 0.03
 P(20000 < X < 30000) = P(-1.76 < Z < 0.03) ≈ P(Z < 0.03) - P(Z < -1.76) ≈ 0.5120 -
0.0392 ≈ 0.4728
So, the probability of a wedding costing between $20,000 and $30,000 is about 47.28%.

c. To be in the top 5% of most expensive weddings:

 We need to find the value of z such that P(Z > z) = 0.05, which is equivalent to P(Z < z)
= 0.95.
 From the normal distribution table or calculator, we find z ≈ 1.645.
 Using the inverse normalization formula: x = μ + zσ = 29858 + 1.645 * 5600 ≈ 38950

So, a wedding would have to cost at least $38,950 to be in the top 5% of most expensive
weddings.

Conclude:

 Only about 3.92% of weddings cost less than $20,000.


 About 47.28% of weddings cost between $20,000 and $30,000.
 To be in the top 5% of most expensive weddings, a wedding must cost at least about
$38,950.

Note:

 For accurate calculations, you should use a normal distribution table or specialized
statistical software.
 The above results are estimates based on the assumption that the data follows a normal
distribution.

Concepts to master:

 Normal distribution: A type of continuous probability distribution, characterized by a


bell shape.
 Standardized value (z-score): A measure that tells how many standard deviations a
data point is from the mean.
 Normal distribution table: A table that gives probabilities corresponding to z values.

Application:

This problem applies knowledge of normal distribution to solve practical problems related to
statistical probability. It can be applied in many different fields such as economics, finance,
medicine, etc.

You might also like