Stat Mid - Revision - Ans

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 7

An airport gamma ray luggage scanner coupled with a neural net artificial intelligence program

can detect a weapon in suitcases with a false positive rate of 2 percent and a false negative

rate of 2 percent. Assume a .001 probability that a suitcase contains a weapon. If a suitcase

triggers the alarm, what is the probability that the suitcase contains a weapon? Explain your

reasoning.

-------------------------------------------------------------------------------------------------------------------

Stat Mid 2020 - 2021


Please round the result to 4 decimals place, e.x 0.0001

Question 1: (25 pts)


To estimate which presidential candidate will be selected in the upcoming U.S election, BBC
news has conducted a survey of U.S voters. BBC has selected randomly 3 states in the U.S,
including: New York, Washington DC, and Montana. For each selected state, BBC has phoned
randomly 1000 voters about the following information: their age, their income (measured as: less
than $5.000 per month, from $5.000 - $10.000 per month, and above $10.000 per month), their
job, and who they plan to vote for in the upcoming presidential election.
a) Complete US population who are eligible to voter for upcoming US election . Sample selected
is 3 states i.e Washington DC , Montana and New York
b) the Sampling used here is Two stage Cluster sampling
c)
The different types of variables are
i) Income - Quantitative
ii) Job - Categorical
iii) Vote Opinion - Categorical
d)
i) Income - Ration sale
ii) Job- Can be taken as Nominal
iii) Vote Opinion- Nominal
e) Inferential , as we using the sample population opinion information and make an estimate of
overall population vote opinion
Question 2 (15pts)
The figure 1 below presents the relationship between wage (million VND per month) and
education (years of schooling) of 20 workers.

Figure 1. The relationship between wage and education

a. Comment on the relationship between wage and education in Figure 1? If compute the
correlation between wage and education, what is the sign of the correlation coefficient (positive
or negative)? (5pts).
b. Using the above data, a researcher estimates a linear equation describing relationship between
wage and education as follows: Wage = 5.21 + 0.74*Education. Let a = 5.21, and b = 0.74,
Explain the meaning of a and b? (5pts)
c. Suppose that worker A has 10 years of schooling while worker B has 5 years of schooling. What
is the predicted difference in monthly wage between A and B? (5pts)
Question 3 (35 points)
Suppose that you are a research assistant at a market research company , you study about have a
data set as below:

15 23 15 45 78 89 18 21 42 15 65 120

a. Calculate GeoMean = 34.65; Mean = 45.5, Median = 32.5 ; Mode = 12; midrange = 67.5
b. Calculate the sample standard deviation = 35.09
c. Estimate the value of the 62% percentile = 46.2
d. Calculate Interquartile range = 59
e. According to Chebyshev’s Theorem, determine the interval that at least 75% number of
population data belongs to ? = (-25.1816,115.1816)
f. Suppose the dataset is a population, is the observation with its value of “120” an outlier point ?
No
Question 4: (5 pts)
Suppose a certain country has an experiment to test a Covid-19 vaccine with two groups : 70%
of participants received the vaccine (in the Group A ) and 30% of that did not receive the
vaccine (in the Group B) . After a test, for the Group A & B , the probability of “a negative
result” is 90% and 60% in respectively. If a person gets “ a negative result” , what is a
probability that she/he came from the Group A ? (5)
Given,
P(A) = 0.7
P(B) = 0.3
Let N denote negative test. Thus,
P(N/A) = 0.9
P(N/B) = 0.6
Using Bayes' Theorem we get,
P(A/N) = P(N/A)P(A)/[P(N/A)P(A)+P(N/B)P(B)]
P(A/N) = 0.7*0.9/(0.7*0.9+0.3*0.6)
=> P(A/N) = 0.63/0.81 = 0.7778
Question 5 : (20 pts)
The U.S. Department of Transportation reported that during November, 83.4% of Southwest
Airlines’ flights, 75.1% of US Airways’ flights, and 70.1% of JetBlue’s flights arrived on time
(USA Today, January 4, 2007). Assume that this on-time performance is applicable for flights
arriving at concourse A of the Rochester International Airport, and that 40% of the arrivals at
concourse A are Southwest Airlines flights, 35% are US Airways flights, and 25% are JetBlue
flights.
a) Develop a joint probability table with three rows (airlines) and two columns (on-time
arrivals vs. late arrivals).
b) An announcement has just been made that Flight 1424 will be arriving at gate 20 in
concourse A. What is the most likely airline for this arrival?
An announcement has just been made that Flight 1424 will be arriving at gate 20 in
concourse A. The most likely airline for this arrival is Southwest Airlines flights. Since
largest percentage of the arrivals at concourse A are Southwest Airlines flights. That is, 40%.
c) What is the probability that Flight 1424 will arrive on time?
By using the above joint probability distribution table we have the required probability
values as shown below:
P(on time) = 0.3336 + 02629 + 0.1753 = 0.7718
Therefore, the required probability is 0.7718.
d) Suppose that an announcement is made saying that Flight 1424 will be arriving late. What
is the most likely airline for this arrival? What is the least likely airline?
the most likely airline for this arrival is U.S. Airways and the least likely airline is southwest.

Stat Mid 2021 – 2022

Question 1

The Hawaii Visitors Bureau collects data on visitors to Hawaii. The following questions were

among 16 asked in a questionnaire handed out to passengers during incoming airline flights.

(i) This trip to Hawaii is my: 1st, 2nd, 3rd, 4th, etc.

(ii) The primary reason for this trip is: (10 categories, including vacation, convention,

honeymoon)

(iii) Where I plan to stay: (11 categories, including hotel, apartment, relatives, camping)

(iv) Total days in Hawaii (days)

a. What is the population being studied? Visitors to Hawaii.


b. Is the use of a questionnaire a good way to reach the population of passengers on incoming
airline flights? Since airline flights carry the vast majority of visitors to the state, the use of
questionnaires for passengers during incoming flights is a good way to reach this population. The
questionnaire actually appears on the back of a mandatory plants and animals declaration form
that passengers must complete during the incoming flight. A large percentage of passengers
complete the visitor information questionnaire.
c. Comment on each of the four questions in terms of whether it will provide categorical or
quantitative data. Questions 1 and 4 provide quantitative data indicating the number of visits
and the number of days in Hawaii. Questions 2 and 3 provide categorical data indicating the
categories of reason for the trip and where the visitor plans to stay.
d. What is the level of measurement of each variable?

Question 1&4: Ratio data

Question 2&3: Nominal

Question 2

The grades of 10 students on their first management test are shown below.
94 61 96 66 92
68 75 85 84 78
a. Construct a frequency distribution. Let the first class be 60 - 69.
b. Construct a cumulative frequency distribution.
c. Construct a relative frequency distribution.

Question 3

The number of sick days due to colds and flu last year was recorded by a sample of 9 adults. The
data are:
9 3 7 4 1 7 5 4 8
For all of your answers, please round to 2 decimal places.
a. Calculate Mean, Median, and Mode.

Mean = 5.33, Median = 5, Mode= 4, 7

b. Calculate Variance and Standard Deviation.

Variance = 6.75, SD = 2.598

c. Compute Coefficient of Variation.


Coefficient of Variation = 100*(2.598/5.33) = 48.74%

d. Find the Quartiles Q1, Q2, Q3, and compute Interquartile Range (IQR).

Q2 = Median = 5, Q1 = (3+4)/2 = 3.5, Q3 = (7+8)/2 = 7.5, IQR = Q3 – Q1 = 7.5 – 3.5 = 4

b. The amount of time (minutes) taken to finish the midterm test for a statistic course by a
sample of 9 students are:
33 29 45 60 42 19 52 38 36
Calculate Standard Deviation and Coefficient of Variation. Does this sample have greater relative
variation than the sample of sick days? Explain.
Mean = 39.33, SD = 12.247, Coefficient of Variation = 31.14%. This sample has lower
relative variation than the sample of sick days due to lower coefficient of variation. Standard
Deviation cannot be used to compare variation between these two samples because they use
different units of measurement, and they have different means.
Question 4
The probability of an economic decline in the year 2000 is 0.23. There is a probability of 0.64
that we will elect a republican president in the year 2000. If we elect a republican president,
there is a 0.35 probability of an economic decline. Let “D” represent the event of an economic
decline, and “R” represent the event of election of a Republican president.
a. Are “R” and “D” independent events?
No, because P(D) ≠ P(D|R)
P(D) = 0.23
P(D|R) = 0.35
b. What is the probability of a Republican president and economic decline in the year 2000?
P(R and D) = P(D|R)*P(R)
Given P(D|R) = 0.35, P(R) = 0.64
Therefore P(R and D) = 0.35*0.64 = 0.224
c. If we experience an economic decline in the year 2000, what is the probability that there
will a Republican president?
P(R|D) = P(R and D) / P(D) = 0.224/0.23 = 0.9739
d. What is the probability of economic decline or a Republican president in the year 2000?
P(D or R) = P(D) + P(R) – P(D and R) = 0.23 + 0.64 – 0.224 = 0.646.
Link to midterm revision:

https://drive.google.com/drive/folders/1mDi71MLle7jRfU0Fg_cvCLIiq34BgwVn?
usp=share_link

You might also like