0% found this document useful (0 votes)
124 views5 pages

Statistics and Probability - MC

Download as docx, pdf, or txt
Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1/ 5

STATISTICS AND PROBABILITY

FINAL EXAM
I. MULTIPLE CHOICE. Choose the letter of the correct answer and write your answer on your ANSWER SHEET.
1. All of the following are the estimation of parameters EXCEPT:
a. point estimate b. interval estimate c. confidence interval d. confidence level
2. The statement that “(𝐴|𝐵) = 𝑃(𝐵|𝐴) whenever 𝐴 and 𝐵 are independent events” is: Please select the best answer of those provided below.
a. Always True
b. Never True
c. Not Enough Information; we would need to know if 𝐴 and 𝐵 are disjoint events
d. Not Enough Information; we would need to know if the events are equally likely
3. The p-value in hypothesis testing represents which of the following: Please select the best answer of those provided below.
a. The probability of failing to reject the null hypothesis, given the observed results
b. The probability that the null hypothesis is true, given the observed results
c. The probability that the observed results are statistically significant, given that the null hypothesis is true
d. The probability of observing results as extreme as or more extreme than currently observed, given that the null hypothesis is true
4. Assume that the difference between the observed, paired sample values is defined in the same manner and that the specified significance level is
the same for both hypothesis tests. Using the same data, the statement that “a paired/dependent two sample t-test is equivalent to a one sample t-test
on the paired differences, resulting in the same test statistic, same p-value, and same conclusion” is: Please select the best answer of those provided
below.
a. Always True b. Never True c. Sometimes True d. Not Enough Information
5. Note for this question that the odds in favor of an event 𝐴 are defined as follows: (𝐴) 1−𝑃(𝐴) For fraternal twins, the odds in favor of having
children that are twins are 1/16. Based upon this information, what is the probability of a fraternal twin not having children that are twins?
a. 1/16 b. 15/16 c. 1/17 d. 16/17
6. Refer to the back-to-back stem plot at the right. Which of the following are true statements regarding the data summarized?
I. The distributions have the same mean
II. The distributions have the same range
III. The distributions have the same variance
IV. The distributions have the same coefficient of variation

a. II only b. II and III c. I, III, and IV d. II, III, and IV

For Questions 7–10, refer to the table, which relates to the possible epilepsy-depression link.

Depressive Disorder
Diagnosed Epilepsy Present (Yes) Absent (No)
Epilepsy 37 51
No Epilepsy 24 78

7. What is the probability of one randomly selected individual presenting with a depressive disorder given the individual has diagnosed epilepsy?
Round to 3 decimal places.
a. 0.421 b. 0.420 c. 0.195 d. 0.607
8. Assume simple random sampling for the data summarized in the table above. Let 𝑝𝐸 represent the proportion of individuals with diagnosed
epilepsy (‘Epilepsy’) that present with a depressive disorder. Let 𝑝𝑁𝐸 represent the proportion of individuals without diagnosed epilepsy (‘No
Epilepsy’) that present with a depressive disorder. What is the 95% confidence interval to estimate 𝑝𝐸 − 𝑝𝑁𝐸, the difference between the
population proportions of individuals presenting with a depressive disorder among those with diagnosed epilepsy and among those without
diagnosed epilepsy? Round to 3 decimal places.
a. (0.040, 0.386) b. (0.258, 0.577) c. (-0.005, 0.142) d. (0.053, 0.317
9. A researcher believes that the proportion of individuals with diagnosed epilepsy that present with a depressive disorder, 𝑝𝐸, is higher than the
proportion of individuals without diagnosed epilepsy that present with a depressive disorder, 𝑝𝑁𝐸. Testing this claim, what would the resulting p-
value be? Round to 3 decimal places.
a. 0.006 b. 0.069 c. 0.003 d. 0.035
10. Refer to Question 9. Using a 0.10 significance level, which of the following is the most appropriate conclusion given the results?
a. Reject the null hypothesis; there is sufficient evidence to support the researcher’s claim.
b. Fail to reject the null hypothesis; there is sufficient evidence to support the researcher’s claim.
c. Accept the null hypothesis; there is not sufficient evidence to support the researcher’s claim.
d. Accept the null hypothesis; there is sufficient evidence to support the researcher’s claim.
11. A sociologist focusing on popular culture and media believes that the average number of hours per week (hrs/week) spent using social media is
greater for women than for men. Examining two independent simple random samples of 100 individuals each, the researcher calculates sample
standard deviations of 2.3 hrs/week and 2.5 hrs/week for women and men respectively. If the average number of hrs/week spent using social media
for the sample of women is 1 hour greater than that for the sample of men, what conclusion can be made from a hypothesis test where: { 𝐻0: 𝜇𝑊 −
𝜇𝑀 = 0 𝐻1: 𝜇𝑊 − 𝜇𝑀 > 0
a. The observed difference in average number of hrs/week spent using social media is not significant
b. The observed difference in average number of hrs/week spent using social media is significant
c. A conclusion is not possible without knowing the average number of hrs/week spent using social media in each sample
d. A conclusion is not possible without knowing the population sizes
12. A 99% t-based confidence interval for the mean price for a gallon of gasoline (dollars) is calculated using a simple random sample of gallon
gasoline prices for 50 gas stations. Given that the 99% confidence interval is $3.32 < 𝜇 < $3.98, what is the sample mean price for a gallon of
gasoline (dollars)? Please select the best answer of those provided below.
a. $0.33
b. $3.65
c. Not Enough Information; we would need to know the variation in the sample of gallon gasoline prices
d. Not Enough Information; we would need to know the variation in the population of gallon gasoline prices
13. A quiz consists of 9 True/False questions. Assume that the questions are independent. In addition, assume that (T) and (F) are equally likely
outcomes when guessing on any one of the questions. What is the probability of guessing on each of the 9 quiz questions and getting more than one
of the True/False questions wrong? Round to 3 decimal places.
a. 0.998 b. 0.018 c. 0.020 d. 0.980
14. Five students take AP Calculus AB one year and AP Calculus BC the next year. Their overall course grades (%) are listed below for both
courses. Which of the following statistical procedures would be most appropriate to test the claim that student overall course grades are the same in
both courses? Assume that any necessary normality requirements hold. Student 1 2 3 4 5 AP Cal AB 80.0% 72.6% 99.0% 91.3% 68.9% AP Cal
BC 85.5% 71.0% 93.2% 93.0% 74.8%
a. Two-tailed two-sample paired/dependent t-test of means
b. Two-tailed two-sample independent t-test of means
c. Two-tailed two-sample independent z-test of means
d. One-tailed two-sample z-test of proportions
15. Referring to the setting and data provided in Question 14 above, what is the test statistic for testing the claim that student overall course grades
are the same in both courses? Round to 3 decimal places.
a. -0.516
b. -0.157
c. 4.306
d. Not Enough Information; we would need to know the variation in the population
16. The histogram to the right represents the hospital length of stay (in days) for patients at a nearby medical facility. How many patients are
included in the histogram?
a. 5 b. 21 c. 17 d. 9
17. Using the histogram to the right that represents the hospital lengths of stay (in days) for patients at a nearby medical facility, determine the
relationship between the mean and the median.

a. Mean = Median c. Mean < Median


b. Mean ≈ Median d. Mean > Median
18. Refer to the discrete probability distribution provided in the table below.
X=x 0 1 2 3 4
P(X = x) 0.040 0.110 0.450 0.230 ?

Find the probability that x is equal to 0 or 4. Round to 3 decimal places.


a. 0.040 b. 0.210 c. 0.007 d. 1.000
19. Green sea turtles have normally distributed weights, measured in kilograms, with a mean of 134.5 and a variance of 49.0. A particular green sea
turtle’s weight has a z-score of -2.4. What is the weight of this green sea turtle? Round to the nearest whole number.
a. 17 kg b. 151 kg c. 118 kg d. 252 kg
20. What percentage of measurements in a dataset fall above the median?
a. 49% b. 50% c. 51% d. Cannot Be Determined
21. Which of the following exam scores is better relative to other students enrolled in the course?  A psychology exam grade of 85; the mean
grade for the psychology exam is 92 with a standard deviation of 3.5  An economics exam grade of 67; the mean grade for the economics exam is
79 with a standard deviation of 8  A chemistry exam grade of 62; the mean grade for the chemistry exam is 62 with a standard deviation of 5
a. The psychology exam score is relatively better
b. The economics exam score is relatively better
c. The chemistry exam score is relatively better
d. All of the exam scores are relatively equivalent
22. The statement “If there is sufficient evidence to reject a null hypothesis at the 10% significance level, then there is sufficient evidence to reject
it at the 5% significance level” is: Please select the best answer of those provided below.
a. Always True
b. Never True
c. Sometimes True; the p-value for the statistical test needs to be provided for a conclusion
d. Not Enough Information; this would depend on the type of statistical test used
23. Assuming weights of female athletes are normally distributed with a mean of 140 lbs and a standard deviation of 15 lbs, what is the probability
that a randomly selected female athlete weighs more than 170 lbs? Round to 3 decimal places. Also, is the probability above the same as the
probability that a randomly selected sample of size 𝑛 (where 𝑛 > 1) has a mean weight more than 170 lbs?
a. 0.023; yes, these two probabilities would be the same
b. 0.023; no, these two probabilities would not be the same
c. 0.977; yes, these two probabilities would be the same
d. 0.977; no these two probabilities would not be the same
For Questions 24–25, refer to the relevant results from a regression analysis provided below. A simple random sample of 5k race times for 32
competitive male runners aged 15-24 years old resulted in a mean 5k race time of 16.79 minutes. The simple linear regression equation that fit the
sample data was obtained and found to be 𝑦̂ = 21.506 − 0.276𝑥 where 𝑥 represents the age of the runner in years and 𝑦 represents the 5k race time
for a competitive male runner in minutes. When testing the claim that there is a linear correlation between age and 5k race times of competitive
male runners, an observed test statistic of (𝑡 = −7.87) resulted in an approximate p-value of 0.0001.
24. The proportion of variation in 5k race times that can be explained by the variation in the age of competitive male runners was approximately
0.663. What is the value of the sample linear correlation coefficient? Round to 3 decimal places.
a. 0.663 b. 0.814 c. -0.814 d. 0.440
25. Using all of the results provided, is it reasonable to predict the 5k race time (minutes) of a competitive male runner 73 years of age?
a. Yes; linear correlation between age and 5k race times is statistically significant
b. Yes; both the sample linear regression equation and an age in years is provided
c. No; linear correlation between age and 5k race times is not statistically significant
d. No; the age provided is beyond the scope of our available sample data
26. US Air has a flight to New York for $160 and United Airlines has a flight to New York for $200. Which statement is true?
a. The cost of the United flight is 25% more than US Air.
b. The cost of the US Air flight is 80% less than the United flight.
c. The cost of the United flight is 120% more than the US Air flight.
d. The cost of the US Air flight is 25% less than United.
27. Suppose you could take all samples of size 64 from a population with a mean of 12 and a standard deviation of 3.2. What would be the standard
deviation of the sample means?
a. 3.2 b. 0.2 c. 0.4 d. 0.3
28. In a normally distributed variable, a value x ∗ is considered unusually large if
a. P(x ≤ x ∗) < 0.05 c. P(x ≤ x ∗) > 0.05
b. P(x ≥ x ∗) < 0.05. d. P(x ≥ x ∗) > 0.05
29. A hypothesis test is conducted and the P-value of the test statistic is 0.02. Four of the following statements are valid. Which statement is not
valid.
a. It is not very likely that the extremeness of the test statistic is due to chance.
b. Assuming the null hypothesis is true, there is a 2% chance of getting a more extreme test statistic.
c. There is a 2% chance that the null hypothesis is false.
d. At the 0.05 significance level, you would reject the null hypothesis.
30. Which of the following is NOT required of a binomial distribution
a. Each trial has exactly two outcomes.
b. There is a fixed number of trials.
c. The probability of success remains fixed for all trials.
d. There are more than 30 trials.
31. The mean = np and the standard deviation = √n p q for:
a. all probability distributions.
b. normal distributions.
c. binomial distributions.
d. none of the above
32. The t distribution should be used to build confidence interval estimates of a population mean when the population standard deviation is
a. small. c. too large to fit the normal distribution.
b. 30 or more. d. Unknown
33. Which of the following is NOT a characteristic of the sampling distribution of sample means (the means of all possible samples of a given
size)?
a. It is normally distributed.
b. It is centered on the population mean.
c. It has a standard deviation (σx¯) which is larger than the population standard deviation (σ).
d. It has a standard deviation (σx¯) which is smaller than the population standard deviation
34. Which of the following is a possible alternative hypothesis H1 for a two-tailed test.
a. µ < 30 b. µ 6= 30 c. µ = 30 d. µ > 30
35. Suppose that at the 95% confidence level we calculate a confidence interval described by 43.8 < µ < 46.2. Which of the following statements
cannot be made about this result.
a. The sample mean is 45.
b. The population mean is 45.
c. The margin of error is 1.2.
d. We are 95% confident that the population mean lies between 43.8 and 46.2.
36. Inferential statistics is so named because it allows us to examine a sample and make inferences about
a. another sample. c. the population from which the sample was taken.
b. an element of the sample. d. none of the above.
37. If the P-value of a given test statistic is 0.03 then,
a. It is unlikely that the extremeness of the test statistic is due to chance.
b. Assuming the null hypothesis is true, there is a 3% chance of getting a more extreme test statistic.
c. At the 0.05 significance level, you would reject the null hypothesis.
d. All of the above are viable conclusions.
38. If a researcher takes a large enough sample, he/she will almost always obtain:
a. virtually significant results c. consequentially significant results
b. practically significant results d. statistically significant results
39. The rejection region for testing 𝐻𝑜: 𝜇 = 100 𝑣𝑠 𝐻𝑎: 𝜇 ≠ 100, at the 0.05 level of significance is:
a. |𝑧| < 0.95 b. |𝑧| > 1.96 c. z > 1.65 d. z < 2.333
40. The owner of a local nightclub has recently surveyed a random sample of n = 300 customers of the club. She would now like to
determine whether or not the mean age of her customers is over 35. If so, she plans to alter the entertainment to appeal to an older
crowd. If not, no entertainment changes will be made. Suppose she found that the sample mean was 35.5 year and population
standard deviation was 5 years. What is the p-value associated with the test statistic?
a. 0.0416 b. 0.9572 c. 0.0421 d. 0.0836
41. The strength (degree) of the correlation between a set of independent variables X and a dependent variable Y is measured by
a. Coefficient of Correlation c. Standard error of estimate
b. Coefficient of Determination d. All of the above
42. The percent of total variation of the dependent variable Y explained by the set of independent variables X is measured by
a. Coefficient of Correlation c. Coefficient of Determination
b. Coefficient of Skewness d. Standard Error of Estimate
42. A coefficient of correlation is computed to be -0.95 means that
a. The relationship between two variables is weak
b. The relationship between two variables is strong and positive
c. The relationship between two variables is strong and but negative
d. Correlation coefficient cannot have this value
43. Let the coefficient of determination computed to be 0.39 in a problem involving one independent variable and one dependent
variable. This result means that
a. The relationship between two variables is negative
b. The correlation coefficient is 0.39 also
c. 39% of the total variation is explained by the independent variable
d. 39% of the total variation is explained by the dependent variable
44. A soft drink dispenser can be adjusted to deliver any fixed number of ounces of soft drink. If the machine is operating with a
standard deviation in delivery equal to 0.3 ounces, what should be the mean setting so that a 12-ounce cup will overflow less then 1%
of the time? Assume a normal distribution for ounces delivered.
a. 11.23 ounces b. 11.30 ounces c. 11.70 ounces d.12.70 ounces
45. Relationship between correlation coefficient and coefficient of determination is that
a. both are unrelated
b. The coefficient of determination is the coefficient of correlation squared
c. The coefficient of determination is the square root of the coefficient of correlation
d. both are equal
46. Multicollinearity exists when
a. Independent variables are correlated less than -0.70 or more than 0.70
b. An independent variables is strongly correlated with a dependent variable
c. There is only one independent variable
d. The relationship between dependent and independent variable is non-linear
47. If “time” is used as the independent variable in a simple linear regression analysis, then which of the following assumption could
be violated
a. There is a linear relationship between the independent and dependent variables
b. The residual variation is the same for all fitted values of Y
c. The residuals are normally distributed
d. Successive observations of the dependent variable are uncorrelated
48. In multiple regression, when the global test of significance is rejected, we can conclude that
a. All of the net sample regression coefficients are equal to zero
b. All of the sample regression coefficients are not equal to zero
c. At least one sample regression coefficient is not equal to zero
d. The regression equation intersects the Y-axis at zero.
49. A residual is defined as
a.  Y−Y^Y−Y^ c. Regression sum of squares
b. Error sum of square d. Type I Error
50. What test statistic is used for a global test of significance?
a. Z test b. t test c. Chi-square test d. F test

“Trust in the Lord with all your heart and lean NOT on your own understanding” – Proverbs 3:5

_____________________________________End of Exam______________________________

Prepared by:
MARISSA C. APLISE
Special Science Teacher I

Checked by:
WENDELL C. CATAM – ISAN, PhD.
School Head
Multiple Choice Key
1. c
2. d
3. d
4. a
5. d
6. b
7. b
8. d
9. c
10. a
11. b
12. b
13. d
14. a
15. a
16. b
17. d
18. b
19. c
20. d
21. c
22. c
23. b
24. c
25. d
26. a
27. c
28. c
29. c
30. d
31. c
32. d
33. c
34. b
35. b
36. c
37. d
38. d
39. b
40. a
41. d
42. c
43. c
44. c
45. b
46. a
47. d
48. c
49. a
50. d

You might also like