0% found this document useful (0 votes)
4 views7 pages

Sb-Midterm I

The document contains a series of statistics questions and answers related to various concepts such as data types, probability, sampling methods, and distributions. It includes examples of calculations for probabilities, normal distributions, and sampling techniques. The content appears to be a study guide or notes for a midterm exam in a statistics course.

Uploaded by

doanminhtien279
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views7 pages

Sb-Midterm I

The document contains a series of statistics questions and answers related to various concepts such as data types, probability, sampling methods, and distributions. It includes examples of calculations for probabilities, normal distributions, and sampling techniques. The content appears to be a study guide or notes for a midterm exam in a statistics course.

Uploaded by

doanminhtien279
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

lOMoARcPSD|32505142

SB midterm

Statistics for Business (Trường Đại học Kinh tế Thành phố Hồ Chí Minh)

Studocu is not sponsored or endorsed by any college or university


Downloaded by Minh Ti?n ?oàn ([email protected])
lOMoARcPSD|32505142

Student 4:
1. The closing price of a stock is an example of what type of data: nominal, ordinal,
interval, or ratio data?
Ans: Ratio data, because it can be ranked, measured and there is true zero
2. From the following tree, find the probability that a randomly chosen person will not
get a vaccination and will not get the flu

Ans: 0.6*0.3 = 0.18 P(A∩B) = P(A) · P(B)

Student 5:
1. What is the level of measurement for categorical data?
Ans: nominal and ordinal
2. People weights are normally distributed with mean of 60kg and standard deviation of
5kg. What is the probability that a randomly picked person weighs more than 65kg?
Ans: EXCEL => 1-NORM.DIST(65,60,5,True) = 0.15866
Or: Standardized normal distribution

Student 7:
1. Internet survey posted on a website is an example of which sampling method?
Ans: Simple random sampling
2. On average there are 16 defects per 100 meters of electric wire, what is the
probability of finding exactly 2 defects in a randomly chosen 10-meter?
Ans: HYPGEOM.DIST(x,n,s,N,True)
+ N:100
+ n:10
+ X:2
+ S:16
=0.302221

Student 8:
1. Which model best describes the number of births in a hospital until the first twins are
delivered?
Ans: Geometric distribution (it describes the number of trials until the first twins
happens)

1. Scores are normally distributed with a mean of 400 and standard deviation of 50. The
top 10 percent of the applicants would have a score of at least what value?
Ans: 464 scores

2.

Downloaded by Minh Ti?n ?oàn ([email protected])


lOMoARcPSD|32505142

Student 9:
1. What is the weakness of the mode?
Ans: it doesn't take all the scores in the data set into consideration (the mode is only
concerned with the most frequently occurring number in a set of raw data, it doesn't
consider any of the other scores.)
2. Can we find the probability of events A or B occurring by summing their probabilities?
Ans: With the mutually exclusive probability we can find probability of events A or B
by sum P(A) and P(B). But the non mutually exclusive probability we have to deduct
the intersection of A and B

Student 10:
1. Using a sample to make generalizations about an aspect of a population is called
what statistics?
Ans: Inferential Statistics (Inferential statistics use measurements from the sample of
subjects in the experiment to compare the treatment groups and make
generalizations about the larger population of subjects.) (2 types: Estimation,
Hypothesis Testing)
2. Scores are normally distributed with a mean of 460 and standard deviation of 80.
What fraction of the applicants would you expect to have a score of 400 or above
Ans: P(X > 400) = P(Z > -0.75) ) =.7734

Student 11:
1. Your telephone area code is an example of a(n) ___ variable.
Ans: Nominal (can’t compare, no ordering)
2. The figure shows a standard normal N(0,1) distribution. Find the z value for the
shaded area

Ans: -1.75

Student 12:
1. The daily closing price of Sacombank stock over the past month is an example of
which data?
Ans: Time series data (it collects the observations collected at the equal time over the
past month)
2. When you send out a resume the probability of being called for an interview is .20.

1. Scores are normally distributed with a mean of 400 and standard deviation of 50. The
top 10 percent of the applicants would have a score of at least what value?
Ans: 464 scores

2.

Downloaded by Minh Ti?n ?oàn ([email protected])


lOMoARcPSD|32505142

What is the probability that you get your first interview within the first five resumes
that you send out?
Ans: P(5) = 0.2 * (1-0.2)^(5-1) =0.08192

Student 14:
1. Could we eliminate sampling error by increasing the sample size?
Ans: Yes, as the sample size increases, the sample gets closer to the actual
population. And can be easy to bias because the sample size is too small
2. The median is halfway between Q1 and Q3 on a box plot?
Ans: Correct. Because median is Q2 and Q2 is the middle between Q1 và Q3

Student 15:
1. What is the advantage of stratified sampling?
Ans: It divides the population into groups with the same common characteristics, and
chooses a single random sample so if we choose every individual representing each
group, we can have a better control of each subgroup to ensure all of them are
represented in the sampling.

2. The length of fish caught in a certain river are normally distributed with a mean of
40cm and standard deviation of 5cm. What proportion of fish caught will be between
30 and 50cm in length?
Ans: EXCEL => =NORM.DIST(50,40,5,True)-NORM.DIST(30,40,5,True)=0.9545

Student 17:
1. A manager chose two people from her team of eight to give an oral presentation
because she felt they were representative of the whole team’s views. What sampling
technique?
Ans: Judgement. Because we choose a typical persons to representative of the
whole team’s views
2. Scores are normally distributed with a mean of 460 and standard deviation of 80.
What fraction of the applicants would you expect to have a score of 400 or above
Ans: Z = (400-460)/80 = -0.75 dò ra P = 0.7734

Student 18:
1. What sampling technique is used when groups are defined by their geographical
location?
Ans: Cluster sampling (because it divides the groups with common characteristics is
geographic location)
2. If arrivals occur at a mean rate of 3.6 events per hour, what is the probability of
waiting more than 30 minutes for the next arrival?
Ans: P = e^(-lambda*X) => e^(-3.6*0.5) = 0.1652

Student 20:
1. Compared to dot plot, we lose some detail when we present data in a frequency

1. Scores are normally distributed with a mean of 400 and standard deviation of 50. The
top 10 percent of the applicants would have a score of at least what value?
Ans: 464 scores

2.

Downloaded by Minh Ti?n ?oàn ([email protected])


lOMoARcPSD|32505142

distribution. Is that correct?


Ans: Yes. Because we can not know the exact value of each variable.
2. The probability is 0.80 that a standard normal distribution is between -z and +z. What
is the value of z?
Ans: P(z)=0.9, u=0, SD=1 → z=1.28

Student 21:
1. Log scales are common because most people are familiar with them?
Ans: For people I think it’s rarely used, because log scale illustrate time series data in
financial or for data that grow rapidly (e.g., revenues for a start-up company)

2. Scores are normally distributed with a mean of 460 and a standard deviation of 80.
The top 5 percent of the applicants would have a score of at least what value?
Ans: 591.6

Student 22:
1. Could we use a column chart instead of line chart for time series data
Ans: Yes but we prefer to using line chart because if there are a large number of data
the bar chart is scattered and difficult to analyze
2. Given the contingency table shown here, find P(A or M)

Ans: = P(A) + P(M) - P(A and M) = 100/200 + 50/200 - 25/200 = 0.625

Student 29:
1. In a histogram, the height of a bar represents what?
Ans: Used to represent the frequency of variables or relative frequency (because
vertical axis illustration this)
2. In the standard normal distribution, the probability between z = 1.00 and z = 1.15 is
higher or lower than the probability between z = 2.00 and z = 2.15?
Ans: The probability between z = 1.00 and z = 1.15 is higher than the probability
between z = 2.00 and z = 2.15. Nhìn hình hoặc quy từ bảng ra trừ.

Student 30:
1. A population is of size 200 observations. When the data are represented in a relative
frequency distribution, the relative frequency of a given interval is 0.15. What is the
frequency in this interval?

1. Scores are normally distributed with a mean of 400 and standard deviation of 50. The
top 10 percent of the applicants would have a score of at least what value?
Ans: 464 scores

2.

Downloaded by Minh Ti?n ?oàn ([email protected])


lOMoARcPSD|32505142

Ans: 200 * 0.15 = 30


2. If two events are collectively exhaustive, what is the probability that one or the other
will occur.
Ans: If two events are collectively exhaustive, it means that the two events describe
every possible outcome. There are no other possibilities. So, the probability that one
of the two events occurs is 1. (luôn luôn xảy ra 1 trong 2)

Student 32:
1. One benefit of the box plot is that it clearly displays the standard deviation. Is that
correct?Ans: Incorrect. Because it’s 5 numbers: min, max, Q1, Q3, median. It’s not
clearly displays the standard deviation
2. Nam scored 85 in an exam (Q1 = 40 and Q3 = 60). Based on the fences, is this an
outlier? (85 dương, upper)
Ans: Interquartiles range = Q3 - Q1 = 20
Inner fence= Q3+1.5*20 = 90

Student 33:
1. What shape of a distribution allows us to apply the Empirical Rules.
Ans: Empirical Rules is used for a normal distribution (Bell curve)
(Chebyshev’s theorem: any distribution, no constant shape)

2. If P(A) = 0.50, P(B) = 0.30. And P (A and B) = 0.15. Are A and B independent events?
Ans: Yes.
Independent Events: P(A | B) = P(A)
P(A|B)=P(A and B)/P(B)=0.15/0.3=0.5 = P(A)

Student 34:
1. Given the date set 2, 5, 10, 6, 3, what is the median value?
Ans: 2, 3, 5, 6, 10 => 5
2. If A and B are mutually exclusive events, then P(A and B) = P(A) + P(B). Is that
correct?
Ans: No. Because P(A and B) = 0. Because two event can not occur simultaneously.

Student 35:
1. If samples are from a normal distribution with mean = 100, standard deviation = 10,
what percentage of data within 90 to 110?
Ans: 0.6846

1. Scores are normally distributed with a mean of 400 and standard deviation of 50. The
top 10 percent of the applicants would have a score of at least what value?
Ans: 464 scores

2.

Downloaded by Minh Ti?n ?oàn ([email protected])


lOMoARcPSD|32505142

2. If events A and B are mutually exclusive, then P(A) + P(B) = 0. Is that correct?
Ans: Yes. Because P(A and B) = 0. Because two event can not occur simultaneously.

Student 36:
1. The Empirical Rule can be applied to any distribution, unlike Chebyshev’s theorem.
Is that correct?
Ans: Incorrect, because Chebyshev’s theorem can be applied to any distribution,
(Empirical Rule: Bell-shape)
2.
Student 37:
1. If the standard deviations of two samples are the same, so are their coefficients of
variation. Is that correct?
Ans: Incorrect, CV = (S/Mean) * 100%
2. On average, a major earthquake (Richter scale 6.0 or above) occurs three times a
decade in a certain California country. Find the probability that at least one major
earthquake will occur within the next decade.
Ans: Poisson (Mega)

Approximation

Biominal: repeat n trials, and want to know the probability of successes with replacement

+ Hypergeometric: được xem là Binomial khi (n/N < 0.05)


+ Poisson: được xem là Binomial khi (probability of success (pi) =< 0.05, n >=20,
lambda = n*pi)

+ Hypergeometric: selecting from a finite population without replacement (đề 4 số)


+ Poisson: number of occurrences within a randomly chosen unit of time or space (đề 2
số)

Normal approximation to Binomial: mean = n.pi, SD= sqrt(n*pi(1-pi)) khi n*pi >=10 and n(1-
pi)>=10

1. Scores are normally distributed with a mean of 400 and standard deviation of 50. The
top 10 percent of the applicants would have a score of at least what value?
Ans: 464 scores

2.

Downloaded by Minh Ti?n ?oàn ([email protected])

You might also like