Dsme5110f HW3
Dsme5110f HW3
Homework 3
1. In basketball, a player who is fouled in the act of shooting gets to shoot two free throws. Suppose we
hear that one player is an “85% free throw shooter.”
a. If this player is fouled 25 times in the act of shooting (maybe over a period of several games), find
the distribution of occasions where he makes both free throws. That is, if X is the number of times
he makes both free throws, find P(X = k) for each k from 0 to 25.
t = 0.85*0.85
px = dbinom(0:25,25,t)
x = c(0:25)
dis = data.frame(x,p)
x px
0 1.206632e-14
1 7.853976e-13
2 2.453837e-11
3 4.898094e-10
4 7.013982e-09
5 7.669884e-08
6 6.656446e-07
7 4.704060e-06
8 2.755689e-05
9 1.355225e-04
10 5.645551e-04
11 2.004379e-03
12 6.088376e-03
13 1.585172e-02
14 3.537565e-02
15 6.754305e-02
16 1.099096e-01
17 1.514970e-01
18 1.753058e-01
19 1.681573e-01
20 1.313445e-01
21 8.142117e-02
22 3.854336e-02
23 1.308934e-02
24 2.839955e-03
25 2.957647e-04
b. How likely is it that he will make both free throws on at least 20 of the 25 occasions?
1 - pbinom(19,25,t) #0.2675341
2. A family is considering a move from a midwestern city to a city in California. The distribution of housing
costs where the family currently lives is normal, with mean $105,000 and standard deviation $18,200.
The distribution of housing costs in the California city is normal with mean $235,000 and standard
deviation $30,400. The family’s current house is valued at $110,000.
a. What percentage of houses in the family’s current city cost less than theirs?
p_midw <- pnorm(110000, mean = 105000, sd = 18200) #0.6082363 or 60.82%
b. If the family buys a $200,000 house in the new city, what percentage of houses there will cost less
than theirs?
pnorm(200000, mean = 235000, sd = 30400) #0.1248012 or 12.48%
c. What price house will the family need to buy to be in the same percentile (of housing costs) in the
new city as they are in the current city?
qnorm(p_midw, mean = 235000, sd = 30400) #$243351.6
3. Suppose that if a presidential election were held today, 53% of all voters would vote for candidate Smith
over candidate Jones. (You can substitute the names of the most recent presidential candidates.) If 2500
voters are sampled randomly, what is the probability that the sample will indicate (correctly) that Smith
is preferred to Jones?
a. Calculate the probability with binomial probability distribution.
1- pbinom(1250, 2500, 0.53) #0.9985728
b. Calculate the probability with normal probability distribution (with = np and 2 = np(1 – p)).
mu = 2500*0.53
sdv = sqrt(2500*0.53*(1-0.53)) #standard deviation
1- pnorm(1226, mean = mu, sd = sdv) #0.9999636
c. Compare the answers obtained in (a) and (b) above. Are they close?
Very close result both a and b
4. It is known that the distribution of purchase amounts by customers entering a popular retail store is
approximately normal with mean $25 and standard deviation $10.
a. What is the probability that a randomly selected customer spends between $12 and $28 at this
store?
Let X be a α.β folllows normal distribution with
α=$25 β=$10
We have to find
When X=12, Then
Z=x-α/β = 12-25/10 = -1.3
When X=28, then
Z=28-25/10=0.3
So P(12≤X≤28)= p(-1.3≤x≤0.3)=P( 12≤Z≤0)+P(0≤z≤0.3)
=P(0≤X≤ 1.3)+ P(0≤Z≤0.3)
0.4032+0.1179=0.5211
b. What is the probability that the average amount spent by 16 randomly selected customers is
between $21 and $28?
When X=21 ,then Z=21-25/ 10/√16= -1.6
When X=28, then Z= 28-25/10/4= 1.2
So P(21≤Z≤28)=P( -1.6≤Z≤1.2)=P ( -1.6≤Z≤0)+P(0≤z≤1.2)
=P(0≤z≤1.6)+ P(0≤z≤1.2)= 0.8301
5. A randomly collected sample consists of the following data: 158, 233, 121, 261, 294, 149, 224, 156, 293,
273, 182, 223, 244, 130, 265, 244, 150. Construct a 95% confidence interval for µ.
sample = c(158, 233, 121, 261, 294, 149, 224, 156, 293, 273, 182, 223, 244, 130, 265, 244, 150)
sample
t.test(sample, conf.level = 0.95) #181.6914 - 241.8380
data: sample
t = 14.928, df = 16, p-value = 8.216e-11
alternative hypothesis: true mean is not equal to 0
95 percent confidence interval:
181.6914 241.8380
sample estimates:
mean of x
211.7647
6. In a sample of 1200 randomly selected voters, 575 of them said that they would vote for Candidate A
while 625 of them indicated that they would vote for Candidate B. There are only two candidates in this
election. Construct a 95% confidence interval to estimate the percentage of votes that Candidate B will
get. Do you have 95% confidence to predict that Candidate B will win the election?
z<- abs(qnorm(0.025)) #z value of 95%
pb = 625/1200 #0.52 Rate of votes obtained by B
pb - z*sqrt(pb*(1-pb)/1200) #0.4925683 lower rate of votes obtained by B in Population >> There is a
possibility to lose the election
pb + z*sqrt(pb*(1-pb)/1200) #0.5490984 Upper rate of votes obtained by B in Population
7. Suppose that you want to estimate an unknown population mean and the population standard
deviation is known to be = 3500.
a. If you want the error of margin to be 500 with 95% confidence, how big a sample size should you
take?
conf.l <- 0.95
alpha <- 1- conf.l
z <- qnorm(1-alpha/2)
PME <- 500
s <- 3500
n = z^2*s^2/PME^2
n #188.2315 or 189
b. If you want the error of margin to be 100 with 99% confidence, how big a sample size should you
take?
conf.l <- 0.99
alpha <- 1- conf.l
z <- qnorm(1-alpha/2)
PME <- 100
s <- 3500
n = z^2*s^2/PME^2
n #8127.748 or 8128
8. In the coming US presidential election, if a pollster wants to predict the percentage of votes for
President Trump to be within ±1% (margin of error) with 99% confidence, how big a sample size should
he take? How about if he wants the margin of error to be 1.5% with 90% confidence?
conf.l <- 0.99
alpha <- 1- conf.l
z <- qnorm(1-alpha/2)
PME <- 0.01
s <- sqrt(0.5*(1-0.5)) #Standard Deviation of Normal distribution
n = z^2*s^2/PME^2
n #16587.24 or 16588
For Questions 9 and 10, please refer to the file “budget_mart.csv”, which contains information of 1000
customers randomly selected from the database of Budget Mart – a large retailing firm. The data has the
following variables:
Age: 1 if 30 or younger, 2 if 31 to 55, 3 if 56 or older
Gender: 1 if male, 2 if female
Married: 1 if married, 2 otherwise
Religion: 1 if Buddhist, 2 if Muslim, 3 if Christian, and 4 Otherwise
Education: 1 if Below High School, 2 if High School, 3 if College, and 4 if Graduate
Salary: Annual salary
Children: Number of children
AmountSpent: Total amount spent on purchases this year
9. Consider only those customers who are Christian. What proportion of them spend more than (≥) $3,000?
Construct a 95% confidence interval to estimate the proportion of Christians who spend more than (≥)
$3,000. Do the same for each of the other three religious groups of customers. Discuss the results.
view(budget)
#For Christian
pc = mean(budget$AmountSpent >= 3000 & budget$Religion == 3) #0.01919192
#For Buddhist
#For Muslim
#For Otherwise
10. Let 1 and 2 be the mean amount spent by married and unmarried customers, respectively.
a. Construct and interpret a 99% confidence interval for 1.
data: spend_m
1550.832 1791.744
sample estimates:
mean of x
1671.288
b. Construct and interpret a 99% confidence interval for 2.
data: spend_um
690.9291 828.9340
sample estimates:
mean of x
759.9316
c. Based on the results of (a) and (b) above, can you conclude with 99% confidence whether married or
unmarried customers spend more on average?
mean(budget$AmountSpent) #total average of spending is 1216.53, then Married customer spends more
than average with 99% confidence