0% found this document useful (0 votes)
69 views6 pages

Dsme5110f HW3

This document contains the solutions to 8 questions regarding a homework assignment on business statistics. The questions involve calculating probabilities using binomial, normal, and t-distributions, constructing confidence intervals, and analyzing spending data from a sample of customers. Key results include finding the probability a basketball player makes both free throws on 20+ occasions (26.75%), constructing a 95% confidence interval for the percentage of votes a candidate will receive in an election, and determining the required sample size for various margin of errors and confidence levels in polling.

Uploaded by

Shaarang Begani
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
69 views6 pages

Dsme5110f HW3

This document contains the solutions to 8 questions regarding a homework assignment on business statistics. The questions involve calculating probabilities using binomial, normal, and t-distributions, constructing confidence intervals, and analyzing spending data from a sample of customers. Key results include finding the probability a basketball player makes both free throws on 20+ occasions (26.75%), constructing a 95% confidence interval for the percentage of votes a candidate will receive in an election, and determining the required sample size for various margin of errors and confidence levels in polling.

Uploaded by

Shaarang Begani
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 6

DSME5110F Business Statistics

Homework 3

Yutaro Goto - 1155138618


XIAOQIN HUANG - 1155154182
Wei Yang – 1155152246
Zhenyu Zhang – 1155155887
Shaarang Begani – 1155155892

1. In basketball, a player who is fouled in the act of shooting gets to shoot two free throws. Suppose we
hear that one player is an “85% free throw shooter.”
a. If this player is fouled 25 times in the act of shooting (maybe over a period of several games), find
the distribution of occasions where he makes both free throws. That is, if X is the number of times
he makes both free throws, find P(X = k) for each k from 0 to 25.
t = 0.85*0.85
px = dbinom(0:25,25,t)
x = c(0:25)

dis = data.frame(x,p)
x px
0 1.206632e-14
1 7.853976e-13
2 2.453837e-11
3 4.898094e-10
4 7.013982e-09
5 7.669884e-08
6 6.656446e-07
7 4.704060e-06
8 2.755689e-05
9 1.355225e-04
10 5.645551e-04
11 2.004379e-03
12 6.088376e-03
13 1.585172e-02
14 3.537565e-02
15 6.754305e-02
16 1.099096e-01
17 1.514970e-01
18 1.753058e-01
19 1.681573e-01
20 1.313445e-01
21 8.142117e-02
22 3.854336e-02
23 1.308934e-02
24 2.839955e-03
25 2.957647e-04
b. How likely is it that he will make both free throws on at least 20 of the 25 occasions?
1 - pbinom(19,25,t) #0.2675341
2. A family is considering a move from a midwestern city to a city in California. The distribution of housing
costs where the family currently lives is normal, with mean $105,000 and standard deviation $18,200.
The distribution of housing costs in the California city is normal with mean $235,000 and standard
deviation $30,400. The family’s current house is valued at $110,000.
a. What percentage of houses in the family’s current city cost less than theirs?
p_midw <- pnorm(110000, mean = 105000, sd = 18200) #0.6082363 or 60.82%
b. If the family buys a $200,000 house in the new city, what percentage of houses there will cost less
than theirs?
pnorm(200000, mean = 235000, sd = 30400) #0.1248012 or 12.48%
c. What price house will the family need to buy to be in the same percentile (of housing costs) in the
new city as they are in the current city?
qnorm(p_midw, mean = 235000, sd = 30400) #$243351.6
3. Suppose that if a presidential election were held today, 53% of all voters would vote for candidate Smith
over candidate Jones. (You can substitute the names of the most recent presidential candidates.) If 2500
voters are sampled randomly, what is the probability that the sample will indicate (correctly) that Smith
is preferred to Jones?
a. Calculate the probability with binomial probability distribution.
1- pbinom(1250, 2500, 0.53) #0.9985728
b. Calculate the probability with normal probability distribution (with  = np and 2 = np(1 – p)).
mu = 2500*0.53
sdv = sqrt(2500*0.53*(1-0.53)) #standard deviation
1- pnorm(1226, mean = mu, sd = sdv) #0.9999636
c. Compare the answers obtained in (a) and (b) above. Are they close?
Very close result both a and b
4. It is known that the distribution of purchase amounts by customers entering a popular retail store is
approximately normal with mean $25 and standard deviation $10.
a. What is the probability that a randomly selected customer spends between $12 and $28 at this
store?
Let X be a α.β folllows normal distribution with
α=$25 β=$10
We have to find
When X=12, Then
Z=x-α/β = 12-25/10 = -1.3
When X=28, then
Z=28-25/10=0.3
So P(12≤X≤28)= p(-1.3≤x≤0.3)=P( 12≤Z≤0)+P(0≤z≤0.3)
=P(0≤X≤ 1.3)+ P(0≤Z≤0.3)
0.4032+0.1179=0.5211

b. What is the probability that the average amount spent by 16 randomly selected customers is
between $21 and $28?
When X=21 ,then Z=21-25/ 10/√16= -1.6
When X=28, then Z= 28-25/10/4= 1.2
So P(21≤Z≤28)=P( -1.6≤Z≤1.2)=P ( -1.6≤Z≤0)+P(0≤z≤1.2)
=P(0≤z≤1.6)+ P(0≤z≤1.2)= 0.8301

5. A randomly collected sample consists of the following data: 158, 233, 121, 261, 294, 149, 224, 156, 293,
273, 182, 223, 244, 130, 265, 244, 150. Construct a 95% confidence interval for µ.
sample = c(158, 233, 121, 261, 294, 149, 224, 156, 293, 273, 182, 223, 244, 130, 265, 244, 150)
sample
t.test(sample, conf.level = 0.95) #181.6914 - 241.8380

One Sample t-test

data: sample
t = 14.928, df = 16, p-value = 8.216e-11
alternative hypothesis: true mean is not equal to 0
95 percent confidence interval:
181.6914 241.8380
sample estimates:
mean of x
211.7647
6. In a sample of 1200 randomly selected voters, 575 of them said that they would vote for Candidate A
while 625 of them indicated that they would vote for Candidate B. There are only two candidates in this
election. Construct a 95% confidence interval to estimate the percentage of votes that Candidate B will
get. Do you have 95% confidence to predict that Candidate B will win the election?
z<- abs(qnorm(0.025)) #z value of 95%
pb = 625/1200 #0.52 Rate of votes obtained by B
pb - z*sqrt(pb*(1-pb)/1200) #0.4925683 lower rate of votes obtained by B in Population >> There is a
possibility to lose the election
pb + z*sqrt(pb*(1-pb)/1200) #0.5490984 Upper rate of votes obtained by B in Population
7. Suppose that you want to estimate an unknown population mean  and the population standard
deviation is known to be  = 3500.
a. If you want the error of margin to be 500 with 95% confidence, how big a sample size should you
take?
conf.l <- 0.95
alpha <- 1- conf.l
z <- qnorm(1-alpha/2)
PME <- 500
s <- 3500
n = z^2*s^2/PME^2
n #188.2315 or 189
b. If you want the error of margin to be 100 with 99% confidence, how big a sample size should you
take?
conf.l <- 0.99
alpha <- 1- conf.l
z <- qnorm(1-alpha/2)
PME <- 100
s <- 3500
n = z^2*s^2/PME^2
n #8127.748 or 8128
8. In the coming US presidential election, if a pollster wants to predict the percentage of votes for
President Trump to be within ±1% (margin of error) with 99% confidence, how big a sample size should
he take? How about if he wants the margin of error to be 1.5% with 90% confidence?
conf.l <- 0.99
alpha <- 1- conf.l
z <- qnorm(1-alpha/2)
PME <- 0.01
s <- sqrt(0.5*(1-0.5)) #Standard Deviation of Normal distribution
n = z^2*s^2/PME^2
n #16587.24 or 16588

conf.l <- 0.90


alpha <- 1- conf.l
z <- qnorm(1-alpha/2)
PME <- 0.015
v <- 0.5*(1-0.5) #Variance of Normal distribution
n = z^2*v/PME^2
n #3006.159 or 3006

For Questions 9 and 10, please refer to the file “budget_mart.csv”, which contains information of 1000
customers randomly selected from the database of Budget Mart – a large retailing firm. The data has the
following variables:
Age: 1 if 30 or younger, 2 if 31 to 55, 3 if 56 or older
Gender: 1 if male, 2 if female
Married: 1 if married, 2 otherwise
Religion: 1 if Buddhist, 2 if Muslim, 3 if Christian, and 4 Otherwise
Education: 1 if Below High School, 2 if High School, 3 if College, and 4 if Graduate
Salary: Annual salary
Children: Number of children
AmountSpent: Total amount spent on purchases this year
9. Consider only those customers who are Christian. What proportion of them spend more than (≥) $3,000?
Construct a 95% confidence interval to estimate the proportion of Christians who spend more than (≥)
$3,000. Do the same for each of the other three religious groups of customers. Discuss the results.

budget <- read_csv("HW1/budget_mart.csv")

view(budget)

n = 990 #the number of answer

#For Christian
pc = mean(budget$AmountSpent >= 3000 & budget$Religion == 3) #0.01919192

z<- abs(qnorm(0.025)) #z value of 95%

pc - z*sqrt(pc*(1-pc)/n) #0.01064555 Lower proportion value


pc + z*sqrt(pc*(1-pc)/n) #0.02773829 Upper proportion value

#For Buddhist

pc = mean(budget$AmountSpent >= 3000 & budget$Religion == 1)

z<- abs(qnorm(0.025)) #z value of 95%

pc - z*sqrt(pc*(1-pc)/n) #0.007542241 Lower proportion value


pc + z*sqrt(pc*(1-pc)/n) #0.02276079 Upper proportion value

#For Muslim

pc = mean(budget$AmountSpent >= 3000 & budget$Religion == 2)

z<- abs(qnorm(0.025)) #z value of 95%

pc - z*sqrt(pc*(1-pc)/n) #0.0006348167 Lower proportion value

pc + z*sqrt(pc*(1-pc)/n) #0.009466193 Upper proportion value

#For Otherwise

pc = mean(budget$AmountSpent >= 3000 & budget$Religion == 4)

z<- abs(qnorm(0.025)) #z value of 95%

pc - z*sqrt(pc*(1-pc)/n) #0.008306823 Lower proportion value


pc + z*sqrt(pc*(1-pc)/n) #0.02401641 Upper proportion value

10. Let 1 and 2 be the mean amount spent by married and unmarried customers, respectively.
a. Construct and interpret a 99% confidence interval for 1.

spend_m <- budget$AmountSpent[budget$Married == 1]


t.test(spend_m, conf.level = 0.99) # 1550.832 - 1791.744

One Sample t-test

data: spend_m

t = 35.877, df = 495, p-value < 2.2e-16

alternative hypothesis: true mean is not equal to 0

99 percent confidence interval:

1550.832 1791.744

sample estimates:

mean of x
1671.288
b. Construct and interpret a 99% confidence interval for 2.

spend_um <- budget$AmountSpent[budget$Married == 2]


t.test(spend_um, conf.level = 0.99) # 690.9291 - 828.9340

One Sample t-test

data: spend_um

t = 28.478, df = 493, p-value < 2.2e-16

alternative hypothesis: true mean is not equal to 0

99 percent confidence interval:

690.9291 828.9340

sample estimates:

mean of x
759.9316
c. Based on the results of (a) and (b) above, can you conclude with 99% confidence whether married or
unmarried customers spend more on average?
mean(budget$AmountSpent) #total average of spending is 1216.53, then Married customer spends more
than average with 99% confidence

You might also like