0% found this document useful (0 votes)
150 views3 pages

Batch38 CSE7315c Probability Basics Lab04 Solutions

This document contains an activity sheet with 9 problems related to probability basics and constructing confidence intervals. The problems cover topics like finding the probability of a total time being between two values, constructing confidence intervals for a population mean, calculating z-scores and p-values, and determining the number of sandwiches needed for guests with a given probability of shortage. Solutions to each problem are provided as code snippets in R.

Uploaded by

varshika
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
150 views3 pages

Batch38 CSE7315c Probability Basics Lab04 Solutions

This document contains an activity sheet with 9 problems related to probability basics and constructing confidence intervals. The problems cover topics like finding the probability of a total time being between two values, constructing confidence intervals for a population mean, calculating z-scores and p-values, and determining the number of sandwiches needed for guests with a given probability of shortage. Solutions to each problem are provided as code snippets in R.

Uploaded by

varshika
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

20180106_Batch38_CSE7315c_ Probability Basics_Lab04

Activity Sheet

Please go through today's lecture and spend time in understanding the concepts, examples problems
explained and then solve the problems given below.

Activity - 1

1. A guy at a counter serves customers standing in the queue one by one. Suppose that the service time
X_i for customer i has mean E(X_i) = 2 minutes and Var(X_i)=1 minutes^2. We assume that service
times for different bank customers are independent. Let Y be the total time the guy spends serving 50
customers. Find P(90<Y<110)

## Note that mean(N*x) = N*mean(x) and std(N*x) = N*std(x)


## The mean serving time for 50 customers is a normal distribution with mean = 50*2 and
standard_error = 50 * (sqrt(1)/sqrt(50))
areas = pnorm(c(110, 90), 100, 50/sqrt(50)) # This gives pnorm-area for each 110 and 90 at mean
100 and std 50/sqrt(50)
answer = areas[1]-areas[2] # then we subtract the area of p(110) from p(90)

2. A random sample of 100 items is taken, producing a sample mean of 49. The population SD is 4.49.
Construct a 90% confidence interval to estimate the population mean.

mean = 49
standard_error = 4.49/sqrt(100)
# for 90% confidence, the left quantile should cover 5% of area and the right quantile should cover
95% of the area hence we find qnorm of both 0.05 and 0.95
CI = qnorm(c(0.05,0.95), mean, standard_error)

3. A random sample of 35 items is taken, producing a sample mean of 2.364 with a sample variance of
0.81. Assume x is normally distributed and construct a 90% confidence interval for the population
mean.

mean = 2.364
sample_var = 0.81
pop_var = sample_var #assumption
standard_error = sqrt(pop_var)/sqrt(35)
CI = qnorm(c(0.05,0.95), mean, standard_error)

4. The average zinc concentration recovered from a sample of zinc measurements in 36 different
locations is found to be 2.6 grams per milliliter. Find the 95% and 99% confidence intervals for the
mean zinc concentration in the river. Assume that the population standard deviation is 0.3.
(Give your intervals for 95% first and then for 99% e.g. - 2.40,2.60,2.45,2.55)

mean = 2.6
pop_sd = 0.3
standard_error = 0.3/sqrt(36)

Inspire…Educate…Transform.
20180106_Batch38_CSE7315c_ Probability Basics_Lab04

CI_95 = qnorm(c(0.025,0.975), mean, standard_error)


CI_99 = qnorm(c(0.005,0.995), mean, standard_error)

5. Suppose a car manufacturer claims a model gets 25 mpg. A consumer group asks 40 owners of this
model to calculate their mpg and the mean value was 22 with a standard deviation of 1.5.
Give the z-score for this observation. Is the claim true? (Give your answer as “z-score,Yes/No”. e.g. -
1.99,Yes)

## H0 (Null Hypothesis): Model's mpg >= 25


## H1 (Alternate Hypothesis): Model's mpg < 25
## This is a left tailed test
h0_mpg = 25 # mpg as per H0
obs_mpg = 22 # observation
std_sample = 1.5
std_pop = std_sample # assumption
standard_error = std_pop/sqrt(40)
significance = 0.05 #This value should be assumed based on domain/problem_severity at hand
z_score = (obs_mpg - h0_mpg)/standard_error # this shows, given our Null Hypothesis assumption,
what is the z-score of the observation

# Once we have the z-score of the observation and significance set, there are two ways to approach
the problem

# 1. Calculate the mpg value at which significance value occurs and see if it is more than observation.
If so, null hypothesis is rejected
mpg_at_significance = qnorm(1-significance, h0_mpg, standard_error) #
mpg_at_significance < obs_mpg

# 2. Calculate the probability of our observation happening in current scenario. If it is less than
significance, reject null hypothesis
prob_observation = pnorm(obs_mpg, h0_mpg, standard_error)
prob_observation < significance #if true reject null hypothesis

6. Suppose the mean weight of King Penguins found in an Antarctic colony last year was 15.4 kg. In a
sample of 35 penguins same time this year in the same colony, the mean penguin weight is 14.6 kg.
Assume the population standard deviation is 2.5 kg.
What is the p-value for the given observation? At 0.05 significance level, can we reject the null
hypothesis that the mean penguin weight does not differ from last year?

obs_mean = 14.6
pop_mean = 15.4
pop_sd = 2.5
standard_error = pop_sd/sqrt(35)
z_score = (14.6 - 15.4)/standard_error

Inspire…Educate…Transform.
20180106_Batch38_CSE7315c_ Probability Basics_Lab04

p_value = pnorm(obs_mean, pop_mean, standard_error) #NOTE: This is the same as pnorm(z_score)


--WHY??--

7. A student, to test his luck, went to an examination unprepared.


It was a MCQ type examination with two choices for each questions. There are 50 questions of which
at least 20 are to be answered correctly to pass the test. What is the probability that he clears the
exam?
If each question has 4 choices instead of two, What is the probability that he clears the exam?
(give answer in following format ans1,ans2)

## Given scenario is a binomial distribution of 50 trials and p=1/2


p_at_2_choices = 1 - pbinom(19, 50, 1/2)
p_at_4_choices = 1 - pbinom(19, 50, 1/4)

8. A marketing director of a large department store wants to estimate the average number of customers
who enter the store every five minutes. She randomly selects five-minute intervals and counts the
number of arrivals at the store. She obtains the figures 68, 42, 51, 57, 56, 80, 45, 39, 36 and 79. The
analyst assumes the number of arrivals is normally distributed. Using this data, the analyst computes
a 95% confidence interval to estimate the mean value for all five-minute intervals. What interval value
does she get? (Notice the small sample size)

obs = c(68, 42, 51, 57, 56, 80, 45, 39, 36, 79)
mean_sample = mean(obs)
standard_error = sd(obs)/sqrt(length(obs))
z_score_for_95_CI_left = qt(0.025, length(obs)-1) # qt is just like any other qdist function. Here the
dist is t-distribution which needs number of degrees of freedom as input
z_score_for_95_CI_right = qt(0.975, length(obs)-1)

CI_left = z_score_for_95_CI_left*standard_error + mean_sample


CI_right = z_score_for_95_CI_right*standard_error + mean_sample

9. You have invited 64 guests to a party. You need to make sandwiches for the guests. You believe that
a guest might need 0, 1 or 2 sandwiches with probabilities 0.25, 0.50, and 0.25 respectively. You
assume that the number of sandwiches each guest needs is independent from other guests. How
many sandwiches should you make so that you are 95% sure that there is no shortage? (Give an
integer answer - e.g. - 64)

mu = sum(c(0.25, 0.5, 0.25) * c(0, 1, 2)) # Expected number of sandwiches per person
var = sum(c(0.25, 0.5, 0.25) * (c(0, 1, 2)^2)) - mu^2 ## Var = E(X*2) - (E(X))^2
std = sqrt(var)
standard_error = std/sqrt(64)
CI = qnorm(0.95, mu, standard_error)*64
# CI = qnorm(0.95, mu*64, standard_error*64) ## also gives the same result
ceiling(CI) ## ceil function rounds a number to its higher integer

Inspire…Educate…Transform.

You might also like