Name: Suresh A
Basic Statistics (Module – 4 (Part – 1))
1) Calculate probability from the given dataset for the below cases
Data_set: Cars.csv
Calculate the probability of MPG of Cars for the below cases.
MPG <- Cars$MPG
a. P(MPG>38)
Sol: 0.3475939
Rcode pnorm(38, mean(MPG), sd(MPG, lower.tail = F))
b. P(MPG<40)
Sol: 0.7293499
Rcode pnorm(40, mean(MPG), sd(MPG))
c. P (20<MPG<50)
Sol: 0.8988689
Rcode pnorm(50, mean(MPG), sd(MPG)) - pnorm(20, mean(MPG), sd(MPG))
Q2) Check whether the data follows normal distribution
a) Check whether the MPG of Cars follows Normal
Distribution Dataset: Cars.csv
Sol: yes, it follows Normal Distribution,
Because Mean=Medain (veryclose) and skewness close to zero
b) Check Whether the Adipose Tissue (AT) and Waist Circumference (Waist) from wc-at
data set follows Normal Distribution
Dataset: wc-at.csv
Sol: Waist & AT does not follow Normal Distribution (No Bell Curve)
3) Calculate the Z scores of 90% confidence interval,94% confidence interval, 60% confidence
interval
Sol:-
Z (90%) = Z(0.05) / Z(0.95) = +/-1.644854
rcode → qnorm(0.05)
Z (94%) = Z(0.03) / Z(0.97) = +/-1.880794
rcode → qnorm(0.03)
Z (60%) = Z(0.20) / Z(0.80) = +/-0.8416212
rcode → qnorm(0.20)
Q4) Calculate the t scores of 95% confidence interval, 96% confidence interval, 99%
confidence interval for sample size of 25
Sol:-
t(95%, 25) = t(0.025, 24) / t(0.975, 24) = +/-2.063899
rcode → qt(0.975, 24)
t(96%, 25) = t(0.02, 24) / t(0.98, 24) = +/-2.171545
rcode → qt(0.98, 24)
t(99%, 25) = t(0.005, 24) / t(0.995, 24) = +/-2.79694
rcode → qt(0.995, 24)
Q5) A Government company claims that an average light bulb lasts 270 days. A researcher
randomly selects 18 bulbs for testing. The sampled bulbs last an average of 260 days, with a
standard deviation of 90 days. If the CEO's claim were true, what is the probability that 18
randomly selected bulbs would have an average life of no more than 260 days
Hint:
rcode → pt(tscore,df)
#df → degrees of freedom
Sol:-
t = [ x - μ ] / [ s / sqrt( n ) ]
= [260-270]/[s/sqrt(18)]
=-10/11.78
=-0.84889
The probability that 18 randomly selected bulbs would have an average life of no more than
260 days
rcode→pt(-0.84889,17)
=0.2038689
Q6) The time required for servicing transmissions is normally distributed with μ = 45 minutes and
σ = 8 minutes. The service manager plans to have work begin on the transmission of a customer’s
car 10 minutes after the car is dropped off and the customer is told that the car will be ready
within 1 hour from drop-off. What is the probability that the service manager cannot meet his
commitment?
A. 0.3875
B. 0.2676
C. 0.5
D. 0.6987
Sol:
Z=(X- μ)/ σ
= (50-45)/8
= 0.625
The score in the Ztable at the Z value of 0.625 is 0.734
The probability that the service manager take more than 50 mins is P(X>60)= 1-0.734 =0.266
rcode →1- pnorm(50,45,8)
Q7) The current age (in years) of 400 clerical employees at an insurance claims processing center
is normally distributed with mean μ = 38 and Standard deviation
σ =6. For each statement below, please specify True/False. If false, briefly explain why.
More employees at the processing center are older than 44 than between 38 and 44.
Sol:- False
As per calculations, the probability at mean i.e age of 38 is 0.5; P(X=38)=0.5
The probability up to age of 44 is 0.84; P(X<44)=0.84
The probability of between 38 and 44; (P38<X<44)= 0.84-0.5= 0.34(0.34*400=136 employees)
The probability of more than 44 is = 1-0.84= 0.16 (0.16*400=64employees)
Rcodes
pnorm(38,38,6)
pnorm(44,38,6)
pnorm(44,38,6)-pnorm(38,38,6)
1-pnorm(44,38,6)
A training program for employees under the age of 30 at the center would be expected to
attract about 36 employees.
Sol:-
True, The Probability P(X<30) = 0.09; 0.09*400= 36 employees
Rcode pnorm(30,38,6)
Q8) If X1 ~ N(μ, σ2) and X2 ~ N(μ, σ2) are iid normal random variables, then what is the difference
between 2 X1 and X1 + X2? Discuss both their distributions and parameters.
Sol: 2X1=N(2 μ, 2σ^2)
X1+X2=N(2 μ, 2σ^2)
Therefore difference between 2 X1 and X1 + X2 = N(0,1), Standard normal random variable
Q9) Let X ~ N(100, 202). Find two values, a and b, symmetric about the mean, such that the
probability of the random variable taking a value between them is 0.99.
A. 90.5, 105.9
B. 80.2, 119.8
C. 22, 78
D. 48.5, 151.5
E. 90.1, 109.9
Sol:
X ~ N(100, 20^2) implies μ= 100, σ^2=20^2 σ= 20
Given P(a ≤ X ≤ b)=0.99
P(a)=0.005
P(b)=0.995
Using the standardization formula as your starting point, solve backwards for the corresponding
0.5th and 99.5th percentiles of a normal distribution with mean 100 and standard deviation 14.
Z = (X-µ)/σ says that X=σ[z] + µ
Thus "a" = 0.5th percentile for X = 20[-2.575] + 100 = 48.5
and "b" = 99.5th percentile for X = 20[+2.575] + 100 = 151.5
Q10) Consider a company that has two different divisions. The annual profits from the two
divisions are independent and have distributions Profit1 ~ N(5, 3^2) and Profit2 ~ N(7, 4^2)
respectively. Both the profits are in $ Million. Answer the following questions about the total
profit of the company in Rupees. Assume that $1 = Rs. 45
A. Specify a Rupee range (centered on the mean) such that it contains 95%
probability for the annual profit of the company.
Sol:-
For Profit1 ~ N(5, 3^2)
μ= 5, σ=3
P(a ≤ Profit1 ≤ b)=0.95
P(a)= 0.025, P(b)=0.975
We know, Z = (Profit1-µ)/σ Profit1=σ[z] + µ
Thus "a" = 2.5th percentile for Profit1 = 3[-1.96] + 5 = -0.88
and "b" = 97.5th percentile for Profit1 = 3[+1.96] + 5 = 10.88
Therefore range in million rupees is -39.5<Profit1<489.6
For Profit2 ~ N(7, 4^2)
μ= 7, σ=4
P(c ≤ Profit2 ≤ d)=0.95
P(c)= 0.025, P(d)=0.975
Z = (Profit1-µ)/σ Profit1=σ[z] + µ
Thus "c" = 2.5th percentile for Profit2 = 4[-1.96] + 7 = -0.84
and "d" = 97.5th percentile for Profit2 = 4[+1.96] + 7 = 14.84
Therefore range in million rupees is -37.8<Profit2<667.8
B. Specify the 5th percentile of profit (in Rupees) for the company
Sol:-
The 5th percentile for Profit1 = 3[-1.645] + 5 = -0.065
The 5th percentile for Profit2 = 4[-1.645] + 7 = 0.42
Therefore in million rupees
The 5th percentile for Profit1 = -0.065*45= 2.925
The 5th percentile for Profit2 =0.42*45= 18.9
C. Which of the two divisions has a larger probability of making a loss in a given year?
Sol: Profit1 has a larger probability of making a loss in a given year.
In Q10 although your approach is good and quite accurate, I have shared answer with you please go through it below
Ans: Given
let, X is the sum of two random variables having normal distribution.
Converting in Rupees=45*(5+7) =540 million rupees
Std Deviation in Rupees=
= 45* = 225 million rupees.
A. Prob= 95%, it comes under 2sigma model,
μ ± 2σ = 540±2*225
= (540-450, 540+450) = (90,990)
So, the rupee range will be 90 to 990 million rupees.
B. To find 5th percentile the formula is,
=μ - 1.5σ
= 540-(1.5*225)
=202.5 million rupees.
C. we have mean and std deviation for both divisions,
So, using Z score to find probability,
Using Python,
> stats.norm.cdf (0,5,3)
=0.04779035
> stats.norm.cdf (0,7,4)
=0.04005916
So, Division 2 has less probability means it will make more loss.
Hints:
1. Business Problem
1.1. Objective
1.2. Constraints (if any)
2. For each assignment the solution should be submitted in the below format
3. Research and Perform all possible steps for obtaining solution
4. For Basic Statistics explanation of the solutions should be documented in black and white
along with the codes.
One must follow these guidelines as well:
4.1. Be thorough with the concepts of Probability, Central Limit Theorem and Perform the
calculation stepwise
4.2. For True/False Questions, explanation is must.
4.3. R & Python code for Univariate Analysis (histogram, box plot, bar plots etc.) for data
distribution to be attached
5. All the codes (executable programs) should execute without errors