Inferential Statistics (AutoRecovered)
Inferential Statistics (AutoRecovered)
Instructions:
Please share your answers filled inline in the word document. Submit code files wherever
applicable.
Insights should be drawn from the plots about the data such as, is data normally
distributed/not, outliers, measures like mean, median, mode, variance, std. deviation, etc.
Problem Statements:
Q1) Three Coins are tossed, find the probability that two heads and one tail are obtained?
Ans1) When three coins are tossed the total number of possible combinations
are 23 = 8.
These combinations are HHH, HHT, HTH, THH, TTH, THT, HTT, TTT.
The number of combinations which have two heads and one tail are:
HHT, HTH, TTH which makes them 3 in number.
Therefore the Probability of getting two heads and one tails in the toss of three
coins simultaneously is defined as:
P(Two heads and One Tail) = Number of outcomes
= 3/8 = 0.375
Q2) Two Dice are rolled, find the probability that sum is
(1,1)(1,2)(1,3)(1,4)(1,5)(1,6)
(2,1)(2,2)(2,3)(2,4)(2,5)(2,6)
(3,1)(3,2)(3,3)(3,4)(3,5)(3,6)
(4,1)(4,2)(4,3)(4,4)(4,5)(4,6)
(5,1)(5,2)(5,3)(5,4)(5,5)(5,6)
(6,1)(6,2)(6,3)(6,4)(6,5)(6,6)
(1,1)(1,2)(1,3)(1,4)(1,5)(1,6)
(2,1)(2,2)(2,3)(2,4)(2,5)(2,6)
(3,1)(3,2)(3,3)(3,4)(3,5)(3,6)
(4,1)(4,2)(4,3)(4,4)(4,5)(4,6)
(5,1)(5,2)(5,3)(5,4)(5,5)(5,6)
(6,1)(6,2)(6,3)(6,4)(6,5)(6,6)
Q4) Calculate the Expected number of candies for a randomly selected child:
Below are the probabilities of count of candies for children (ignoring the nature of the child-
Generalized view)
i. Child A – probability of having 1 candy is 0.015
ii. Child B – probability of having 4 candies is 0.2
Ans) 3.09
CHILD Candies count Probability
A 1 0.015
B 4 0.20
C 3 0.65
D 5 0.005
E 6 0.01
F 2 0.12
Q5) Calculate Mean, Median, Mode, Variance, Standard Deviation, Range & comment about
the values / draw inferences, for the given dataset
- For Points, Score, Weigh>
Find Mean, Median, Mode, Variance, Standard Deviation, and Range and comment about the
values/ Draw some inferences.
∑ P(x).E(x)
P(x) 1/9 1/9 1/9 1/9 1/9 1/9 1/9 1/9 1/9
= (1/9) ( 108 + 110 + 123 + 134 + 135 + 145 + 167 + 187 + 199)
= (1/9) ( 1308)
= 145.33
Q7) Look at the data given below. Plot the data, find the outliers and find out μ , σ , σ 2
Hint: [Use a plot which shows the data distribution, skewness along with the outliers; also use
R/Python code to evaluate measures of centrality and spread]
Μ = 0.332713
σ = Standard deviation = 0.169454
σ 2 = Variance = 0.028715
Python Code:
import pandas as pd
import matplotlib.pyplot as plt
inf = pd.read_excel("E:/Data Science/assignment/DataSets/stats1.xlsx")
inf.mean()
inf.std()
inf.var()
plt.boxplot(inf.Measure)
plt.hist(inf.Measure)
Outliers are present.
What is the probability that at least one in five attempted telephone calls reaches the wrong
number? (Assume independence of attempts.)
Hint: [Using Probability formula evaluate the probability of one call being wrong out of five
attempted calls]
Number of Calls = 5
P(x) = ⁿCₓpˣqⁿ⁻ˣ
n=5
p = 1/200
q = 199/200
at least one in five attempted telephone calls reaches the wrong number
= 1 - P(0)
= 1 - ⁵C₀(1/200)⁰(199/200)⁵⁻⁰
= 1 - (199/200)⁵
= 0.02475
probability that at least one in five attempted telephone calls reaches the wrong number =
0.02475
Q9) Returns on a certain business venture, to the nearest $1,000, are known to follow the
following probability distribution
X P(x)
-2,000 0.1
-1,000 0.1
0 0.2
1000 0.2
2000 0.3
3000 0.1
(i) What is the most likely monetary outcome of the business venture?
Hint: [The outcome is most likely the expected returns of the venture]
Ans(i) most likely monetary outcome of the business venture is 2000 $
(iv) What is the good measure of the risk involved in a venture of this kind? Compute
this measure.
Hint: [Risk here stems from the possible variability in the expected returns,
therefore, name the risk measure for this venture]
ANS: P(loss) = P(x= -2000)+P(x=-1000)=0.2. So the risk associated with this venture is 20%
Hints:
For each assignment, the solution should be submitted in the below format
1. Research and Perform all possible steps for obtaining solution.
2. For Statistics calculations, explanation of the solutions should be documented detail along
with codes. Use the same word document to fill in your explanation
Must follow these guidelines:
2.1. Be thorough with the concepts of Probability, Central Limit Theorem and Perform the
calculation stepwise
2.2. For True/False Questions, or short answer type questions explanation is must.
2.3. R & Python code for Univariate Analysis (histogram, box plot, bar plots etc.) the data
distribution to be attached
3. All the codes (executable programs) should execute without errors
4. Code modularization should be followed