0% found this document useful (0 votes)
7 views145 pages

One Shot Prob

The document discusses various concepts in probability and statistics, including the probability of events with coin tosses, conditional probability, random variables, and distributions such as Bernoulli, Binomial, and Normal distributions. It also covers topics like expectation, variance, covariance, and the Central Limit Theorem, along with examples and problems related to these concepts. Additionally, it addresses confidence intervals and estimators for population properties.

Uploaded by

roudradev43
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views145 pages

One Shot Prob

The document discusses various concepts in probability and statistics, including the probability of events with coin tosses, conditional probability, random variables, and distributions such as Bernoulli, Binomial, and Normal distributions. It also covers topics like expectation, variance, covariance, and the Central Limit Theorem, along with examples and problems related to these concepts. Additionally, it addresses confidence intervals and estimators for population properties.

Uploaded by

roudradev43
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 145

Probability & Statistics

in one go
Four fair coins are tossed simultaneously. The probability that at least one head and
one tail turn up is :
(A) 1/16
(B) 1/8
(C) 7/8
(D) 15/16

Answer: (C)
Answer: D
Answer: B
Example: 500 students are taking one or more courses out of chemistry, physics and
mathematics. Registration records indicate course enrolment as follows: chemistry (329),
physics (186), mathematics (295), chemistry and physics (83), chemistry and mathematics
(217), and physics and mathematics (63), How many students are taking all 3 subjects?

a. 37
b. 43
c. 47
d. 53

Answer (d)
Different Types of Events

• Mutually Exclusive Events


• Exhaustive Events
• Equally Likely Events
• Independent Events
• Dependent Events

01
GATE DA 2024
Three fair coins are tossed independently. T is the event that two or more tosses result
in heads. S is the event that two or more tosses result in tails. What is the probability of
the event 𝑇 ∩ 𝑆 ?
(a) 0
(b) 0.5
(c) 0.25
(d) 1

Answer: (A)
Conditional
Probability

01
P and Q are considering to apply for job. The probability that p applies for job is
1/4. The probability that P applies for job given that Q applies for the job 1/2 and
The probability that Q applies for job given that P applies for the job 1/3.The
probability that P does not apply for job given that Q does not apply for the job .
(A) 4/5
(B) 5/6
(C) 7/8
(D) 11/12

Answer: (A)
Law of total probability & Bayes’s formula
Suppose that if I take bus 1, then I am late with probability 0.1. If I take bus 2, I am
late with probability 0.2. The probability that I take bus 1 is 0.4, and the probability
that I take bus 2 is 0.6. What is the probability that I am late?

Answer: 0.14
GATE DA Sample paper 2024

A class contains 60% students who are incapable of changing their opinions about
anything, and 40% of students are changing their minds at random, with probability
0.3, between subsequent votes on the same issue. Then, the probability of a
student randomly chosen voted twice in the same way is _______.
GATE DA Sample paper 2024

Let {O1, O2, O3, O4} represent the outcome of a random experiment, with
P({O1})=P({O2})=P({O3})=P({O4}). Consider the following events: P={O1,O2},
Q={O2,O3}, R={O3,O4},S={O1,O2,O3}. Then, which of the following statements is
true?
(A) P and Q are independent
(B) P and Q are not independent
(C) R and S are independent
(D) Q and S are not independent
GATE DA 2024
Random Variable

01
Discrete Random Variable:
Continuous Random Variable:
Example:- Compute the value of P (1 < X < 2).
Such that
𝑘𝑥 3 , 0≤𝑥≤3
𝑓(𝑥) = ቊ
0, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
Where, f(x) is a density function
Cumulative Density function
Consider the density function

Find cdf.
Expectation
A fair die with faces {1,2,3,4,5,6} is thrown repeatedly till ′3′ is observed for the first
time. Let X denote the number of times the dice is thrown. The expected value of X
is _______.
(GATE ECE 2015 Set 3)
1.5
2.6
3.10
4.15

Answer (b)
Properties of Expectation

i. Let g and h be functions, and let a and b be constants. For any random variable X
(discrete or continuous),
E {ag(X) + bh(X) } = aE { g(X) } + bE { h(X) } .
In particular, E(aX + b) = aE(X) + b.

ii. Let X and Y be ANY random variables (discrete, continuous, independent, or non-
independent). Then E(X + Y ) = E(X) + E(Y ).

iii. Let X and Y be independent random variables, and g, h be functions.


Then E(XY ) = E(X)E(Y )
E [ g(X)h(Y ) ] = E [ g(X) ] E [ h(Y ) ] .
Variance
Properties of Variance
(i) Let g be a function, and let a and b be constants.
For any random variable X (discrete or continuous),
Var { ag(X) + b } = a2 Var { g(X) } .
In particular, Var(aX + b) = a2 Var(X).

(ii) Let X and Y be independent random variables.


Then Var(X + Y ) = Var(X) + Var(Y ).

(iii) If X and Y are NOT independent,


then Var(X + Y ) = Var(X) + Var(Y ) + 2cov(X, Y ).
GATE DA Sample paper 2024
X is a uniformly distributed random variable from 0 to 1

The variance of X is
(A) 1/2
(B) 1/3
(C) 1/4
(D) 1/12
Joint Probability
Distributions

01
Example 1: Roll two dice. Let X be the value on the first die and let Y be the
value on the second die
In example 1, describe the event B = ‘Y − X ≥ 2’ and find its probability.
Joint Probability Mass Function (Joint PMF)

The function f(x, y) is a joint probability function, or probability mass function of the
discrete random variables X and Y if
Let X be a coin flip, Y be a dice. Find the joint PMF.
In the previous example, Let’s take head as 1 and tail as 0, define A = {X + Y = 3} and B
= {min(X, Y ) = 1}. Find P[A] and P[B].
Marginal Probability Mass Function (Marginal PMF)
The marginal distributions of X alone and Y along are, respectively
Example 2: Data supplied by a company in Duluth, Minnesota, resulted in the contingency
table displayed as below for number of bedroom and number of bathrooms for 50 homes
currently for sale. Suppose that one of these 50 homes is selected at random. Let X and Y
denote the number of bedrooms and the number of bathrooms, respectively, of the home
obtained.
(i) Obtain the joint PMF of X and Y.
(ii) Find the probability that the home obtained has the same number of
bedrooms and bathrooms, i.e., P(X = Y).
(iii) Find the marginal distribution of X alone.
(iv) Find the marginal distribution of Y alone.
Conditional Probability Mass Function (Conditional PMF)
Example 2: Data supplied by a company in Duluth, Minnesota, resulted in the contingency
table displayed as below for number of bedroom and number of bathrooms for 50 homes
currently for sale. Suppose that one of these 50 homes is selected at random. Let X and Y
denote the number of bedrooms and the number of bathrooms, respectively, of the home
obtained.
EXAMPLE . Refer to Example 2.
(a) Find the distribution of X|Y = 2?
(b) Use the result to determine f(3|2) = P(X = 3|Y = 2).
Independence

Jointly-distributed random variables X and Y are independent if their joint pmf is the
product of the marginal pmf’s:

f(x, y) = fX(x)fY (y)


Suppose X and Y both take values in [0,1] with density f(x, y) = 4xy.
Show f(x, y) is a valid joint pdf,
visualize the event A = ‘X < 0.5 and Y > 0.5’ and find its probability.
Suppose that the random variables X and Y have a joint density function given by

Find
(a) the constant c,
(b) P(X + Y > 4)
Consider the joint density function of the random variables X and Y:

a. Find marginal pdfs


b. Find P(X>1/2)
Conditional Probability Density Function (Conditional PDF)
Consider the joint probability density function of the random variables X and Y:
Expectation and Variance in Joint Random Variable
Covariance
If Var(X + 2Y ) = 40 and Var(X − 2Y ) = 20, what is Cov(X, Y )?
Correlation
What is the correlation between x and a−x?
Conditional Expectation and Variance
What is the conditional expectation of Y given X?

Answer: 2/3
What is the conditional variance of Y given X?
GATE DA 2024
GATE DA 2024

Two fair coins are tossed independently. X is a random variable that takes a value of 1 if
both tosses are heads and 0 otherwise. Y is a random variable that takes a value of 1 if
at least one of the tosses is heads and 0 otherwise.

The value of the covariance of X and Y is ______ (rounded off to three decimal places).

Answer: 0.0625
GATE DA Sample Paper 2024
Probability
Distributions

01
Bernoulli Distribution
Geometric Distribution
Binomial Distribution
Poisson Distribution
Uniform Distribution
Exponential Distribution
Normal Distribution
Standard Normal Distribution
Passengers try repeatedly to get a seat reservation in any train running between two
stations until they are successful. If there is a 40% chance of getting a reservation in
any attempt by a passenger, then the average number of attempts that passengers
need to make to get a seat reserved is __________.

Answer 2.5
Each sample of water has a 10% chance of containing a particular organic pollutant.
Assume that the samples are independent with regard to the presence of the
pollutant. Find the probability that in the next 18 samples, exactly 2 contain the
pollutant
A fair coin is tossed independently four times. The probability of the event ”The number
of times heads show up is more than the number of times tails show up” is _________.
(GATE ECE 2010)
A. 1/16
B. ⅛
C. ¼
D. 5/16

Answer (d)
If a random variable X has a Poisson distribution with mean 5, then the expression
E[(X + 2)2] equals _____.
Note: This question appeared as Numerical Answer Type.
(A) 54
(B) 55
(C) 56
(D) 57

Answer: (A)
GATE DA 2024
The sample average of 50 data points is 40. The updated sample average after including
a new data point taking the value of 142 is ______

Answer: (42)
Suppose you break a stick of unit length at a point chosen uniformly at random. Then the
expected length of the shorter stick is ________ .

Answer: 0.25
Let X denote the time between detections of a particle with a Geiger counter
and assume that X has an exponential distribution with E(X) = 1.4 minutes.
What is the probability that we detect a particle within 30 seconds of starting the
counter?
P(Z<1) =

P(Z<0.87) =
P(Z< - 1) =

P(Z>1.5) =
P(Z> - 1) =

P(Z> -1.56) =
P(- 1< Z < 1) =
Suppose that the current measurements in a strip of wire are assumed to follow a
normal distribution with a mean of 10 milliamperes and a variance of four
(milliamperes)2 . What is the probability that a measurement exceeds 13
milliamperes?
GATE ME 2014

A nationalized bank has found that the daily balance available in its saving bank
accounts follows a normal distribution with a mean of Rs. 500 and a standard deviation
of Rs. 50. The percentage of savings account holders who maintain an average daily
balance more than Rs. 500 is _______________.

Answer: 50
GATE CS 2008
Let X be a random variable following normal distribution with mean +1 and
variance 4. Let Y be another normal variable with mean -1 and variance unknown
If P(X <=-1) = P(Y >=2). the standard deviation of Y is
(A) 3
(B) 2
(C) sqrt(2)
(D) 1

Answer: (A)
Student’s t distribution is the sampling distribution of the t-statistic.

The values of the t-statistic is given by:


𝑥ҧ − 𝜇
𝑡=
𝑠Τ 𝑛

where,
t = t score
𝑥ҧ = sample mean,
μ = population mean,
s = standard deviation of the sample,
n = sample size

As z table we have t table also, we will discuss that in statistics.


Basics of Statistics
Central Limit
Theorem for mean
Central Limit Theorem for mean

Suppose X is a random variable with a distribution that may be known or unknown


(it can be any distribution).
suppose:
a. µ = the mean of X
b. σ = the standard deviation of X

If you draw random samples of size n, then as n increases, the random


variable 𝑋ത which consists of sample means, tends to be normally distributed and
An unknown distribution has a mean of 90 and a standard deviation of 15. Samples
of size n = 25 are drawn randomly from the population.
Find the probability that the sample mean is between 85 and 92.
Example Consider a normal population with mean µ = 82 and standard deviation σ = 12.

(a) If a random sample of size 64 is selected, what is the probability that the sample
mean 𝑋ത will lie between 80.8 and 83.2?
(b) With a random sample of size 100, what is the probability that the sample mean 𝑋ത
will lie between 80.8 and 83.2?
Central Limit
Theorem for sum
Central Limit Theorem for sum

Suppose X is a random variable with a distribution that may be known or unknown


(it can be any distribution).
suppose:
a. µ = the mean of X
b. σ = the standard deviation of X

If you draw random samples of size n, then as n increases, the random variable

which consists of total of n observations, tends to be normally distributed and


EXAMPLE A large freight elevator can transport a maximum of 9800 pounds. Suppose a
load of cargo containing 49 boxes must be transported via the elevator. Experience has
shown that the weight of boxes of this type of cargo follows a distribution with mean µ =
205 pounds and standard deviation σ = 15 pounds. Based on this information, what is the
probability that all 49 boxes can be safely loaded onto the freight elevator and
transported?

We are given n = 49, µ = 205, σ = 15.


The elevator can transport up to 9800 pounds. Therefore these 49 boxes will be
safely transported if they weigh in total less than 9800 pounds.
Confidence Interval

The main idea is to estimate properties of a population


based on properties of a sample.
Estimators of a Population

• A Point estimate is a single value that best describes the population of


interest
• An Interval estimate provides a range of values that best describes the
population
Confidence Interval Estimates:

In simple terms, Confidence Interval is a range where we are certain that true value
exists.
But this is the same as the probability that the population mean lies within a certain
interval of a sample.
Thus, we can determine how confident we are that the population mean lies within a
certain interval of a sample mean.

Confidence Level:
The confidence level describes the uncertainty associated with a sampling method.
For example, let’s suppose you were surveying an average height of men in a particular
city. To find that, you set a 95% confidence level and find that the 95% confidence interval
is (168,182). That means if you repeated this over and over, 95 percent of the time the
height of a man would fall somewhere between 168 cm and 182 cm.
Generally, 100(1 − α)% confidence interval

95% confidence interval means α = 0.05 i.e. There is 5% chances of error.


Notation: zα

In statistical inference, we need the z values that give certain tail areas under the
standard normal curve.
zα will denote the z value for which α of the area under the z curve lies to the right
of zα.
What is the z value for a 90, 95, and 99 percent confidence interval?
Constructing a confidence interval involves 4 steps.
We use z-distribution when the sample size n>30. Z-test is more useful when the
standard deviation is known.
Consider the following example. A random sample of 50 adult females was taken
and their RBC count is measured. The sample mean is 4.63 and the standard
deviation of RBC count is 0.54. Construct a 95% confidence interval estimate for
the true mean RBC count in adult females.
A sample of 25 Valencia oranges weighed an average (mean) of 10 oz per orange. The
standard deviation of the population of weights of Valencia oranges is 2 oz. Find a 95%
confidence interval (CI) for the population mean.
What can we learn from the formula?
1. As n increases, the width of the CI DECREASES
2. As σ increases, the width of the CI INCREASES
3. As the confidence level increases, z ∗ INCREASES , so the width INCREASES
Hypothesis Testing

Hypothesis testing is a statistical method that is used in making a statistical


decision using experimental data.

Hypothesis testing evaluates two mutually exclusive population statements to


determine which statement is most supported by sample data.
Parameters of hypothesis testing

•Null hypothesis(H0): It is a basic assumption based on the problem


knowledge.

•Alternative hypothesis(H1): The alternative hypothesis is the hypothesis used


in hypothesis testing that is contrary to the null hypothesis.

Null Hypothesis : A company production is equal to 50 unit/per day


Alternate Hypothesis: : A company production is not equal to 50 unit/per day

H0 : amount of lead in Maggie noodles does not exceed the maximum limit i.e., 2.5ppm
H1: amount of lead in Maggie noodles exceed the maximum limit i.e., 2.5ppm
Outcome 1: We reject the null hypothesis when in reality it is false.
Outcome 2: We reject the null hypothesis when in reality it is true.
(Type 1 Error)
Outcome 3: We failed to reject the null hypothesis when in reality it is false.
(Type 2 Error)
Outcome 4: We failed to reject the null hypothesis when in reality it is true.

We say “We failed to reject the null hypothesis” instead of “we accept the null hypothesis”.
• P-value
The P value is the probability for the null hypothesis to be true.

• Level of significance
The level of significance is the probability of rejecting the null hypothesis when
it is true.

If the p-value is less than α, then the null hypothesis is rejected, and the
alternative hypothesis is accepted. If the p-value is greater than α, then the null
hypothesis is not rejected.
Z - Test

When to Use Z-test:


•Samples should be drawn at random from the population.
•The sample size should be greater than 30.
•The standard deviation of the population should be known.
Steps to perform Z-test:
• First, identify the null and alternate hypotheses.
• Determine the level of significance (∝).
• Calculate the z-test statistics. Below is the formula for calculating the z-test
statistics.

where,
ത mean of the sample.
𝑋:
𝜇: mean of the population.
𝜎: Standard deviation of the population.
n: sample size.
• Find p value using z statistics.
• Now compare with the hypothesis and decide whether to reject or not to reject
the null hypothesis
Suppose the arousal of hot cats has a population that is normally distributed with a
standard deviation of 6. Tomorrow you sample 49 hot cats from this population and
obtain a mean arousal of 46.44 and a standard deviation of 5.6968. Using an alpha
value of α = 0.01, is this observed mean significantly less than an expected arousal of
47?
Problem: A school claimed that the student’s study is more intelligent than the average
school. On calculating the IQ scores of 50 students, the average turns out to be 110. The
mean of the population IQ is 100 and the standard deviation is 15. State whether the claim of
principal is right or not at a 5% significance level.
A teacher claims that the mean score of students in his class is greater than 82
with a standard deviation of 20. If a sample of 81 students was selected with a
mean score of 90 then check if there is enough evidence to support this claim
at a 0.05 significance level.
Suppose the width of makeshift personalities has a population that is normally
distributed with a standard deviation of 7. You want to sample 22 makeshift
personalities from this population and obtain a mean width of 87.19 and a standard
deviation of 7.257. Using an alpha value of α = 0.01, is this observed mean significantly
less than an expected width of 89?
Z – Test (two – tailed)
Suppose the jewelry of exams has a population that is normally distributed with a
standard deviation of 5. You are walking down the street and sample 9 exams from
this population and obtain a mean jewelry of 28.95 and a standard deviation of
6.3802. Using an alpha value of α = 0.01, is this observed mean significantly different
than an expected jewelry of 27?
Suppose the life expectancy of Seattleites has a population that is normally distributed
with a standard deviation of 1. You go out and sample 45 Seattleites from this
population and obtain a mean life expectancy of 88.51 and a standard deviation of
1.0815. Using an alpha value of α = 0.05, is this observed mean significantly different
than an expected life expectancy of 89?
Suppose the width of bus riders has a population that is normally distributed with a
standard deviation of 10. Suppose that before graduation, your first job was to sample
98 bus riders from this population and obtain a mean width of 49.98 and a standard
deviation of 10.3386. Using an alpha value of α = 0.01, is this observed mean
significantly different than an expected width of 52?
T - Test

A t-test is a statistical test that compares the means of two samples. It is used in
hypothesis testing, with a null hypothesis that the difference in group means is
zero and an alternate hypothesis that the difference in group means is different
from zero.
Steps to perform T-test:
• First, identify the null and alternate hypotheses.
• Determine the level of significance (∝).
• Calculate the degree of freedom df = n-1
• Find the critical value of t in the t-test using t- table.
• Calculate the t-test statistics. Below is the formula for calculating the t-test
statistics.

where,
ത mean of the sample.
𝑋:
𝜇: mean of the population.
𝑠: Standard deviation of the sample.
n: sample size.
• Now compare with the hypothesis and decide whether to reject or not to reject
the null hypothesis
Problem: A school claimed that the students’ study that is more intelligent than the average
school. On calculating the IQ scores of 30 students, the average turns out to be 140 and
standard deviation is 20. The mean of the population IQ is 100 . State whether the claim of
principal is right or not at a 5% significance level.
Suppose we are interested in determining whether the average weight of a certain
breed of dog is significantly different from a target weight of 25 pounds. We randomly
select a sample of 20 dogs from this breed and weigh them and get the mean 24
pounds and standard deviation is 0.7. State whether the claim we made is right or not
at a 5% significance level.
Chi- Square Test

It is a powerful test for testing the significance of the discrepancy between theory and
experiment.
(OR)
The Chi-square (χ2 ) test represents a useful method of comparing experimentally obtained
results with those to be expected theoretically on some hypothesis.
The value of chi-square is very big it indicates that the divergence between expected
and observed frequencies is large.
If the value of chi-square is very small it indicates that the divergence between
actual and expected frequencies is very little.
The following steps are followed for the above said purpose:
i. A null and alternative hypothesis related to the enquiry
ii. expected or theoretical frequencies are derived through probability.
iii. A level of significance is chosen for rejection of the null hypothesis.
iv. Chi Square value

v. The observed frequencies are compared with the expected or theoretical


frequencies.

If the calculated value of 𝜒 2 is less than the table value, failed to reject the null
2
hypothesis. On the other hand, if the calculated value of 𝜒 is greater than the
table value, we will reject the null hypothesis.
Problem Ninety-six subjects are asked to express their attitude towards the
proposition “Should AIDS education be integrated in the curriculum of Higher
secondary stage” by marking F (favorable), I (indifferent) or U (unfavorable).
Observed(fo) 48 24 24

Expected (fe) 32 32 32

Test the hypothesis that “there is no difference between preferences in the group”.
Two hundred bolts were selected at random from the output of each of the five machines.
The number of defective bolts found were 5, 9, 13, 7 and 6 . Is there a significant
difference among the machines? Use 5% level of significance.
Thank you

You might also like