Statistical Method
Statistical Method
Suppose that two coins are tossed so that the sample space is S = {HH, HT, TH, TT}
Suppose X is the number of heads which can come up, with each sample point we
can associate a number for X as shown in the table below:
Suppose that two coins are tossed so that the sample space is S = {HH, HT,
TH, TT} Suppose X is the number of heads which can come up, with each
sample point we can associate a number for X as shown in the table below:
Sample point HH HT TH TT
X 2 1 1 0
Suppose when we throw the two dice, total number of points equal to 9
(x=9)
S={(6,3), (5,4), (4,5), (3,6)}
Types of Random Variables
Discrete Random Variables
Continuous Random Variables
(1)A discrete random variable is one which may take on only a countable number
of distinct values such as 0,1,2,3,4,........ Discrete random variables are usually
(but not necessarily) counts.
If a random variable can take only a finite number of distinct values, then it
must be discrete.
Examples
when 3 coins are tossed, the number of heads obtained is the random variable X
assumes the values 0,1,2,3 which form a countable set.
Number of children in a family,
Number of defective light bulbs in a box of ten.
A continuous random variable is one which takes any value between
certain intervals or its domain
p(x)
0.45
0.4
0.4
0.35
0.3
0.3
0.25
0.2
0.2
0.15
0.1
0.1
0.05
0
1 2 3 4
Cumulative Distribution Function (CDF)
It is a function giving the probability that the random variable X is less than or equal
to x, for every value x.
For a discrete random variable, the cumulative distribution function is found by
summing up the probabilities.
Example:
Outcome 1 2 3 4
Probability 0.10.30.40.2
FX (x)=
FX (x) is probability distribution function of random variable and also
called the step function and its graph is just like staircase having a jump
of magnitude pi at each i taking along abscissa.
Example:
Two dice are rolled. Let X denotes the random variable which counts the total
number of points on the upturned faces, Construct a table giving the non-zero
values of the probability mass function and draw the probability chart.
x 2 3 4 5 6 7 8 9 10 11 12
0.027 0.055 0.083 0.111 0.138 0.166 0.138 0.111 0.083 0.055 0.027
P(x) ('1/36) ('2/36) ('3/36) ('4/36) ('5/36) ('6/36) ('5/36) ('4/36) ('3/36) ('2/36) ('1/36)
Probability Function
0.180
0.160
0.140
0.120
0.100
P(x)
0.080
0.060
0.040
0.020
-
2 3 4 5 6 7 8 9 10 11 12
x
x 2 3 4 5 6 7 8 9 10 11 12
0.027 0.82 0.165 0.276 0.414 0.580 0.0718 0.829 0.912 0.967 0.999
p(X≤x) ('1/36) (‘3/36) (‘6/36) (‘10/36) (‘15/36) (‘21/36) (‘26/36) (‘30/36) (‘33/36) (‘35/36) (‘36/36)
•
• F(x)=
Example2:
A Random variable X has the following
probability function
x 0 1 2 3 4 5 6 7
P(x) 0 k 2k 2k 3k k2 2k2 7k2+k
P(α≤ X≤β) =
Which represent area between the curve y=f(x).
(i) f(x) ≥ 0,
(ii) =1
(iii) The probability P(E) given by: P(E) =dx is well defined for any event E
Remarks:
(i) P(X=c) is not zero in case discrete random variable
P(X=c) is zero in case continuous random variable
Where c is fixed constant
=6
=6
6= 6
2b-1=0, b=1/2
or
(2b2 -2b-1)=0
b=
•Example
2:
Let X be a continuous random variable with p.d.f :
(viiii) μ2 (Variance) = )2
•
Median = dx = dx = 1/2
Example3:
A random variable X is distributed at random between the values 0 and 1 so that
its probability density function is : f(x)= kx2(1-x3), where k is a constant. Find the
values of k. Using this values of k, find its mean and variance.
Example4:
A variable X is distributed at random between the values 0 and 4 and its
probability density function is given by : f(x) = Kx3(4-x)2
Find the value of k, the mean and standard deviation of the distribution.
•Example
5:
The number of minutes that a flight from Phoenix to Tucson is early or late is a
random variable whose probability density is given by
Where negative values are indicative of the flight’s being early and positive values
are indicative of its being late.
Find the probabilities that one of these flights will be
(a) At least 2 minutes early
(b) At least 1 minute late
(c) Anywhere from 1 to 3 minutes early
(d) Exactly 5 minutes late
Two-Dimensional Random Variable
Let X and Y be two random variables defined on the same sample space S, then the
function (X,Y) that assigns a point in R2=(R x R), is called a two-dimensional
random variable.
Ex: Measuring height and weight of every person.
Let (X,Y) be a discrete two –dimensional r.v. which takes up countable number of
pX(xi)=P(X=xi )
Py (y j )=P(Y=yj)= == P.j
From the knowledge of joint distribution function FXY (x,y), it is possible to obtain the individual
distribution functions, FX (x), and FY (y) which are termed as marginal distribution function of X
and Y respectively with respect to the joint distribution function FXY (x,y).
f(x) = dy
Similarly
f(y) = dx
Important Remark:
If we know the joint p.d.f fXY (x,y) of two random variable X and Y,
we can obtain the individual distributions of X and Y in the form of
their marginal p.d.f’s fX (x) and fY (y). However, the converse is not
true, i.e., from the marginal distribution of two jointly distributed
random variables, we cannot determine the joint distributions of these
two random variable.
• Conditional Probability Density Function
The conditional distribution function FY/X (y/x) denotes the distribution
function of Y when X has already assumed the particular value x . Hence
FY/X (y/x) = P(Y≤ y /X=x)
FY/X (y/x) =
FX/Y (x/y) =
Independent Random Variable
Two r.v.’s X and Y with joint p.d.f fXY (x,y) and marginal p.d.f’s fX (x)
E[g(X,Y)]= P(X= , Y= )
• Conditional Expectation and Conditional Variance
Discrete Case:
The conditional expectation or mean value of a function g(x,y) given
that Y=yj is defined by
E{g(X,Y)|Y=yj )=
• The
conditional expectation of a discrete random variable X given Y=yj
E(X|Y=yj)=
Continuous Case:
E{g(X,Y)|Y=yj ) =
E(Y|X=x) =
Moment Generating Function (MGF)
Moment generating function play an important role in the
characterization of various distributions. We need to find out the
moments, for this, the moment generating function is a good
device.
Mx (t)=E(etx ) =
• In case of continuous random variable
Mx (t) = E (etx ) =
Properties of MGF
K1 =
K2 =
K3 =
K4 + 3 =
Properties of Cumulants
1. Additive property of cumulants:
The rth cumulant of the sum of the independent random variables is
to the rth cumulants of individual variable
xi ; i= 1,2,3,…n
•
2. Effect of change of origin and scale on cumulants.
Let us take x is random variable
U=
Ku (t)= - kx (t/h)
Thus we see that except the first cumulant, all cumulants are independent
of change of origin. But the cumulants are dependent of change of scale
as the rth cumulant of U is (1/hr) times the rth cumulant of the distribution
of x
• Characteristic Function
Let X is random variable, then its characteristic function can be defined as
= | Øx(t)
•6. Øcx(t) = Øx(ct), C being a constant
Bernoulli Distribution
Binomial Distribution
Poisson Distribution
Geometric Distribution
P(X=x) = ; x= 0,1
Where
x = Bernoulli variate,
p=probability of success
q=probability of failure
0≤p≤1
•
Characteristics of Bernoulli distribution:
Parameter of distribution is p
Mean = E(X) = p
Variance = V(X) = p q
Standard Deviation = SD(X) =
Moment generating function = (q + pet)
Characteristic generating function = (q + peit)
Binomial distributions:
Binomial distribution is a discrete probability distribution which arises when
Bernoulli trails are performed repeatedly for a fixed number of times say ‘n’.
where
q= 1-p, x = 0,1,2,….n
The two independent constants ‘n’ and ‘p’ in the distribution are known as
the parameters of the distribution.
Condition/Assumptions of Binomial distribution:
1) The number of trials ‘n’ is finite.
2) Each trial must result in only two possible outcomes i.e. success or
failure.
3) The probability of success ‘p’ is constant for each trial.
4) The trials are independent of each other.
3) Coefficient of Skewness =
4)Coefficient Kurtosis =3 +
5) Mode of the Binomial distribution is that value of the variable x, which occurs
with the largest probability. It may have either unimode or bimode.
6) Moment generating function = (q + pet)n
7) Characteristic generating function = (q + peit)n
P(X=x)=P(x)=
• Robert is a football player. His success rate of goal hitting is 70 %.
What is the probability that Robert hits his third goal on his fifth
attempt?
• A marker is to continue shooting at the target until he hits the target 6 times.
The probability that he hits the target on any shooting is 0.40. Calculate the
probability that the marker will have to shoot 9 time.
• If probability is 0.40 that a child exposed to a certain contagious disease will
catch it, what is the probability that the tenth child exposed to the disease will
be third to catch it?
Answer= 0.064
You are surveying people exiting from a polling booth and asking them if they
voted independent. The probability (p) that a person voted independent is 20%.
What is the probability that 15 people must be asked before you can find 5
people who voted independent?
• Geometric Distribution:
Suppose we have a series of independent trails and each trail the probability of
success “p” remains the same. Then the probability that there are x failures
preceding the first success is given by qx p, where q=1-p
Definition:
A Random variable X is said to have a geometric distribution if it assumes only
non-negative values and its probability mass function is given by:
P(X=x)=P(x)=
h(x:n,N,M) =
For x=0,1,2,3,…..Min(n,M)
•
Mean =
Variance =
Where -ꝏ ≤ x ≤ ꝏ
-ꝏ ≤ μ ≤ ꝏ
σ>0
=3.14159
e=2.71828
Note: The mean µ and σ are called the parameters of Normal distribution.
The normal distribution is expressed by X ~ N(µ, σ2)
• Standard Normal distribution:
Let X be a random variable which follows normal distribution with mean µ
and variance σ2 i.e. X ~ N(µ, σ2). The standard normal variate is defined as
Z= , which follows standard normal distribution with mean 0 and standard
deviation 1 i.e., Z ~ N (0,1).
The standard normal distribution is given by fX (x) =
where, - ꝏ ≤ Z ≤ ꝏ
The advantage of the above function is that it doesn’t contain any parameter.
This enables us to compute the area under the normal probability curve And
all the properties holds good for standard normal distributions. Standard
normal distributions also known as unit normal distribution
Standard Normal distribution curve
Properties of normal distribution:
1. The normal curve is bell shaped and is symmetric at x =µ.
2. Mean, median, and mode of the distribution are coincide
3. Mean = Median = Mode = µ
4. It has only one mode at x = µ (i.e., unimodal)
5. Since the curve is symmetrical, coefficient of skewness (β1) = 0 and
coefficient of kurtosis (β2)= 3.
6. The points of inflection are at x = µ ± σ
7. The maximum ordinate occurs at x = µ and its value is =
•8. The x axis is an asymptote to the curve (i.e. the curve continues to approach
but never touches the x axis).
9. The first quartile (Q1) and third quartile (Q3) are equidistant from median.
MX(t)=
Importance/ applications of normal distribution:
The normal distribution occupied a central place of theory of Statistics
1) ND has a remarkable property stated in the central limit theorem, which
state that sample size (n) increases, then distribution of mean of random
sample approximately normal distributed.
2) As sample size (n) becomes large, ND serves as a good approximation of many
discrete probability distribution viz. Binomial, Poisson, Hyper geometric etc..
X+Y ~ N (+, )
X-Y ~ N (, )
aX ~ N(, )
aX + b Y ~ N (+b, )
•
If X1, X2, X3,…….., Xn are n independent random variables distributed
normally with mean , , ……. and variance , ,……. respectively, then
sum (X1+X2+X3,……..+ Xn ) is distributed with mean, (+ + +…….+ )
and variance, ( + + …….+ ).
•
If X1, X2, X3,…….., Xn are n independent random variables distributed
normally with mean , , ,……. and variance , ,……. respectively, then
sum (X1-X2-X3,……..-Xn ) is distributed with mean, ( …….- ) and
variance, ( + + …….+ ).
If birth weights in a population are normally distributed with a mean of 109 oz
and a standard deviation of 13 oz
(A). What is the chance of obtaining a birth weight of 141 oz or heavier when
sampling birth records at random?
(B). What is the chance of obtaining a birth weight of 120 or lighter?
Solutions:
(A). What is the chance of obtaining a birth weight of 141 oz or heavier when
sampling birth records at random?
141 109
Z 2.46
13
2. If Random variable X ~ N (40, 52). Find the probabilities for the values of X
specified as (i) 32 < X ≤ 50 (ii) X ≥ 44 (iii) 45 ≤ X ≤ 50 (iv) 31≤ X ≤ 35.
•3. X is normally distributed and the mean is 12 and SD is 4.
Find the probability of the following.
f(x; a,b) =
F(x) =
The total area under the curve is 1
Median =
Moment generating function (MGF), mx(t) =
Mode does not exists as probability at each point in an interval (a,b) remains the
same.
P(x)
0.2
0
x
0 1 2 3 4 5
Temperature (degrees Celsius)
• Note that the total area under the “curve” is 1.
P(x)
0.2
x
0 1 2 3 4 5
Temperature (degrees Celsius)
• Since we have a continuous random variable there are an infinite number of
possible outcomes between 0 and 5, the probability of one number out of an
infinite set of numbers is 0.
What is the probability the temperature is between 10C and 40C?
We know that the total area of the rectangle is 1, and we can see that the part
of the rectangle between 1 and 4 is 3/5 of the total, so P(1 x 4) = (4-
1)*(0.20)= 3/5 = 0.6.
Example:
The waiting time for the train that leaves every 30 minutes is uniformly
distributed from 0 to 30 minutes. Find the probability that a person arriving at a
random time will wait between 10 and 15 minutes.
Example:
Suppose a train is delayed by approximately 60 minutes. What is the probability
that train will reaches by 57 minutes to 60 minutes?
Example:
Suppose a flight is about to land and the announcement says that the expected
time to land is 30 minutes. Find the probability of getting flight land between 25
to 30 minutes?
If X has a uniform distribution in (0,1), then Y= -2 log X
follows chi-squre distribution with 2 degrees of freedom.
=
=3+
•In one parameter distribution:
Moment generating function (MGF) = (t) =
~ Ꝩ (a, )
•
• As λ → ꝏ, gamma distribution tends to normal distribution. It is called the
limiting form of gamma distribution.
• X~ Ꝩ (a, λ), if, λ=1, then gamma distribution tends to Exponential distribution.
Anwser:
Let X represent the daily consumption of electric power (millions of kilowatt-
hours)
Example:
Suppose that the lifetime of a device (in years) has the gamma distribution with
shape parameter λ = 2 and scale parameter a=4.
a. Find the probability that the device will last more than 3 years.
Example:
Suppose you are fishing and you expect to get a fish once every ½ hour. Compute
the probability that you will have to wait between 2 to 4 hours before you catch 4
fish.
• Exponential Distribution:
A random variable X is said to have an exponential distribution with parameter θ
>0, if its p.d.f is given by:
f(x,θ)= , x ≥ 0
0 , otherwise
Mean = 0
Variance = 2
=0
=6
• Two Parameter Laplace Distribution:
A Continuous r.v X is said to have a double exponential (Laplace) distribution
with two parameters λ and μ if its p.d.f is given by:
(a) What is the probability that a component is still working after 5000 hours?
(b) Find the mean and standard deviation of the time till failure.
Example:
If jobs arrive every 15 seconds on average, θ=4 per minute, what is the
probability of waiting less than or equal to 30 seconds. i.e 0.5 minute?
Example:
The time intervals between successive barges passing a certain point on a busy
waterway have an exponential distribution with mean 8 minutes.
(a) Find the probability that the time interval between two successive barge is
less than 5 minutes?
(b) Find the probability that the time interval between two successive barge is
between 6 minutes to 8 minutes?
Example
The mileage which car owners get with a certain kind of radial tire is a random
variable having an exponential distribution with mean 40,000 km. Find the
probabilities that one of these tire will last
(i) at least 20,000 km
(ii) at most 30,000 km
Show that the exponential distribution ‘Lacks memory’,
i.e., if X has an exponential distribution, then for every
constant a ≥ 0, one has P(Y≤ x | X ≥ a)=P(X ≤ x) for all
x, Where Y=X-a
• Beta Distribution of first kind
A random variable X is said to be have beta distribution of first kind with
parameters m and n (m > 0, n > 0) if its p.d.f is given by:
It is referred to as (m,n)
• Cumulative distribution function of Beta first kind:
0, x<0
F(x)= , (m,n)>0, 0<x<1
1, x>1
Characteristics / Properties of Beta first
• kind
Mean =
Variance =
Coefficient of Skewness =
Coefficient of Kurtosis =
•Mode
of beta distribution of first kind depends on the values of m and n.
• If m<1, x=0 is the model value
• If n<1, x=1 is the model value
• If m<1, x=0 is the model value
• If m<1, n< 1, then bimodal, one mode occurs at x=0 and the other at x=1.
• If m=1,n=1, then f(x)=1 for 0<x<1. In such situation each x Ꞓ(0,1) is mode.
• If m=1, n>1, x=0 is the mode
• If m>1, n=1, x=1 is the mode
• If m>1, n>1, x=is the mode
•
• Characteristic function of beta distribution =
• Harmonic mean of (m,n) =
• In particular, if m=1 n=1, then f(x)= =1. 0<x<1, which is the p.d.f of uniform
distribution on (0,1)
• If X~(m,n) and Y~(p,q) are independent variates such that p+q=m, then variate
XY is distributed as (p, n+q)
• Beta Distribution of Second Kind
A continuous random variate X with parameters m and n is said to be follow
beta distribution of second kind if its probability density function is
f(x)= , m>0, n>0 and 0<x<ꝏ
0 , otherwise
Remark:
Beta distribution of second kind (x) is transformed to Beta distribution of first
kind (y)by the transformation:
1+x= y=
• Characteristics / Properties of Beta second kind
Mean=
Variance=
Harmonic Mean=
Characteristic function =
Application:
1. It is used in Bayesian Statistics/ Bayesian Inference
2. It is used in modelling of time to complete a task
3. It is used in modelling of defective items in a shipment.
4. Wavelet Analysis
• Suppose that DVDs in a certain shipment are defective with a beta distribution
with m=2 and n=5. Compute the probability that the shipment has 20 % and 30
% defective DVDs?
• Tanya enters a raffle at the local fair, and is wondering what her
chances of winning. If her probability of winning can be modelled
by a beta distribution with m = 5 and n = 2, what is the probability
that she has at most a 10% chance of winning?
• Cauchy Distribution:
It is also known as Lorentz distribution,
A random variable X is said to have Cauchy distribution with parameters λ and μ if its p.d.f
is given by:
• Mean is undefined
• Variance is undefined
• Skewness is undefined
Definition:
If X is a random variable with mean μ and variance is , then for any
positive number k, we have
P≤
P ≥ 1-
•
• P + P=1
P≤
P ≥ 1-
•Generalised
form of Bienayme –Chebychev’s inequality.
Let g(X) be a non-negative function of a random variable X. Then for every k>0,
we have
P{ g(X)≥k} ≤
• A R.V X has mean μ=12 and variance =9 and an unknown
probability distribution. Find P(6<X<18).
• A fair die is tossed 720 times. Use chebychev’s inequality to find a lower bound
for the probability of getting 100 to 140 sixes.
• A discrete RV X takes the values -1,0,1 with probabilities 1/8,3/4,1/8
respectively. Evaluate P{|X-μ|≥ 2σ} and compare it with the upper bound given
by chebychev’s inequality.
• A RV X is exponentially distributed with parameter 1. Use chebychev’s
inequality to show that P(-1≤X≤3) ≥ 3/4. Find the actual probability also.
• Convergence in probability:
Definition:
A sequence of random variables , , ,,,,,,,, is said to converge in
probability to a constant “a”, if for any >0,
- a|< ) =1
or its equivalent
- a|≥ ) = 0
And we write
a, as n → ꝏ
• Weak Law of Large Numbers (W.L.LN):
Statement:
Let , , , ,……, be a sequence of random variables and , , , ,…….. be
their respective expectation and let = Var ( ) < ꝏ, Then,
P ≥ 1-ƞ
Provided
→0
•Weak
law of large numbers can also stated as follows:
→P →
→0 as n→ ꝏ
• Strong Law of Large number:
Statement:
(i=1,2,3….n) be a sequence of independent and identically distributed
(iid) random variables with sample average and population mean
(expected value) μ, then sample average converges almost surely to the
expected value for infinitely large number of trails (n→ꝏ).
as μ as n →ꝏ
For example, a single roll of six-sided die produces one of the
number 1,2,3,4,5 and 6 , each with equal probability.
Therefore, the expected value of a single die roll is
=(1+2+3+4+5+6)/(6)
=3.5
According to the law of large numbers, if a large number of
six-sided die are rolled, the average of their values(Sample
mean) is likely to be close to 3.5, with the precision increasing
as more dice are rolled.
• For example, a fair coin is a Bernoulli trail. When a fair coin
is flipped once, the theoretical probability that the outcome
will be heads is equal to ½. Therefore, according to the law of
large numbers, the proportion of heads in a “large” number of
coin flips “should be roughly ½. In particular, the proportion
of heads after “n” flips will almost surely converge to ½ as
“n” approaches infinity.
•Remarks:
1. For existence of WLLN we assume the following conditions:
(i) E() exists for all i
(ii) = Var ( ) exists.
(iii) → 0 as n→ ꝏ
Conditions (i) is necessary, without it the law itself cannot be stated. But the
conditions (ii) and (iii) are not necessary; (iii) is however a sufficient condition.
Central Limit Theorem (CLT):
This theorem was first stated by Laplace in 1812 and rigorous proof
under fairly general conditions was given by Liapounoff in 1901.
The Central Limit Theorem (CLT) is a statistical theory states that given
a sufficiently large sample size from a population with a finite level of
variance, the mean of all samples from the same population will be
approximately equal to the mean of the population.
•The
Central Limit Theorem is the sampling distribution of the sampling means
approaches a normal distribution as the sample size gets larger, no matter what the
shape of the data distribution. An essential component of the Central Limit Theorem
is the average of sample means will be the population mean.
Mean of sample is same as mean of the population.
Standard deviation of the sample is equal to standard deviation of the population
divided by square root of sample size.
=
Central limit theorem is applicable for a sufficiently large sample sizes. The formula
for central limit theorem can be stated as follows:
=μ
=
•CLT
also stated as:
If independent random variables such that E()= and V()= , then under
certain very general conditions, the random variable =+++…..+ is
asymptotically normal with mean μ and standard deviation σ where
μ= and =
A study involving stress is done on a college campus among the students. The
stress scores follow a uniform distribution with the lowest stress score equal to
1 and the highest equal to 5. Using a sample of 75 students.
THANK YOU