0% found this document useful (0 votes)
39 views84 pages

Random Variables and Distributions - New

Uploaded by

245123742004
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
39 views84 pages

Random Variables and Distributions - New

Uploaded by

245123742004
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 84

Random Variables and Probability

Distributions
Speaker:
Dr. Venkateswara Raoo,
NIT Warangal

Note: These slides were assembled by Dr. K. V. Rao, with grateful acknowledgement of the many others who made their course materials available online.
Concepts
• Random variables (discrete and continuous)
• Probability distributions over discrete/continuous r.v.'s
• Notions of joint, marginal, and conditional probability
distributions
• Properties of random variables (and of functions of
random variables)
• Expectation and variance/covariance of random
variables
• Examples of probability distributions
Probability and Statistics
Probability is the chance of an outcome in an experiment (also called event).

Event: Tossing a fair coin


Outcome: Head, Tail

Probability deals with predicting the Statistics involves the analysis of the
likelihood of future events. frequency of past events

Example: Consider there is a drawer containing 100 socks: 30 red, 20 blue and
50 black socks.
We can use probability to answer questions about the selection of a
random sample of these socks.
• Q1. What is the probability that we draw two blue socks or two red socks from the
drawer?
• Q2. What is the probability that we pull out three socks or have matching pair?
• Q3. What is the probability that we draw five socks and they are all black?

3
Statistics
Instead, if we have no knowledge about the type of socks in the drawers, then we enter into
the realm of statistics. Statistics helps us to infer properties about the population on the basis
of the random sample.

Questions that would be statistical in nature are:

• Q1: A random sample of 10 socks from the drawer produced one blue, four red, five black socks.
What is the total population of black, blue or red socks in the drawer?

• Q2: We randomly sample 10 socks, and write down the number of black socks and then return the
socks to the drawer. The process is done for five times. The mean number of socks for each of these
trial is 7. What is the true number of black socks in the drawer?
• etc.

4
Probability vs. Statistics
In other words:
• In probability, we are given a model and asked what kind of data we are likely to see.

• In statistics, we are given data and asked what kind of model is likely to have generated
it.

Examples :
• You have a fair coin (equal probability of heads or tails). You will toss it 100 times.
What is the probability of 60 or more heads? We can get only a single answer
because of the standard computation strategy.
• You have a coin of unknown provenance. To investigate whether it is fair you toss it
100 times and count the number of heads. Let’s say you count 60 heads. Your job as
a statistician is to draw a conclusion (inference) from this data.

5
Random Variable
• Probability is often associated with at least one event.
– Rolling a die or pulling a color ball out of a bag.
• In these examples the outcome of the event is random
• So the variable that represents the outcome of these
events is called a random variable.

• A random variable is a variable that assumes numerical


values associated with the random outcomes of an
experiment, where one (and only one) numerical value is
assigned to each sample point.
Random Variable
• Informally, a random variable (r.v.) X denotes possible
outcomes of an event
• Can be discrete (i.e., finite many possible outcomes) or
continuous
Discrete Random Variable
• If I pick any two consecutive outcomes. I can’t get any
outcome that’s in between.
– For example, if we consider 1 and 2 as outcomes of rolling a
six-sided die, then I can’t have an outcome in between that (I
can’t have an outcome of 1.5).
– In mathematics, we would say that the list of outcomes is
countable

Experiment Random Variable Possible Values

Make 100 Sales Calls # Sales 0, 1, 2, ..., 100

Inspect 70 Radios # Defective 0, 1, 2, ..., 70

Answer 33 Questions # Correct 0, 1, 2, ..., 33


Continuous Random Variable
• Random variables that can assume values corresponding
to any of the points contained in one or more intervals
(i.e., values that are infinite and uncountable) are called
continuous.

Experiment Random Possible Values


Variable
Weigh 100 People Weight 45.1, 78, ...
Measure time taken Hours 900, 875.9, ...
Amount spent on food $ amount 54.12, 42, ...
Measure Time Inter-Arrival 0, 1.3, 2.78, ...
Between Arrivals Time
Probability
• We are often interested in knowing the probability of a
random variable taking on a certain value.
– What is the probability that when I roll a fair 6-sided die it
lands on a 3? P(X=3).
– What is the probability that weight less than or equal to 70 Kg?
𝑃 𝑋 ≤ 70 .
Types of probability
• Can either be marginal, joint or conditional.

• Marginal Probability: If A is an event, then the marginal probability is the


probability of that event occurring, P(A).
– Example: Assuming that we have a pack of traditional playing cards, an example of
a marginal probability would be the probability that a card drawn from a pack is
red: P(red) = 0.5.

• Joint Probability: The probability of the intersection of two or more


events. If A and B are two events then the joint probability of the two
events is written as P(A ∩ B). Or, P(X=x, Y=y), X and Y are random
variables.
– Example: the probability that a card drawn from a pack is red and has the value 4 is
P(red and 4) = 2/52 = 1/26.

• Conditional Probability: The conditional probability is the probability that


some event(s) occur given that we know other events have already
occurred. If A and B are two events then the conditional probability of A
occurring given that B has occurred is written as P(A|B). Or, P(X=x|Y=y).
– Example: the probability that a card is a four given that we have drawn a red card is
P(4|red) = 2/26 = 1/13.
Example: Discrete case
• In this experiment, we toss a coin (first event) and throw
a dice (second event).
– We look at the probability of each event. For instance, the
probability to get a ‘head’ is 1/2. The probability to roll a 1 is
1/6.
• The following figure shows the probabilities of each
outcome for each event individually:

Joint
probability
Marginal
probability
Example: Discrete case

• P(Heads) = ?
• = P(Heads, die = 1) + P(Heads, die = 2) + … + P(Heads, die = 6).
• = 1/12 + 1/12 + … + 1/12 = 1/2
• Formally, 𝑃 𝑋 = 𝑥 = σ𝑦 𝑃(𝑋 = 𝑥, 𝑌 = 𝑦)
Conditional Probability
• Conditional probability refers to the probability of an
event given that another event occurred.

• First, it is important to distinguish between dependent


and independent events! The intuition is a bit different in
both cases.
– Example of independent events: dice and coin
– Example of dependent events: two cards from a deck

• notation: P(Y=y | X=x) Probability that Y=y given that


X=x.

• Mathematically, there is a convenient relationship


between conditional probabilities and joint probability.
𝑃(𝑌=𝑦, 𝑋=𝑥)
𝑃 𝑌=𝑦𝑋=𝑥 =
𝑃(𝑋=𝑥)
Conditional Probability
• It may be more intuitive to look at it in another direction:
𝑃 𝑋 = 𝑥, 𝑌 = 𝑦 = 𝑃 𝑋 = 𝑥 . 𝑃 𝑌 = 𝑦 𝑋 = 𝑥
• To calculate the probability that both events occur, we
have to take the probability that the first event occurs
(𝑃 𝑋 = 𝑥 ) and multiply it with the probability that the
second event occurs given that the first event occurred
(𝑃 𝑌 = 𝑦 𝑋 = 𝑥 ).
• 𝑃 𝑋 = 𝑥, 𝑌 = 𝑦 = 𝑃 𝑌 = 𝑦, 𝑋 = 𝑥
• 𝑃 𝑋 = 𝑥 . 𝑃 𝑌 = 𝑦 𝑋 = 𝑥 = P(Y=y). P(X=x|Y=y)
Bayesian
𝑃 𝑦 𝑥 𝑃(𝑥)
• P xy = theorem
𝑃(𝑦)
Independent Events
• When two events x and y are independent
P(x|y) = p(x)

• Therefore P(x,y) = P(x). P(y).


– For instance, P(x=3, y=heads) is the probability of rolling a 3 on
a dice and getting a ‘heads’ on a coin.
– For this example, let’s say that we know that P(x=3, y=heads) =
1/12.
– P(y = heads|x=3) = P(x=3, y=heads)/P(x=3) = 0.5
– We can see that P(y = heads|x=3) = P(heads) = 0.5
Dependent Events
• Example: we draw two cards without replacement. The
first step is to use what we learned and write the
problem using mathematical notation. We’ll call X the
variable corresponding to the first draw and Y the
variable corresponding to the second draw.
• P(Y=6|X=6) ?
Probability Distribution
• A probability distribution is a list of all of the possible
outcomes of a random variable along with their
corresponding probability values.

• This is an example of a discrete univariate probability


distribution with finite support.
• Discrete ?
• Univariate ?
• Finite support ?
Axioms
• For a discrete r.v. X, p(x) denotes the probability that
p(X = x)
• p(x) is called the probability mass function (PMF)

1. p(x) ≥ 0 and ≤ 1 for all values of x


2.  p(x) = 1, where the summation of p(x) is over all
possible values of x.
Discrete Probability Distribution Example

Experiment: Toss 2 coins. Count number of tails.


Probability Distribution
Values, x Probabilities,
p(x)
0 1/4 = .25
1 2/4 = .50
2 1/4 = .25
Visualizing Discrete Probability Distributions

Listing Table

{ (0, .25), (1, .50), (2, .25) } f(x) p(x)


# Tails Count
0 1 .25
Graph 1 2 .50
2 1 .25
p(x)
.50
.25 Formula

.00 x n!
p (x ) = px(1 – p)n – x
0 1 2 x!(n – x)!
Continuous probability distribution
• For a continuous r.v. X, a probability p(X = x) is meaningless and
zero.
• Instead we use f(x) to denote the probability density and is often
expressed in terms of an integral between two points, p(a≤X ≤b).
• For a continuous r.v. X, we can only talk about probability within
an interval 𝑋 ∈ (𝑥, 𝑥 + 𝛿𝑥).
• 𝑝(𝑥)𝛿𝑥 is the probability that 𝑋 ∈ 𝑥, 𝑥 + 𝛿𝑥
as 𝛿𝑥 → 0.

• Example: Metal Cylinder Production


– Suppose that the random variable x is the diameter of a randomly chosen
cylinder manufactured by the company. Since this random variable can take
any value between 49.5 and 50.5, it is a continuous random variable.
Continuous probability distribution

• The probability density f(x) satisfies the following

– f 𝑥 ≥ 0 and
– ‫)𝑥(𝑓 𝑥׬‬dx = 1
• Suppose that the diameter of a metal cylinder has a p.d.f

f ( x) = 1.5 − 6( x − 50.2) 2 for 49.5  x  50.5


f ( x) = 0, elsewhere
Continuous probability distribution
• This is a valid p.d.f.

50.5
49.5
(1.5 − 6( x − 50.0) 2 ) dx = [1.5 x − 2( x − 50.0) 3 ]50.5
49.5

= [1.5  50.5 − 2(50.5 − 50.0)3 ]


−[1.5  49.5 − 2(49.5 − 50.0)3 ]
= 75.5 − 74.5 = 1.0
Continuous probability distribution
• The probability that a metal cylinder has a diameter
between 49.8 and 50.1 mm can be calculated to be

50.1
49.8
(1.5 − 6( x − 50.0) 2 ) dx = [1.5 x − 2( x − 50.0) 3 ]50.1
49.8

= [1.5  50.1 − 2(50.1 − 50.0)3 ]


−[1.5  49.8 − 2(49.8 − 50.0)3 ]
= 75.148 − 74.716 = 0.432
Continuous probability distribution
• Cumulative Distribution Function
x
 F ( x) = P( X  x) =  f ( y )dy
−

dF ( x)
 f ( x) =
dx

 P ( a  X  b) = P ( X  b) − P ( X  a )
= F (b) − F (a)

 P ( a  X  b) = P ( a  X  b)
Continuous probability distribution
• Cumulative Distribution Function
• Example
x
F ( x) = P( X  x) =  (1.5 − 6( y − 50.0) 2 )dy
49.5

= [1.5 y − 2( y − 50.0)3 ]49.5


x

= [1.5 x − 2( x − 50.0)3 ] − [1.5  49.5 − 2(49.5 − 50.0)3 ]


=1.5 x − 2( x − 50.0)3 − 74.5
P(49.7  X  50.0) = F (50.0) − F (49.7)
= (1.5  50.0 − 2(50.0 − 50.0)3 − 74.5)
−(1.5  49.7 − 2(49.7 − 50.0)3 − 74.5)
= 0.5 − 0.104 = 0.396
Continuous probability distribution
• Cumulative Distribution Function
P(49.7  X  50.0) = 0.396
1

P( X  50.0) = 0.5
F ( x)

P( X  49.7) = 0.104

49.5 49.7 50.0 50.5 x


Marginal probability distribution
• Intuitively, the probability of one r.v. regardless of the
value that other r.v. takes.
• For discrete r.v.’s

• For continuous r.v.’s:


Conditional probability distribution
• Probability distribution of one r.v. given the value of
other r.v.

• Discrete case

• Continuous case
The Classification Problem Katydids
(informal definition)

Given a collection of annotated data.


In this case 5 instances of Katydids
and five of Grasshoppers, decide
what type of insect the unlabeled
example is. Grasshoppers

Katydid or Grasshopper?
Slides credit (for this example):
Dr Eamonn Keogh University of California - Riverside
For any domain of interest, we can measure features

Color {Green, Brown, Gray, Other} Has Wings?

Abdomen Thorax
Length Length Antennae
Length

Mandible
Size

Spiracle
Diameter Leg Length
My_Collection
We can store features
Insect Abdomen Antennae Insect Class
in a database. ID Length Length
1 2.7 5.5 Grasshopper
2 8.0 9.1 Katydid
3 0.9 4.7 Grasshopper
The classification
4 1.1 3.1 Grasshopper
problem can now be
5 5.4 8.5 Katydid
expressed as:
6 2.9 1.9 Grasshopper
7 6.1 6.6 Katydid
• Given a training database
(My_Collection), predict the class 8 0.5 1.0 Grasshopper
label of a previously unseen Katydid
instance
9 8.3 6.6
10 8.1 4.7 Katydids

previously unseen instance = 11 5.1 7.0 ???????


Grasshoppers Katydids

10
9
8
7
Antenna Length
6
5
4
3
2
1

1 2 3 4 5 6 7 8 9 10
Abdomen Length
With a lot of data, we can build a histogram. Let
us just build one for “Antenna Length” for now…
10
9
8
7
Antenna Length

6
5
4
3
2
1

1 2 3 4 5 6 7 8 9 10

Katydids
Grasshoppers
We can leave the
histograms as they are,
or we can summarize
them with two normal
distributions.

Let us us two normal


distributions for ease
of visualization in the
following slides…
• We want to classify an insect we have found. Its antennae are 3 units long.
How can we classify it?

• We can just ask ourselves, give the distributions of antennae lengths we have
seen, is it more probable that our insect is a Grasshopper or a Katydid.
• There is a formal way to discuss the most probable classification…
p(cj | d) = probability of class cj, given that we have observed d

Antennae length is 3
p(cj | d) = probability of class cj, given that we have observed d

P(Grasshopper | 3 ) = 10 / (10 + 2) = 0.833


P(Katydid | 3 ) = 2 / (10 + 2) = 0.166

10

Antennae length is 3
p(cj | d) = probability of class cj, given that we have observed d

P(Grasshopper | 7 ) = 3 / (3 + 9) = 0.250
P(Katydid | 7 ) = 9 / (3 + 9) = 0.750

9
3

Antennae length is 7
Summary of basic rules
Mean or Expectation
• Example: Tossing a single unfair die
– For fun, imagine a weighted die (cheating!)
Imagine the following probability values.

• Mean or Expected Value: μ or E(x)


– When we know the probability p(x) of every value x we can
calculate the Expected Value (Mean) of X:
– E(x) = μ = Σxp(x) = 0.1+0.2+0.3+0.4+0.5+3 = 4.5
– The expected value is 4.5.
• Expectation of a continuous random variable with p.d.f
f(x)
E( X ) = 
state space
xf ( x)dx
Expectations of Continuous Random Variables

• Example (continuous random variable)


– The expected diameter of a metal cylinder is
50.5
E( X ) =  x(1.5 − 6( x − 50.0) 2 )dx
49.5

– Change of variable: y=x-50


0.5
E ( x) =  ( y + 50)(1.5 − 6 y 2 )dy
−0.5
0.5
= (−6 y 3 − 300 y 2 + 1.5 y + 75)dy
−0.5

= [−3 y 4 / 2 − 100 y 3 + 0.75 y 2 + 75 y ]0.5


−0.5

= [25.09375] − [−24.90625] = 50.0


The variance of a Random Variable

• Variance(  2 )
– A positive quantity that measures the spread of the distribution
of the random variable about its mean value
– Larger values of the variance indicate that the distribution is
more spread out

– Definition: Var( X ) = E (( X − E ( X )) 2 )
= E ( X 2 ) − ( E ( X )) 2

• Standard Deviation
– The positive square root of the variance
– Denoted by 
The variance of a Random Variable

Var( X ) = E (( X − E ( X )) 2 )
= E ( X 2 − 2 XE ( X ) + ( E ( X )) 2 )
= E ( X 2 ) − 2 E ( X ) E ( X ) + ( E ( X )) 2
= E ( X 2 ) − ( E ( X )) 2
= Σx2p − μ2

f ( x)
Two distribution with
identical mean values but
different variances

x
The variance of a Random Variable
Covariance

Cov( X , Y ) = E (( X − E ( X ))(Y − E (Y )))


= E ( XY ) − E ( X ) E (Y )

Cov( X , Y ) = E (( X − E ( X ))(Y − E (Y )))


= E ( XY − XE (Y ) − E ( X )Y + E ( X ) E (Y ))
= E ( XY ) − E ( X ) E (Y ) − E ( X ) E (Y ) + E ( X ) E (Y )
= E ( XY ) − E ( X ) E (Y )

– May take any positive or negative numbers.


– Independent random variables have a covariance of zero
– What if the covariance is zero?
Covariance
• Example: Air Conditioner servicing company

E ( X ) = 2.59, E (Y ) = 1.79

4 3
E ( XY ) =  ijpij
i =1 j =1

= (11 0.12) + (1 2  0.08)


+ + (4  3  0.07) = 4.86

Cov( X , Y ) = E ( XY ) − E ( X ) E (Y )
= 4.86 − (2.59 1.79) = 0.224
Correlation
• When two sets of data are strongly linked together we
say they have a High Correlation.
– Correlation is Positive when the values increase together, and
– Correlation is Negative when one value decreases as the other
increases

– The value shows how good the correlation is (not how steep
the line is), and if it is positive or negative.
Correlation
• Example: Ice Cream Sales

• We can easily see that warmer weather and higher sales go


together. The relationship is good but not perfect.
• the correlation is 0.9575, we will see how to calculate.
• Correlation Is Not Causation.
Correlation

• Correlation:

Cov( X , Y )
Corr( X , Y ) =
Var( X )Var(Y )

– Values between -1 and 1, and independent random


variables have a correlation of zero
Correlation: Ice cream example
Common Probability Distributions
• Important: We will use these extensively to model data as well as
parameters
• Some discrete distributions and what they can model:
– Bernoulli: Binary numbers, e.g., outcome (head/tail, 0/1) of a coin toss
– Binomial: Bounded non-negative integers, e.g., # of heads in n coin tosses
– Multinomial: One of K (>2) possibilities, e.g., outcome of a dice roll
– Poisson: Non-negative integers, e.g., # of words in a document
– .. and many others
• Some continuous distributions and what they can model:
– Uniform: numbers defined over a fixed range
– Beta: numbers between 0 and 1, e.g., probability of head for a biased coin
– Gamma: Positive unbounded real numbers
– Dirichlet: vectors that sumup 1 (fraction of data points in dierent clusters)
– Gaussian: real-valued numbers or real-valued vectors
– .. and many others
Binomial Probability Distribution
• A fixed number of observations (trials), n
– e.g., 15 tosses of a coin; 20 patients; 1000 people surveyed
• A binary random variable
– e.g., head or tail in each toss of a coin; defective or not
defective light bulb
– Generally called “success” and “failure”
– Probability of success is p, probability of failure is 1 – p
• Constant probability for each observation
– e.g., Probability of getting a tail is the same each time we toss
the coin
Binomial example
Take the example of 5 coin tosses. What’s the
probability that you flip exactly 3 heads in 5 coin
tosses?
Outcome Probability
THHHT (1/2)3 x (1/2)2
HHHTT (1/2)3 x (1/2)2
TTHHH (1/2)3 x (1/2)2
HTTHH (1/2)3 x (1/2)2 The probability
5 (1/2)3 x (1/2)2
ways to HHTTH of each unique
  (1/2)3 x (1/2)2
arrange 3 THTHH outcome (note:
heads in HTHTH (1/2)3 x (1/2)2
 3 5 trials HHTHT (1/2)3 x (1/2)2
they are all
equal)
THHTH (1/2)3 x (1/2)2
HTHHT (1/2)3 x (1/2)2
10 arrangements x (1/2)3 x (1/2)2

5C3 = 5!/3!2! = 10
5
P(3 heads and 2 tails) =   ×P(heads)3 × P(tails)2 =
 3

10 × (½)5=31.25%
Binomial distribution function:
X= the number of heads tossed in 5 coin tosses
p(x)

x
0 1 2 3 4 5
number of heads
Definitions: Bernouilli
Bernouilli trial: If there is only 1 trial with probability of
success p and probability of failure 1-p, this is called a
Bernouilli distribution. (special case of the binomial with
n=1)

1 1
Probability of success: P( X = 1) =   p (1 − p)1−1 = p
1

Probability of failure: 1 0


P( X = 0) =   p (1 − p)1−0 = 1 − p
0
Multinomial distribution
The multinomial is a generalization of the binomial. It is used
when there are more than 2 possible outcomes (for ordinal or
nominal, rather than binary, random variables).
– Instead of partitioning n trials into 2 outcomes (yes with
probability p / no with probability 1-p), you are
partitioning n trials into 3 or more outcomes (with
probabilities: p1, p2, p3,..)
• General formula for 3 outcomes:

n!
P ( D = x, R = y , G = z ) = p Dx p Ry (1 − p D − p R ) z
x! y! z!
Multinomial example
Specific Example: if you are randomly choosing 8 people from an
audience that contains 50% democrats, 30% republicans, and 20%
green party, what’s the probability of choosing exactly 4 democrats, 3
republicans, and 1 green party member?

8!
P( D = 4, R = 3, G = 1) = (.5) 4 (.3) 3 (.2)1
4! 3!1!

You can see that it gets hard to calculate very fast! The
multinomial has many uses in genetics where a person
may have 1 of many possible alleles (that occur with
certain probabilities in a given population) at a gene
locus.
Poisson Distribution

1. Poisson distribution is for counts—if events happen at a


constant rate over time, the Poisson distribution gives the
probability of X number of events occurring in time T.

2. Number of events that occur in an interval


• events per unit
— Time, Length, Area, Space

3. Examples
• Number of customers arriving in 20 minutes
• Number of strikes per year in the India.
• Number of defects per lot (group) of DVD’s
Poisson Probability Distribution Function

e
x –
p (x) = (x = 0, 1, 2, 3, . . .)
x!

p(x) = Probability of x given 


 = Mean (expected) number of events in unit
e = 2.71828 . . . (base of natural logarithm)
x = Number of events per unit
Poisson Distribution Example

Customers arrive at a rate of 72 per hour. What is the


probability of 4 customers arriving in 3 minutes?

© 1995 Corel Corp.


Poisson Distribution Solution

72 Per Hr. = 1.2 Per Min. = 3.6 Per 3 Min. Interval

-
 e
x
p ( x) =
x!
( 3.6 )
4 -3.6
e
p (4) = = .1912
4!
Uniform Probability Distribution

NY
Chicago

•Consider the random variable x representing the flight time of an


airplane traveling from Chicago to NY.
•Under normal conditions, flight time is between 120 and 140
minutes.
•Because flight time can be any value between 120 and 140
minutes, x is a continuous variable.
Uniform Probability Distribution
With every one-minute
interval being equally likely,
the random variable x is
said to have a uniform 0 elsewhere

probability distribution

1

for 120  x  140
f (x) =  20
0 elsewhere
Uniform Probability Distribution

 1 for a  x  b

f ( x) =  b − a
0 elsewhere

For the flight-time


random variable,
a = 120 and b = 140
Uniform Probability Density
Function for Flight time

The shaded area indicates


the probability the flight
f (x) will arrive in the interval
between 120 and 140
minutes

1
20

120 125 130 135 140 x


Probability as an Area
Question: What is the probability that arrival time will
be between 120 and 130 minutes—that is:

P(120  x  130) Remember when we


multiply a line
segment times a line
f (x) segment, we get an
area
P(120  x  130) = Area = 1 / 20(10) = 10 / 20 = .50

1
20

10

120 125 130 135 140 x


Normal Probability Distribution
• The normal distribution is by far the most important
distribution for continuous random variables. It is
widely used for making statistical inferences in both
the natural and social sciences.

• It has been used in a wide variety of applications:

Heights Test Scientific Amounts


of people scores measurements of rainfall
The Normal Distribution

1 − ( x −  ) 2 / 2 2
f ( x) = e
 2

Where:
μ is the mean
σ is the standard deviation
 = 3.1459
e = 2.71828
Normal Probability Distribution

Characteristics

The distribution is symmetric, and is bell-shaped.

x
Normal Probability Distribution

Characteristics

The entire family of normal probability


distributions is defined by its mean  and its
standard deviation  .

Standard Deviation 

x
Mean 
Normal Probability Distribution

Characteristics

The highest point on the normal curve is at the


mean, which is also the median and mode.

x
Normal Probability Distribution

Characteristics

The mean can be any numerical value: negative,


zero, or positive.

x
-10 0 20
Normal Probability Distribution

Characteristics

The standard deviation determines the width of the


curve: larger values result in wider, flatter curves.

 = 15

 = 25

x
Normal Probability Distribution

Characteristics
Probabilities for the normal random variable are
given by areas under the curve. The total area
under the curve is 1 (.5 to the left of the mean and
.5 to the right).

.5 .5
x
The Standard Normal Distribution
• The Standard Normal Distribution is a normal distribution with the special
properties that is mean is zero and its standard deviation is one.

 =0  =1

The letter z is used to designate the standard


normal random variable.

=1

z
0
Cumulative Probability
Probability that z ≤ 1 is the area under the curve to
the left of 1.

P( z  1)

z
0 1
What is P(z ≤ 1)?

To find out, use the Cumulative Probabilities Table for


the Standard Normal Distribution
Z .00 .01 .02

.9 .8159 .8186 .8212


1.0 .8413 .8438 .8461
1.1 .8643 .8665 .8686
1.2 .8849 .8869 .8888
● P( z  1)

Exercise 1

a) What is P(z ≤2.46)? Answer:


b) What is P(z ≥2.46)? a) .9931
b) 1-.9931=.0069

2.46 z
Exercise 2

a) What is P(z ≤-1.29)? Answer:


b) What is P(z ≥-1.29)? a) 1-.9015=.0985
b) .9015
Note that:
P( z  1.29) = 1 − P( z area
Red-shaded 1.29)is
equal to green-
shaded area

-1.29 1.29 z

Note that, because of the symmetry, the area to the left of -1.29 is the
same as the area to the right of 1.29
References
• Probability and Statistics:The Science of Uncertainty,
Second Edition, Michael J. Evans and Jeffrey S.,
Rosenthal University of Toronto.
HAPPY LEARNING
Thank you !

You might also like