0% found this document useful (0 votes)
79 views50 pages

Chapter 6

Uploaded by

h.hazaresen
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
79 views50 pages

Chapter 6

Uploaded by

h.hazaresen
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 50

MATH 264

Statistics for
Social Sciences

Chapter 6

Discrete Random Variables and


Probability Distributions
Definition

A Random variable is a variable that takes on


numerical values determined by the outcome of a
random experiment
Introduction to
Probability Distributions

 Random Variable
 Represents a possible numerical value from
a random experiment
Random
Variables

Ch. 6 Discrete Continuous Ch. 7


Random Variable Random Variable
Discrete Random Variables
 Can only take on a countable number of values
Examples:

 Roll a die twice


Let X be the number of times 4 comes up
(then X could be 0, 1, or 2 times)

 Toss a coin 5 times.


Let X be the number of heads
(then X = 0, 1, 2, 3, 4, or 5)
Definition

A Probability Distribution Function P(X), of a


discrete random variable X expresses that the
random variable X takes the value x, as a function of
X.

That is;

P(X) = P(X=x) for all values of x


Discrete Probability Distribution
Experiment: Toss 2 Coins. Let X = # heads.
Show P(x) , i.e., P(X = x) , for all values of x:

4 possible outcomes
Probability Distribution
T T x Value Probability
0 1/4 = .25
T H 1 2/4 = .50
2 1/4 = .25
H T
Probability

.50

.25
H H
0 1 2 x
Probability Distribution
Required Properties

 P(x)  0 for any value of x

 The individual probabilities sum to 1;

 P(x)  1
x

(The notation indicates summation over all possible x values)


Cumulative Probability Function

 The cumulative probability function, denoted


F(x0), shows the probability that X is less than or
equal to x0

F(x 0 )  P(X  x 0 )

 In other words,

F(x 0 )   P(x)
xx0
Properties of F(X0)

 0 ≤ F(X0) ≤ 1 for every number x0

 İf x0 and x1 are two numbers with x0 < x1


 F(X0) < F(X1)
Descriptive Measures
Functions of Random Variables

 If P(x) is the probability function of a discrete


random variable X , and g(X) is some function of
X , then the expected value of function g is

E[g(X)]   g(x)P(x)
x

 Note that expected value takes the average of g(x)


Expected Value
 Expected Value (or mean) of a discrete
distribution (Weighted Average)
μ  E(x)   xP(x)
x

x P(x)
 Example: Toss 2 coins, 0 .25
x = # of heads, 1 .50

compute expected value of x: 2 .25

E(x) = (0 x .25) + (1 x .50) + (2 x .25)


= 1.0
Variance and Standard
Deviation
 Variance of a discrete random variable X

σ  E(X  μ)   (x  μ) P(x)
2 2 2

 Standard Deviation of a discrete random variable X

σ  σ2  
x
(x  μ)2
P(x)
Shortcut for variance formula

Var(X) = s²
= E(X²) – m²
= S X² P(X) – m²
Standard Deviation Example

 Example: Toss 2 coins, X = # heads,


compute standard deviation (recall E(x) = 1)

σ x
(x  μ)2
P(x)

σ  (0  1)2 (.25)  (1 1)2 (.50)  (2  1)2 (.25)  .50  .707

Possible number of heads


= 0, 1, or 2
Rules for Expectation & Variance

 E(c) = c  V(c) = 0
 E(cX) = cE(X)  V(cX) = c²V(X)
 E(cX+Y)= cE(X)+E(Y)  V(cX+Y)= c²V(X)+V(Y)
 E(X-cY)= E(X)-cE(Y)  V(X-cY)= V(X)+c²V(Y)

Where X and Y random


variable & c constant
Solve CE4,1 and CE4,2

CE4,1: Consider the following probability distribution function (pdf). Let X denotes number of cars sold in a
day.

X 0 1 2 3 4 5 6
P(X) 0.07 0.19 0.23 0.17 0.16 0.14 0.04

a) What is P(X>3) =?
b) P(2<X<5) = ?
c) P(X≥2) = ?
d) P(X<6) = ?

CE4,2: Suppose that the PDF for the number of errors, X, on pages from business textbook is

P(0) = 0.81 P(1) = 0.17 P(2) = 0.02

Find the mean standard deviation of number of errors per page


Solution: CE4,3

CE4,3: Consider the following probability distribution function (pdf). Let X denotes number of cars
sold in a day.

X 0 1 2 3 4 5 6
P(X) 0.07 0.19 0.23 0.17 0.16 0.14 0.04

Find mean and standard deviation.

Mean= m = E(X)=SX*P(X) =(0)(0.07)+(1)(0.19)+………+ (6)(0.04)=2.74 3 cars will be sold in any


random day

Variance=s2 = SX-m)2P(X) = (0-2.74)2(0.07)+(1-2.74)2(0.19)+ ……+(6-2.74)2(0.04)=2.63


s= 2.63 = 1.62  2 cars is the expected deviation from center
Probability Distributions
Probability
Distributions

Ch. 6 Discrete Continuous Ch. 7


Probability Probability
Distributions Distributions

Bernoulli Normal

Binomial

Hypergeometric
Bernoulli Distribution
Bernoulli Distribution

 Consider only two outcomes: “success” or “failure”


 Let P denote the probability of success
 Let 1 – P be the probability of failure
 Define random variable X:
x = 1 if success, x = 0 if failure
 Then the Bernoulli probability function is

P(0)  (1 P) and P(1)  P


Bernoulli Distribution
Mean and Variance

 The mean is µ = P

μ  E(X)   xP(x)  (0)(1 P)  (1)P  P


X

 The variance is σ2 = P(1 – P)

σ 2  E[(X  μ)2 ]   (x  μ)2 P(x)


X

 (0  P) (1 P)  (1 P) P  P(1 P)


2 2
Binomial Distribution
Binomial Distribution
 A fixed number of observations, n
 e.g., 15 tosses of a coin; ten light bulbs taken from a warehouse
 Two mutually exclusive and collectively exhaustive
categories
 e.g., head or tail in each toss of a coin; defective or not defective
light bulb
 Generally called “success” and “failure”
 Probability of success is P , probability of failure is 1 – P
 Constant probability for each observation
 e.g., Probability of getting a tail is the same each time we toss
the coin
 Observations are independent
 The outcome of one observation does not affect the outcome of
the other
Possible Binomial Distribution
Settings

 A manufacturing plant labels items as


either defective or acceptable
 A firm bidding for contracts will either get a
contract or not
 A marketing research firm receives survey
responses of “yes I will buy” or “no I will
not”
 New job applicants either accept the offer
or reject it
Developing Binomial Distribution
Lets repeat success and failure type experiment n independent times.
Assume that we obtain below sequence.

S S S……S S F F F……F F
Probability of observing this sequence is
p*p* ….p(1-p)*(1-p)*……*(1-p)

Since X defined as number of success we can write


px(1-p)n-x

Is this the only possible sequence that we can observe?

How many different sequences can be observed?


Sequences of x Successes
in n Trials

 The number of sequences with x successes in n


independent trials is:

n!
C 
n

x! (n  x)!
x

Where n! = n·(n – 1)·(n – 2)· . . . ·1 and 0! = 1

 These sequences are mutually exclusive, since no two


can occur at the same time
Developing Binomial Distribution
Lets repeat success and failure type experiment n independent times.
Assume that we obtain below sequence.

S S S……S S F F F……F F
Probability of observing this sequence is
p*p* ….p(1-p)*(1-p)*……*(1-p)

Since X defined as number of success we can write


px(1-p)n-x
We can observe 𝑋𝑛 different sequences

𝑛
P(X)= 𝑋
px(1-p)n-x
Example
The random variable X represents the number of students
who prefer news from the Internet among a random
sample of students from a large university. If the
population proportion of students who prefer Internet
news is 0.6. And, if we relabel the outcome “Internet” as
a success ( S ) and “not Internet” as a failure (F). List the
associated probabilities, and the value of X for the
elementary outcomes of sampling 4 students.
Example: Answer

Sample points FFFF SFFF SSFF SSSF SSSS


(basic FSFF SFSF SSFS
outcomes) FFSF SFFS SFSS
FFFS FSSF FSSS
FSFS
FFSS
Example: Answer

Sample points FFFF SFFF SSFF SSSF SSSS


(basic FSFF SFSF SSFS
outcomes) FFSF SFFS SFSS
FFFS FSSF FSSS
FSFS
FFSS
Value of X 0 1 2 3 4
Example: Answer

Sample points FFFF SFFF SSFF SSSF SSSS


(basic FSFF SFSF SSFS
outcomes) FFSF SFFS SFSS
FFFS FSSF FSSS
FSFS
FFSS
Value of X 0 1 2 3 4
Probability of 0404 0.601 0.403 0.602 0.402 3 1
0.60 0.40 0.604
each outcome
Example: Answer

Sample points FFFF SFFF SSFF SSSF SSSS


(basic FSFF SFSF SSFS
outcomes) FFSF SFFS SFSS
FFFS FSSF FSSS
FSFS
FFSS
Value of X 0 1 2 3 4
Probability of 0404 0.601 0.403 0.602 0.402 3
0.60 0.40 1
0.604
each outcome
Number of 4 4 4 4 4
0
=1 1
=4 2
=6 3
=4 4
=1
outcomes
Example: Answer
Sample points FFFF SFFF SSFF SSSF SSSS
(basic FSFF SFSF SSFS
outcomes) FFSF SFFS SFSS
FFFS FSSF FSSS
FSFS
FFSS
Value of X 0 1 2 3 4
Probability of 0404 0.601 0.403 0.602 0.402 3
0.60 0.40 1
0.604
each outcome
Number of 4 4 4 4 4
0
=1 1
=4 2
=6 3
=4 4
=1
outcomes
P(X) 0404 4*0.601 0.403 6*0.602 0.402 4*0.603 0.401 0.604

𝑛
P(X)= 𝑋
px(1-p)n-x
Binomial Distribution Formula

n! X nX
P(x)  P (1- P)
x ! (n  x )!
P(x) = probability of x successes in n trials,
with probability of success P on each trial
Example: Flip a coin four
times, let x = # heads:
x = number of ‘successes’ in sample,
n=4
(x = 0, 1, 2, ..., n)
P = 0.5
n = sample size (number of trials
or observations) 1 - P = (1 - 0.5) = 0.5
P = probability of “success” x = 0, 1, 2, 3, 4
Example:
Calculating a Binomial Probability
What is the probability of one success in five
observations if the probability of success is 0.1?
x = 1, n = 5, and P = 0.1

n!
P(x  1)  P X (1 P)n X
x! (n  x)!
5!
 (0.1)1(1 0.1)51
1!(5  1)!
 (5)(0.1)(0.9)4
 .32805
Binomial Distribution
 The shape of the binomial distribution depends on the
values of P and n
Mean P(x) n = 5 P = 0.1
.6
 Here, n = 5 and P = 0.1 .4
.2
0 x
0 1 2 3 4 5

P(x) n = 5 P = 0.5
 Here, n = 5 and P = 0.5 .6
.4
.2
0 x
0 1 2 3 4 5
Binomial Distribution
Mean and Variance

 Mean
μ  E(x)  nP
 Variance and Standard Deviation

σ  nP(1- P)
2

σ  nP(1- P)
Where n = sample size
P = probability of success
(1 – P) = probability of failure
Binomial Characteristics
Examples
μ  nP  (5)(0.1)  0.5
Mean P(x) n = 5 P = 0.1
.6
.4
σ  nP(1- P)  (5)(0.1)(1 0.1) .2
 0.6708 0 x
0 1 2 3 4 5

μ  nP  (5)(0.5)  2.5 P(x) n = 5 P = 0.5


.6
.4
σ  nP(1- P)  (5)(0.5)(1 0.5) .2
 1.118 0 x
0 1 2 3 4 5
CE4,4
The cubs are to play a series of 5 games in St. Louis
against the Cardinals. For any one game it is estimated
that the probability of a Cubs win is 0.4. The
outcomes of the five games are independent of one
other.
Solution: CE4,4
a) What is the probability that the Cubs will win all five games?

5
P(X=5)= P(X)= 5
(0.4)5(1-0.4)0 =0.0102= 1%

b) What is the probability that the Cubs will win a majority of five games?

P(X≥3)= P(X=3)+P(X=4)+P(X=5)
= 53 (0.4)3(1-0.4)2 + 54 (0.4)4(1-0.4)1 + 5
5
(0.4)5(1-0.4)0 =0.3174= 32%

c) If the Cubs win the first game, what is the probability that they will win a majority of five games?

First game cubs 4 games to play


P(X≥2)=0.5248 = 52%

d) Before the series begins, what is the expected number of Cubs wins in these five games?

Mean=np=5(0.4)=2 games

e) If Cubs win the first game, what is the expected number of Cubs wins in these five games?

First game cubs 4 games to play


Y=1+X
Mean=E(Y)=E(1+X)=1+E(X)=1+(4)(0.4)=2.6  3 games
The Hypergeometric Distribution
The Hypergeometric Distribution

 “n” trials in a sample taken from a finite


population of size N
 Sample taken without replacement
 Outcomes of trials are dependent
 Concerned with finding the probability of “X”
successes in the sample where there are “S”
successes in the population
Hypergeometric Distribution
Formula

S! (N  S)!

CSxCNnxS x! (S  x)! (n  x)! (N  S  n  x)!
P(x)  N

Cn N!
n!(N  n)!
Where
N = population size
S = number of successes in the population
N – S = number of failures in the population
n = sample size
x = number of successes in the sample
n – x = number of failures in the sample
Hypergeometric Distribution
Mean and Variance
Mean = E(X)=m  n(S/N)
Variance =Var(X) = s² = n(S/N)(1-S/N)*((N-n)/(N-1))

Where (N-n)/(N-1) is the finite population correction

We assume that we sample from a finite population


without replacement and n/N > 0.05. So, the
probability of a success changes for each trial.
Otherwise you can use Binomial Distribution
Using the
Hypergeometric Distribution
■ Example: 3 different computers are checked from 10 in
the department. 4 of the 10 computers have illegal
software loaded. What is the probability that 2 of the 3
selected computers have illegal software loaded?
N = 10 n=3
S=4 x=2

CSxCNnxS C24C16 (6)(6)


P(x  2)  N
 10   0.3
Cn C3 120
The probability that 2 of the 3 selected computers have illegal
software loaded is 0.30, or 30%.
CE4,5
A company receives a shipment of 16 items. A random
sample of 4 items is selected, and the shipment is
rejected if any of these items proves to be defective.

N=16 n=4
X=number of defectives in the sample
Decision rule: Reject shipment if X≥1 and accept
shipment if X=0
Solution: CE4,5
N=16 n=4 X=number of defectives in the sample
Decision rule: Reject shipment if X≥1 and accept shipment if X=0

a) What is the probability of accepting a shipment containing 4 defective items?

S=4
4 12
0 4
P(accept)=P(X=0)= 16 = 0.29
4

b) What is the probability of accepting a shipment containing 1 defective item?

S=1
1 15
0 4
P(accept)=P(X=0)= 16 = 0.75
4

c) What is the probability of rejecting a shipment containing 1 defective item?

S=1
P(reject)=1-P(accept) = 1- 0.75 =0.25
Discuss: CE4,6 & CE4,7
CE4,6: An analyst predicted that 3.5% of all small corporations
would file for bankruptcy in the coming year. For a random sample
of 100 small corporations, estimate the probability that at least 3
will file for bankruptcy in the next year.

CE4,7:An auditor reviewing the invoices of a small company finds


that there are errors in 1.5% of them. If auditor looks at 500
invoices, what is the probability that he finds more than 1 invoice
with errors?
Chapter Summary

 Defined discrete random variables and


probability distributions
 Discussed descriptive measures
 Discussed the Bernoulli distribution
 Discussed the Binomial distribution
 Discussed the Hypergeometric distribution

You might also like