Probability Theory
Probability Theory
Data Representation
Graphical representation of Data
tem-and-Leaf Plot Histogram (Fig. 508)
tem-and-Leaf Plot
We divide these numbers into 5 groups, 75–79, 80–84, 85–89, 90–94, 95–99.
The integers in the tens position of the groups are 7, 8, 8, 9, 9. These form the
stem
The first leaf is 789, representing 77, 78, 79. The second leaf is 1123344,
representing 81, 81, 82, 83, 83, 84, 84.
The number of times a value occurs is called its absolute frequency. Thus 78
has absolute frequency 1, the value 89 has absolute frequency 5,
Histogram
histograms are better in displaying the distribution of data than stem-and-leaf
plots
The bases of the rectangles in Fig. 508 are the x-intervals (known as class
intervals) 74.5–79.5, 79.5–84.5, 84.5–89.5, 89.5–94.5, 94.5–99.5, whose
midpoints (known as class marks) are x 77, 82, 87, 92, 97, respectively. The
height of a rectangle with class mark x is the relative class frequency frel(x),
defined as the number of data values in that class interval, divided by n ( 30 in
our case). Hence the areas of the rectangles are proportional to these relative
frequencies, 0.10, 0.23, 0.43, 0.17, 0.07, so that histograms give a good
impression of the distribution
Mean. Standard Deviation. Variance. Empirical Rule
Medians and quartiles are easily obtained by ordering and counting, practically
without calculation. But they do not give full information on data: you can change
data values to some extent without changing the median. Similarly for the
quartiles.
The average size of the data values can be measured in a more refined way by
the
mean
standard deviation
Similarly, the spread (variability) of the data values can be measured in a more
refined
way by the standard deviation s or by its square, the variance
Empirical Rule.
Empirical Rule and Outliers. z-Score
Experiments, Outcomes, Events
Examples
1. Tossing a coin – outcomes S ={Head, Tail}
Unable to predict on each toss whether is Head or
Tail.
In the long run can predict that 50% of the time
heads will occur and 50% of the time tails will occur
2. Rolling a die – outcomes
S ={ , , , , , }
={1, 2, 3, 4, 5, 6}
An Event , E
The event, E, is any subset of the sample space,
S. i.e. any set of outcomes (not necessarily all
outcomes) of the random phenomena
Venn
S diagram
E
The event, E, is said to have occurred if after
the outcome has been observed the outcome lies
in E.
S
E
Examples
A B
The event A B occurs if the event A occurs or
the event and B occurs .
AB
A B
Intersection
Let A and B be two events, then the intersection of
A and B is the event (denoted by AB) defined by:
AB
A B
The event A B occurs if the event A occurs and
the event and B occurs .
AB
A B
Complement
A
A
The event A occurs if the event A does not
occur
A
A
In problems you will recognize that you are
working with:
A B
If two events A and B are are mutually
exclusive then:
A B
Probability
Definition: probability of an Event E.
Suppose that the sample space S = {o1, o2, o3, …
oN} has a finite number, N, of oucomes.
Also each of the outcomes is equally likely
(because of symmetry).
Then for any event E
n E n E no. of outcomes in E
P E =
n S N total no. of outcomes
Note : the symbol n A = no. of elements of A
Thus this definition of P[E], i.e.
n E n E no. of outcomes in E
P E =
n S N total no. of outcomes
if A B =
(A and B mutually exclusive)
If two events A and B are are mutually
exclusive then:
A B
P[A B] = P[A] + P[B]
i.e.
P[A or B] = P[A] + P[B]
A B
Rule The additive rule
(In general)
or
P[A or B] = P[A] + P[B] – P[A and B]
Logic AB
A B
AB
hence
P[A B] = P[A] + P[B] – P[A B]
P A B P A P B P A B
Example:
Saskatoon and Moncton are two of the cities competing
for the World university games. (There are also many
others). The organizers are narrowing the competition to
the final 5 cities.
There is a 20% chance that Saskatoon will be amongst
the final 5. There is a 35% chance that Moncton will be
amongst the final 5 and an 8% chance that both
Saskatoon and Moncton will be amongst the final 5.
What is the probability that Saskatoon or Moncton will
be amongst the final 5.
Solution:
Let A = the event that Saskatoon is amongst the final 5.
Let B = the event that Moncton is amongst the final 5.
Given P[A] = 0.20, P[B] = 0.35, and P[A B] = 0.08
What is P[A B]?
Note: “and” ≡ , “or” ≡ .
P A B P A P B P A B
0.20 0.35 0.08 0.47
Rule for complements
2. P A 1 P A
or
P not A 1 P A
Complement
A
A
The event A occurs if the event A does not
occur
A
A
Logic:
A and A are mutually exclusive.
and S A A
A
A
thus 1 P S P A P A
and P A 1 P A
Independent event
Sampling With and Without Replacement
• A box contains 10 screws, three of which are
defective. Two screws are drawn at random.
Find the probability that neither of the two
screws is defective.
• Solution?
Independent event
Exercise
• A batch of 200 iron rods consists of 50
oversized rods, 50 undersized rods, and 100
rods of the desired length. If two rods are
drawn at random without replacement,
what is the probability of obtaining (a) two
rods of the desired length, (b) exactly one of
the desired length, (c) none of the desired
length?
Conditional Probability
Conditional Probability
• Frequently before observing the outcome of a random
experiment you are given information regarding the
outcome
• How should this information be used in prediction of
the outcome.
• Namely, how should probabilities be adjusted to take
into account this information
• Usually the information is given in the following
form: You are told that the outcome belongs to a
given event. (i.e. you are told that a certain event has
occurred)
Definition
Suppose that we are interested in computing the
probability of event A and we have been told
event B has occurred.
Then the conditional probability of A given B is
defined to be:
P A B if P B 0
P A B
P B
Rationale:
If we’re told that event B has occurred then the sample
space is restricted to B.
The probability within B has to be normalized, This is
achieved by dividing by P[B]
The event A can now only occur if the outcome is in of
A ∩ B. Hence the new probability of A is:
A
P A B B
P A B
P B A∩B
Examples
An Example
The academy awards is soon to be shown.
For a specific married couple the probability that
the husband watches the show is 80%, the
probability that his wife watches the show is
65%, while the probability that they both watch
the show is 60%.
If the husband is watching the show, what is the
probability that his wife is also watching the
show
Solution:
The academy awards is soon to be shown.
Let B = the event that the husband watches the show
P[B]= 0.80
Let A = the event that his wife watches the show
P[A]= 0.65 and P[A ∩ B]= 0.60
P A B 0.60
P A B 0.75
P B 0.80
Independence
Definition
Two events A and B are called independent if
P A B P A P B
Note if P B 0 and P A 0 then
P A B P A P B
P A B P A
P B P B
P A B P A P B
and P B A P B
P A P A
Thus in the case of independence the conditional probability of
an event is not affected by the knowledge of the other event
Difference between independence
and mutually exclusive
mutually exclusive
Two mutually exclusive events are independent only in
the special case where
P A 0 and P B 0. (also P A B 0
Mutually exclusive events are
A highly dependent otherwise. A
B
and B cannot occur
simultaneously. If one event
occurs the other event does not
occur.
Independent events
P A B P A P B
P A B P A
or P A
P B PS
S
A B
The ratio of the probability of the
AB set A within B is the same as the
ratio of the probability of the set
A within the entire sample S.
The multiplicative rule of probability
P A P B A if P A 0
P A B
P B P A B if P B 0
and
P A B P A P B
Pr(R) = 0.8
R: It is a rainy day
Pr(W|R) R R
W: The grass is wet
W 0.7 0.4 Pr(R|W) = ?
W 0.3 0.6
Bayes’ Rule
R R
R: It rains
W 0.7 0.4
W: The grass is wet
W 0.3 0.6
Information
Pr(W|R)
R W
Inference
Pr(R|W)
Bayes’ Rule
R R
R: It rains
W 0.7 0.4
W: The grass is wet
W 0.3 0.6
Information: Pr(E|H)
Hypothesis H Evidence E
Posterio Likelihood
Inference: Pr(H|E) Prior
r
Pr( E | H ) Pr( H )
Pr( H | E )
Pr( E )
Bayes’ Rule: More Complicated
• Suppose that B1, B2, … Bk form a partition of S:
Bi B j ; i Bi S
Suppose that Pr(Bi) > 0 and Pr(A) > 0. Then
Pr( A | Bi ) Pr( Bi )
Pr( Bi | A)
Pr( A)
Pr( A | Bi ) Pr( Bi )
k
j 1 Pr( AB j )
Pr( A | Bi ) Pr( Bi )
k
j 1
Pr( B j ) Pr( A | Bj )
Bayes’ Rule: More Complicated
• Suppose that B1, B2, … Bk form a partition of S:
Bi B j ; i Bi S
Suppose that Pr(Bi) > 0 and Pr(A) > 0. Then
Pr( A | Bi ) Pr( Bi )
Pr( Bi | A)
Pr( A)
Pr( A | Bi ) Pr( Bi )
k
j 1 Pr( AB j )
Pr( A | Bi ) Pr( Bi )
k
j 1
Pr( B j ) Pr( A | Bj )
Bayes’ Rule: More Complicated
• Suppose that B1, B2, … Bk form a partition of S:
Bi B j ; i Bi S
Suppose that Pr(Bi) > 0 and Pr(A) > 0. Then
Pr( A | Bi ) Pr( Bi )
Pr( Bi | A)
Pr( A)
Pr( A | Bi ) Pr( Bi )
k
j 1 Pr( AB j )
Pr( A | Bi ) Pr( Bi )
k
j 1
Pr( B j ) Pr( A | Bj )
A More Complicated Example
R It rains
R
W The grass is wet
U People bring umbrella
W U
Pr(UW|R)=Pr(U|R)Pr(W|R)
Pr(R) = 0.8 Pr(UW| R)=Pr(U| R)Pr(W| R)
Pr(W|R) R R Pr(U|R) R R
W 0.7 0.4 U 0.9 0.2
W 0.3 0.6 U 0.1 0.8
Pr(U|W) = ?
A More Complicated Example
R It rains
R
W The grass is wet
U People bring umbrella
W U
Pr(UW|R)=Pr(U|R)Pr(W|R)
Pr(R) = 0.8 Pr(UW| R)=Pr(U| R)Pr(W| R)
Pr(W|R) R R Pr(U|R) R R
W 0.7 0.4 U 0.9 0.2
W 0.3 0.6 U 0.1 0.8
Pr(U|W) = ?
A More Complicated Example
R It rains
R
W The grass is wet
U People bring umbrella
W U
Pr(UW|R)=Pr(U|R)Pr(W|R)
Pr(R) = 0.8 Pr(UW| R)=Pr(U| R)Pr(W| R)
Pr(W|R) R R Pr(U|R) R R
W 0.7 0.4 U 0.9 0.2
W 0.3 0.6 U 0.1 0.8
Pr(U|W) = ?
Conditioning
• If A and B are events with Pr(A) > 0, the conditional
probability of B given A is
Pr( AB)
Pr( B | A)
Pr( A)
• Example: Drug test
A = {Patient is a Women}
Women Men B = {Drug fails}
Success 200 1800 Pr(B|A) = ?
Failure 1800 200 Pr(A|B) = ?
Which Drug is Better ?
Simpson’s Paradox: View I