0% found this document useful (0 votes)
38 views12 pages

Engineering Data Analysis

The document discusses sample spaces, events, and their probabilities. It defines a sample space as the set of all possible outcomes of a random experiment. An event is a subset of the sample space. It provides rules for calculating probabilities, including: 1) The probability of an event is between 0 and 1. 2) The sum of probabilities of all possible outcomes is 1. 3) The probability of an event not occurring is 1 minus the probability of it occurring.

Uploaded by

Robert Manubag
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
38 views12 pages

Engineering Data Analysis

The document discusses sample spaces, events, and their probabilities. It defines a sample space as the set of all possible outcomes of a random experiment. An event is a subset of the sample space. It provides rules for calculating probabilities, including: 1) The probability of an event is between 0 and 1. 2) The sum of probabilities of all possible outcomes is 1. 3) The probability of an event not occurring is 1 minus the probability of it occurring.

Uploaded by

Robert Manubag
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 12

Sample Spaces, Events, and Their Probabilities

Sample Spaces and Events

Rolling an ordinary six-sided die is a familiar example of a random experiment, an action for
which all possible outcomes can be listed, but for which the actual outcome on any given trial
of the experiment cannot be predicted with certainty. In such a situation we wish to assign to
each outcome, such as rolling a two, a number, called the probability of the outcome that
indicates how likely it is that the outcome will occur. Similarly, we would like to assign a
probability to any event, or collection of outcomes, such as rolling an even number, which
indicates how likely it is that the event will occur if the experiment is performed. This section
provides a framework for discussing probability problems, using the terms just mentioned.

Definition
A random experiment is a mechanism that produces a definite outcome that cannot be
predicted with certainty. The sample space associated with a random experiment is the set of
all possible outcomes. An event is a subset of the sample space.
An event E is said to occur on a particular trial of the experiment if the outcome observed is an
element of the set E.

Example 1
Construct a sample space for the experiment that consists of tossing a single coin.
Solution:
The outcomes could be labeled h for heads and t for tails. Then the sample space is the set
S= {h,t}.
When selecting elements of a set, the number of possible outcomes depends on the conditions
under which the selection has taken place. There are at least 4 rules to count the number of
possible outcomes:

Multiplicative rule
Suppose you have j sets of elements, n1 in the first set, n2 in the second set, ... and nj in the jth
set. Suppose you wish to form a sample of j elements by taking one element from each of the j
sets. The number of possible sets is then defined by
n1 n2 ... nj.

Permutation rule
The arrangement of elements in a distinct order is called permutation. Given a single set of n
distinctively different elements, you wish to select k elements from the n and arrange them
within k positions. The number of different permutations of the n elements taken k at a time is
denoted Pkn and is equal to

Partitions rule
Suppose a single set of n distinctively different elements exists. You wish to partition them into
k sets, with the first set containing n1 elements, the second containing n2 elements, ..., and the
kth set containing nk elements. The number of different partitions is

where n1 + n2 + ...+ nk =n.


The numerator gives the permutations of the n elements. The terms in the denominator
remove the duplicates due to the same assignments in the k sets (multinomial coefficients).

Combinations rule
A sample of k elements is to be chosen from a set of n elements. The number of different
samples of k samples that can be selected from n is equal to
The combination rule is a special application of the partition rule, with j=2 and n 1=k. From
n=n1+n2 it follows that n2 can be replaced by (n-n1). Usually the two groups refer to the two
different groups of selected and non-selected samples. The order in which the n1 elements are
drawn is not important, therefore there are fewer combinations than permutations (binomial
theorem).

Note: The factorial n! is defined by n! = 1 2 3 ...(n-2) (n-1) n. (0! is defined as 1).

Rules of Probability

Probability Rule One

Our first rule simply reminds us of the basic property of probability that we’ve already learned.
The probability of an event, which informs us of the likelihood of it occurring, can range
anywhere from 0 (indicating that the event will never occur) to 1 (indicating that the event is
certain).

Probability Rule One:


For any event A, 0 ≤ P(A) ≤ 1.

EXAMPLE: Blood Types

As previously discussed, all human blood can be typed as O, A, B or AB.


In addition, the frequency of the occurrence of these blood types varies by ethnic and racial
groups.
According to Stanford University’s Blood Center (bloodcenter.stanford.edu), these are the
probabilities of human blood types in the United States (the probability for type A has been
omitted on purpose):

Motivating question for rule 2: A person in the United States is chosen at random. What is the
probability of the person having blood type A?
Answer: Our intuition tells us that since the four blood types O, A, B, and AB exhaust all the
possibilities, their probabilities together must sum to 1, which is the probability of a “certain”
event (a person has one of these 4 blood types for certain).
Since the probabilities of O, B, and AB together sum to 0.44 + 0.1 + 0.04 = 0.58, the probability
of type A must be the remaining 0.42 (1 – 0.58 = 0.42):

Probability Rule Two


This example illustrates our second rule, which tells us that the probability of all possible
outcomes together must be 1.

Probability Rule Two:


The sum of the probabilities of all possible outcomes is 1.

This is a good place to compare and contrast what we’re doing here with what we learned in
the Exploratory Data Analysis (EDA) section.
 Notice that in this problem we are essentially focusing on a single categorical variable:
blood type.
 We summarized this variable above, as we summarized single categorical variables in the
EDA section, by listing what values the variable takes and how often it takes them.
 In EDA we used percentages, and here we’re using probabilities, but the two convey the
same information.
 In the EDA section, we learned that a pie chart provides an appropriate display when a
single categorical variable is involved, and similarly we can use it here (using percentages
instead of probabilities):
Even though what we’re doing here is indeed similar to what we’ve done in the EDA section,
there is a subtle but important difference between the underlying situations

 In EDA, we summarized data that were obtained from a sample of individuals for whom
values of the variable of interest were recorded.

Here, when we present the probability of each blood type, we have in mind the
entire population of people in the United States, for which we are presuming to know the
overall frequency of values taken by the variable of interest.

Probability Rule Three


In probability and in its applications, we are frequently interested in finding out the probability
that a certain event will not occur.

An important point to understand here is that “event A does not occur” is a separate event that
consists of all the possible outcomes that are not in A and is called “the complement event of
A.”

Notation: we will write “not A” to denote the event that A does not occur. Here is a visual
representation of how event A and its complement event “not A” together represent all
possible outcomes.

Comment:
Such a visual display is called a “Venn diagram.” A Venn diagram is a simple way to visualize
events and the relationships between them using rectangles and circles.
Rule 3 deals with the relationship between the probability of an event and the probability of its
complement event.
Given that event A and event “not A” together make up all possible outcomes, and since rule 2
tells us that the sum of the probabilities of all possible outcomes is 1, the following rule should
be quite intuitive:

Probability Rule Three (The Complement Rule):


 P(not A) = 1 – P(A)
 that is, the probability that an event does not occur is 1 minus the probability that it does occur

EXAMPLE: Blood Types


Back to the blood type example:

Here is some additional information:


 A person with type A can donate blood to a person with type A or AB.
 A person with type B can donate blood to a person with type B or AB.
 A person with type AB can donate blood to a person with type AB only.
 A person with type O blood can donate to anyone.
What is the probability that a randomly chosen person cannot donate blood to everyone? In
other words, what is the probability that a randomly chosen person does not have blood type
O? We need to find P(not O). Using the Complement Rule, P(not O) = 1 – P(O) = 1 – 0.44 = 0.56.
In other words, 56% of the U.S. population does not have blood type O:

Clearly, we could also find P(not O) directly by adding the probabilities of B, AB, and A.
Comment:
Note that the Complement Rule, P(not A) = 1 – P(A) can be re-formulated as P(A) = 1 – P(not
A).
o P(not A) = 1 – P(A)
o Can be re-formulated as P(A) = 1 – P(not A).
o This seemingly trivial algebraic manipulation has an important application, and actually
captures the strength of the complement rule.
o In some cases, when finding P(A) directly is very complicated, it might be much easier to
find P(not A) and then just subtract it from 1 to get the desired P(A).
o We will come back to this comment soon and provide additional examples.

Comments:
 The complement rule can be useful whenever it is easier to calculate the probability of the
complement of the event rather than the event itself.
 Notice, we again used the phrase “at least one.”
 Now we have seen that the complement of “at least one …” is “none … ” or “no ….” (as we
mentioned previously in terms of the events being “opposites”).
 In the above activity we see that
o P(NONE of these two side effects) = 1 – P(at least one of these two side effects )
 This is a common application of the complement rule which you can often recognize by the
phrase “at least one” in the problem.

Probabilities Involving Multiple Events


We will often be interested in finding probabilities involving multiple events such as

 P(A or B) = P(event A occurs or event B occurs or both occur)


 P(A and B)= P(both event A occurs and event B occurs)
A common issue with terminology relates to how we usually think of “or” in our daily life. For
example, when a parent says to his or her child in a toy store “Do you want toy A or toy B?”,
this means that the child is going to get only one toy and he or she has to choose between
them. Getting both toys is usually not an option.
In contrast:

In probability, “OR” means either one or the other or both.


and so P(A or B) = P(event A occurs or event B occurs or BOTH occur)

Having said that, it should be noted that there are some cases where it is simply impossible for
the two events to both occur at the same time.

Probability Rule Four


The distinction between events that can happen together and those that cannot is an important
one.
Disjoint: Two events that cannot occur at the same time are called disjoint or mutually
exclusive. (We will use disjoint.)

It should be clear from the picture that


 in the first case, where the events are NOT disjoint, P(A and B) ≠ 0
 in the second case, where the events ARE disjoint, P(A and B) = 0.
EXAMPLE:
Consider the following two events:
A — a randomly chosen person has blood type A, and
B — a randomly chosen person has blood type B.
In rare cases, it is possible for a person to have more than one type of blood flowing through his
or her veins, but for our purposes, we are going to assume that each person can have only one
blood type. Therefore, it is impossible for the events A and B to occur together.
 Events A and B are DISJOINT

EXAMPLE:
Consider the following two events:
A — a randomly chosen person has blood type A
B — a randomly chosen person is a woman.
In this case, it is possible for events A and B to occur together.
 Events A and B are NOT DISJOINT.

The Venn diagrams suggest that another way to think about disjoint versus not disjoint events
is that disjoint events do not overlap. They do not share any of the possible outcomes, and
therefore cannot happen together.
On the other hand, events that are not disjoint are overlapping in the sense that they share
some of the possible outcomes and therefore can occur at the same time.

We now begin with a simple rule for finding P(A or B) for disjoint events.

Probability Rule Four (The Addition Rule for Disjoint Events):


 If A and B are disjoint events, then P(A or B) = P(A) + P(B).

Comment:
 When dealing with probabilities, the word “or” will always be associated with the operation
of addition; hence the name of this rule, “The Addition Rule.”

EXAMPLE: Blood Types


Recall the blood type example:
Here is some additional information
 A person with type Acan donate blood to a person with type A or AB.
 A person with type Bcan donate blood to a person with type B or AB.
 A person with type ABcan donate blood to a person with type AB
 A person with type Oblood can donate to anyone.

What is the probability that a randomly chosen person is a potential donor for a person with
blood type A?
From the information given, we know that being a potential donor for a person with blood type
A means having blood type A or O.

We therefore need to find P(A or O). Since the events A and O are disjoint, we can use the
addition rule for disjoint events to get:
 P(A or O) = P(A) + P(O) = 0.42 + 0.44 = 0.86.
It is easy to see why adding the probability actually makes sense.

If 42% of the population has blood type A and 44% of the population has blood type O,
 then 42% + 44% = 86% of the population has either blood type A or O, and thus are
potential donors to a person with blood type A.
This reasoning about why the addition rule makes sense can be visualized using the pie chart
below:

Comment:
 The Addition Rule for Disjoint Events can naturally be extended to more than two disjoint
events. Let’s take three, for example. If A, B and C are three disjoint events
then P(A or B or C) = P(A) + P(B) + P(C). The rule is the same for any number of disjoint events.

We are now finished with the first version of the Addition Rule (Rule four) which is the version
restricted to disjoint events. Before covering the second version, we must first discuss P(A and
B).

Finding P(A and B) using Logic

We now turn to calculating


 P(A and B)= P(both event A occurs and event B occurs)
Later, we will discuss the rules for calculating P(A and B).
First, we want to illustrate that a rule is not needed whenever you can determine the answer
through logic and counting.
Special Case:
There is one special case for which we know what P(A and B) equals without applying any rule.
So, if events A and B are disjoint, then (by definition) P(A and B)= 0. But what if the events are
not disjoint?
Recall that rule 4, the Addition Rule, has two versions. One is restricted to disjoint events, which
we’ve already covered, and we’ll deal with the more general version later in this module. The
same will be true of probabilities involving AND
However, except in special cases, we will rely on LOGIC to find P(A and B) in this course.
Before covering any formal rules, let’s look at an example where the events are not disjoint.
EXAMPLE: Periodontal Status and Gender
Consider the following table regarding the periodontal status of individuals and their gender.
Periodontal status refers to gum disease where individuals are classified as either healthy, have
gingivitis, or have periodontal disease.

We have seen this type of table before when we discussed analyzing data in case C → C. For the
purpose of this question, we will use this data as our “population” and consider randomly
selecting one person.

You might also like