0% found this document useful (0 votes)
3 views23 pages

Module 4 Probability Basics - Filled

The document provides an overview of probability basics and statistical methods, emphasizing the importance of random sampling and the concept of sampling variability. It introduces key terminology, basic probability rules, and examples of probability calculations, including conditional probability and independence. The document also discusses different types of random variables and their probability distributions.

Uploaded by

maddyrunway
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views23 pages

Module 4 Probability Basics - Filled

The document provides an overview of probability basics and statistical methods, emphasizing the importance of random sampling and the concept of sampling variability. It introduces key terminology, basic probability rules, and examples of probability calculations, including conditional probability and independence. The document also discusses different types of random variables and their probability distributions.

Uploaded by

maddyrunway
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 23

Probability Basics

Introduction to Statistical Methods


STAT 2331
Looking Back & Moving Forward
Suppose we want to answer some question about a population
Example: what proportion of all adults believe in Darwinian evolution?

We can use the results of a sample to infer about the population


Example: a poll of 1005 randomly sampled adults found 12% believed in Darwinian evolution

However, it is possible for our estimate to be wrong


Example: a second poll of 1005 randomly sampled adults found 32% believed in Darwinian evolution

Random sampling helps eliminate bias, but estimates based on samples can still be wrong because of
sampling variability—two samples from the same population will yield different results

If the sampling variability is too large, then we cannot trust the results of any one sample
We need probability to help us understand and express how samples behave

2
Probability Terminology
• Random Phenomenon: situation involving chance leading to results called
outcomes
• Examples: getting heads or tails on a coin toss; whether school is cancelled on a particular
day
• Unpredictable in the short run but regular and predictable over the long run
• Probability: proportion of times an outcome occurs in infinitely repeated trials
• Example: snow chance of 80% implies it will snow on 80% of all days with similar weather
conditions H
H
• In practice, we have only a finite number of repeated trials T
H
• Law of Large Numbers: the larger the sample size (number of trials), the closer
H
T
the observed sample probability will be to the unknown theoretical probabilityT
-
• Sample Space, H
H

• Set of possible outcomes for a random phenomenon T


T
• Example: What is the sample space for flipping a coin 3 times? H
T
T

• Notation represents the probability event occurs:


3
Example 1
An observational study was conducted to evaluate the long-term complications in
diabetic patients treated under two different treatment regimens. The researcher took a
random sample of 200 patients and recorded which treatment each patient used as well
as whether they had experienced complications of foot, eye, or cardiovascular disease.
No • Outcomes in the sample space:
Treatment Complications Complications Total
Treatment 1 11 77 88
Treatment 2 9 103 112
• Estimate the following:
Total 20 180 200

• = patient used Treatment 1


• = patient has experienced complications
• = patient used Treatment 2 and has not
experienced complications

4
Basic Probability Definitions
• Union of two events & 𝑆
• Denoted U
𝐴 and 𝐵
• The event either or occurs
• Intersection of two events &
• Denoted ∩ 𝑆
• The event both and happen at the same time
• Mutually exclusive / disjoint events 𝐴 𝐵
• Two events that cannot happen at the same
time
• They have no intersection 𝑆
• Complement of event
𝐴 𝐴
𝑐

• Denoted
• The event does not occur
5
Basic Probability Rules
• For any event ,
• for a 0% chance of event
• for a 100% chance of event
• If is the sample space, then
• Complement Rule:
• The complement of event is its opposite, denoted
• For any event ,
• Addition Rule:
• In general,
• If and are mutually exclusive, then
• Why is this true?
• Because in this case

6
Example 2
Recall the results of the diabetes study: No
Treatment Complications Complications Total
• = patient used Treatment 1
Treatment 1 11 77 88
• = patient has experienced complications
Treatment 2 9 103 112
• = patient used Treatment 2 and has not
Total 20 180 200
experienced complications
Estimate the following probabilities:

• What is another way to compute ?


• Read it straight from the table:

7
Conditional Probability

• In conditional probability calculations the value of one variable or outcome


of one trial is known
• This restricts the sample space and reduces the total number of possible outcomes
• In other words, it changes the denominator of the fraction
• When , the conditional probability of given is:

• Multiplication Rule:
• In general,

8
Example 3

The table below displays information regarding the 80.2 million long-form federal returns
received by the IRS one year. It cross-tabulates the taxpayer’s income level and whether
they were audited. For simplicity, frequencies are reported in thousands and rounded.

• Find the following conditional probabilities by reading them directly from the table:
Not
Income Level Audited Audited Total
Under $25,000 90 14010 14100
$25,000 to $49,999 71 30629 30700
$50,000 to $99,999 69 24631 24700
$100,000 or more 80 10620 10700
Total 310 79890 80200

9 Source: Statistics: The Art and Science of Learning from Data


Example 4
The chance of having a baby with Down syndrome increases after a woman is 35 years
old. A study used a sample of 5282 pregnant women aged 35 or over to test the
accuracy of the Triple Blood Test (TBT), which screens for the condition.
Down Syndrome Pos Neg Total
• Two possible errors with diagnostic tests: (Down) 48 6 54
• False positive: positive test, but condition is absent (Unaffected) 1307 3921 5228
• False negative: negative test, but condition is present Total 1355 3927 5282

• Given a positive test, what is the probability the baby will have Down syndrome?

• This is a true positive

• Given a negative test, what is the probability the baby will have Down syndrome?

• This is a false negative

10
Independence
• Suppose we have a population of 100 subjects, of which 5 are unusual. If
we randomly sample 3 subjects without replacement, what is the
probability all 3 will be unusual?

• These trials are dependent on each other


• Trials are independent if the outcome of one does not influence the
outcome of another
• Mathematically, two events are independent if
• Conditioning on has no impact on the probability of
• Multiplication Rule:
• If and are independent, then
• Why is this true?
• General multiplication rule is , and in this case
11
Example 5
Blue Other Color Color Not Color
Gender Eyes Eyes Total Gender Blind Blind Total
Female 3 12 15 Female 1 14 15
Male 4 16 20 Male 3 17 20
Total 7 28 35 Total 4 31 35

Are the variables Gender and Color Blind


Let = blue eyes and = male independent?
Are the events and independent? • We need in each of the four cells for
independence!

• Yes, the events are independent because


• No, gender and color blindness are dependent
because
12
Tree Diagrams
• Probability often requires combining several basic rules into elaborate calculations.
Tree diagrams help us visualize and simplify the math.

𝑃 ( 𝐵 ∨ 𝐴 )
𝐵 𝑃 ( 𝐴∩ 𝐵 ) =𝑃 ( 𝐴 ) ∗ 𝑃 (𝐵∨ 𝐴)
( 𝐴 ) 𝐴 𝑃 (𝐵 𝑐 𝑐 𝑃 ( 𝐴∩ 𝐵 𝑐 )=𝑃 ( 𝐴 ) ∗ 𝑃 (𝐵𝑐∨ 𝐴)
𝑃 ∨ 𝐴)
𝐵
-
𝐵
𝑐
𝑃( 𝐴 )
𝐴 𝑐
𝑃 ( 𝐵 ∨ 𝑃 ( 𝐴𝑐 ∩ 𝐵 )=𝑃 ( 𝐴𝑐 ) ∗ 𝑃 ( 𝐵∨ 𝐴𝑐 )
) 𝑐
𝐴 𝑃 (𝐵 𝑐 𝑐 𝑐 𝑃 ( 𝐴𝑐 ∩ 𝐵 𝑐) =𝑃 ( 𝐴𝑐 ) ∗ 𝑃 ( 𝐵𝑐∨ 𝐴𝑐)
∨𝐴 )
𝐵
• The sum of the probabilities emanating from any branch is 1
• The final outcomes are disjoint
• The find the probability for a final outcome, multiply across that branch
13
Example 6
The following tree diagram shows the probabilities of skin cancer among men and women
by body locations:

0.44 Man
• Calculate Head
0.56 Woman
. 15
0
0.63 Man
Individuals with 0.41 Trunk
• Calculate skin cancer 0.37 Woman
0.4
4
0.20 Man
Limbs
• Calculate 0.80 Woman

14
Probability Rules for Random Variables
• A random variable has values that Let and be numbers with
represent numerical outcomes of
a random phenomenon
• Capital letters such as refer to the • by the addition rule for disjoint events
variable itself
• Lowercase letters such as refer to •
possible values of the variable • by the addition rule for disjoint events
• Two types of random variables:
• Discrete

• by the complement rule
• Continuous
• The probability distribution of a
random variable tells us what
values are possible and their
associated probabilities
• by the complement & addition rules
15
Probability Distributions for Discrete Random Variables
• Discrete random variables have a countable (finite) list of possible outcomes
• A probability model lists all possible outcomes along with their associated probabilities

Value of …

Probability …

• Where
• Find the probability for any event by adding probabilities of the individual outcomes that
make up the event
• Mean (aka expected value) of
• Multiply each possible value by its probability, then add all the products:

• Variance of
• Subtract the mean from each possible value, square the result, multiply by the corresponding
probability, then add all the products:

16
Example 7
Suppose we have a population of 15 people with the following ages. What is the
probability distribution for , the age of a randomly chosen person?

13, 14, 15, 16, 16, 17, 17, 17, 18, 18, 18, 18, 18, 18, 18
13 14 15 16 17 18

17
Example 8
A study examined hearing impairment in 5333 Dalmatians, since pure dog breeds are often
inbred with high numbers of congenital defects. Let = the number of ears impaired in a
randomly chosen Dalmatian. 0 1 2
0.70 0.22 0.08
• What is the mean of ?

• What is the variance of ?


• What is the standard deviation of ?

18
Probability Distributions for Continuous Random
Variables
• Continuous random variables can take on any value in an interval
• Since possible outcomes are infinite, we calculate probabilities by finding areas
under a density curve
• A density curve is:
• Fitted to the irregular bars of a histogram
• Describes the overall pattern of a distribution
• Is always on or above the horizontal axis
• Has total area exactly 1 underneath it
• Probabilities are assigned to intervals of values
rather than individual values
• In fact, if is continuous and is any constant,

19
Example 9
Suppose we have a uniform distribution over the interval from 0 to 5

• What is the height of the distribution?


• For , we must have

• Or use the complement rule:


1 3
20
Rules for Means and Variances of Random Variables
• If is a random variable and and
are fixed numbers, then

• If and are random variables with


correlation , then

and
and

• If and are independent, then and


21
Example 10
Suppose a family has natural gas heat and electric air conditioning. In the summer, their
electric bill is high and gas bill is low; in the winter, the opposite is true. The correlation
between the bills is -0.67. Suppose the mean electric bill is $121 with a standard deviation
of $22, and the mean gas bill is $56 with a standard deviation of $19.

• Find the mean and standard deviation for the total of these two bills
• Mean

• Variance

• Standard deviation

22
Important Points
• Law of Large Numbers: the larger the sample size, the closer
the sample probability will be to the theoretical probability
• Basic probability rules for events and , and sample space
• and
• Complement rule:
• Addition rule:
• If and are mutually exclusive,
• Multiplication rule:
• The conditional probability of given is
• If and are independent,
• Tree diagrams simplify complex probability
• Two types of random variables:
• Discrete: find probabilities by adding
• Continuous: find probabilities by computing areas under density curves
• When combining two random variables, add their variances 23

You might also like