Seminar Slides Week 2 - With Solutions - Fullpage
Seminar Slides Week 2 - With Solutions - Fullpage
Week 2
General Probability & Normal Distribution
Charanjit Kaur
Learning Outcomes for Week 3
• Understand the concept of probability
• Calculate and interpret probabilities:
• Marginal probability
• Joint probability
• Conditional probability
• Evaluate the relationship between variables
• Understand the concept of a continuous random variable
• Identify the characteristics of a normal distribution
• Find probabilities using Excel
What is PROBABILITY?
• Probability = quantification of chance / likelihood of an occurrence
Probability statements:
✓ Event A has zero probability P(A) = 0 → There is no chance that event A will occur
✓ Event B has probability of one P(B)=1 → Event B will occur with 100% certainty
Union
• P(A or B) or P 𝐴 ∪ 𝐵 describes either A OR B
• P 𝐴 𝑜𝑟 𝐵 = 𝑃 𝐴 + 𝑃 𝐵 − Pr 𝐴 𝑎𝑛𝑑 𝐵
• Note: P(A) and P(B) are referred to as marginal probabilities
Conditional Probability
Conditional Probability
Probability of an event (A) given that (conditional on) another event (B) having occurred.
P(A given B) or P (A|B) is known as conditional probability
𝐏(𝐀 ∩ 𝐁)
𝐏 𝐀𝐁 =
𝐏(𝐁)
Mutually Exclusive and Independent Events
Mutually exclusive:
We know that P(A or B) = P(A) + P(B) – P(A and B). For mutually exclusive events, P(A and B) = 0
Independent:
• Definition: Probability of one event occurring unaffected by the outcome of the other
Open“Heart Disease” sheet in Week 2 Seminar Data.xlsx. It contains data on the diagnosis
• Exercise level
• Medical condition
Medical Condition
1. Place the cursor in any cell within the data
2. Drag & drop Heart Disease under 1. Rows & 2. Values.
3. Right-click on Heart Disease entries > Value Field Setting >
4. Summarize value by > Count;
5. Show Values As > % of Column Total
Calculating Joint Probabilities
Construct a two-way frequency table (contingency table/Pivot table) by placing
Convert to probabilities
Probabilities Exercise Level
Heart Disease Minimal Moderate Grand Total
No
Yes
Grand Total
Calculating Joint Probabilities: Manual Calculations
Construct a two-way frequency table (contingency table/Pivot table) by placing
1. If you choose someone at random from the population, what is the probability that they
• have no heart disease
• do a moderate amount of exercise
• do a moderate amount of exercise and have no heart disease
• do a moderate amount of exercise or have no heart disease
2. Are Heart Disease and Exercise levels mutually exclusive events?
Solutions for Question 1
Count of Heart Disease Exercise Level
Heart Disease Minimal Moderate Grand Total
No 59.92% 34.82% 94.74%
Yes 4.40% 0.86% 5.26%
Grand Total 64.32% 35.68% 100.00%
1. If you choose someone at random from the population, what is the probability that they
• have no heart disease
• P(No Heart Disease) = 94.74%
17
Probability Distribution
• The histogram is a picture of the probability density in crude form
• Probability density describes how “dense” the distribution is over a data range
• Allows us to calculate the probability related to the variable of interest
• In statistics, we use a smooth mathematical function to model the probability density function
(pdf)
• The area under the curve represents the probability
Probability Distribution: Normal Distribution
• The most common distribution in statistics is the normal distribution
• It is a symmetric (bell-shaped) distribution
• The normal distribution has two features: Mean and Stdev
Normal Distribution
• Notation:
𝑿 ~ 𝑵 𝑴𝒆𝒂𝒏, 𝑺𝒕𝒅𝒆𝒗
𝑿 ~ 𝑵 𝝁, 𝝈
• Skewness = 0
• Mean = Median = Mode
STANDARD Normal Distribution
• A special case: STANDARD normal distribution
• Mean = 0 and Stdev = 1
• Used in statistics to assess statistical uncertainty
Question 1 P(Z<0)
Question 4 P(Z>-1)
Question 5 P(Z<2)
Question 6 P(Z>2)
Standard normal distribution: Excel