Probability and Distribution Deck
Probability and Distribution Deck
DISTRIBUTION
D. SATYAM
(LEAD DATA SCIENTIST)
PROBABILITY
• Probability implies 'likelihood' or 'chance'. When an event is certain to happen then the probability of
occurrence of that event is 1 and when it is certain that the event cannot happen then the probability
of that event is 0.
• Hence the value of probability ranges from 0 to 1.
Solution:
• Sample Space
The set of all possible out-comes of an experiment is called the sample space. It is denoted by 'S' and its
number of elements are n(s).
Example
In throwing a dice, the number that appears at top is any one of 1,2,3,4,5,6. So here:
S ={1,2,3,4,5,6} and n(s) = 6
Similarly in the case of a coin, S={Head,Tail} or {H,T} and n(s)=2.
PROBABILITY - BASIC CONCEPTS
Event
Every subset of a sample space is an event. It is denoted by 'E'.
Example
In throwing a dice S={1,2,3,4,5,6}, the appearance of an even number will be the event E={2,4,6}.
Clearly E is a sub set of S.
PROBABILITY - BASIC CONCEPTS
• Exhaus9ve events
When every possible out come of an experiment is considered.
Example
A dice is thrown, cases 1,2,3,4,5,6 form an exhausCve set of events.
PROBABILITY - BASIC CONCEPTS
• Mutually exclusive or Disjoint event
If two or more events can't occur simultaneously,
that is no two of them can occur together.
Example
When a coin is tossed, the event of occurrence of a head and the event of occurrence of a tail are mutually exclusive
events.
Statement: If A and B are two mutually exclusive events, then the probability of occurrence of either A or B is the
sum of the individual probabilities of A and B. Symbolically
ADDITIVE THEOREM OF PROBABILITY - EXAMPLES
• For Non Mutually Exclusive Events
1. A shooter is known to hit a target 3 out of 7 shots; whereas another shooter is known to hit the target 2 out
of 5 shots. Find the probability of the target being hit at all when both of them try.
2. In a math class of 30 students, 17 are boys and 13 are girls. On a unit test, 4 boys and 5 girls made an A grade.
If a student is chosen at random from the class, what is the probability of choosing a girl or an A student?
1. A card is drawn from a pack of 52, what is the probability that it is a king or a queen?
If we recall dependent event(), the earlier stated mulCplicaCve theorem is not applicable for dependent events.
For dependent event, we have an another theorem called the condiConal probability which is given as:
The probability of event B given event A equals the probability of event A and event B divided by the probability
of event A
MULTIPLICATIVE THEOREM/CONDITIONAL PROBABILITY - EXAMPLES
Independent Event: You have a cowboy hat, a top hat, and an Indonesian hat called a songkok. You also have four
shirts: white, black, green, and pink. If you choose one hat and one shirt at random, what is the probability that you
choose the songkok and the black shirt?
MULTIPLICATIVE THEOREM/CONDITIONAL PROBABILITY - EXAMPLES
Independent Event: You have a cowboy hat, a top hat, and an Indonesian hat called a songkok. You also have four
shirts: white, black, green, and pink. If you choose one hat and one shirt at random, what is the probability that you
choose the songkok and the black shirt?
The two events are independent events; the choice of hat has no effect on the choice of shirt.
There are three different hats, so the probability of choosing the songkok is 1/3 .
There are four different shirts, so the probability of choosing the black shirt is 1/4.
Dependent Event: An urn contains 20 red and 10 blue balls. Two balls are drawn from a bag one after the other
without replacement. What is the probability that both the balls drawn are red?
MULTIPLICATIVE THEOREM/CONDITIONAL PROBABILITY - EXAMPLES
Dependent Event: An urn contains 20 red and 10 blue balls. Two balls are drawn from a bag one after the other
without replacement. What is the probability that both the balls drawn are red?
BAYES THEOREM
Example:
Suppose the weather of the day is cloudy. Now, you need to know whether it would rain today, given the cloudiness of the
day. Therefore, you are supposed to calculate the probability of rainfall, given the evidence of cloudiness.
BAYES THEOREM – FROM WHERE IT CAME?
We know from Conditional Probability:
Rearranging Equation 1:
= *
Similarly: = *
Since =
Hence,
Finally:
BAYES THEOREM – GENERALIZED FORM?
BAYES THEOREM - EXAMPLES
• Epidemiologists claim that the probability of breast cancer among Caucasian women in their mid -50s is 0.005. An established
test identified people who had breast cancer and those that were healthy. A new mammography test in clinical trials has a
probability of 0.85 for detecting cancer correctly. In women without breast cancer, it has a chance of 0.925 for a negative
result. If a 55-year-old Caucasian woman tests positive for breast cancer, what is the probability that she, in fact, has breast
cancer?
Solution:
BAYES THEOREM - EXAMPLES
• Epidemiologists claim that the probability of breast cancer among Caucasian women in their mid -50s is 0.005. An
established test idenCfied people who had breast cancer and those that were healthy. A new mammography test in clinical
trials has a probability of 0.85 for detecCng cancer correctly. In women without breast cancer, it has a chance of 0.925 for a
negaCve result. If a 55-year-old Caucasian woman tests posiCve for breast cancer, what is the probability that she, in fact, has
breast cancer?
SoluCon:
• P(Cancer) = 0.005
Symantec works by having users train the system. It looks for patterns in the words in emails marked as spam by the user. For
example, it may have learned that the word “free” appears in 20% of the emails marked as spam. Assuming 0.1% of non-spam
mail includes the word “free” and 50% of all emails received by the user is spam, find the probability that a mail is a spam if the
word “free” appears in it.
Solution:
BAYES THEOREM - EXAMPLES
Symantec works by having users train the system. It looks for patterns in the words in emails marked as spam by the user. For
example, it may have learned that the word “free” appears in 20% of the emails marked as spam. Assuming 0.1% of non-spam
mail includes the word “free” and 50% of all emails received by the user is spam, find the probability that a mail is a spam if the
word “free” appears in it.
Solution:
RANDOM VARIABLES – DISCRETE/CONTINUOUS
RANDOM VARIABLE TYPES
PROBABILITY MASS FUNCTION/PROBABILITY DENSITY FUNCTION
DISCRETE DISTRIBUTIONS - BINOMIAL DISTRIBUTION
• A distribution where only two outcomes are possible, such as success or failure, gain or loss, win or lose
and where the probability of success and failure is same for all the trials is called a Binomial
Distribution.
• The outcomes need not be equally likely.
• Each trial is independent.
• A total number of n identical trials are conducted.
• The probability of success and failure is same for all trials. (Trials are identical.)
• Mathematical Representation
CONTINUOUS DISTRIBUTION – NORMAL DISTRIBUTION
• Normal distribution represents the behaviour of most of the situations in the universe (That is why it’s called a “normal”
distribution.)
• The large sum of (small) random variables often turns out to be normally distributed, contributing to its widespread
application.
Characteristics:
• The mean, median and mode of the distribution coincide.
• The curve of the distribution is bell-shaped and symmetrical about the line x=μ.
• The total area under the curve is 1.
• Exactly half of the values are to the left of the center and the other half to the right.
• A normal distribution is highly different from Binomial Distribution. However, if the number of trials approaches infinity
then the shapes will be quite similar.
NORMAL DISTRIBUTION – CENTRAL LIMIT THEOREM
• The central limit theorem in staCsCcs states that, given a sufficiently large sample size, the sampling distribuCon of
the mean for a variable will approximate a normal distribuCon regardless of that variable’s distribuCon in the populaCon.
• (In Layman’s term – even if the data is not normally distributed, the mean of the distribuCon are normal distribuCon
provided the sample size is large).
• Why is it useful: We can use mean’s Normal DistribuCons
to make confidence Intervals, perform hypothesis tesCng.
NORMAL DISTRIBUTION – CENTRAL LIMIT THEOREM
Central Limit Theorem (Key Takeaways)
• The central limit theorem (CLT) states that the distribuCon of sample means approximates a normal distribuCon as the
sample size gets larger.
• Sample sizes equal to or greater than 30 are considered sufficient for the CLT to hold.
• A key aspect of CLT is that the average of the sample means and standard deviaCons will equal the populaCon mean and
standard deviaCon.
• A sufficiently large sample size can predict the characterisCcs of a populaCon accurately.
• ExpectaCon of Sample Mean as a random variable = PopulaCon Mean. Symbolically E(Xq) = µ.
• Standard DeviaCon (Xq) = σ / √n (where σ is standard deviaCon and n is sample size).
NORMAL DISTRIBUTION – EMPIRICAL RULE
The empirical rule states that for a normal distribution, nearly all of the data will fall within three standard deviations of
the mean. The empirical rule can be broken down into three parts:
• 68% of data falls within the first standard deviation from the mean.
• 95% fall within two standard deviations.
• 99.7% fall within three standard deviations.
The rule is also called the 68-95-99 7 Rule or the Three Sigma Rule.
PDF OF A NORMAL DISTRIBUTION AND ORIGIN OF EMPIRICAL FORMULA
Empirical Formula of Normal Distribution: The empirical rule, also referred to as the three-sigma rule or 68-95-99.7 rule, is a
statistical rule which states that for a normal distribution, almost all data falls within three standard deviations (denoted by
σ) of the mean (denoted by µ). Broken down, the empirical rule shows that almost 68% falls within the first standard
deviation (µ ± σ), almost 95% within the first two standard deviations (µ ± 2σ), and almost 99.7% within the first three
standard deviations (µ ± 3σ).
NORMAL DISTRIBUTION/ORIGIN OF Z-SCORE
Z Score gives how many standard deviaCon away from mean a value is. However, to understand the probability associated with
it, we need to refer to Z-Table.
Lets solve some quesCons:
Type 1: Comparison of 2 different Normally Distributed values (Z-Score is enough)
Type 2: Finding the probability or percentage of values. (Need Z-table)
Type 1: Happy and Ekta are two students. Happy Scored 65 marks in Math Exam while Ekta scored 80 in English Exam. Given
that both Math and English marks follows an approx. Normal DistribuCon, who performed beyer?
Math ~ N(60,4)
English ~ N(79,2)
Type 2: According to the Canter for Disease Control, heights for U.S. adult females and males are approximately normal.
Females: mean of 64 inches and SD of 2 inches
Males: mean of 69 inches and SD of 3 inches
Find the probability of a randomly selected U.S. adult female being taller than 65 inches.