0% found this document useful (0 votes)
53 views11 pages

HL AI Probability Distributions Notes RMS

This document discusses three types of discrete probability distributions: general discrete, binomial, and Poisson. It provides examples and explanations of key concepts for each distribution including probability mass functions, expected value, variance, and how to calculate these using a calculator. Discrete random variables and their properties are also covered.

Uploaded by

Tanish Bengani
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
53 views11 pages

HL AI Probability Distributions Notes RMS

This document discusses three types of discrete probability distributions: general discrete, binomial, and Poisson. It provides examples and explanations of key concepts for each distribution including probability mass functions, expected value, variance, and how to calculate these using a calculator. Discrete random variables and their properties are also covered.

Uploaded by

Tanish Bengani
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

HL AI Probability Distributions Notes

1. Discrete Probability Distributions

We will look at 3 types discrete probability distributions

1. A general discrete distribution, where 𝑷(𝒙) is known or determined from a


function
2. Binomial distribution
3. Poisson distribution

1.1 General

A discrete random variable, 𝑋, can take specific, counted values.

The probability of getting a specific value, 𝑥, for the random variable is written as 𝑃(𝑋 = 𝑥).

The values of 𝑃(𝑋 = 𝑥) come from a discrete probability distribution function (pdf), 𝑓(𝑥),
where 𝑷(𝑿 = 𝒙) = 𝒇(𝒙).

Note that
𝟎 ≤ 𝑷(𝑿 = 𝒙) ≤ 𝟏 (the probability of any outcomes must be between 0 and 1)
∑𝒙 𝑷( 𝑿 = 𝒙) = 𝟏 (the sum of probabilities must add up to 1)

PDF- Gives the probability for that specific value, 𝑷(𝑿 = 𝒙)

CDF- Gives the probability for that the variable is equal to or less than the value, 𝑷(𝑿 ≤ 𝒙)

Eg. 𝑃(𝑋 ≤ 3) = 𝑃(𝑋 = 3) + 𝑃 (𝑋 = 2) + 𝑃(𝑋 = 1) + 𝑃(𝑋 = 0)

Eg. 𝑃(𝑋 ≥ 3) = 1 − 𝑃(𝑋 ≤ 2)

Eg. 𝑃(𝑋 > 3) = 1 − 𝑃(𝑋 ≤ 3)

HL AI Probability Distributions Notes RMS 1


Expected Value

The expected value of a discrete random variable, 𝑋, is the mean score that would be
expected if the experiment was carried out many times. It is calculated:

𝑬(𝒙) = 𝝁 = ; 𝒙. 𝑷( 𝑿 = 𝒙)
𝒙

= =
For Example: A survey of families in a city finds that > of the families have 1 car, ? have 2
=
cars and @ have 3 cars. Find the expected number of cars per family.

𝒙 1 2 3
𝑷(𝑿 = 𝒙) 1 1 1
2 3 6

1 1 1 5
𝐸 (𝑥 ) = 𝜇 = ; 𝑥. 𝑃( 𝑋 = 𝑥) = 1 × +2× +3× =
2 3 6 3
D

Note that while the values of 𝑥 must be whole numbers (can’t have part of a car), the value of
𝐸 (𝑥 ) does not need to be a whole number.

Fair Game

A game is considered to be fair if 𝑬(𝒙) = 𝟎.

For Example: Jin Ah pays $5 to play a game. A quarter of the time he gets $1 back, a third
of the time he gets $b and the rest of the time $10. If it is a fair game, find the values of
𝑎 and 𝑏.

𝒙 1 − 5 = −4 𝑏−5 10 − 5 = 5
𝑷(𝑿 = 𝒙) 1 1 𝑎
4 3

1 1 5
; 𝑃 (𝑋 = 𝑥 ) = 1 = + + 𝑎 →𝑎=
4 3 12

1 1 5
𝐸 (𝑥 ) = 0 = −4 × + (𝑏 − 5) × + 5 × → 𝑏 = $1.75
4 3 12

HL AI Probability Distributions Notes RMS 2


1.2 Binomial Distribution

For a discrete random variable, X, to fit a binomial distribution, it must have the following
attributes:

• There is a fixed number of trials, 𝒏


• X is the number of ‘successes’ in those trials
• There are only two possible outcomes in each trial (“success” or “not success”)
• The trials are independent, that is, the probability of success, 𝒑, is fixed.
Note that this means that there must be ‘replacement’ when items are selected
EXCEPT when it is a very large population in which case the effective probability
is not really changed.

• The parameter of the binomial distribution are 𝒏 and 𝒑.


• We write 𝑿~𝑩(𝒏, 𝒑)

The probability that 𝑋 = 𝑥 is given by the distribution function

𝑷(𝑿 = 𝒙) = 𝒇(𝒙) = 𝒏𝑪𝒙 𝒑𝒙 (𝟏 − 𝒑)𝒏R𝒙

If you know 𝑛, 𝑝 and 𝑥, Your calculator can find this for you.

CALCULATOR
2ND Function vars; DISTR A:binompdf (𝑃(𝑋 = 𝑥 ))
2ND Function vars; DISTR B:binomcdf (𝑃(𝑋 ≤ 𝑥 ))

The MEAN of the binomial distribution is calculated: 𝑬(𝒙) = 𝒏𝒑


The VARIANCE of the binomial distribution is calculated: 𝑽𝒂𝒓(𝒙) = 𝝈𝟐 = 𝒏𝒑(𝟏 − 𝒑)

(Remember Standard Deviation, 𝜎, is the square root of variance.)

=
For Example: The probability of getting a ‘6’ on an unbiased dice is @. I throw the dice 10 times.
=
OR In a city of 600 000 citizens, @ of them have green eyes. A sample of 10 citizens is taken.

a) What is the probability I get exactly 3 sixes? OR Exactly 3 citizens in the sample have
green eyes.?

Want 𝑃(𝑋 = 3). Use binompfd


=
𝑝= @
, 𝑛 = 10, 𝑥 = 3 → 𝑃(𝑋 = 3) = 0.155

b) What is the probability I get 3 sixes or less?

Want 𝑃(𝑋 ≤ 3). Use binomcfd


=
𝑝= @
, 𝑛 = 10, 𝑥 = 3 → 𝑃(𝑋 ≤ 3) = 0.930
HL AI Probability Distributions Notes RMS 3
c) What is the probability I get less than 3 sixes?

Want 𝑃(𝑋 < 3). Use binomcfd


=
𝑝 = , 𝑛 = 10, 𝑥 = 𝟐 → 𝑃(𝑋 ≤ 2) = 0.775
@

d) What is the probability I get more than 3 sixes

Want 𝑃(𝑋 > 3). Need to convert this to a ‘less than’: 𝑃(𝑋 > 3) = 1 − 𝑃 (𝑋 ≤ 3)

Use binomcfd
=
𝑝= @
, 𝑛 = 10, 𝑥 = 3 → 𝑃(𝑋 > 3) = 1 − 𝑃(𝑋 ≤ 3) = 1 − 0.930

=
For Example: The probability of getting a ‘6’ on an unbiased dice is @. What is the smallest number
of throws that I need to ensure that the probability of getting at least one 6 is greater that 0.5?

GDC Approach
(Remember 𝑃(𝑋 ≥ 1) = 1 − 𝑃(𝑋 ≤ 0)

Algebraic Approach
𝑃(𝑋 ≥ 1) > 0.5
1 − 𝑃(𝑋 = 0) > 0.5
→ 𝑃(𝑋 = 0) < 0.5
=
→ (1 − @)a < 0.5
b
(@)a < 0.5
b
𝑛 ln @ < ln 0.5
de f.b
𝑛> g Change direction of inequality as dividing by -ve number
de
h

𝑛 > 3.80
→ 𝑛=4

HL AI Probability Distributions Notes RMS 4


1.3 Poisson Distribution

For a discrete random variable, X, to fit a Poisson distribution, it must have the following
attributes:
• There is an average number of occurrences, 𝝀 (𝒐𝒓 𝒎 𝒐𝒓 𝛼), of an event in a
given interval, whether that is in time or space.
• This average is uniform across all intervals being considered.
• The average number is proportional to the size of the interval (eg. if the interval is
doubled, 𝝀 is doubled)
• Occurrences are independent and cannot occur at the same time, or in the same
position.
• X is the number of occurrences in the interval being considered and must be a
whole number.
• We write 𝑿~𝑷𝒐(𝝀)

The probability function is given as:


𝑒 R 𝝀 . 𝝀D
𝑷(𝑿 = 𝒙) = 𝒇(𝒙) =
𝑥!

But you will not be expected to use this formula.


Given 𝜶, the probabilities can be found using your calculator.

CALCULATOR
2ND Function vars; DISTR D:poissonpdf (𝑃(𝑋 = 𝑥 ))
2ND Function vars; DISTR E:poissoncdf (𝑃(𝑋 ≤ 𝑥 ))

The MEAN of the poisson distribution is calculated: 𝑬(𝒙) = 𝝀


The VARIANCE of the poisson distribution is calculated: 𝑽𝒂𝒓(𝒙) = 𝝈𝟐 = 𝝀

(Remember Standard Deviation, 𝜎, is the square root of variance.)

For Example:
A 1200 m length of telephone cable has 10 faults in it. If a 100 m length of similar cable is chosen
at random:

(a) Find the mean and standard deviation of the number of faults per 100 m.

=f b
1200m has 10 faults, ∴ 100m has => = @ faults
b
∴ 𝝀=@
b
𝐸 (𝑥 ) = 𝝀 = @
b b
𝑉𝑎𝑟(𝑥 ) = 𝜎 > = 𝝀 = ∴ 𝑆𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝑑𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛 = 𝜎 = y = 0.203
@ @

(b) Find the probability that 100m of cable has at least two faults.
HL AI Probability Distributions Notes RMS 5
𝑃(𝑋 ≥ 2) = 1 − 𝑃(𝑋 ≤ 1)

b
𝝀 = @,𝑥 = 1

Use 2ND Function vars, DISTR E: poissoncfd

𝑃(𝑋 ≥ 2) = 1 − 𝑃(𝑋 ≤ 1) = 1 − 0.7967 = 0.203

HL AI Probability Distributions Notes RMS 6


2. Continuous Probability Distributions

Unlike discrete distributions, in continuous distributions the variable, 𝑋, is measured and so is a


continuous random variable. This has the following implications:

• The probability that 𝑋 equals a specific value is 0.


• Instead, we find the probability that 𝑋 is between two values.
• Rather than have a probability distribution function (pdf), the variable has a probability
density function.

We will look at just one, very common, continuous probability distribution- the Normal
distribution.

2.1 Normal Distribution

For the normal distribution, the probability density function is given by


1 = DR} €
𝑓 (𝑥 ) = 𝑒 R >| ~ •
𝜎√2𝜋
You will not be expected to use this.

The probability that the variable lies between two values is calculated by finding the area under

the probability density function between those bounds (∫ƒ 𝑓(𝑥)𝑑𝑥). Again, you will not be
expected to do this.

The normal distribution has the following attributes:


• The distribution is symmetric about the mean
• The mean, the mode and the median are all the same
• The parameters are the mean, 𝝁, and the variance, 𝝈𝟐 ,(and the standard deviation, 𝝈).
• The probability that the variable is within one standard deviation of the mean is ~68%
The probability that the variable is within two standard deviations of the mean is ~95%
The probability that the variable is within 3 standard deviations of the mean is ~99.7%

P(𝜇 − 𝜎 ≤ 𝑋 ≤ 𝜇 + 𝜎) = .6826 P(𝜇 − 2𝜎 ≤ 𝑋 ≤ 𝜇 + 2𝜎) = .9544 P(𝜇 − 3𝜎 ≤ 𝑋 ≤ 𝜇 − 3𝜎) = .9974

• Note that for continuous distributions, 𝑃(𝑋 ≤ 𝑥) is the same as 𝑃(𝑋 < 𝑥) because
𝑃(𝑋 = 𝑥 ) = 0.
• We write 𝑿~𝑵(𝝁, 𝝈𝟐 ). Note well that it is variance, not standard deviation, that is given.

HL AI Probability Distributions Notes RMS 7


CALCULATOR
2ND Function vars; DISTR 2:normalcdf (if you know limits)

lower: the lower bound if there is one, OR if it is a less than case, number at least 4
standard deviations below the mean
upper: the upper bound OR if it is a greater than case, a number at least 4 standard
deviations above the mean

2ND Function vars; DISTR 3:invNorm (if you know the probability (area) )
Tail: LEFT if less than, CENTRE is about the mean, RIGHT if greater than
(if your calculator does not have the ‘Tail’ option, you must first convert the area
to less than eg. If 𝑃(𝑋 > 𝑥) = 0.3, you need to use 𝑃(𝑋 < 𝑥 ) = 1 − 0.3 = 0.7

(Note that we NEVER use DISTR 1:normalpdf)

For Example:
The length of a large population of cats are normally distributed with a mean of 35cm and a
standard deviation of 8cm.

(a) If a cat is classified as small if it has a length less than 29cm, find the proportion of
small cats.

In this case, we know the limits, so use normalcdf

Want 𝑃(𝑋 ≤ 29)

normalcdf (𝜇 = 35, 𝜎 = 8, 𝑙𝑜𝑤𝑒𝑟 = 0, 𝑢𝑝𝑝𝑒𝑟 = 29)


𝑃(𝑋 ≤ 29) = 0.226 (GDC)
22.6% of cats are small.

(b) What length are 40% of cats longer than?

In this case, we know the probability (or area), so use invNorm

Want 𝑥 such that 𝑃(𝑋 > 𝑥 ) = 0.4


As it is greater than, we can use RIGHT for the tail

invNorm( 𝑎𝑟𝑒𝑎 = 0.4, 𝜇 = 35, 𝜎 = 8, 𝑅𝐼𝐺𝐻𝑇)


𝑃(𝑋 > 𝑥 ) = 0.4
→ 𝑥 = 37.0. (𝐺𝐷𝐶)

(Note: if your GDC doesn’t have the ‘tail’ option, use 0.6 as you need less than)

HL AI Probability Distributions Notes RMS 8


3. Transformed or Combined Random Variables

3.1 Linear Transformation of Variable

If a random variable, 𝑋, is transformed such that


𝒀 = 𝒂𝑿 + 𝒃,
then the parameters for the new variable, 𝑌, are

𝑬(𝒀) = 𝒂𝑬(𝑿) + 𝒃

𝑽𝒂𝒓(𝒀) = 𝒂𝟐 𝑽𝒂𝒓(𝑿)

For Example:

A firm manufactures chocolate bars with a mean of 50g and a standard deviation of 2g. After a
survey of consumers, they decide they are going to triple the mass of the chocolate bars, as well as
add an addition 10grams. Find the new mean, variance and standard deviation.

𝑋 = 𝑜𝑟𝑖𝑔𝑖𝑛𝑎𝑙 𝑚𝑎𝑠𝑠 𝑌 = 𝑛𝑒𝑤 𝑚𝑎𝑠𝑠


𝑌 = 3𝑋 + 10 𝐸(𝑌) = 3𝐸 (𝑋) + 10 = 3 × 50 + 10 = 160𝑔
𝑉𝑎𝑟(𝑌) = 3> 𝑉𝑎𝑟(𝑋) = 3> × 2> = 36𝑔
𝜎(𝑌) = ¨𝑉𝑎𝑟 (𝑌) = √36 = 6𝑔

3.2 Linear Combination of Variables

If a random variable, Y, is a linear combination of two or more random variables, 𝑋= , 𝑋> , 𝑋? ….


such that
𝒀 = 𝒂𝟏 𝑿𝟏 ± 𝒂𝟐 𝑿𝟐 ± 𝒂𝟑 𝑿𝟑 …
then the parameters for the new variable, 𝑌, are

𝑬(𝒀) = 𝒂𝟏 𝑬(𝑿𝟏 ) ± 𝒂𝟐 𝑬(𝑿𝟐 ) ± 𝒂𝟑 𝑬(𝑿𝟑 ) ….

𝑽𝒂𝒓(𝒀) = 𝒂𝟏 𝟐 𝑽𝒂𝒓(𝑿𝟏 ) + 𝒂𝟐 𝟐 𝑽𝒂𝒓(𝑿𝟐 ) + 𝒂𝟑 𝟐 𝑽𝒂𝒓(𝑿𝟑 ) …


(Note: The rule for variance is only valid if the random variables are independent)

For Example:

A firm manufactures chocolate bars with a mean of 50g and a standard deviation of 2g. After a
survey of consumers, they decide they are going to sell boxes with three chocolate bars. The
packaging has a mass of 10g and a standard deviation of 0.5g. Find the mean, variance and
standard deviation of the boxes of chocolate bars, assuming the masses are independent.
𝐵𝑜𝑥: 𝐵 = 𝐶= + 𝐶> + 𝐶? + 𝑃
𝐸 (𝐵) = 𝐸 (𝐶= ) + 𝐸 (𝐶> ) + 𝐸 (𝐶? ) + 𝐸 (𝑃) = 3𝐸(𝐶 ) + 𝐸 (𝑃) = 3 × 50 + 10 = 160
𝑉𝑎𝑟(𝐵) = 𝑉𝑎𝑟(𝐶= ) + 𝑉𝑎𝑟(𝐶> ) + 𝑉𝑎𝑟(𝐶? ) + 𝑉𝑎𝑟(𝑃) = 3 × 2> + 0.5> = 12.25𝑔
𝜎(𝑌) = ¨𝑉𝑎𝑟 (𝑌) = √12.25 = 3.5𝑔
(Note well the differences between this and the previous example.)
HL AI Probability Distributions Notes RMS 9
3.3 Poisson Distribution with Linear Combination of Variables

If two independently distributed Poisson random variables (𝑿𝟏 ~𝑷𝒐(𝝀), 𝑿𝟐 ~𝑷𝒐(𝜶)) are
combined, (𝒀 = 𝑿𝟏 + 𝑿𝟐 ), then the combined variable follows a Poisson distribution such that
𝒀~𝑷𝒐(𝝀 + 𝜶)

• Random variables must be independent


• The mean must be for equal intervals. If not, adjust accordingly.

For Example:
The manufacturer of wooden tables finds that the number of indents in the wood surface follows a
Poisson distribution with a mean of 4.2 per square metre. The number of cracks in the wood
follows a Poisson distribution with a mean of 3 per table, where a table has an area of 2.5m2. Both
indents and cracks are classed as defects. Given that indents and cracks occur independently of
each other, find the probability that a table will have less than 6 defects.

Interval is a table, with area of 2.5m2.


𝑋= ~𝑃𝑜(4.2 × 2.5 = 10.5) (Indents)
𝑋> ~𝑃𝑜(3) (Cracks)
𝑌 = 𝑋= + 𝑋> ~𝑃𝑜(3 + 10.5 = 13.5)

𝑃(𝑋 < 6) = 𝑃(𝑋 ≤ 5) =poissoncdf(13.5, 5)=0.00773

Probability that a table has less than 6 defects is 0.773%

HL AI Probability Distributions Notes RMS 10


1.3 Normal Distribution with Linear Combination of Variables

If a linear combination of two independently distributed Normal random variables


(𝑿𝟏 ~𝑵(𝝁𝟏 , 𝝈𝟏 𝟐 ), 𝑿𝟐 ~𝑵(𝝁𝟐 , 𝝈𝟐 𝟐 )) are combined, (𝒀 = 𝒂𝟏 𝑿𝟏 + 𝒂𝟏 𝑿𝟐 ), then the combined
variable follows a Normal distribution such that
𝟐
𝒀~𝑵 |𝒂𝟏 𝝁𝟏 + 𝒂𝟐 𝝁𝟐 , 𝒂𝟏 𝟐 𝝈𝟏 + 𝒂𝟐 𝟐 𝝈𝟐 𝟐 •

• Random variables must be independent

For Example:
A wooden table is made of 4 legs and a table top. The manufacturer of the legs finds that the mass
(in kg) of the legs follows a normal distribution 𝐿~𝑁(0.6, 0.025> ). The manufacturer of the table
tops finds that the mass (in kg) of the table tops follows a normal distribution 𝑇~𝑁(2.5, 0.05> ).
Find the probability that a table has a mass of less than 4.7kg.

𝑇𝑎𝑏𝑙𝑒 = 𝐿= + 𝐿> + 𝐿? + 𝐿° + 𝑇=
Assuming that the mass of the legs and table top are independent, then
𝑇𝑎𝑏𝑙𝑒~𝑁(4 × 0.6 + 2.5, 𝟒 × 0.025> + 0.05> ) (NB Variance is multiple by 4, once for each leg)
𝑇𝑎𝑏𝑙𝑒~𝑁(4.9, 0.005)

Therefore, the distribution for the table has 𝜇 = 4.9𝑘𝑔, 𝜎 = √0.005 = 0.0707

Want 𝑃 (𝑋 < 4.7)


normalcdf (𝜇 = 4.9, 𝜎 = 0.0707, 𝑙𝑜𝑤𝑒𝑟 = 0, 𝑢𝑝𝑝𝑒𝑟 = 4.7)

𝑃(𝑋 < 4.7) = 0.00233 (GDC)

HL AI Probability Distributions Notes RMS 11

You might also like