0% found this document useful (0 votes)
2 views22 pages

Statistics 3 - Random Variables and Probability Distributions

The document explains random variables, which can be discrete or continuous, and their probability distributions. It covers Bernoulli experiments, binomial distribution, and normal distribution, including examples of calculating probabilities. The normal distribution is highlighted for its significance in modeling real-world phenomena and its properties related to the Central Limit Theorem.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views22 pages

Statistics 3 - Random Variables and Probability Distributions

The document explains random variables, which can be discrete or continuous, and their probability distributions. It covers Bernoulli experiments, binomial distribution, and normal distribution, including examples of calculating probabilities. The normal distribution is highlighted for its significance in modeling real-world phenomena and its properties related to the Central Limit Theorem.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 22

Random Variables

A random variable is a numerical representation of each event in the sample space.

The notation of the random variable is a capital letter (X, Y, Z …) and the values it
can assume are represented by lowercase letters (x, y, z …).

The random variable can be:

Discrete
assumes a countable number of distinct integer values
these values are usually obtained by counting

Continuous
assumes an infinite number of uncountable distinct values
these values are usually obtained by measuring
Random Variables
Discrete
Experiment Random Variable Values

Roll a die 100 times Number of sixes 0, 1, 2, 3, … 100


Monitor Taylor Swift’s listeners Number of people listening to Taylor 0, 1, 2, 3, … 100000 …
Test 500 people for covid Number of sick people 0, 1, 2, 3, … 500

Continuous
Experiment Random Variable Values

Inspect Instagram latency latency in ms 82,34; 99,4 ….


Monitor Farfetch stock prices price at a given moment in euros 2,95; 5,78 ….
Weigh 100 people weight in kg 63,05; 88,50….
Worksheet 3

Ex 1
Random Variables
A Random variable is defined by:
● Probability distribution - can be a graph, table or formula
● Parameters - mean, variance and standard deviation

Sample Population
(statistic) (parameters)
Some probability distributions are considered mean
standard as they model many real world variance
phenomena. standard deviation

Discrete: Uniform, Bernoulli, Binomial, Poisson..

Continuous: Uniform, Normal, Exponential, t-student, Qui-squared…


Bernoulli Experiments
An experiment is said to be a Bernoulli experiment when a given event A has only
two possible outcomes: success or failure.

In this case we say that the occurrence of A is a success and P(A) = p. On the other
hand, the non-occurrence of A is a failure and P(A ) = 1- p = q.

Example:

If we toss a coin, let event A be


success: heads
P(A) = 0,5
failure: tails
P(A ) = 1 - 0,5 = 0,5
Binomial Distribution
If we carry out n independent Bernoulli experiments, the random variable X that
represents the number of successes in n tries follows a Binomial distribution
with parameters n and p :

● n and p are the parameters of the Binomial distribution


● the parameters for the random variable X are the mean and variance

The probability function for the binomial distribution is:

n - number of tries x - number of successes in n tries p - probability of success in each try


Example
When someone is shopping on Farfetch.com the probability of a bug occurring
and them not being able to complete their purchase is 1%.

If we take a sample of 10 random customers, what is the probability of:

● None of the customers having a problem completing their purchase

● At least one customer having an issue


Example
When someone is shopping on Farfetch.com the probability of a bug occurring
and them not being able to complete their purchase is 1%.

If we take a sample of 10 random customers, what is the probability of:

● None of the customers having a problem completing their purchase


n = 10 p = 0.01

P(X = 0) = C * 0.01 * (1 - 0.01) = (10! / (0! * (10 - 0)!))* 1 * (0.99)

= (10! / (1*10!)) * (0.99) = (10!/10!) * 0,904382 = 1* 0,904382 = ~ 0,90 = 90%


Example
When someone is shopping on Farfetch.com the probability of a bug occurring
and them not being able to complete their purchase is 1%.

If we take a sample of 10 random customers, what is the probability of:

● At least one customer having an issue

P(X ≥ 1) = 1 - P(X < 1) = 1 - P(X = 0) = 1 - 0.90 = 0.10


Worksheet 3

Ex 3 to 7
Normal Distribution
The Normal distribution is one of the most used probabilistic models.

Not only because it describes many real-world phenomena but also because of the
Central Limit Theorem (CLT) which states that the distribution of a sample mean
approximates a normal distribution as the sample size gets larger, regardless of the
population's distribution. In practice we can assume a normal distribution of means
if sample size is larger than 30.

The normal distribution is defined by the mean (μ) and standard deviation (σ)
which means that the parameters of the distribution and the random variable X are
the same.
Normal Distribution
The Normal distribution:
● is symmetrical in relationship to the mean meaning that μ = Mo = Md
● always has an area equal to 1 between its curve and the x axis, distributed in
the following way:
○ 68.2% in ] μ - σ; μ + σ [
○ 95.4% in ] μ - 2σ; μ + 2σ [
○ 99.6% in ] μ - 3σ; μ + 3σ [
Normal Distribution
The mean and standard deviations can take infinite values and as such we need a
function to represent all normal distributions, this is called the standard normal
distribution which has a mean equal to zero (μ = 0) and a standard deviation
equal to one (σ = 1).

The transformation for a given random variable X with mean μ and standard
deviation σ can be done using the following formula:

The advantage of using the standard normal


distribution is that P(Z < z) can be found in a table.
How to use the table

1. Look for the first number and


decimal point in the first
column

2. Look for the second decimal


number in the first row

3. By intersecting the two we get


the desired probability value.

P(Z < 1.27) = 0.8980


Example
The length of a song on Spotify follows a normal distribution with a mean of 3 and
a half minutes and a standard deviation of 36 seconds.

If I am listening to Spotify on shuffle, what is the probability of the next song to play
to be:

● longer than 3 minutes

● shorter than 3 minutes and 30 seconds

● between 2 and 4 minutes long


Example
The length of a song on Spotify follows a normal distribution with a mean of 3 and
a half minutes and a standard deviation of 36 seconds.

If I am listening to Spotify on shuffle, what is the probability of the next song to play
to be:

● longer than 3 minutes


μ = 210 s σ = 36 s or μ = 3.5 mins σ = 0,6 mins x = 3 mins

P(X > 3) = 1 - P( X < 3) z =( 3 - 3,5 )/ 0.6 = -0,5/ 0,6 = -0,83

P(X < 3 ) = P (Z < -0,833) = 0,2033

P(X > 3) = 1 - 0,2033 = 0,7967


Example
The length of a song on Spotify follows a normal distribution with a mean of 3 and
a half minutes and a standard deviation of 36 seconds.

If I am listening to Spotify on shuffle, what is the probability of the next song to play
to be:

● longer than 3 minutes and 30 seconds


μ = 3.5 mins σ = 0,6 mins x = 3.5 mins

z =( 3.5 - 3,5 )/ 0.6 = 0/ 0,6 = 0

P(X < 3.5 ) = P (Z < 0) = 0,5


Example
The length of a song on Spotify follows a normal distribution with a mean of 3 and
a half minutes and a standard deviation of 36 seconds.

If I am listening to Spotify on shuffle, what is the probability of the next song to play
to be:

● between 2 and 4 minutes long


μ = 3.5 mins σ = 0,6 mins x1 = 4 mins x2 = 2 mins

z1 =( 4 - 3,5 ) / 0.6 = 0,5 / 0,6 = 0,83 z2 = ( 2 - 3,5) /0.6 = -1,5 / 0,6 = -2,50

P(X < 4 ) = P (Z < 0,83) = 0,7967 P(X < 2 ) = P (Z < -2,50) = 0,0062
P( 2 < X < 4 ) = P ( -2,50 < Z < 0,83) = P(Z < 0,83) - P(Z < -2,50)
= 0,7967 - 0,0062
= 0,7905
Worksheet 3

Ex 8 to 14
You can now do all of Worksheet 3

You might also like