Week3 Module AIL1020 L201
Week3 Module AIL1020 L201
[email protected]
Module 02
Measures of Central Tendency
Exploring data variability
Descriptive Statistics Contd.
Recap
In this video,
Chebychev’s Identity
Chebychev’s Identity
Chebyshev’s Inequality is a fundamental theorem in probability that provides a
bound on how much of the data deviates from the mean.
Chebyshev’s Inequality states that for any dataset (not necessarily normal),
It gives a worst-case bound, meaning it guarantees that extreme values do not occur
too often (helps in setting safety bounds)
The mean transaction value is: AI fraud detection models can use this to set
risk thresholds.
μ = 200 dollars
σ = 50 dollars At most 11.11% of transactions will be
below 50 dollars or above 350 dollars.
Fraudulent transactions usually have
extremely high or low values.
Use Chebyshev’s inequality to determine the maximum proportion of deliveries that might fall
outside this range.
Descriptive Statistics Contd.
Use Chebyshev’s inequality to determine the maximum proportion of deliveries that might fall
outside this range.
Descriptive Statistics Contd.
Use Chebyshev’s inequality to determine the maximum proportion of deliveries that might fall
outside this range.
The company considers a delivery "very late" if it takes more than 3 standard
deviations from the mean.
Use Chebyshev’s inequality to find out: at most, what percentage of deliveries
will take more than 45 minutes?
Descriptive Statistics Contd.
Summary
Chebyshev’s Inequality is a powerful tool when data distributions are unknown.
Coming up next…
Normal distributions and paired datasets