Stats Unit3
Stats Unit3
Module: 3
Learning Objectives:
3.5 Summary
3.6 Keywords
3.9 Reference
3.1 Understanding the Nature of Probability
Probability can be defined as the measure of the likelihood that an event will occur. It quantifies the uncertainty inherent in predictions.
Chance and uncertainty are often used interchangeably with probability. However, while probability provides a numerical measure, chance and
uncertainty more generally describe situations where outcomes are not deterministic.
The concept of probability has its roots in gambling and games of chance, where early thinkers wanted to predict outcomes and optimise
their bets.
Today, probability theory extends far beyond gambling. It's essential in various disciplines like finance, science, economics, and more.
Understanding the probability of different outcomes allows decision-makers to assess risks, make informed choices, and optimise results.
Complementary Events: If event A is an event, then the complementary event, denoted by A′ or Ac, consists of all outcomes not in A.
o If A is the event of rolling a dice and getting a 3, then A′ is the event of not rolling a 3.
Classical Approach: Assumes all outcomes in the sample space are equally likely. The probability of an event E is calculated as:
Relative Frequency Approach: Defines probability based on historical or experimental data. It is the ratio of the number of times an event
occurs to the total number of trials. P(E)= Number of times E occurs / Total number of trials
Subjective Approach: Based on personal judgement, intuition, or experience rather than objective empirical evidence. Useful in situations
where it is impossible or impractical to collect relevant frequency data. For instance, estimating the likelihood of a new business venture
succeeding.
The probability of any event A is denoted by P(A) and satisfies the following three properties:
P(S)=1: The probability of the sample space S, which represents all possible outcomes, is 1.
For any finite sequence of disjoint events A1,A2,…,An (events that have no outcomes in common), P(A1 A2 …An)=P(A1)+P(A2)+…+P(An)
P( )=0
For instance, the probability of rolling a 7 on a fair six-sided die is 0, since it's an impossible event.
For example, the probability that a fair six-sided die will land
The additive rule helps us find the probability of the union of two events.
This rule is essential to account for the overlap (or intersection) of two events so as not to double-count.
The multiplicative rule helps us find the probability of the intersection of two events.
P(A B)=P(A)×P(B A)
Conditional Probability: The probability of event B happening given that event A has already occurred is denoted as P(B A).It's calculated as:
P(B A)=P(A)P(A B)
P(A B)=P(A)×P(B)
If this equation holds, A and B are independent. If not, they are dependent.
Binomial Distribution
Trials are independent: The outcome of one trial doesn’t affect the outcome of another trial.
Two possible outcomes for each trial: These are often referred to as "success" and "failure."
Probability of success remains the same: For every trial, the
Fixed number of trials: We perform the experiment a set number of times, denoted as n.
Computing Probabilities: The probability of achieving exactly k successes in n trials is given by the formula: P(X=k)=(kn)pk(1−p)n−k Where: (kn)
is the binomial coefficient, p is the probability of success, 1−p is the probability of failure.
Binomial Distribution:
Estimating the probability of a certain number of successes in a fixed number of trials. E.g., the probability of getting 5 heads when flipping a
coin 10 times.
Election polling to estimate the probability of a candidate receiving a certain percentage of votes.
3.5 Normal Distribution
Normal Distribution
Mean, median, and mode are equal: They all lie at the centre of the distribution.
Spread determined by standard deviation: The spread of the distribution is determined by its standard deviation, denoted as σ.
The standard normal distribution is a special case of the normal distribution with a mean of 0 and a standard deviation of 1.
A Z-score represents the number of standard deviations a data point is from the mean. It is calculated as: Z = X−μ / σ Where: X is the data
point, μ is the mean, σ is the standard deviation.
The Empirical Rule and Percentiles:
About 68% of the data lies within 1 standard deviation of the mean.
Stock returns in finance are often analysed using the normal distribution.
3.6 Summary
A mathematical measure of the likelihood of an event occurring, ranging from 0 (impossible event) to 1 (certain event). It helps quantify
uncertainty and make predictions about outcomes based on known data.
Fundamental terms used in probability, such as:
Simple, Compound, and Complementary Events: Types of events based on their complexity and relationship to other events.
Fundamental rules governing the behaviour and calculation of probabilities, ensuring they remain within their defined limits of 0 and 1.
A probability distribution that describes the number of successes in a fixed number of independent Bernoulli trials. It has two parameters: the
number of trials and the probability of success in an individual trial.
A continuous probability distribution characterised by its bell-shaped curve. It is defined by two parameters: the mean (average) and the
standard deviation (dispersion). Many natural phenomena and statistics are approximately normally
distributed.
Various probability distributions tailored to specific types of events or processes. Examples include:
Poisson Distribution: Used for counting the number of events in a fixed interval of time or space.
Exponential Distribution: Describes the time between events in a process that occurs continuously and independently.
Uniform Distribution: All outcomes in the sample space are equally likely.
3.7 Keywords
Sample Space: The set of all possible outcomes or results of an experiment. In any probability experiment, the sample space represents all
the outcomes that cannot be broken down any further. For example, when rolling a die, the sample space is {1, 2, 3, 4, 5, 6}.
Compound Event: An event that consists of two or more simple events. While a simple event consists of a single outcome, a compound event
encompasses multiple outcomes. For instance, when rolling a die, getting an odd number can be described as a compound event since it
includes the outcomes {1, 3, 5}.
Binomial Distribution: A probability distribution that describes the number of successes in a fixed number of Bernoulli trials. This distribution
is characterised by two parameters: the number of trials and the probability of success in a single trial. It answers questions like, "What is the
probability of getting exactly 3 heads in 5 coin tosses?"
Z-Score: A statistical measurement that describes a value's relationship to the mean of a group of values. Z-score is expressed in terms of
standard deviations from the mean. If a Z-score is 0, it indicates that the data point's score is identical to the mean score. A Z-score of 1.0
indicates a value that is one standard deviation from the mean.
Poisson Distribution: A probability distribution that expresses the probability of a given number of events occurring in a fixed interval of time
or space. It is often used to model rare events in a large population or time frame. For instance, it can represent the number of emails received
in an hour or the number of phone calls at a call centre in a day.
Conditional Probability: The probability of one event occurring given that another event has already occurred. Represented as P(A B) it ,
provides the probability of event A occurring after it is given that event B has already taken place. An example could be the probability of
someone having a cold given that they are sneezing.
1. How would you differentiate between a simple event and a compound event in probability?
2. What is the significance of the Z-score in the context of the Normal Distribution?
Frequency, or Subjective—would be most suitable for predicting the outcome of a coin toss and why?
4. How does the memorylessness property manifest itself in the Exponential Distribution?
5. What are the key assumptions to be met for a probability distribution to be considered a Binomial Distribution?
Disha Electronics, a popular electronics store in Mumbai, was facing unpredictable sales patterns for the past few years. With a diversified
product range from mobile phones to refrigerators, forecasting monthly sales had become increasingly challenging.
Background:
In 2021, the store witnessed an unusual peak in sales in the month of July, which was initially thought to be an anomaly. Upon closer inspection,
the management realised that a significant portion of this peak was due to the sale of air conditioners. Historically,
Mumbai experienced the monsoon in July, which led to reduced demand for air conditioners. However, due to changing climate patterns, 2021
saw a delay in monsoon, which inadvertently increased the demand for air conditioners.
Disha's management team decided to employ probability and statistics to better understand these patterns and forecast sales. They collected
data over the past ten years and mapped it against Mumbai's weather patterns. Using statistical analysis, they found a strong correlation
between the onset of monsoon and the sales of specific electronic items.
For air conditioners, there was a negative correlation with monsoon onset. In contrast, sales of items like washing machines and water purifiers
showed a positive correlation. The delayed monsoon in 2021 was an outlier, but the data suggested that the onset of the monsoon was getting
gradually delayed over the years.
Equipped with these insights, Disha Electronics started making informed decisions about inventory management. They increased the stock of
air conditioners in the anticipation of a delayed
monsoon and ran promotional offers on washing machines and water purifiers right before the expected onset of monsoon.
By leveraging probability and statistics, Disha Electronics not only improved their inventory management but also enhanced customer
satisfaction by ensuring product availability as per the demand.
Questions:
2. How did Disha Electronics utilise probability and statistics to address the sales forecasting challenge?
3. Based on the case study, how can understanding correlations between external factors (like weather) and product sales be beneficial for
businesses?
3.10 References