Stat - G. Assignment
Stat - G. Assignment
GROUP ASSIGNMENT
TITLE OF THE ASSIGNMENT: PROBABILITY DISTRIBUTION
Group 1
Section: B (2nd Year - Extension)
S/N STUDENT NAME FATHER NAME ID. NO.
1 Tayto Mindahun DBU1502109
2 Alemnesh Mekibib DBU1502012
3 Tadelech Dereje DBU1502104
4 Haymanot Ashenafi DBU1502061
5 Ermiyas Yifru DBU1502152
6 Yetmwork Mesfin DBU1502133
7 Biruk Bezawork DBU1502030
[i]
1. Introduction
Probability and probability distributions are fundamental concepts in statistics that provide the
foundation for understanding and analyzing random phenomena. Probability measures the
likelihood of specific outcomes occurring within a defined set of possibilities, ranging from
impossible events to those that are certain. A probability distribution, on the other hand, describes
how probabilities are distributed over the values of a random variable, providing a complete
description of the uncertainty associated with that variable. These concepts are crucial for
modeling and interpreting various types of data and making informed decisions based on
probabilistic reasoning (Meester et al., 2005).
Understanding the basic concepts of probability involves distinguishing between discrete and
continuous random variables. Discrete random variables take on a countable number of values,
such as the number of successes in a series of trials, while continuous random variables can assume
an infinite number of values within a given range, like measurements of height or weight. The
expected value, or mean, of a random variable provides a measure of its central tendency,
representing the long-term average value. Variance, on the other hand, quantifies the spread or
dispersion of the values around the expected value, indicating the degree of variability within the
data.
Discrete probability distributions are used to model scenarios involving discrete random variables
and include distributions such as the binomial, hypergeometric, and Poisson distributions. The
binomial distribution models the number of successes in a fixed number of independent trials,
while the hypergeometric distribution deals with scenarios involving sampling without
replacement from a finite population. The Poisson distribution, often used in modeling rare events,
describes the number of occurrences of an event within a fixed interval of time or space
(Mendenhall, 2006).
Continuous probability distributions, such as the normal distribution, are used to model continuous
random variables. The normal distribution, characterized by its bell-shaped curve, is one of the
most important and widely used distributions in statistics. It is defined by its mean and standard
deviation and is central to many statistical methods and theorems, such as the central limit theorem.
Understanding these distributions and their applications enables more accurate data analysis and
predictions, making them essential tools in both theoretical and applied statistics.
[1]
2. Definition of probability and probability distribution
2.2. Probability
The term probability refers to the study of randomness and uncertainty. In any situation in which
one of a number of possible outcomes may occur, the discipline of probability provides methods
for quantifying the chances, or likelihoods, associated with the various outcomes. Probability is a
measure of the likelihood of a particular event occurring. It ranges from 0 to 1, where 0 indicates
an impossible event and 1 indicates a certain event (Devore, 2010). The probability of an event
can be calculated using the formula:
1. Experiment: Any process of observation or measurement or any process which generates well
defined outcome.
2. Probability Experiment: It is an experiment that can be repeated any number of times under
similar conditions and it is possible to enumerate the total number of outcomes without
predicting an individual outcome. It is also called random experiment.
Example: If a fair die is rolled once it is possible to list all the possible outcomes i.e.,1, 2, 3,
4, 5, 6 but it is not possible to predict which outcome will occur.
3. Outcome: The result of a single trial of a random experiment
4. Sample Space: Set of all possible outcomes of a probability experiment
5. Event: It is a subset of sample space. It is a statement about one or more outcomes of a random
experiment. They are denoted by capital letters.
6. Equally Likely Events: Events which have the same chance of occurring.
7. Complement of an Event: the complement of an event A means nonoccurrence of A and is
denoted by A',or Ac,or A contains those points of the sample space which don’t belong to A.
8. Elementary Event: an event having only a single element or sample point.
9. Mutually Exclusive Events: Two events which cannot happen at the same time.
[2]
10. Independent Events: Two events are independent if the occurrence of one does not affect the
probability of the other occurring.
11. Dependent Events: Two events are dependent if the first event affects the outcome or
occurrence of the second event in a way the probability is changed.
2.3.Random Variables
Definition: A random variable is a function from the sample space S to the set R1 of all real
numbers.
Random variables are customarily denoted by uppercase letters, such as X and Y, near the end of
our alphabet. In contrast to our previous use of a lowercase letter, such as x, to denote a variable,
we will now use lowercase letters to represent some particular value of the corresponding random
variable. The notation means that x is the value associated with the outcome s by the rv X.
Example: If X is a random variable, then it is a function from the elements of the sample space to
the set of real numbers. i.e., X is a function X: S → R.
X(TTT) = 0
X = {0,1,2,3,4,5}
Discrete random variable are variables which can assume only a specific number of values. They
have values that can be counted.
[3]
The random variable is called a random variable if it is defined over a sample space having a finite
or a countably infinite number of sample points. In this case, random variable takes on discrete
values, and it is possible to enumerate all the values it may assume.
∑ 𝑃(𝑋 ∈ 𝑥) = 1
𝑥∈𝑅 1
A random variable X is discrete if there is a finite or countable sequence x1 x2, of distinct real
numbers, and a corresponding sequence p1 p2… of nonnegative real numbers, such that P (X = xi)
= pi for all i, and ∑𝑖 𝑝𝑖 = 1.
Examples:
Continuous random variable are variables that can assume all values between any two give values.
In the case of a sample space having an uncountably infinite number of sample points, the
associated random variable is called a random variable, with its values distributed over one or
more continuous intervals on the real line.
Examples:
[4]
It is calculated as a weighted average of all possible values that X can take, with weights
corresponding to the probabilities of those values occurring.
𝐸(𝑋) = 𝜇 = ∑ 𝑥 . 𝑓(𝑥)
In the formula above each value is weighted by probability that it occurs. The expectation is
essentially a weighted average.
This measure provides a central location around which the values of the random variable are
distributed, offering a snapshot of where the outcomes are expected to center. In essence, the
expected value serves as a key indicator of the 'center' of the distribution of a random variable,
helping to make informed predictions and decisions based on the average outcome.
Thus, it is possible to calculate the expected value for both discrete and continuous random
variable.
For a discrete random variable, the expected value is the sum of each possible value multiplied by
its probability. Let a discrete random variable X, assume the values X1, X2…,Xn with the
probabilities P(X1), P(X2), ….,P(Xn) respectively. Then the expected value of X, denoted as E(X)
is defined as:
= ∑ 𝑋𝑖 𝑃(𝑋𝑖 )
𝑖=1
For a continuous random variable, it is the integral of the variable multiplied by its probability
density function. Let X be a continuous random variable assuming the values in the interval (a, b)
such that;
𝑏
𝑏
∫ 𝑓(𝑥)𝑑𝑥 = 1, 𝑡ℎ𝑒𝑛 𝐸(𝑋) = ∫ 𝑥 𝑓(𝑥)𝑑𝑥
𝑎
𝑎
[5]
2.4.3. Some properties of expected values
For a discrete random variable, the variance is calculated by summing the product of the square of
the difference between the value of the random variable and the expected value, and the
associated probability of the value of the random variable, taken over all of the values of the
random variable.
Example: Consider the experiment of tossing a coin three times. Let X be the number of heads.
Construct the probability distribution of X.
Solution:
[6]
X=x 0 1 2 3
P (X = x) 1/8 3/8 3/8 1/8
Probability distribution is denoted by P for discrete and by f for continuous random variable.
1.
𝑃(𝑥) ≥ 0, 𝑖𝑓 𝑋 𝑖𝑠 𝑑𝑒𝑠𝑐𝑟𝑒𝑡𝑒
𝑓(𝑥) ≥ 0, 𝑖𝑓 𝑋 𝑖𝑠 𝑐𝑜𝑛𝑡𝑖𝑛𝑖𝑜𝑢𝑠
2.
∑ 𝑃(𝑋 = 𝑥) = 1 , 𝑖𝑓 𝑋 𝑖𝑠 𝑑𝑒𝑠𝑐𝑟𝑒𝑡𝑒
𝑥
∫ 𝑓(𝑥)𝑑𝑥 = 1, 𝑖𝑓 𝑋 𝑖𝑠 𝑐𝑜𝑛𝑡𝑖𝑛𝑖𝑜𝑢𝑠
𝑥
Discrete probability distributions describe the probabilities of outcomes for discrete random
variables. Here’s a brief overview of three important discrete distributions:
To start, the “binomial” in binomial distribution means two terms—the number of successes and
the number of attempts. Each is useless without the other. Binomial distribution is a common
discrete distribution used in statistics, as opposed to a continuous distribution, such as normal
distribution. This is because binomial distribution only counts two states, typically represented as
1 (for a success) or 0 (for a failure), given a number of trials in the data.
[7]
Binomial distribution thus represents the probability for x successes in n trials, given a success
probability p for each trial. Binomial distribution summarizes the number of trials, or observations,
when each trial has the same probability of attaining one particular value. Binomial distribution
determines the probability of observing a specific number of successful outcomes in a specified
number of trials.
Example: Flip a fair coin 4 times. What is the probability that I get two tails?
The first thing that we need to do is think about how many ways I can get 2 tailsin 4 flips
TTHH
THTH
THHT
HTTH
HTHT
HHTT
There are 6 ways. Each way there is to get 2 tails has a probability, P2 (1 - P)2 of occurring. This
the probability of getting exactly 2 flips in 4 trials is
[8]
The Binomial PF:
𝑛!
𝑓(𝑥) = 𝑝 𝑥 (1 − 𝑝)𝑛−𝑥
𝑥! (𝑛 − 𝑥)!
Once you know that your distribution is binomial, you can apply the binomial distribution
formula to calculate the probability.
𝑟 𝑁−𝑟
( )( )
𝑥 𝑛−𝑥
𝑓(𝑥) = 𝑓𝑜𝑟 0 ≤ 𝑥 ≤ 𝑟
𝑁
( )
𝑛
Where:
[9]
Insights into this formula:
Nichols
Haveman
Holden
Reschovsky
Engel
Wolf
What is the probability of forming a two-person committee with at least one woman if
committee members are chosen randomly? At least one means 1 or 2 female committee
members thus we must calculate the probability of one and then of 2 female committee
members
𝑟 𝑁−𝑟 2 4
( )( ) ( )( ) 8
𝑥 𝑛−𝑥
𝑓(1) = = 1 1 =
𝑁 6 15
( ) ( )
𝑛 2
𝑟 𝑁−𝑟 2 4
( )( ) ( )( ) 1
𝑥 𝑛−𝑥
𝑓(2) = = 2 0 =
𝑁 6 15
( ) ( )
𝑛 2
[10]
Probability of at least one committee member = f (1) + f (2) = (2) = 9 /15 = 3/5
Note we could also compute this probability as 1 (− f 0)
For a hypergeometric distribution with parameters N (the population size), K (the number of
successes in the population), and n (the number of draws), the mean and variance are as follows:
𝑛𝐾
𝑀𝑒𝑎𝑛 =
𝑁
This represents the average number of successes you would expect if you were to draw
𝑛𝐾(𝑁 − 𝐾)(𝑁 − 𝑛)
𝑉𝑎𝑟𝑖𝑎𝑛𝑐𝑒 =
𝑁 2 (𝑁 − 1)
This measures the dispersion or variability in the number of successes you would expect across
different samples of size 𝑛 drawn from the population.
In these formulas:
These expressions provide a way to quantify the expected number of successes and the variability
around that expectation when sampling without replacement.
[11]
This contrasts with Bernoulli trials where we know the number of trials, the number of events
occurring and therefore the number of events not occurring. The Poisson distribution models
the number of events occurring in a fixed interval of time or space, given that these events
occur with a known constant mean rate and independently of the time since the last event. The
Poisson distribution is used to model the probability of a given number of events occurring
within a fixed interval of time or space, under the assumption that these events occur
independently of each other and at a constant average rate.
Suppose that events occur at random throughout an interval. Suppose further that the interval can
be divided into subintervals which are so small that:
(1) The probability of more than one event occurring in the subinterval is zero.
(2) The probability of one event occurring in a subinterval is proportional to the length of the
subinterval.
(3) An event occurring in any given subinterval is independent of any other subinterval then
the random experiment is known as a Poisson process.
The word ‘process’ is used to suggest that the experiment takes place over time, which is the usual
case.
Probability Mass Function (PMF): The probability of observing exactly x events in a given
interval is given by:
𝜇 𝑥 𝑒 −𝜇
𝑓(𝑥) = , 𝑥 = 0,12,3, …
𝑥!
where 𝜇 = 𝐸(𝑥) = the expected number of events occuring in an interval and e = 2.71828
Example: An unemployed worker receives an average of two job offers per week. What is the
chance that he receives 4 offers in one week?
40 𝑒 −4
𝑓(𝑥) = = 0.018
0!
[12]
Mean and Variance:
Both the mean and variance of a Poisson-distributed random variable are equal to 𝜇. This reflects
the fact that the average rate of occurrences also determines the variability around this mean.
Example applications:
Space: Number of particles hitting a detector in a given area over a period of time.
The Poisson distribution is particularly useful in scenarios where events are rare relative to the
size of the interval and where the occurrence of one event does not affect the occurrence of
another. The Poisson distribution depends only on the average number of occurrences per unit
time of space. The Poisson distribution is used as a distribution of rare events, such as: Arrivals,
Accidents, Number of misprints, Hereditary, Natural disasters like earth quake, etc. The process
that gives rise to such events is called Poisson process.
𝑃(𝑎 ≤ 𝑋 ≤ 𝑏) = +
The probability that X is in the interval [a, b] can be calculated by integrating the pdf of the random
variable X. The probability that X takes on a value in the interval [a, b] is the area above this
interval and under the graph of the density function:
[13]
For f(x) to be a legitimate PDF, it must satisfy the following two conditions:
(1) The probability that a continuous RV will equal to any specific value is always zero.
𝑃(𝑋 = 𝑥) = 0∀𝑥
(2) Distribution is instead described by a probability density function - f(x).
𝑓(𝑥) = lim 𝑃(𝑋𝜖{𝑥, 𝑥 + 𝜖})
𝜖→0
(3) Probabilities can only be calculated for ranges of values, and are given by the area
under the PDF curve.
𝑏
𝑃(𝑎 < 𝑥 < 𝑏) = ∫ 𝑓(𝑥)𝑑𝑥
𝑎
The normal distribution is a continuous distribution that is unimodal and symmetric with a
distinctive bell-shaped density function. Many variables are nearly normal, but very few are
exactly normal - we will see why next time.
1 1 𝑥−𝜇 2
𝑓(𝑥) = 𝑒𝑥𝑝 [− ( ) ]
𝜎√2𝜋 2 𝜎
The statement that X is normally distributed with parameters µ and σ2 is often abbreviated X ~ N
(µ, σ2). Clearly f (x; µ, σ) ≥ 0, but a somewhat complicated calculus argument must be used to
verify that
∞
∫ 𝑓(𝑥; 𝜇, 𝜎)𝑑𝑥 = 1
−∞
Similarly, it can be shown that E(X) = µ and V(X) = σ2, so the parameters are the mean and the
standard deviation of X.
[14]
X ∼ N (µ; σ) E(X) = µ Var(X) = σ2 SD(X) = σ
Figure below presents graphs of f(x; µ, σ) for several different (µ, σ) pairs.
It is possible for observations to fall 4, 5, or 1000 standard deviations away from the mean, but
these occurrences are very rare.
[15]
2.6.1.1. Properties of normal distribution
1. It is bell shaped and is symmetrical about its mean and it is mesokurtic. The maximum
ordinate is at x = and is given by:
1
𝑓(𝑥) =
𝜎√2𝜋
2. It is asymptotic to the axis, i.e., it extends indefinitely in either direction from the mean.
3. It is a continuous distribution.
4. It is a family of curves, i.e., every unique pair of mean and standard deviation defines a
different normal distribution. Thus, the normal distribution is completely described by two
parameters: mean and standard deviation.
5. Total area under the curve sums to 1, i.e., the area of the distribution on each side of the
mean is 0.5.
∞
∫−∞ 𝑓(𝑥)𝑑𝑥 = 1
Note: To facilitate the use of normal distribution, the following distribution known as the standard
normal distribution was derived by using the transformation.
𝑋−𝜇 1 1
𝑍= f(z) = − 𝑧2
𝜎 √2𝜋 2
The normal distribution with parameter values µ = 0 and σ = 1 is called the standard normal
distribution. A random variable with this distribution is called a standard normal random variable
and is denoted by Z. Its PDF is:
1 𝑧2
𝑓(𝑧; 0,1) = 𝑒− 2 𝑤ℎ𝑒𝑟𝑒 − ∞ < 𝑧 < ∞
√2𝜋
[16]
The graph of f (z; 0, 1) is called the standard normal curve. Its inflection points are at 1 and –1.
The cdf of the standard normal is
𝑧
Φ(𝑧) = ∫ 𝑓(𝑦; 0,1)𝑑𝑦
−∞
[17]
Summary
Probability and probability distributions are central to statistics, providing a framework for
analyzing random events and making predictions based on data. Probability measures the
likelihood of an event occurring, ranging from 0 (impossible) to 1 (certain), and can be calculated
as the ratio of favorable outcomes to total possible outcomes. Probability distributions, on the other
hand, describe how probabilities are allocated across different values of a random variable. These
distributions are categorized into discrete and continuous types, depending on whether the random
variable can take on countable or uncountable values. Discrete probability distributions, such as
the binomial, hypergeometric, and Poisson distributions, model scenarios with countable
outcomes. The binomial distribution calculates the probability of a given number of successes in
a fixed number of independent trials, while the hypergeometric distribution deals with sampling
without replacement. The Poisson distribution is used for modeling the number of occurrences of
rare events over a fixed interval. Continuous probability distributions, like the normal distribution,
handle uncountable outcomes and are defined by their probability density functions. The normal
distribution, with its bell-shaped curve, is particularly significant in statistics due to its wide
applicability and the central limit theorem, which states that the sum of many independent random
variables approaches a normal distribution as the number of variables increases.
[18]
References
Annis, D. H. (2005). Probability and Statistics: The Science of Uncertainty. In The American
Statistician (Vol. 59, Issue 3). https://doi.org/10.1198/tas.2005.s248
Ayyub, B. M., & McCuen, R. H. (2020). - Fundamentals of Probability. In Probability, Statistics,
and Reliability for Engineers and Scientists. https://doi.org/10.1201/b12161-7
Devore, J. L. (2010). Probablity and statistics for engineers and sciences.
Meester, L. E., Dekking, F. M., & Lopuhaa, H. P. (2005). A Modern Introduction to Probability
and Statistics Understanding Why and How.
Mendenhall. (2006). Introduction to probablity and statistics.
[19]