Notes 3
Notes 3
Heads Tails
.5 .5
Common probability distributions include the binomial distribution, Poisson distribution, and uniform
distribution. Certain types of probability distributions are used in hypothesis testing, including
the standard normal distribution, the F distribution, and Student’s t distribution.
She can get a rough idea of the probability of different egg sizes directly from this frequency
distribution. For example, she can see that there’s a high probability of an egg being around 1.9 oz., and
there’s a low probability of an egg being bigger than 2.1 oz.
Suppose the farmer wants more precise probability estimates. One option is to improve her estimates by
weighing many more eggs.
A better option is to recognize that egg size appears to follow a common probability distribution called
a normal distribution. The farmer can make an idealized version of the egg weight distribution by
assuming the weights are normally distributed:
Since normal distributions are well understood by statisticians, the farmer can calculate precise
probability estimates, even with a relatively small sample size.
Variables that follow a probability distribution are called random variables. There’s special notation
you can use to say that a random variable follows a specific distribution:
For example, the following notation means “the random variable X follows a normal distribution with a
mean of µ and a variance of σ2.”
Probability tables
A probability table represents the discrete probability distribution of a categorical variable. Probability
tables can also represent a discrete variable with only a few possible values or a continuous variable
that’s been grouped into class intervals.
A probability table is composed of two columns:
Example: Probability tableA robot greets people using a random greeting. The probability distribution of
the greetings is described by the following probability table:
Greeting Probability
“Greetings, human!” .6
“Hi!” .1
“Howdy!” .1
Notice that all the probabilities are greater than zero and that they sum to one.
Where:
Notice that the variable can only have certain values, which are represented by closed circles. You can
have two sweaters or 10 sweaters, but you can’t have 3.8 sweaters.
The probability that a person owns zero sweaters is .05, the probability that they own one sweater is .15,
and so on. If you add together all the probabilities for
possible number of sweaters a person can own, it will equal exactly 1.
Where:
The probability of an egg being exactly 2 oz. is zero. Although an egg can weigh very close to 2 oz., it is
extremely improbable that it will weigh exactly 2 oz. Even if a regular scale measured an egg’s weight
as being 2 oz., an infinitely precise scale would find a tiny difference between the egg’s weight and 2 oz.
The probability that an egg is within a certain weight interval, such as 1.98 and 2.04 oz., is greater than
zero and can be represented in the graph of the probability density function as a shaded region:
The shaded region has an area of .09, meaning that there’s a probability of .09 that an egg will weigh
between 1.98 and 2.04 oz. The area was calculated using statistical software.
Common continuous probability distributions
Distribution Description Example
Normal Describes data with values that become less probable the farther SAT scores
distribution they are from the mean, with a bell-shaped probability density
function.
Continuous Describes data for which equal-sized intervals have equal The amount of time cars
uniform probability. wait at a red light
Log-normal Describes right-skewed data. It’s the probability distribution of a The average body weight of
random variable whose logarithm is normally distributed. different mammal species
Exponential Describes data that has higher probabilities for small values than Time between earthquakes
large values. It’s the probability distribution of time between
independent events.
If you have a formula describing the distribution, such as a probability density function, the
expected value is usually given by the µ parameter. If there’s no µ parameter, the expected value
can be calculated from the other parameters using equations that are specific to each distribution.
If you have a sample, then the mean of the sample is an estimate of the expected value of the
population’s probability distribution. The larger the sample size, the better the estimate will be.
If you have a probability table, you can calculate the expected value by multiplying each
possible outcome by its probability, and then summing these values.
Example: Expected valueAmerican robins lay between two and four eggs in their nests. Imagine that this
probability table describes the probability distribution of the number of robin eggs per nest:
Eggs Probability
2 0.2
3 0.5
4 0.3
What is the expected value of robin eggs per nest?
2 .2 2 * 0.2 = 0.4
3 .5 3 * 0.5 = 1.5
4 .3 4 * 0.3 = 1.2
If you have a formula describing the distribution, such as a probability density function, the
standard deviation is sometimes given by the σ parameter. If there’s no σ parameter, the standard
deviation can often be calculated from other parameters using formulas that are specific to each
distribution.
If you have a sample, the standard deviation of the sample is an estimate of the standard
deviation of the population’s probability distribution. The larger the sample size, the better the
estimate will be.
If you have a probability table, you can calculate the standard deviation by calculating the
deviation between each value and the expected value, squaring it, multiplying it by its
probability, and then summing the values and taking the square root.
TULSIRAMJI GAIKWAD-PATIL College of Engineering and Technology
Wardha Road, Nagpur - 441108
Accredited withNAACA+Grade
Approved by AICTE, New Delhi, Govt. of Maharashtra
(An Autonomous Institution Affiliated to RTM Nagpur University, Nagpur)
--------------------------------------------------------------------------------------------------------------------------------------
Date: 16/07/2022
It’s our indeed pleasure to cordially invite you for the Guest Lecture at M.B.A. Departmen
scheduled through OFFLINE mode on 16.07.2022 (Saturday) at 03.00 pm.
HOD, M.B.A.
TULSIRAMJI GAIKWAD-PATIL College of Engineering and T
Wardha Road, Nagpur - 441108
Accredited withNAACA+Grade
Approved by AICTE, New Delhi, Govt. of Maharashtra
(An Autonomous Institution Affiliated to RTM Nagpur University
---------------------------------------------------------------------------------------------------------
It’s our indeed pleasure to cordially invite you for the Guest Lecture
scheduled through OFFLINE mode on 16.07.2022 (Saturday) at 03.00 pm.