4 Lecture4
4 Lecture4
Lecture 4:
[email protected] 1
CHAPTER 3: DESCRIPTIVE STATISTICS (RECALL)
[email protected] 2
CHAPTER 4: PROBABILITY AND DISTRIBUTION
[email protected] 3
4.1. INTRODUCTION TO PROBABILITY
•Random experiments:
A random experiment is a process that generates well-defined
experimental outcomes. On any single repetition or trial, the outcome that
occurs is determined completely by chance.
By specifying all the possible experimental outcomes, we identify the
sample space for a random experiment.
The sample space for a random experiment is the set of all experimental
outcomes.
An experimental outcome is also called a sample point to identify it as an
element of the sample space.
[email protected] 4
4.1. INTRODUCTION TO PROBABILITY
•Random experiments:
[email protected] 5
4.1. INTRODUCTION TO PROBABILITY
•Assigning probabilities:
The three approaches most frequently used are the classical, relative
frequency, and subjective methods.
Regardless of the method used, two basic requirements for assigning
probabilities must be met.
[email protected] 6
4.1. INTRODUCTION TO PROBABILITY
•Assigning probabilities:
The classical method of assigning probabilities is appropriate when all
the experimental outcomes are equally likely.
When using this approach, the two basic requirements for assigning
probabilities are automatically satisfied.
If n experimental outcomes are possible,
a probability of 1/n is assigned to each experimental outcome
[email protected] 7
4.1. INTRODUCTION TO PROBABILITY
•Assigning probabilities:
The relative frequency method of assigning probabilities is appropriate when
data are available to estimate the proportion of the time the experimental
outcome will occur if the experiment is repeated a large number of times.
When using this approach, the two basic requirements for assigning
probabilities are automatically satisfied. [email protected] 8
4.1. INTRODUCTION TO PROBABILITY
•Assigning probabilities:
The subjective method of assigning probabilities is most appropriate when
one cannot realistically assume that the experimental outcomes are equally
likely and when little relevant data are available.
[email protected] 9
4.1. INTRODUCTION TO PROBABILITY
[email protected] 10
4.1. INTRODUCTION TO PROBABILITY
•Events and their probabilities:
Events
(e.g. less than 10,
equal 10, more than 10, etc.) [email protected] 11
4.1. INTRODUCTION TO PROBABILITY
[email protected] 12
4.1. INTRODUCTION TO PROBABILITY
[email protected] 13
4.1. INTRODUCTION TO PROBABILITY
[email protected] 14
4.1. INTRODUCTION TO PROBABILITY
[email protected] 15
4.1. INTRODUCTION TO PROBABILITY
•Multiplication Law:
is used to compute the probability of the
intersection of two events. The multiplication law
is based on the definition of conditional
probability.
•Independent Events:
[email protected] 17
4.1. INTRODUCTION TO PROBABILITY
•Bayes’ theorem:
Two-event case:
[email protected] 18
4.2. DISCRETE PROBABILITY DISTRIBUTIONS
•Random variables
•Developing discrete probability distributions
•Bivariate distributions, covariance, and financial portfolios
•Binomial probability distribution
•Poisson probability distribution
•Hypergeometric probability distribution
[email protected] 19
4.2. DISCRETE PROBABILITY DISTRIBUTIONS
[email protected] 20
4.2. DISCRETE PROBABILITY DISTRIBUTIONS
[email protected] 21
4.2. DISCRETE PROBABILITY DISTRIBUTIONS
•Random variables:
[email protected] 22
4.2. DISCRETE PROBABILITY DISTRIBUTIONS
•Developing discrete probability distributions:
• The probability distribution for a random variable describes how
probabilities are distributed over the values of the random variable.
• For a discrete random variable x, a probability function, denoted by f(x),
provides the probability for each value of the random variable.
• The relative frequency method of assigning probabilities to values of a
random variable is applicable when reasonably large amounts of data are
available.
• We then treat the data as if they were the population and use the relative
frequency method to assign probabilities to the experimental outcomes.
• The use of the relative frequency method to develop discrete probability
distributions leads to what is called an empirical discrete distribution.
[email protected] 23
4.2. DISCRETE PROBABILITY DISTRIBUTIONS
•Developing discrete probability distributions:
A primary advantage of defining a random variable and its probability
distribution is that once the probability distribution is known, it is relatively
easy to determine the probability of a variety of events that may be
of interest to a decision maker.
In the development of a probability function for any discrete random
variable, the following two conditions must be satisfied.
[email protected] 24
4.2. DISCRETE PROBABILITY DISTRIBUTIONS
•Developing discrete probability distributions:
Expected value: or mean, of a random variable is a measure of the
central location for the random variable. The formula for the expected
value of a discrete random variable x follows.
Recall (Lecture3): Sample variance In which, 𝑥:̅ mean (or weighted mean)
∑(𝑥! ∗ 𝑤! )
𝑥̅ =
∑ 𝑤!
[email protected] 25
4.2. DISCRETE PROBABILITY DISTRIBUTIONS
•Developing discrete probability distributions:
Variance:
•Recall (Lecture3): Variance = 256/4= 64
[email protected] 26
4.2. DISCRETE PROBABILITY DISTRIBUTIONS
[email protected] 27
4.2. DISCRETE PROBABILITY DISTRIBUTIONS
In which:
a: coefficient of x
b: coefficient of y
[email protected] 28
4.2. DISCRETE PROBABILITY DISTRIBUTIONS
Variance:
[email protected] 29
4.2. DISCRETE PROBABILITY DISTRIBUTIONS
[email protected] 30
4.2. DISCRETE PROBABILITY DISTRIBUTIONS
[email protected] 31
4.2. DISCRETE PROBABILITY DISTRIBUTIONS
[email protected] 32
4.2. DISCRETE PROBABILITY DISTRIBUTIONS
•Poisson probability distribution: useful in estimating the number of
occurrences over a specified interval of time or space (e.g. number of arrivals at a
car wash in one hour, the number of repairs needed in 10 miles of highway, or the
number of leaks in 100 miles of pipeline)
[email protected] 34
4.3. CONTINUOUS PROBABILITY DISTRIBUTIONS
• A fundamental difference separates discrete and continuous random
variables in terms of how probabilities are computed.
- For a discrete random variable, the probability function f (x) provides the
probability that the random variable assumes a particular value.
- With continuous random variables, the counterpart of the probability
function is the probability density function, also denoted by f (x).
àThe difference is that the probability density function does not directly
provide probabilities.
àThe area under the graph of f (x) corresponding to a given interval does
provide the probability that the continuous random variable x assumes a
value in that interval.
[email protected] 35
4.3. CONTINUOUS PROBABILITY DISTRIBUTIONS
• A fundamental difference separates discrete and continuous random
variables in terms of how probabilities are computed.
- For a discrete random variable, the probability function f (x) provides the
probability that the random variable assumes a particular value.
- With continuous random variables, the counterpart of the probability
function is the probability density function, also denoted by f (x).
àThe difference is that the probability density function does not directly
provide probabilities.
àThe area under the graph of f (x) corresponding to a given interval does
provide the probability that the continuous random variable x assumes a
value in that interval.
[email protected] 36
4.3. CONTINUOUS PROBABILITY DISTRIBUTIONS
[email protected] 37
4.3. CONTINUOUS PROBABILITY DISTRIBUTIONS
[email protected] 38
4.3. CONTINUOUS PROBABILITY DISTRIBUTIONS
[email protected] 39
4.3. CONTINUOUS PROBABILITY DISTRIBUTIONS
[email protected] 40
4.3. CONTINUOUS PROBABILITY DISTRIBUTIONS
[email protected] 41
4.3. CONTINUOUS PROBABILITY DISTRIBUTIONS
[email protected] 42
4.3. CONTINUOUS PROBABILITY DISTRIBUTIONS
[email protected] 43
4.3. CONTINUOUS PROBABILITY DISTRIBUTIONS
•Normal probability distribution: characteristics
• The entire family of normal distributions is differentiated by two parameters:
the mean µ and the standard deviation σ.
• The highest point on the normal curve is at the mean, which is also the
median and mode of the distribution.
• The mean of the distribution can be any numerical value: negative, zero,
or positive. Three normal distributions with the same standard deviation
but three different means (−10, 0, and 20) are shown here.
[email protected] 44
4.3. CONTINUOUS PROBABILITY DISTRIBUTIONS
•Normal probability distribution: characteristics
• The normal distribution is symmetric, with the shape of the normal curve to the
left of the mean a mirror image of the shape of the normal curve to the right of the
mean. The tails of the normal curve extend to infinity in both directions and
theoretically never touch the horizontal axis. Because it is symmetric, the
normal distribution is not skewed; its skewness measure is zero.
• The standard deviation determines how flat and wide the normal curve is. Larger
values of the standard deviation result in wider, flatter curves, showing
more variability in the data. Two normal distributions with the same mean but
with different standard deviations are shown here.
[email protected] 45
4.3. CONTINUOUS PROBABILITY DISTRIBUTIONS
•Normal probability distribution: characteristics
• Probabilities for the normal random variable are given by areas under the
normal curve. The total area under the curve for the normal distribution is 1.
Because the distribution is symmetric, the area under the curve to the left of
the mean is 0.5 and the area under the curve to the right of the mean is 0.5.
• The percentage of values in some commonly used intervals are:
[email protected] 47
4.3. CONTINUOUS PROBABILITY DISTRIBUTIONS
à we can interpret z as the number of standard deviations that the normal random
variable x is from its mean µ.
à Use the table to check the probability [email protected] 48
4.3. CONTINUOUS PROBABILITY DISTRIBUTIONS
•Normal
probability
distribution
(z < 0):
[email protected] 49
4.3. CONTINUOUS PROBABILITY DISTRIBUTIONS
•Normal
probability
distribution
(z > 0):
[email protected] 50
4.3. CONTINUOUS PROBABILITY DISTRIBUTIONS
[email protected] 51
4.3. CONTINUOUS PROBABILITY DISTRIBUTIONS
[email protected] 52
4.3. CONTINUOUS PROBABILITY DISTRIBUTIONS
•Normal approximation of binomial probabilities:
In cases where n*p ≥ 5, and n*(1 − p) ≥ 5:
set µ = n*p and 𝜎 = 𝑛 ∗ 𝑝 ∗ (1 − 𝑝)
à normal distribution:
[email protected] 53
4.3. CONTINUOUS PROBABILITY DISTRIBUTIONS
[email protected] 54
4.3. CONTINUOUS PROBABILITY DISTRIBUTIONS
[email protected] 55
4.3. CONTINUOUS PROBABILITY DISTRIBUTIONS
[email protected] 56
4.4. SAMPLING AND SAMPLING DISTRIBUTIONS (SELF-READING)
• The sampled population is the population from which the sample is
drawn.
• The target population is the population we want to make inferences about
• A frame is a list of the elements that the sample will be selected from.
The sampling problems
Selecting a sample
Point estimation
Introduction to sampling distributions
Sampling distribution of 𝑥̅
Sampling distribution of 𝑝̅
Properties of point estimators
Other sampling methods [email protected] 57
4.4. SAMPLING AND SAMPLING DISTRIBUTIONS (SELF-READING)
Sampling problems:
- Sampling errors
- Lack of sample representativeness
- Difficulty in estimation of sample size
- Lack of knowledge about the sampling process
- Lack of resources
- Lack of cooperation
- Lack of existing appropriate sampling frames for larger population
[email protected] 58
4.4. SAMPLING AND SAMPLING DISTRIBUTIONS (SELF-READING)
•Selecting a sample:
Sampling from a Finite Population: A simple random sample of size n
from a finite population of size N is a sample selected such that each
possible sample of size n has the same probability of being selected.
One procedure for selecting a simple random sample from a finite
population is to use a table of random numbers to choose the elements
for the sample one at a time in such a way that, at each step, each of the
elements remaining in the population has the same probability of being
selected.
Sampling n elements in this way will satisfy the definition of a simple
random sample from a finite population.
sampling without replacement
sampling with replacement
[email protected] 59
4.4. SAMPLING AND SAMPLING DISTRIBUTIONS (SELF-READING)
[email protected] 60
4.4. SAMPLING AND SAMPLING DISTRIBUTIONS (SELF-READING)
we cannot develop a list of all the elements that could
•Selecting a sample: be produced, the population is considered infinite
[email protected] 61
4.4. SAMPLING AND SAMPLING DISTRIBUTIONS (SELF-READING)
•Point estimation:
To estimate the value of a population parameter, we compute a
corresponding characteristic of the sample, referred to as a sample
statistic.
• By making the preceding computations, we perform the statistical
procedure called point estimation.
• We refer to:
- the sample mean 𝑥̅ as the point estimator of the population mean µ
- the sample standard deviation s as the point estimator of the population standard
deviation σ, and
- the sample proportion 𝑝̅ as the point estimator of the population proportion p.
The numerical value obtained for 𝑥,̅ s, or 𝑝̅ is called the point estimate. [email protected] 62
4.4. SAMPLING AND SAMPLING DISTRIBUTIONS (SELF-READING)
-:
Standard Deviation of 𝒙
[email protected] 64
4.4. SAMPLING AND SAMPLING DISTRIBUTIONS (SELF-READING)
[email protected] 65
4.4. SAMPLING AND SAMPLING DISTRIBUTIONS (SELF-READING)
[email protected] 66
4.4. SAMPLING AND SAMPLING DISTRIBUTIONS (SELF-READING)
[email protected] 67
4.4. SAMPLING AND SAMPLING DISTRIBUTIONS (SELF-READING)
-:
Standard Deviation of 𝒑
[email protected] 68
4.4. SAMPLING AND SAMPLING DISTRIBUTIONS (SELF-READING)
[email protected] 69
4.4. SAMPLING AND SAMPLING DISTRIBUTIONS (SELF-READING)
Efficiency: The point estimator with the smaller standard error is said to
have greater relative efficiency than the other.
Consistency: point estimator is consistent if the values of the point
estimator tend to become closer to the population parameter as the
sample size becomes larger. In other words, a large sample size tends to
provide a better point estimate than a small sample size.
[email protected] 70
4.4. SAMPLING AND SAMPLING DISTRIBUTIONS (SELF-READING)
[email protected] 71
4.4. SAMPLING AND SAMPLING DISTRIBUTIONS (SELF-READING)
[email protected] 72
4.4. SAMPLING AND SAMPLING DISTRIBUTIONS (SELF-READING)