0% found this document useful (0 votes)
31 views192 pages

Prob and Stat Notes

The document discusses data and statistics concepts including data types, data sources, and ways to summarize qualitative data. It provides examples and definitions of key terms such as data, variables, observations, experimental and observational studies. Methods for summarizing qualitative data like frequency distributions, relative frequencies, pie charts and bar graphs are also explained.

Uploaded by

anor27299
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
31 views192 pages

Prob and Stat Notes

The document discusses data and statistics concepts including data types, data sources, and ways to summarize qualitative data. It provides examples and definitions of key terms such as data, variables, observations, experimental and observational studies. Methods for summarizing qualitative data like frequency distributions, relative frequencies, pie charts and bar graphs are also explained.

Uploaded by

anor27299
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 192

Probability and Statistics

401237
Chapter 1: Data and Statistics

Dr. Ayat Al-Meanazel

Al Albayt University

Dr. Ayat Al-Meanazel (A.A.U) 1 / 190


1.2 Data - Definitions

Statistics is a branch of science which deals with collecting, organizing,


analyzing, interpreting, summarizing and presenting data.

Data are the facts and figures collected, analyzed, and summarized for
presentation and interpretation.

Data set is all the data collected in a particular study.

Elements are the entities on which data are collected.

Variable is a characteristic of interest for the elements.

Observation the set of measurements obtained for a particular element.

Dr. Ayat Al-Meanazel (A.A.U) 2 / 190


1.2 Data - Example

Name Age Hight Weight Uni. City Major Smoke


(cm) (kg) Level
Ahamd 21 165 85 Third Irbid Math. Yes
Noor 23 145 73 Fourth Mafraq Phys. No
Kareem 24 172 94 Second Mafraq Math. No
Saleh 20 169 66 Second Zarqa Chem. Yes
Salma 22 159 49 Third Ajloun I.T. No
Asma 19 141 78 First Amman Pio. No
Ali 21 174 101 Second Irbid I.T. Yes
Samer 27 177 89 Fourth Irbid Math. Yes
Islam 20 147 53 Third Mafraq Math. No
Nawal 24 161 71 Fourth Zarqa Phys. No
Ibrahim 25 169 79 Fifth Mafra I.T. Yes
Amal 23 174 70 Second Jarash Chem. No
Table: Data set for 12 students from Al Albayt University
Dr. Ayat Al-Meanazel (A.A.U) 3 / 190
1.2 Data - Quantitative Data and Qualitative Data

Data can be classified as


1 Quantitative data which takes numerical values for which arithmetic
operations such adding and averaging make sense. For example: height,
age, blood cholesterol level, temperature, selling price in JD and IQ.

2 Qualitative data are variables in which their data is recorded as labels.


For example: blood type (A, B, AB, O), hair color, gender and email.
Based on the scale of measurement qualitative data can be one of the
following:
Nominal simply a label or name and the categories cannot be ordered in
any sense. For example: eye color, gender, major and nationality.

Ordinal the categories follow a natural order and the order makes sense.
For example: education level, rating scale and medals (gold, sliver,
bronze).

Dr. Ayat Al-Meanazel (A.A.U) 4 / 190


1.3 Data Sources

Data can be obtained from:


Existing Sources
In some cases, data needed for a particular application already exist.
For example, companies maintain a variety of databases about their
employees, customers and business operations.

Statistical Studies
Sometimes the data needed for a particular application are not
available through existing sources. In such cases, the data can often be
obtained by conducting a statistical study either experimental or
observational.

Dr. Ayat Al-Meanazel (A.A.U) 5 / 190


1.3 Data Sources - Experimental Study

Experimental study in this study the variable of interest is first identified.


Then one or more other variables are identified and controlled so that data
can be obtained about how they influence the variable of interest.

For example, a researchers might be interested in conducting an


experiment to learn about how a new drug affects blood pressure. Blood
pressure is the variable of interest in the study. The dosage level of the new
drug is another variable that is hoped to have a causal effect on blood
pressure. To obtain data about the effect of the new drug, researchers
select a sample of individuals. The dosage level of the new drug is
controlled, as different groups of individuals are given different dosage levels.
Before and after data on blood pressure are collected for each group.
Statistical analysis of the experimental data can help determine how the
new drug affects blood pressure.

Dr. Ayat Al-Meanazel (A.A.U) 6 / 190


1.3 Data Sources - Observational Study

Observational study make no attempt to control the variables of interest. A


survey is perhaps the most common type of observational study.

For example, in a personal interview survey, research questions are first


identified. Then a questionnaire is designed and administered to a sample of
individuals. Some restaurants use observational studies to obtain data
about their customers’ opinions of the quality of food, service, atmosphere,
and so on.

Dr. Ayat Al-Meanazel (A.A.U) 7 / 190


2.1 Summarizing Qualitative Data
There are different ways to summarize qualitative data such as:
Frequency Distribution.
A frequency distribution is a tabular summary of data showing the
number (frequency) of items in each of several non-overlapping classes.

Relative Frequency and Percent Frequency Distributions.


A frequency distribution shows the number (frequency) of items in each
of several non-overlapping classes. For a data set with n observations,
the relative frequency of each class can be determined as follows:
frequency of the class
Relative frequency of a class =
n
A relative frequency distribution gives a tabular summary of data
showing the relative frequency for each class. A percent frequency
distribution summarizes the percent frequency of the data for each
class.
Dr. Ayat Al-Meanazel (A.A.U) 8 / 190
2.1 Summarizing Qualitative Data

Bar Graphs
A bar graph, or bar chart, is a graphical device for depicting qualitative
data summarized in a frequency, relative frequency, or percent
frequency distribution. On one axis of the graph (usually the horizontal
axis), we specify the labels that are used for the classes (categories). A
frequency, relative frequency, or percent frequency scale can be used
for the other axis of the graph (usually the vertical axis). Then, using a
bar of fixed width drawn above each class label, we extend the length
of the bar until we reach the frequency, relative frequency, or percent
frequency of the class. For qualitative data, the bars should be
separated to emphasize the fact that each class is separate.

Dr. Ayat Al-Meanazel (A.A.U) 9 / 190


2.1 Summarizing Qualitative Data

Pie Chart
The pie chart provides another graphical device for presenting relative
frequency and percent frequency distributions for qualitative data. To
construct a pie chart, we first draw a circle to represent all of the data.
Then we use the relative frequencies to subdivide the circle into sectors,
or parts, that correspond to the relative frequency for each class. Such
that

The sector of the pie chart = (relative frequency )(360 degree )

Dr. Ayat Al-Meanazel (A.A.U) 10 / 190


1.2 Data - Example

Example: Assume that the data set below show the soft drink selected in a
sample of 50 soft drink purchases. {Coke Classic, Sprite, Pepsi, Diet Coke,
Coke Classic, Coke Classic, Pepsi, Diet Coke, Coke Classic, Diet Coke, Coke
Classic, Coke Classic, Coke Classic, Diet Coke, Pepsi, Coke Classic, Coke
Classic, Dr. Pepper, Dr. Pepper, Sprite, Coke Classic, Diet Coke, Pepsi,
Diet Coke, Pepsi, Coke Classic, Pepsi, Pepsi, Coke Classic, Pepsi, Coke
Classic, Coke Classic, Pepsi, Dr. Pepper, Pepsi, Pepsi, Sprite, Coke Classic,
Coke Classic, Coke Classic, Sprite, Dr. Pepper, Diet Coke, Dr. Pepper,
Pepsi, Coke Classic, Pepsi, Sprite, Coke Classic, Diet Coke}

Dr. Ayat Al-Meanazel (A.A.U) 11 / 190


1.2 Data - Example Continued

Frequency Distribution of Soft Drink

Soft Drink Frequency


Coke Classic 19
Diet Coke 8
Dr. Pepper 5
Pepsi 13
Sprite 5
Total 50

Dr. Ayat Al-Meanazel (A.A.U) 12 / 190


1.2 Data - Example Continued

Relative and Percent Frequency Distribution of Soft Drink

Soft Drink Relative Frequency Percent Frequency


Coke Classic 19/50=0.38 (0.38)(100)=38 %
Diet Coke 8/20=0.16 (0.16)(100)=26 %
Dr. Pepper 5/50=0.10 (0.10)(100)=10 %
Pepsi 13/50=0.26 (0.16)(100)=26 %
Sprite 5/50=0.10 (0.10)(100)=10 %
Total 1 100 %

Dr. Ayat Al-Meanazel (A.A.U) 13 / 190


1.2 Data - Example Continued

Bar Graph

Figure: Bar Graph of Soft Drink

Dr. Ayat Al-Meanazel (A.A.U) 14 / 190


1.2 Data - Example Continued

Pie Chart

Figure: Pie Chart of Soft Drink

Dr. Ayat Al-Meanazel (A.A.U) 15 / 190


Probability and Statistics
401237
Chapter 2: Descriptive Statistics: Tabular and
Graphical Displays

Dr. Ayat Al-Meanazel

Al Albayt University

Dr. Ayat Al-Meanazel (A.A.U) 16 / 190


2.2 Summarizing Quantitative Data

There are different ways to summarize quantitative data such as:


Frequency Distribution.
Steps to define the classes for a frequency distribution are:
1 Determine the number of non overlapping classes.
We recommend using between 5 and 20 classes.

2 Determine the width of each class.


We can use the following expression to determine the class width
Max value − Min value
Class width =
number of classes
3 Determine the class limits.
Class limits must be chosen so that each data item belongs to one and
only one class.

Dr. Ayat Al-Meanazel (A.A.U) 17 / 190


2.2 Summarizing Quantitative Data

Relative Frequency and Percent Frequency Distributions.


The relative frequency of each class with n observations can be
determined as follows:
frequency of the class
Relative frequency of a class =
n

Dot Plot.
A horizontal axis shows the range for the data. Each data value is
represented by a dot placed above the axis.

Dr. Ayat Al-Meanazel (A.A.U) 18 / 190


2.2 Summarizing Quantitative Data

Histogram.
A histogram is constructed by placing the variable of interest on the
horizontal axis and the frequency, relative frequency, or percent
frequency on the vertical axis. The frequency, relative frequency, or
percent frequency of each class is shown by drawing a rectangle whose
base is determined by the class limits on the horizontal axis and whose
height is the corresponding frequency, relative frequency, or percent
frequency.

Dr. Ayat Al-Meanazel (A.A.U) 19 / 190


2.2 Summarizing Quantitative Data

Cumulative Distributions.
The cumulative frequency distribution uses the number of classes, class
widths, and class limits developed for the frequency distribution. But,
rather than showing the frequency of each class, the cumulative
frequency distribution shows the number of data items with values less
than or equal to the upper class limit of each class.

Ogive.
The ogive is constructed by plotting a point corresponding to the
cumulative frequency, such that, the horizontal axis and either the
cumulative frequencies, the cumulative relative frequencies, or the
cumulative percent frequencies on the vertical axis.

Dr. Ayat Al-Meanazel (A.A.U) 20 / 190


2.2 Summarizing Quantitative Data - Example
Example: These data show the time in days required to complete workshop
in finance for a sample of 20 clients of a small public accounting firm.
{12 , 14 , 19 , 18 , 15 , 15 , 18 , 17 , 20 , 27 , 22 , 23 , 22 , 21 , 33 , 28 ,
14 , 18 , 16 , 13}

Frequency Distribution of Time

Time Frequency
(days)
10 -14 4
15 - 19 8
20 - 24 5
25 - 29 2
30 - 34 1
Total 20

Dr. Ayat Al-Meanazel (A.A.U) 21 / 190


2.2 Summarizing Quantitative Data - Example Continued

Relative and Percent Frequency Distribution of Time

Time Relative Percent


(days) Frequency Frequency
10 -14 0.20 20 %
15 - 19 0.40 40 %
20 - 24 0.25 25 %
25 - 29 0.10 10 %
30 - 34 0.05 5%
Total 1 100 %

Dr. Ayat Al-Meanazel (A.A.U) 22 / 190


2.2 Summarizing Quantitative Data - Example Continued

Dot Plot

Figure: Dot Plot of Time

Dr. Ayat Al-Meanazel (A.A.U) 23 / 190


2.2 Summarizing Quantitative Data - Example Continued

Histogram

Figure: Histogram of Time

Dr. Ayat Al-Meanazel (A.A.U) 24 / 190


2.2 Summarizing Quantitative Data - Example Continued

Cumulative Relative and Cumulative Percent Frequency Distribution of


Time

Time Cumulative Cumulative Cumulative


(days) Freq. Relative Freq. Percent Freq.
Less than or equal to 14 4 4/20 = 0.20 20 %
Less than or equal to 19 12 12/20 = 0.60 60 %
Less than or equal to 24 17 17/20 = 0.85 85 %
Less than or equal to 29 19 19/20 = 0.95 95 %
Less than or equal to 34 20 20/20 = 1.00 100 %

Dr. Ayat Al-Meanazel (A.A.U) 25 / 190


2.2 Summarizing Quantitative Data - Example Continued

Ogive

Figure: Ogive of Time

Dr. Ayat Al-Meanazel (A.A.U) 26 / 190


Probability and Statistics
401237
Chapter 3: Descriptive Statistics: Numerical Measures

Dr. Ayat Al-Meanazel

Al Albayt University

Dr. Ayat Al-Meanazel (A.A.U) 27 / 190


3.1 Measures of Location

Mean
The mean provides a measure of central location for the data. If the
data are for a sample, the mean is denoted by x̄ ; if the data are for a
population, the mean is denoted by the Greek letter µ. For a sample
with n observations, the formula for the sample mean is as follows.

∑ni=1 xi
X̄ =
n
The formula for the population mean is as follows.

∑N
i =1 xi
µ=
N

Dr. Ayat Al-Meanazel (A.A.U) 28 / 190


3.1 Measures of Location
Example: Let us consider the following class size data for a sample of five
college classes.
{46 54 42 46 32}
∑5i =1 xi x1 + x2 + x3 + x4 + x5
X̄ = =
n 5
46 + 54 + 42 + 46 + 32
= = 44
5
Example: Let us consider the following starting prices for a sample of 12
used cars (in JD)
{3450 3490 3550 3730 3650 3540 3480 3925 3355 3520 3310 3480}
∑12
i =1 xi x1 + x2 + x3 + . . . + x12
X̄ = =
n 12
3450 + 3490 + 3550 + . . . + 3480
=
12
42, 480
= = 3540
12
Dr. Ayat Al-Meanazel (A.A.U) 29 / 190
3.1 Measures of Location

Median
The median is another measure of central location, which is the value
in the middle when the data are arranged in ascending order (smallest
value to largest value). With an odd number of observations, the
median is the middle value. An even number of observations has no
single middle value. In this case, we follow convention and define the
median as the average of the values for the middle two observations.
Steps to find the median M of n observations
1 Arranged the observations from the smallest value to largest value.
2 If n is an odd number, then M is the observation at position (n + 1)/2.
3 If n is an even number, then M is the average of the two values on
either side of position (n + 1)/2.

Dr. Ayat Al-Meanazel (A.A.U) 30 / 190


3.1 Measures of Location
Example: Consider the previous data

{46 54 42 46 32}

1 32 42 46 46 54
2 n = 5 which is an odd number
5+1

3 M is the observation at position 2 = 3 . Then M = 46.

Example: Consider the previous data


{3450 3490 3550 3730 3650 3540 3480 3925 3355 3520 3310 3480}

1 3310 3355 3450 3480 3480 3490 3520 3540 3550 3650 3730 3925
2 n = 12 which is an even number
M is average of the two values on either side of position 1212+1 = 6.5 .

3

Then M = 3490+2 3520 = 3505.


Dr. Ayat Al-Meanazel (A.A.U) 31 / 190
3.1 Measures of Location

Mode
The mode is the value that occurs with greatest frequency. Situations
can arise for which the greatest frequency occurs at two or more
different values. In these instances more than one mode exists.
1 If the data contain exactly two modes, we say that the data are
Bimodal.
2 If data contain more than two modes, we say that the data are
Multimodal.
Example: Consider the previous data {46 54 42 46 32}. Then the
Mode is 46.

Example: Consider the previous data


{3450 3490 3550 3730 3650 3540 3480 3925 3355 3520 3310 3480}.
Then the Mode is 3480.

Dr. Ayat Al-Meanazel (A.A.U) 32 / 190


3.1 Measures of Location

Percentiles
The p − th percentile is a value such that at least p percent of the
observations are less than or equal to this value and at least (100 − p )
percent of the observations are greater than or equal to this value.
Steps to calculate the p − th percentile
1 Arranged the observations from the smallest value to largest value.
2 Compute an index i  p 
i= n
100
where p is the percentile of interest and n is the number of observations.
3 If i is not an integer, round up. The next integer greater than i denotes
the position of the p − th percentile.
4 If i is an integer, the p − th percentile is the average of the values in
positions i and i + 1.

Dr. Ayat Al-Meanazel (A.A.U) 33 / 190


3.1 Measures of Location

Example: The 35 − th percentile for

{46 54 42 46 32}

1 32 42 46 46 54
35
  
2 i = 100 5 = 1.75 which is not an integer, round it up 2 then the
35 − th percentile is 42.

Example: The 85 − th percentile for


{3450 3490 3550 3730 3650 3540 3480 3925 3355 3520 3310 3480}

1 3310 3355 3450 3480 3480 3490 3520 3540 3550 3650 3730 3925
85
  
2 i = 100 12 = 10.2 which is not an integer, round it up 11 then the
85 − th percentile is 3730.

Dr. Ayat Al-Meanazel (A.A.U) 34 / 190


3.1 Measures of Location
Quartiles
It is often desirable to divide data into four parts, with each part
containing approximately one-fourth, or 25% of the observations. The
division points are referred to as the quartiles and are defined as
Q1 = first quartile, or 25th percentile
Q2 = second quartile, or 50th percentile
Q3 = third quartile, or 75th percentile

Figure: Location of the quartiles

Dr. Ayat Al-Meanazel (A.A.U) 35 / 190


3.1 Measures of Location
Example: The quartiles for
{3310 3355 3450 3480 3480 3490 3520 3540 3550 3650 3730 3925}

25
  
1 For Q1 we find the 25 − th percentile i = 100 12 = 3 which is an
integer, then the 25 − th percentile is
3450 + 3480
= 3465.
2
50
  
2 For Q2 we find the 50 − th percentile i = 100 12 = 6 which is an
integer, then the 50 − th percentile is
3490 + 3520
= 3505.
2
75
  
3 For Q3 we find the 75 − th percentile i = 100 12 = 9 which is an
integer, then the 75 − th percentile is
3550 + 3650
= 3600.
2
Dr. Ayat Al-Meanazel (A.A.U) 36 / 190
3.2 Measures of Variability

Range.
The simplest measure of variability is the range.

Range = Max − Min

Example: Let us consider the following class size data for a sample of
five college classes. {46 54 42 46 32}

Range = 54 − 32 = 22

Example: Let us consider the following starting prices for a sample of


12 used cars (in JD)

{3450 3490 3550 3730 3650 3540 3480 3925 3355 3520 3310 3480}

Range = 3925 − 3310 = 615.

Dr. Ayat Al-Meanazel (A.A.U) 37 / 190


3.2 Measures of Variability

Interquartile Range
A measure of variability that overcomes the dependency on extreme
values is the interquartile range (IQR). This measure of variability is
the difference between the third quartile, Q3 , and the first quartile, Q1 .
In other words, the interquartile range is the range for the middle 50%
of the data.
IQR = Q3 − Q1
Example: The quartiles for
{3310 3355 3450 3480 3480 3490 3520 3540 3550 3650 3730 3925}
are Q1 = 3465, Q2 = 3505 and Q3 = 3600. Then,
IQR = Q3 − Q1
= 3600 − 3465
= 135
Dr. Ayat Al-Meanazel (A.A.U) 38 / 190
3.2 Measures of Variability

Variance
The variance is a measure of variability that utilizes all the data. The
variance is based on the difference between the value of each
observation (xi ) and the mean.The difference between each xi and the
mean (for a sample x̄, µ for a population) is called a deviation about
the mean. The variance is given in the following way

∑ni=1 (xi − x̄ )2
Sample variance = s 2 =
n−1
N
∑ (xi − µ)2
Population variance = σ2 = i =1
N

Dr. Ayat Al-Meanazel (A.A.U) 39 / 190


3.2 Measures of Variability

Standard Deviation
The standard deviation is defined to be the positive square root of the
variance. Following the notation we adopted for a sample variance and
a population variance, we use s to denote the sample standard
deviation and σ to denote the population standard deviation. The
standard deviation is derived from the variance in the following way

Sample standard deviation = s = s 2

Population standard deviation = σ = σ2

Dr. Ayat Al-Meanazel (A.A.U) 40 / 190


3.2 Measures of Variability

Example: Find the sample variance for {46 54 42 46 32}.


xi x̄ (xi − x̄ ) (xi − x̄ )2
46 44 2 4
54 44 10 100
42 44 -2 4
46 44 2 4
32 44 -12 144
Total 220 0 256

∑5i =1 (xi − x̄ )2 256


s2 = = = 64
5−1 4
√ √
s = s 2 = 64 = 8

Dr. Ayat Al-Meanazel (A.A.U) 41 / 190


3.2 Measures of Variability
Example: Find the population variance for
{3450 3490 3550 3730 3650 3540 3480 3925 3355 3520 3310 3480}

∑12
i =1 (xi − x̄ )
2 301, 850
s2 = = = 27, 440.91
12 − 1 11
√ √
s = s 2 = 27, 440.91 = 165.65

Dr. Ayat Al-Meanazel (A.A.U) 42 / 190


3.3 Measures of Distribution Shape, Relative Location, and
Detecting Outliers

Dr. Ayat Al-Meanazel (A.A.U) 43 / 190


3.4 Exploratory Data Analysis
Five Number Summary
In a five-number summary, the following five numbers are used to
summarize the data.
1 Smallest value.
2 First quartile Q1 .
3 Second quartile Q2 .
4 Third quartile Q3 .
5 Largest value.
The easiest way to develop a five-number summary is to first place the data
in ascending order. Then it is easy to identify the smallest value, the three
quartiles, and the largest value.
Example: The prices for a sample of 12 used cars are repeated here in
ascending order.

Dr. Ayat Al-Meanazel (A.A.U) 44 / 190


3.4 Exploratory Data Analysis

Box Plot
A box plot is a graphical summary of data that is based on a
five-number summary. A key to the development of a box plot is the
computation of the median and the quartiles, Q1 and Q3 . The
interquartile range, IQR = Q3 − Q1 , is also used. The steps used to
construct the box plot follow.
1 A box is drawn with the ends of the box located at the Q1 and Q3 .
2 A vertical line is drawn in the box at the location of the median (i.e.
Q2 ).
3 By using the interquartile range IQR, limits are located. The limits for
the box plot are the lower fence [LF = Q1 − 1.5(IQR )] and the
upper fence [UF = Q3 + 1.5(IQR )]. Data outside these limits are
considered outliers.
4 Whiskers are dashed lines drawn from the ends of the box to the
smallest and largest values inside the upper and lower fence in Step 3.
5 Finally, the location of each outlier is shown with the symbol ∗.

Dr. Ayat Al-Meanazel (A.A.U) 45 / 190


3.4 Exploratory Data Analysis
Box Plot
Example: Recall that Q1 = 3465, Q2 = 3505, Q3 = 3600 and
IQR = 135 for the prices of a sample of 12 used cars. Then the upper
fence and lower fence are
LF = Q1 − 1.5(IQR ) = 3465 − 1.5(135) = 3262.5
and
UP = Q3 + 1.5(IQR ) = 3600 + 1.5(135) = 3802.5

Dr. Ayat Al-Meanazel (A.A.U) 46 / 190


Probability and Statistics
401237
Chapter 4: Introduction to Probability

Dr. Ayat Al-Meanazel

Al Albayt University

Dr. Ayat Al-Meanazel (A.A.U) 47 / 190


4.1 Experiments, Counting Rules, and Assigning
Probabilities

Experiment is a process that generates well-defined outcomes.

Sample Space is the set of all experimental outcomes.

Experiment Sample Space S


Toss a coin {Head, Tail}
Select a part for inspection {Defective, Non-defective}
Conduct a sales call {Purchase, No purchase}
Roll a die {1, 2, 3, 4, 5, 6}
Play a football game {Win, Lose, Tie}

Dr. Ayat Al-Meanazel (A.A.U) 48 / 190


4.1 Experiments, Counting Rules, and Assigning
Probabilities

Being able to identify and count the experimental outcomes is a necessary


step in assigning probabilities. We now discuss three useful counting rules.
Multiple-step experiments
If an experiment can be described as a sequence of k steps with n1
possible outcomes on the first step, n2 possible outcomes on the
second step, and so on, then the total number of experimental
outcomes is given by

|S | = (n1 )(n2 ) . . . (nk )

Dr. Ayat Al-Meanazel (A.A.U) 49 / 190


4.1 Experiments, Counting Rules, and Assigning
Probabilities

Example Find the total number of possible outcomes in the following


experiments
1 Tossing a coin three times.

|S | = (2)(2)(2) = 8

2 Rolling a dice two times.

|S | = (6)(6) = 36

3 Select one letter from {A, B, C , D, E } then one number from 0 − 9.

|S | = (5)(10) = 50

Dr. Ayat Al-Meanazel (A.A.U) 50 / 190


4.1 Experiments, Counting Rules, and Assigning
Probabilities

Combinations
A second useful counting rule allows one to count the number of
experimental outcomes when the experiment involves selecting n
objects from a set of N objects. The number of combinations of N
objects taken n at a time is
 
N N N!
Cn = =
n n!(N − n )!

where
N! = (N )(N − 1)(N − 2)(N − 3) . . . (2)(1)
n! = (n )(n − 1)(n − 2)(n − 3) . . . (2)(1)
0! = 1

Dr. Ayat Al-Meanazel (A.A.U) 51 / 190


4.1 Experiments, Counting Rules, and Assigning
Probabilities

Example
1 How many different groups of three letters can be made from these
letters {A, B, C , D, E }?
 
5 5 5!
| S | = C3 = = = 10
3 3!(5 − 3)!
2 In how many ways we can select six integers from a group of 53?
 
53 53 53!
|S | = C6 = = = 22, 957, 480
6 6!(53 − 6)!

Dr. Ayat Al-Meanazel (A.A.U) 52 / 190


4.1 Experiments, Counting Rules, and Assigning
Probabilities

Permutations
A third counting rule that is sometimes useful is the counting rule for
permutations. It allows one to compute the number of experimental
outcomes when n objects are to be selected from a set of N objects
where the order of selection i The number of combinations of N
objects taken n at a time is important. The same n objects selected in
a different order are considered a different experimental outcome. The
number of permutations of N objects taken n at a time is given by
N!
PnN =
(N − n ) !

Dr. Ayat Al-Meanazel (A.A.U) 53 / 190


4.1 Experiments, Counting Rules, and Assigning
Probabilities

Examples
1 How many different groups of three letters can be made from these
letters {A, B, C , D, E }, if no letter is used more than once?

5!
|S | = P35 = = 60
(5 − 3) !
2 In how many ways can a president, a treasurer and a secretary be
chosen from among 7 candidates?
7!
|S | = P37 = = 210
(7 − 3) !

Dr. Ayat Al-Meanazel (A.A.U) 54 / 190


4.1 Experiments, Counting Rules, and Assigning
Probabilities
Basic Requirements for Assigning Probabilities
Probabilities can be assigned to experimental outcomes using the classical
method or the relative frequency method or the subjective method.
Regardless of the method used, two basic requirements must be met.
1 The probability assigned to each experimental outcome must be
between 0 and 1, inclusively. If we let Ei denote the i − th
experimental outcome and P (Ei ) its probability, then this requirement
can be written as
0 ≤ P (Ei ) ≤ 1 for all i
2 The sum of the probabilities for all the experimental outcomes must
equal 1. For n experimental outcomes, this requirement can be written
as
P ( E 1 ) + P ( E2 ) + . . . + P ( En ) = 1

Dr. Ayat Al-Meanazel (A.A.U) 55 / 190


4.1 Experiments, Counting Rules, and Assigning
Probabilities
Methods for Assigning Probabilities

Classical Method.
This method is appropriate when all the experimental outcomes are
equally likely. If n experimental outcomes are possible, a probability of
1/n is assigned to each experimental outcome.
Relative Frequency Method.
This method is appropriate when data are available to estimate the
proportion of the time the experimental outcome will occur if the
experiment is repeated a large number of times.
Subjective Method.
This method is most appropriate when one cannot realistically assume
that the experimental outcomes are equally likely and when little
relevant data are available.

Dr. Ayat Al-Meanazel (A.A.U) 56 / 190


4.1 Experiments, Counting Rules, and Assigning
Probabilities

Examples (Classical Method)


1 Consider the experiment of tossing a fair coin.
Outcomes Head Tail Total
Probability 1/2 1/2 1

2 Consider the experiment of rolling a die.


Outcomes 1 2 3 4 5 6 Total
Probability 1/6 1/6 1/6 1/6 1/6 1/6 1

Dr. Ayat Al-Meanazel (A.A.U) 57 / 190


4.1 Experiments, Counting Rules, and Assigning
Probabilities

Examples (Relative Frequency Method)


1 Consider the experiment of selecting one ball from a box with 3 red, 5
blue, and 7 green balls.
Outcomes Red Blue Green Total
Probability 3/15 5/15 7/15 1

2 Consider the experiment of rolling three coins, and recording the


number of heads.
Outcomes 0 1 2 3 Total
Probability 1/8 3/8 3/8 1/8 1

Dr. Ayat Al-Meanazel (A.A.U) 58 / 190


4.2 Events and Their Probabilities

Event is a collection of sample points.

Probability of an Event is equal to the sum of the probabilities of the


sample points in the event.

Dr. Ayat Al-Meanazel (A.A.U) 59 / 190


4.2 Events and Their Probabilities

Example A box contains 7 Red, 4 Blue, 6 Green and 3 White balls. If we


select 5 balls at random what is the probability that we get:
1 all Green balls.[Answer: 0.0004]
2 exactly 3 Red balls.[Answer: 0.1761]
3 exactly 2 Red and 1 Blue balls.[Answer: 0.1950]
4 at least 4 Green balls.[Answer: 0.0139]
5 at most 3 Blue balls.[Answer: 0.9990]
6 less than 2 White balls.[Answer: 0.8600]
7 more than 3 Green balls.[Answer: 0.0139]
8 same ball color for all 5 balls.[Answer: 0.0017]
9 different ball color for all 5 balls.[Answer: 0]

Dr. Ayat Al-Meanazel (A.A.U) 60 / 190


4.2 Events and Their Probabilities

Example Four men visit a city with 5 hotels. what is the probability that
1 each one select any hotel to stay in.[Answer: 1]
2 each one stay in different hotel.[Answer: 0.1920]

Dr. Ayat Al-Meanazel (A.A.U) 61 / 190


4.2 Events and Their Probabilities

Example Consider the experiment of rolling a dice twice. Answer the


following questions.
What is the probability of obtaining an even number in the first roll
but not 5 or 6 in the second roll.[Answer: 0.3333]
Suppose that we are interested in the sum of the face values showing
on the dice. What is the probability of obtaining a value of
1 7.[Answer: 0.1667]
2 9 or greater.[Answer: 0.2778]

Dr. Ayat Al-Meanazel (A.A.U) 62 / 190


4.3 Some Basic Relationships of Probability

Complement of an Event
Given an event A, the complement of A is defined to be the event
consisting of all sample points that are not in A. The complement of A
is denoted by Ac .
In any probability application, either event A or its complement Ac
must occur. Therefore, we have

P (A) + P (Ac ) = 1

Dr. Ayat Al-Meanazel (A.A.U) 63 / 190


4.3 Some Basic Relationships of Probability

Union of Events
Given two events A and B, the union of A and B is the event
containing all sample points belonging to A or B or both. The union is
denoted by A ∪ B.

Dr. Ayat Al-Meanazel (A.A.U) 64 / 190


4.3 Some Basic Relationships of Probability

Intersection of Events
Given two events A and B, the intersection of A and B is the event
containing the sample points belonging to both A and B. The
intersection is denoted by A ∩ B.

Dr. Ayat Al-Meanazel (A.A.U) 65 / 190


4.3 Some Basic Relationships of Probability

Addition Law
Given two events A and B, the addition law is helpful when we are
interested in knowing the probability that event A or event B or both
occur. The addition law is written as follows.

P (A ∪ B ) = P (A) + P (B ) − P (A ∩ B )

Note, Two events are said to be mutually exclusive or disjoint if the


events have no sample points in common. In this case P (A ∩ B ) = 0
and the addition law can be written as P (A ∪ B ) = P (A) + P (B ).

Dr. Ayat Al-Meanazel (A.A.U) 66 / 190


4.3 Some Basic Relationships of Probability

Example:Suppose that we have a sample space


S = {E1 , E2 , E3 , E4 , E5 , E6 , E7 }, where E1 , E2 , . . . , E7 denote the sample
points. The following probability assignments apply:

P (E1 ) = 0.05, P (E2 ) = 0.20, P (E3 ) = 0.20, P (E4 ) = 0.25,

P (E5 ) = 0.15, P (E6 ) = 0.10, and P (E7 ) = 0.05

Let A = {E1 , E4 , E6 }, B = {E2 , E4 , E7 }, and C = {E2 , E3 , E5 , E7 }.

Dr. Ayat Al-Meanazel (A.A.U) 67 / 190


4.3 Some Basic Relationships of Probability

Find P (A), P (B ), and P (C ).

P ( A ) = P ( E1 ) + P ( E4 ) + P ( E6 )
= 0.05 + 0.25 + 0.10
= 0.40

P ( B ) = P ( E2 ) + P ( E4 ) + P ( E7 )
= 0.20 + 0.25 + 0.05
= 0.50

P ( C ) = P ( E2 ) + P ( E3 ) + P ( E5 ) + P ( E7 )
= 0.20 + 0.20 + 0.15 + 0.05
= 0.60

Dr. Ayat Al-Meanazel (A.A.U) 68 / 190


4.3 Some Basic Relationships of Probability
Find P (A ∩ B ).
A ∩ B = { E4 }
P ( A ∩ B ) = P ( E4 )
= 0.25
Find P (A ∪ B ).

P (A ∪ B ) = P (A) + P (B ) − P (A ∩ B )
= 0.40 + 0.50 − 0.25
= 0.65

A ∪ B = { E1 , E2 , E 4 , E 6 , E 7 }
P ( A ∪ B ) = P ( E1 ) + P ( E2 ) + P ( E4 ) + P ( E6 ) + P ( E7 )
= 0.05 + 0.20 + 0.25 + 0.10 + 0.05
= 0.65
Dr. Ayat Al-Meanazel (A.A.U) 69 / 190
4.3 Some Basic Relationships of Probability

Are events A and C mutually exclusive? Yes.


Find P (B c ).
P (B ) + P (B c ) = 1
P (B c ) = 1 − P (B )
= 1 − 0.50
= 0.50
B c = { E1 , E3 , E5 , E6 }
P ( B c ) = P ( E1 ) + P ( E3 ) + P ( E 5 ) + P ( E6 )
= 0.05 + 0.20 + 0.15 + 0.10
= 0.50

Dr. Ayat Al-Meanazel (A.A.U) 70 / 190


4.3 Some Basic Relationships of Probability

Example:Suppose that we have a sample space with five equally likely


experimental outcomes: E1 , E2 , E3 , E4 , E5 . Such that
1
P ( E1 ) = P ( E2 ) = P ( E 3 ) = P ( E4 ) = P ( E5 ) = = 0.20
5

Let A = {E1 , E2 }, B = {E3 , E4 }, and C = {E2 , E3 , E5 }.

Dr. Ayat Al-Meanazel (A.A.U) 71 / 190


4.3 Some Basic Relationships of Probability

Find P (A), P (B ), and P (C ).

P ( A ) = P ( E1 ) + P ( E2 )
= 0.20 + 0.20
= 0.40

P ( B ) = P ( E3 ) + P ( E4 )
= 0.20 + 0.20
= 0.40

P ( C ) = P ( E2 ) + P ( E3 ) + P ( E 5 )
= 0.20 + 0.20 + 0.20
= 0.60

Dr. Ayat Al-Meanazel (A.A.U) 72 / 190


4.3 Some Basic Relationships of Probability

Find P (A ∩ B ).
Event A and B are mutually exclusive since A ∩ B = ∅ then

P (A ∩ B ) = 0

Find P (A ∪ B ).
P (A ∪ B ) = P (A) + P (B )
= 0.40 + 0.40
= 0.80
A ∪ B = { E1 , E2 , E3 , E4 }
P ( A ∪ B ) = P ( E1 ) + P ( E2 ) + P ( E 3 ) + P ( E4 )
= 0.20 + 0.20 + 0.20 + 0.20
= 0.80

Dr. Ayat Al-Meanazel (A.A.U) 73 / 190


4.3 Some Basic Relationships of Probability

Find P (C c ).
P (C ) + P (C c ) = 1
P (C c ) = 1 − P (C )
= 1 − 0.60
= 0.40
C c = { E1 , E4 }
P ( C c ) = P ( E1 ) + P ( E4 )
= 0.20 + 0.20
= 0.40

Dr. Ayat Al-Meanazel (A.A.U) 74 / 190


4.4 Conditional Probability

Conditional Probability
Suppose we have an event A with probability P (A). If we obtain new
information and learn that a related event, denoted by B, already occurred.
Then the conditional probability of A given B can be computed as

P (A ∩ B )
P (A/B ) =
P (B )

However, if the probability of event A is not changed by the existence of


event B, that is,
P (A/B ) = P (A)
we would say that events A and B are independent events (i.e.,
P (A ∩ B ) = P (A)P (B )).

Dr. Ayat Al-Meanazel (A.A.U) 75 / 190


4.4 Conditional Probability
Example: Suppose that we have two events, A and B, with
P (A) = 0.50, P (B ) = 0.60, and P (A ∩ B ) = 0.40.
Find P (A/B ).
P (A ∩ B )
P (A/B ) =
P (B )
0.40
= = 0.67
0.60
Find P (B/A).
P (A ∩ B )
P (B/A) =
P (A)
0.40
= = 0.80
0.50
Are A and B independent? Why or why not? No, since
P (A ∩ B ) 6 = P (A)P (B )
0.40 6= (0.50)(0.60)
Dr. Ayat Al-Meanazel (A.A.U) 76 / 190
4.4 Conditional Probability

Example: Assume that we have two events, A and B, that are independent.
Assume further that we know P (A) = 0.30 and P (B ) = 0.40.
Find P (A ∩ B ).

P (A ∩ B ) = P (A)P (B ) = (0.30)(0.40) = 0.12

Find P (B/A).
P (B/A) = P (B ) = 0.40
Find P (A ∪ B ).

P (A ∪ B ) = P (A) + P (B ) − P (A ∩ B )
= 0.30 + 0.40 − 0.12 = 0.58

Dr. Ayat Al-Meanazel (A.A.U) 77 / 190


4.4 Conditional Probability

Example Continued
Find P (A ∩ B c ).

P (A ∩ B c ) = P (A) − P (A ∩ B )
= 0.30 − 0.12 = 0.18

Find P (B/Ac ).

P (B ∩ Ac )
P (B/Ac ) =
P (Ac )
P (B ) − P (B ∩ A)
=
1 − P (A)
0.40 − 0.12 0.28
= = = 0.40
1 − 0.30 0.70

Dr. Ayat Al-Meanazel (A.A.U) 78 / 190


4.4 Conditional Probability

Example: Consider the situation of the promotion status of male and


female officers. The specific breakdown of promotions for male and female
officers is shown in the table below

Find the probability that a randomly selected officer is

Dr. Ayat Al-Meanazel (A.A.U) 79 / 190


4.4 Conditional Probability

a man.
960
P (M ) = = 0.80
1200
a woman and is promoted.
36
P (W ∩ P ) = = 0.03
1200
promoted given that the officer is a man.

P (P ∩ M )
P (P/M ) =
P (M )
288/1200
=
960/1200
288
= = 0.30
960

Dr. Ayat Al-Meanazel (A.A.U) 80 / 190


4.4 Conditional Probability

woman or is not promoted.

P (W ∪ P c ) = P (W ) + P (P c ) − P (W ∩ P c )
240 876 204
= + −
1200 1200 1200
912
= = 0.76
1200

Dr. Ayat Al-Meanazel (A.A.U) 81 / 190


4.4 Conditional Probability
Example: Assume that we have two events, A and B. Assume further that
we know P (A/B ) = 0.25, P (A ∪ B ) = 0.60 and P (B ) = 0.15. Find
P (Ac ).
P (A ∩ B )
P (A/B ) =
P (B )
P (A ∩ B )
0.25 =
0.15
(0.25)(0.15) = P (A ∩ B ) → P (A ∩ B ) = 0.0375
and
P (A ∪ B ) = P (A) + P (B ) − P (A ∩ B )
0.60 = P (A) + 0.15 − 0.0375
0.60 = P (A) + 0.1125 → P (A) = 0.4875
then
P (Ac ) = 1 − P (A)
= 1 − 0.4875 = 0.5125
Dr. Ayat Al-Meanazel (A.A.U) 82 / 190
Probability and Statistics
401237
Chapter 5: Discrete Probability Distributions

Dr. Ayat Al-Meanazel

Al Albayt University

Dr. Ayat Al-Meanazel (A.A.U) 83 / 190


5.1 Random Variables

Random Variable is a numerical description of the outcome of an


experiment. A random variable can be classified as being either discrete or
continuous depending on the numerical values it assumes.
Discrete Random Variables
A random variable that may assume either a finite number of values or
an infinite sequence of values such as 1, 2, 3, . . ..

As an example of a discrete random variable, consider the experiment


of cars arriving at a tollbooth. The random variable of interest is x
denotes the number of cars arriving during a one-day period. The
possible values for x come from the sequence of integers 0, 1, 2, . . . and
so on. Hence, x is a discrete random variable assuming one of the
values in this infinite sequence.

Dr. Ayat Al-Meanazel (A.A.U) 84 / 190


5.1 Random Variables

Continuous Random Variables


A random variable that may assume any numerical value in an interval
or collection of intervals.

As an example suppose the random variable of interest is x denotes the


time between consecutive incoming calls in minutes. This random
variable may assume any value in the interval x ≥ 0. Actually, an
infinite number of values are possible for x, including values such as
1.26 minutes, 2.751 minutes, 4.3333 minutes, and so on.

Dr. Ayat Al-Meanazel (A.A.U) 85 / 190


5.1 Random Variables

Discrete Random Variables

The probability distribution for a random variable describes how


probabilities are distributed over the values of the random variable. For a
discrete random variable x, the probability distribution is defined by a
probability function, denoted by f (x ). The probability function provides the
probability for each value of the random variable [i.e., f (x ) = P (X = x )].
In the development of a probability function f (x ) for any discrete random
variable, the following two conditions must be satisfied:
1 f (x ) ≥ 0.
2 ∑ f (x ) = 1.

Dr. Ayat Al-Meanazel (A.A.U) 86 / 190


5.1 Random Variables

Example: Consider the sales of used cars. Over the past 300 days of
operation, sales data show 54 days with no cars sold, 117 days with 1 car
sold, 72 days with 2 cars sold, 42 days with 3 cars sold, 12 days with 4 cars
sold, and 3 days with 5 cars sold. Suppose we consider the experiment of

selecting a day and define the random variable of interest as x is the


number of cars sold during a day.
x 0 1 2 3 4 5 Total
# Days 54 117 72 42 12 3 300
f (x ) 0.18 0.39 0.24 0.14 0.04 0.01 1.00

Dr. Ayat Al-Meanazel (A.A.U) 87 / 190


5.1 Random Variables
Discrete Uniform Probability Distribution

The simplest example of a discrete probability distribution given by a


formula is the discrete uniform probability distribution. Its probability
function is defined by:
1
f (x ) =
n
where n is the number of values the random variable may assume.

Example: Suppose that for the experiment of rolling a die we define the
random variable x to be the number of dots on the upward face. For this
experiment, x = 1, 2, 3, 4, 5, 6. Thus, the probability function for this
discrete uniform random variable is
1
f (x ) = for all x = 1, 2, 3, 4, 5, 6
6

Dr. Ayat Al-Meanazel (A.A.U) 88 / 190


5.1 Random Variables

Discrete Uniform Probability Distribution

Example: Consider the random variable x with the following discrete


probability distribution
x
f (x ) = for all x = 1, 2, 3, 4
10
To evaluate the probability that the random variable assumes a value of 2
2
f (2) =
10

Dr. Ayat Al-Meanazel (A.A.U) 89 / 190


5.3 Expected Value and Variance
Expected Value
The expected value, or mean, of a random variable is a measure of the
central location for the random variable. The formula for the expected
value of a discrete random variable x follows.

E (x ) = µ = ∑ xf (x )
Variance
The variance summarize the variability in the values of a random
variable. The formula for the variance of a discrete random variable
follows.
Var (x ) = σ2 = ∑(x − µ)2 f (x )
The standard deviation, , is defined as the positive square root of the
variance. Thus, the standard deviation for the number of automobiles
sold during a day is √
σ = σ2
Dr. Ayat Al-Meanazel (A.A.U) 90 / 190
5.3 Expected Value and Variance
Example: Consider the sales of used cars. Over the past 300 days of
operation, sales data show 54 days with no cars sold, 117 days with 1 car
sold, 72 days with 2 cars sold, 42 days with 3 cars sold, 12 days with 4 cars
sold, and 3 days with 5 cars sold. Suppose we consider the experiment of

selecting a day and define the random variable of interest as x is the


number of cars sold during a day.
x f(x) xf(x)
0 0.18 0(0.18)=0.00
1 0.39 1(0.39)=0.39
2 0.24 2(0.24)=0.48
3 0.14 3(0.14)=0.42
4 0.04 4(0.04)=0.16
5 0.01 5(0.01)=0.05
Total 1.50
µ= ∑ xf (x ) = 1.5
Dr. Ayat Al-Meanazel (A.A.U) 91 / 190
5.3 Expected Value and Variance

x x −µ (x − µ )2 f(x) (x − µ )2 f (x )
0 0-1.50=-1.50 2.25 0.18 2.25(0.18)=0.4050
1 1-1.50=-0.50 0.25 0.39 0.25(0.39)=0.0975
2 2-1.50=0.50 0.25 0.24 0.25(0.24)=0.0600
3 3-1.50=1.50 2.25 0.14 2.25(0.14)=0.3150
4 4-1.50=2.50 6.25 0.04 6.25(0.04)=0.2500
5 5-1.50=3.50 12.25 0.01 12.25(0.01)=0.1225
Total 1.2500

σ2 = ∑

(x − µ)2 f (x ) = 1.25

σ= σ2 = 1.25 = 1.118

Dr. Ayat Al-Meanazel (A.A.U) 92 / 190


5.3 Expected Value and Variance

Example: A volunteer ambulance service handles 0 to 5 service calls on any


given day. The probability distribution for the number of service calls is as
follows.
Number of Service Calls 0 1 2 3 4 5
Probability 0.10 0.15 0.30 0.20 0.15 0.10

Find the expected number of service calls, the variance and the standard
deviation in the number of service calls.

Dr. Ayat Al-Meanazel (A.A.U) 93 / 190


5.3 Expected Value and Variance

x f(x) xf(x)
0 0.10 0(0.10)=0.00
1 0.15 1(0.15)=0.15
2 0.30 2(0.30)=0.60
3 0.20 3(0.20)=0.60
4 0.15 4(0.15)=0.60
5 0.10 5(0.10)=0.50
Total 2.45

µ= ∑ xf (x ) = 2.45

Dr. Ayat Al-Meanazel (A.A.U) 94 / 190


5.3 Expected Value and Variance

x x −µ (x − µ )2 f(x) (x − µ )2 f (x )
0 0-2.45=-2.45 6.0025 0.10 6.0025(0.10)=0.60025
1 1-2.45=-1.45 2.1025 0.15 2.1025(0.15)=0.31538
2 2-2.45=-0.45 0.2025 0.30 0.2025(0.30)=0.06075
3 3-2.45=0.55 0.3025 0.20 0.3025(0.20)=0.06050
4 4-2.45=1.55 2.4025 0.15 2.4025(0.15)=0.36038
5 5-2.45=2.55 6.5025 0.10 6.5025(0.10)=0.65025
Total 2.04751

σ2 = ∑

(x − µ)2 f (x ) = 2.04751

σ= σ2 = 2.04751 = 1.43091

Dr. Ayat Al-Meanazel (A.A.U) 95 / 190


5.4 Binomial Probability Distribution

Properties of a Binomial Experiment


An experiment is a binomial experiment if it satisfies the following
conditions:
1 There are a fixed number of trials n in advance of experiment.
2 There are only two possible outcomes on each trial.(i.e., Success or
Failure, True or False)
3 The trials are independent. (i.e., the outcome of the current trial do
not affect the outcome of the next trial)
4 The probability of success ,p, is the same from trial to trial.

Dr. Ayat Al-Meanazel (A.A.U) 96 / 190


5.4 Binomial Probability Distribution

Example
Check if the following experiment satisfies the conditions of a binomial
experiment,
You have a box with 4 red balls and 5 green balls. You select a ball, note its
colour and return it to the box. This is done five times. We note the
number of red balls selected.
Let’s check the four conditions:
1 We have a fixed number of trials, n = 5.
2 On each trial we either select a red ball (success) or a green ball
(failure).
3 As we return the ball after each trial, the trials are independent.
4 Each time the probability of success is the same; since
p = P (success ) = P (selectred ) = 49 .
Then the experiment is a binomial experiment.
Dr. Ayat Al-Meanazel (A.A.U) 97 / 190
5.4 Binomial Probability Distribution

Example
Check if the following experiment satisfies the conditions of a binomial
experiment,
You have 7 different classrooms. You will select one student from each
classroom. We note the number of female students selected.
Let’s check the four conditions:
1 We have a fixed number of trials, n = 7.
2 On each trial we either select a female student (success) or a male
student (failure).
3 As we select from different classroom each trial, the trials are
independent.
4 Each time the probability of success is not the same; since the
probability of selecting a female student is different for each classroom.
Then the experiment is not a binomial experiment.
Dr. Ayat Al-Meanazel (A.A.U) 98 / 190
5.4 Binomial Probability Distribution

Binomial Distribution and Probability Function


In a binomial experiment, if X is the total number of success we say that X
is a binomial random variable that follows the binomial distribution
with parameters n and p. Let X ∼ Bin (n, p ) then the probability function
for X is given by
 
n r
P (X = r ) = f (r ) = p (1 − p )(n−r ) , for r = 0, 1, 2, . . . , n
r

where
f (r ) = the probability of r successes in n trials
n = the number of trials
(nr) = n!
n Cr = r ! ( n − r ) !
p = the probability of a success on any one trial
1−p = the probability of a failure on any one trial

Dr. Ayat Al-Meanazel (A.A.U) 99 / 190


5.4 Binomial Probability Distribution

Binomial Expected Value and Variance


Let X ∼ Bin (n, p ) then the expected value and variance are given by

E (x ) = µ = np
Var (x ) = σ2 = np (1 − p )

Dr. Ayat Al-Meanazel (A.A.U) 100 / 190


5.4 Binomial Probability Distribution

Example
In San Francisco, 30% of workers take public transportation daily. In a
sample of 10 workers, what is the probability that
(a) exactly three workers take public transportation daily?
(b) at least three workers take public transportation daily?

Let X be the total number of workers taking public transportation daily,


then X ∼ Bin (10, 0.30).
(a) P (r = 3) = f (3) = (10 3
3 )(0.30) (1 − 0.30)
(10−3) = 0.2668.

(b) P (r ≥ 3) = 1 − P (r < 3) = 1 − [f (2) + f (1) + f (0)] = 0.6172.

Dr. Ayat Al-Meanazel (A.A.U) 101 / 190


5.4 Binomial Probability Distribution

Example
Twenty-three percent of automobiles are not covered by insurance. On a
particular weekend, 35 automobiles are involved in traffic accidents.
(a) What is the expected number of these automobiles that are not
covered by insurance?
(b) What is the variance and standard deviation?
Let X be the total number of automobiles that are not covered by
insurance, then X ∼ Bin (35, 0.23).
(a) E (x ) = µ = (35)(0.23) = 8.05.
(b) Var (x√) = σ2√
= (35)(0.23)(0.77) = 6.1985, and
σ = σ2 = 6.1985 = 2.490.

Dr. Ayat Al-Meanazel (A.A.U) 102 / 190


5.4 Binomial Probability Distribution

Geometric Probability Distribution


An experiment is a geometric experiment if it satisfies the following
conditions:
1 There is no fixed number of trials in advance of experiment.
2 There are only two possible outcomes on each trial.(i.e., Success or
Failure, True or False)
3 The trials are independent. (i.e., the outcome of the current trial do
not affect the outcome of the next trial)
4 The probability of success ,p, is the same from trial to trial.

Dr. Ayat Al-Meanazel (A.A.U) 103 / 190


5.4 Binomial Probability Distribution

Geometric Distribution and Probability Function


In a geometric experiment, if X is the number of trials until trhe first
success we say that X is a geometric random variable that follows the
geometric distribution with parameter p. Let X ∼ Geo (p ) then the
probability function for X is given by

P ( X = r ) = f ( r ) = p ( 1 − p ) (r −1) , for r = 1, 2, . . .

where
f (r ) = the probability of first success in the r th trial
p = the probability of a success on any one trial
1−p = the probability of a failure on any one trial

Dr. Ayat Al-Meanazel (A.A.U) 104 / 190


5.4 Binomial Probability Distribution

Geometric Expected Value and Variance


Let X ∼ Bin (n, p ) then the expected value and variance are given by

E (x ) = µ = p1
Var (x ) = σ2 = 1p−2p

Dr. Ayat Al-Meanazel (A.A.U) 105 / 190


5.4 Binomial Probability Distribution

Example
Products produced by a machine has a 3% defective rate. What is the
probability that the first defective occurs in
(a) the fifth item inspected?
(b) the first five inspections?

Let X be the number of inspection until the first defective, then


X ∼ Geo (0.03).
(a) P (r = 5) = f (5) = (0.03)(0.97)4 = 0.0266.
(b) P (r ≤ 5) = 1 − P (First 5 non-defective) = 1 − (0.97)5 = 0.1413.

Dr. Ayat Al-Meanazel (A.A.U) 106 / 190


Binomial vs Geometric Probability Distribution

Binomial Dis. Geometric Dis.


# of trials n Know Unknow
# of outcomes 2 2
P(Success)= p Fixed Fixed
Random variable X Total number of Number of trials
success in n trials until the 1st success
X values x=0,1,2,. . . ,n x=1,2,3,. . . ,∞

Dr. Ayat Al-Meanazel (A.A.U) 107 / 190


Binomial vs Geometric Probability Distribution

Example: A box contains 4 red, 6 blue, and 5 green balls.

Case 1: If you select one ball with replacement, what is the probability that
the first blue ball is the seventh selection?

Case 2: If you select 8 balls at random, what is the probability that you
will select 3 green balls?

Dr. Ayat Al-Meanazel (A.A.U) 108 / 190


Binomial vs Geometric Probability Distribution

Example: A box contains 4 red, 6 blue, and 5 green balls.

Case 1: If you select one ball with replacement, what is the probability that
the first blue ball is the seventh selection?

Let X is the number of trials until the first blue ball


6
p = p (B ) = = 0.4
15
X ∼ Geo (0.4)
p (x = 7) = (0.4)(1 − 0.4)6 = 0.0187

Dr. Ayat Al-Meanazel (A.A.U) 109 / 190


Binomial vs Geometric Probability Distribution

Case 2: If you select 8 balls at random, what is the probability that you
will select 3 green balls?

Let X is the total number of green balls selected


5
p = p (G ) = = 0.3
15
X ∼ Bin (8, 0.3)
 
8
p (x = 3) = (0.3)3 (1 − 0.3)5 = 0.2541
3

Dr. Ayat Al-Meanazel (A.A.U) 110 / 190


5.5 Poisson Probability Distribution

Properties of a Poisson Experiment


An experiment is a Poisson experiment if it satisfies the following
conditions:
1 The expected value or mean number of occurrences µ is the same for
any two intervals of equal length.
2 The occurrence or nonoccurrence in any interval is independent of the
occurrence or nonoccurrence in any other interval.

Dr. Ayat Al-Meanazel (A.A.U) 111 / 190


5.5 Poisson Probability Distribution

Poisson Distribution and Probability Function


In a Poisson experiment, if X is the number of occurrences over a specified
interval of time or space we say that X is a Poisson random variable that
follows the Poisson distribution with parameter µ. Let X ∼ Poi (µ) then
the probability function for X is given by

µr e − µ
P (X = r ) = f (r ) = , for r = 0, 1, 2, . . .
r!
where

f (r ) = the probability of r occurrences in an interval


µ = expected value or mean number of occurrences in an interval

Dr. Ayat Al-Meanazel (A.A.U) 112 / 190


5.5 Poisson Probability Distribution

Poisson Expected Value and Variance


Let X ∼ Poi (µ) then the expected value and variance are given by

E (x ) = µ
Var (x ) = µ

Dr. Ayat Al-Meanazel (A.A.U) 113 / 190


5.5 Poisson Probability Distribution

Example
During the period of time that a local university takes phone-in
registrations, calls come in at the rate of one every two minutes.
(a) What is the expected number of calls in one hour?
(b) What is the probability of four calls in six minutes?

Dr. Ayat Al-Meanazel (A.A.U) 114 / 190


5.5 Poisson Probability Distribution

Let X be the number of calls.

# of calls → Period of time


(a) 1 → 2 minutes
?? → 1 hour = 60 minutes

60
Then number of calls in 1 hour = 2 = 30.

Dr. Ayat Al-Meanazel (A.A.U) 115 / 190


5.5 Poisson Probability Distribution

# of calls → Period of time


(b) 1 → 2 minutes
?? → 6 minutes

6
Then number of calls in 6 minutes = 2 = 3, so our µ = 3 then

34 e −3
P (x = 4) = f (4) = = 0.1680.
4!

Dr. Ayat Al-Meanazel (A.A.U) 116 / 190


5.5 Poisson Probability Distribution

Example
An average of 15 aircraft accidents occur each year.
(a) Compute the mean number of aircraft accidents per month.
(b) Compute the probability of no accidents during a month.
(c) Compute the probability of exactly one accident during a month.
(d) Compute the probability of more than one accident during a month.

Dr. Ayat Al-Meanazel (A.A.U) 117 / 190


5.5 Poisson Probability Distribution

# of accident → Period of time


(a) 15 → 1 year = 12 months
?? → 1 month

15
Then number of accident in 1 month = 12 = 1.25.

Dr. Ayat Al-Meanazel (A.A.U) 118 / 190


5.5 Poisson Probability Distribution

1.250 e −1.25
(b) P (x = 0) = f (0) = 0! = 0.2865.

1.251 e −1.25
(c) P (x = 1) = f (1) = 1! = 0.3581.

(d) P (x > 1) = 1 − P (x ≤ 1) = 1 − [f (0) + f (1)] = 0.3554.

Dr. Ayat Al-Meanazel (A.A.U) 119 / 190


Probability and Statistics
401237
Chapter 6: Continuous Probability Distributions

Dr. Ayat Al-Meanazel

Al Albayt University

Dr. Ayat Al-Meanazel (A.A.U) 120 / 190


6.2 Normal Probability Distribution

Normal distribution arise naturally for many different random variables, such
as, heights, test scores, and lengths.
Normal Distribution
Normal distribution is symmetrical, bell- shaped density curves defined by a
mean µ and a standard deviation σ. We say that X is normally distributed
if X ∼ N (µ, σ ).

Dr. Ayat Al-Meanazel (A.A.U) 121 / 190


6.2 Normal Probability Distribution

Mean Standard deviation

Dr. Ayat Al-Meanazel (A.A.U) 122 / 190


6.2 Normal Probability Distribution

The 68%, 95%, and 99% Rule


Despite the different means and standard deviations, all normal distributions
have the following rule in common. For every normal distribution
approximately
68% of all values fall within one standard deviation of the mean
( µ ± σ ).
95% of all values fall within two standard deviations of the mean
(µ ± 2σ).
99.7% of all values fall within three standard deviations of the mean
(µ ± 3σ).

Dr. Ayat Al-Meanazel (A.A.U) 123 / 190


6.2 Normal Probability Distribution

Dr. Ayat Al-Meanazel (A.A.U) 124 / 190


6.2 Normal Probability Distribution

To compute the probabilities under normal distribution N (µ, σ), we will


transform the normal variable X into a standard normal variable Z , we find
the z-score
X −µ
Z = .
σ
The standard normal distribution is a normal distribution with mean µ = 0
and standard deviation σ = 1. (i.e., Z ∼ N (0, 1))

Dr. Ayat Al-Meanazel (A.A.U) 125 / 190


6.2 Normal Probability Distribution
Example If the heights X of adult male in Jordan follow a normal
distribution with mean 179 cm and standard deviation 6 cm, i.e.,
X ∼ N (179, 6).
If we measure one man’s height to be 188 cm. The z-score for this man?s
height is
X −µ 188 − 179
Z = = = 1.5
σ 6
In other words, this man?s height is 1.5 standard deviations above the
population mean.

Dr. Ayat Al-Meanazel (A.A.U) 126 / 190


6.2 Normal Probability Distribution

Probability Curve

P (z ≤ 1.00) =

= 0.8413

Dr. Ayat Al-Meanazel (A.A.U) 127 / 190


6.2 Normal Probability Distribution

Probability Curve

P (z ≥ 1.58) =

1 − P (z ≤ 1.58) =
1 - 0.9429 = 0.0571

Dr. Ayat Al-Meanazel (A.A.U) 128 / 190


6.2 Normal Probability Distribution

Probability Curve

P (−0.50 ≤ z ≤ 1.25) =

P (z ≤ 1.25) − P (z ≤ −0.50)=
0.8944 - 0.3085 = 0.5859

Dr. Ayat Al-Meanazel (A.A.U) 129 / 190


6.2 Normal Probability Distribution

Probability Curve

P (−1.00 ≤ z ≤ 1.00) =

P (z ≤ 1.00) − P (z ≤ −1.00)=
0.8413 - 0.1587 = 0.6826

P (−1.00 ≤ z ≤ 1.00) = 1 − 2P (z ≤ −1.00) = 1 − 2(0.1587) = 0.6826

Dr. Ayat Al-Meanazel (A.A.U) 130 / 190


6.2 Normal Probability Distribution

Summary
P (z ≤ b ) → z-table

P (z ≥ a ) → 1 − P (z ≤ a ) → z-table

P (a ≤ z ≤ b ) → P (z ≤ b ) − P (z ≤ a ) → z-table

P (−a ≤ z ≤ a) → 1 − 2P (z ≤ −a) → z-table

P (z = b ) = 0
P (z ≤ −b ) = P (z ≥ b )
P (z ≥ −b ) = P (z ≤ b )
P (|z | ≤ b ) = P (−b ≤ z ≤ b )
P (|z | ≥ b ) = 2P (z ≤ −b )

Dr. Ayat Al-Meanazel (A.A.U) 131 / 190


6.2 Normal Probability Distribution

Example Find:
1 P (z < 0.36) = 0.6406.

2 P (−2.05 ≤ z ≤ 0.6) = P (z ≤ 0.6) − P (z ≤ −2.05) =


0.7257 − 0.0202 = 0.7055.

3 P (z > 0.01) = 1 − P (z < 0.01) = 1 − 0.5040 = 0.4960.

4 P (−0.4 ≤ z ≤ 0.4) = 1 − 2P (z ≤ −0.4) = 1 − 2(0.3446) = 0.3108.

Dr. Ayat Al-Meanazel (A.A.U) 132 / 190


6.2 Normal Probability Distribution
Example Find the value of a for each situation.
1 P (z < a ) = 0.3300. from the table a = −0.44.

2 P (z > a) = 0.9750.
P (z > a) =0.9750
1 − P (z < a) =0.9750
P (z < a) =1 − 0.9750
P (z < a) =0.0250 from the table a = −1.96
3 P (0 < z < a) = 0.4700.
P (0 < z < a) =0.4700
P (z < a) − P (z < 0) =0.4700
P (z < a) − 0.5 =0.4700
P (z < a) =0.4700 + 0.5
P (z < a) =0.9700 from the table a = 1.88
Dr. Ayat Al-Meanazel (A.A.U) 133 / 190
6.2 Normal Probability Distribution

Example Find the value of a for each situation P (−a < z < a) = 0.2050.

P (−a < z < a) =0.2050


1 − 2P (z < −a) =0.2050
2P (z < −a) =1 − 0.2050
2P (z < −a) =0.7950
P (z < −a) =0.3975

from the table we have −a = −0.26 → a = 0.26.

Dr. Ayat Al-Meanazel (A.A.U) 134 / 190


6.2 Normal Probability Distribution

Example The time needed to complete a final examination in a particular


college course is normally distributed with a mean of 80 minutes and
variance of 100 minutes2 . Answer the following questions.

Let X be the time needed to complete the final examination, then


X ∼ N (80, 10).

Dr. Ayat Al-Meanazel (A.A.U) 135 / 190


6.2 Normal Probability Distribution
1 What is the probability of completing the exam in one hour or less?
x −µ 60 − 80
 
P (x < 60) =P <
σ 10
=P (z < −2.00)
=0.0228
2 What is the probability that a student will complete the exam in more
than 60 minutes but less than 75 minutes?
60 − 80 x −µ 75 − 80
 
P (60 < x < 75) =P < <
10 σ 10
=P (−2.00 < z < −0.50)
=P (z < −0.50) − P (z < −2.00)
=0.3085 − 0.0228
=0.2857
Dr. Ayat Al-Meanazel (A.A.U) 136 / 190
6.2 Normal Probability Distribution

Example According to the Sleep Foundation, the average night?s sleep is


6.8 hours. Assume the standard deviation is 0.6 hours and that the
probability distribution is normal.

Let X be sleeping hours, then X ∼ N (6.8, 0.6).

Dr. Ayat Al-Meanazel (A.A.U) 137 / 190


6.2 Normal Probability Distribution
1 What is the probability that a randomly selected person sleeps at least
7 hours?
P (x ≥ 7) =1 − P (x < 7)
x −µ 7 − 6.8
 
=1 − P <
σ 0.6
=1 − P (z < 0.33)
=1 − 0.6293 = 0.3707.
2 Doctors suggest getting between 7 and 8 hours of sleep each night.
What percentage of the population gets this much sleep?
7 − 6.8 x −µ 8 − 6.8
 
P (7 < x < 8) =P < <
0.6 σ 0.6
=P (0.33 < z < 2.00)
=P (z < 2.00) − P (z < 0.33)
=0.9772 − 0.6293 = 0.3479.
Dr. Ayat Al-Meanazel (A.A.U) 138 / 190
6.2 Normal Probability Distribution
Example The mean hourly pay rate for worker is 12 JD, and the standard
deviation is 2 JD. Assume that pay rates are normally distributed. How high
must the hourly rate be to put the worker in the top 10% with respect to
pay? Let X be the pay rate, then X ∼ N (12, 2).

P (x ≥ a) =0.10
1 − P (x < a) =0.1000
P (x < a) =1 − 0.1000
P (x < a ) =0.9000
x −µ a − 12
P( < ) =0.9000
σ 2
a − 12
P (z < ) =0.9000
2
a − 12
=1.28 → a = 14.56 JD.
2

Dr. Ayat Al-Meanazel (A.A.U) 139 / 190


6.4 Exponential Probability Distribution
Properties of an Exponential Experiment
An experiment is an exponential experiment if the expected value or
mean time required to complete one success µ, or between the occurrence
of two successes.

Exponential Probability Distribution


In an exponential experiment, if X is the time required to complete one
success, then X is an exponential random variable that follows the
exponential distribution with parameter µ. Let X ∼ Exp (µ) then the
probability function for X is given by
1 − µx
f (x ) = e , for x ≥ 0
µ

µ is the expected value or mean time.


E (x ) = µ and Var (x )= µ2
Dr. Ayat Al-Meanazel (A.A.U) 140 / 190
6.4 Exponential Probability Distribution

Computing Probabilities for the Exponential Distribution


1 P (x = c ) = 0.

x b
1 −µ
= 1 − e− µ .
Rb
2 P (x ≤ b ) = P (x < b ) = 0 µ
e dx

R∞ 1 −µ
x a
3 P (x ≥ a ) = P (x > a ) = a µe dx = e− µ .

x a b
1 −µ
= e− µ − e− µ .
Rb
4 P (a ≤ x ≤ b ) = a µ
e dx

Example In a loading dock the time required to load one truck is 15


minutes. What is the probability that

Dr. Ayat Al-Meanazel (A.A.U) 141 / 190


6.4 Exponential Probability Distribution
1 loading a truck will take 6 minutes or less?
6
P (x ≤ 6) = 1 − e − 15 = 0.3297.
2 the loading time will be between 6 minutes and 18 minutes?
6 18
P (6 ≤ x ≤ 18) = e − 15 − e − 15 = 0.3691.
3 a truck will take at least 11 minutes to load?
11
P (x ≥ 11) = e − 15 = 0.4803.
4 a truck will take at most 20 minutes to load given that it will take
more than 14 minutes to load?
P (x ≤ 20 and x > 14) P (14 < x ≤ 20)
P (x ≤ 20/x > 14) = =
P (x > 14) P (x > 14)
14 20
e − 15 − e − 15
= 14 = 0.3297.
e − 15
Dr. Ayat Al-Meanazel (A.A.U) 142 / 190
6.4 Exponential Probability Distribution

Example The average number of years a laptop will function is 5 years. If a


customer purchased an old laptop which was used for the last two years,
what is the probability that it will function for at least 3 years?

P (x ≥ 5 and x > 2) P (x ≥ 5)
P (x ≥ 5/x > 2) = =
P (x > 2) P (x > 2)
5
e− 5
= 2 = 0.5488.
e− 5
Note,
5
e− 5 5 2 3
= e − 5 + 5 = e − 5 = 0.5488.
e − 52

Dr. Ayat Al-Meanazel (A.A.U) 143 / 190


Poisson vs Exponential Probability Distribution

Poisson Dis. Exponential Dis.


E (X ) Mean number of events Mean time required
in a given period of time to finish one event
Example 3 cakes are done in one day One cake need 45 minutes
to be ready

A doctor treat 15 patients A doctor visit takes


in one week 35 minutes

A cashier helps 9 customers One customer need 20


every 12 hours minutes at the cashier

Dr. Ayat Al-Meanazel (A.A.U) 144 / 190


Probability and Statistics
401237
Chapter 7: Sampling and Sampling Distributions

Dr. Ayat Al-Meanazel

Al Albayt University

Dr. Ayat Al-Meanazel (A.A.U) 145 / 190


7.5 Sampling Distribution of x̄

Sampling Distribution of Sample Mean


Let X be normally distributed with mean µ and standard deviation σ. If we
select simple random sample of size n from the population of X , then we
say that the sample mean is also normally distributed with mean µ and
standard deviation √σn .(i.e., x̄ ∼ N (µ, √σn ))

Random Variable Mean Variance Standard Deviation


xi µ σ2 σ

∑ni=1 xi ∑ni=1 µ = nµ ∑ni=1 σ2 = nσ2 nσ

∑ni=1 xi nµ nσ2 σ2 √σ
x̄ = n n =µ n2
= n n

Dr. Ayat Al-Meanazel (A.A.U) 146 / 190


7.5 Sampling Distribution of x̄

Example
The heights of adults male in Jordan are normally distributed with mean
height 172 cm, and variance 144 cm2 .
1 What is the probability that the height of selected male is less than
175 cm?
2 If we select a simple random sample of 25 males, what is the
probability that the sum of there heights more than 4350 cm?
3 If we select a simple random sample of 25 males, what is the
probability that the sample mean is between 165 cm and 179 cm?

Dr. Ayat Al-Meanazel (A.A.U) 147 / 190


7.5 Sampling Distribution of x̄

Let x be the height of adult male in Jordan, then x ∼ N (172, 12). Which
means µ = 172 and σ = 12.

x −µ 175 − 172
 
P (x < 175) =P <
(1) σ 12
=P (z < 0.25) = 0.5987.

Here we have a sample√ of size n = 25, then we know that


(2)

∑25
i =1 x i ∼ N ( 25µ, 25σ) = N (4300, 60).

∑25
 25
i =1 xi − nµ 4350 − 4300
  
P ∑ xi > 4350 =P √

>
60
i =1
=P (z > 0.83)
=1 − P (z < 0.83) = 1 − 0.7967 = 0.2033.

Dr. Ayat Al-Meanazel (A.A.U) 148 / 190


7.5 Sampling Distribution of x̄

(3) Here we have a sample of size n = 25, then we know that


x̄ ∼ N (µ, √σ25 ) = N (172, 2.4).

165 − 172 x̄ − µ 179 − 172


 
P (165 < x̄ < 179) =P < √ <
2.4 σ/ n 2.4
=P (−2.92 < z < 2.92)
=1 − 2P (z < −2.92) = 1 − 2(0.0018) = 0.9964.

Dr. Ayat Al-Meanazel (A.A.U) 149 / 190


7.5 Sampling Distribution of x̄

Example
The average price of a gallon of gasoline was reported to be $2.34. Use this
price as the population mean, and assume the population standard deviation
is $0.20.
1 What is the probability that the average price of a gallon of gasoline is
within $0.02 of the population mean?
2 What is the probability that the mean price for a sample of 49 service
stations is within $0.04 of the population mean?
3 A sample mean have a 0.95 probability that it is within $0.03 of the
population mean, What is the sample size?

Dr. Ayat Al-Meanazel (A.A.U) 150 / 190


7.5 Sampling Distribution of x̄
Let x be the price of a gallon of gasoline, then x ∼ N (2.34, 0.20). Which
means µ = 2.34 and σ = 0.20.

2.32 − 2.34 x −µ 2.36 − 2.34


 
P (2.32 < x < 2.36) =P < <
0.20 σ 0.20
(1)
=1 − 2P (z < −0.10)
=1 − 2(0.4602) = 1 − 0.9204 = 0.0796.
(2) Here we have a sample of size n = 49, then we know that
x̄ ∼ N (µ, √σ49 ) = N (2.34, 0.03).

2.30 − 2.34 x̄ − µ 2.38 − 2.34


 
P (2.30 < x̄ < 2.38) =P < √ <
0.03 σ/ n 0.03
=P (−1.33 < z < 1.33)
=1 − 2P (z < −1.33)
=1 − 2(0.0918) = 1 − 0.1836 = 0.8164.
Dr. Ayat Al-Meanazel (A.A.U) 151 / 190
7.5 Sampling Distribution of x̄
(3) Here we have a sample os size n, then we know that
x̄ ∼ N (µ, √σn ) = N (2.34, 0.20
√ ).
n

2.31 − 2, 34 x̄ − µ 2.37 − 2.34


 
P (2.31 < x̄ < 2.37) =P √ < σ < √
0.20/ n √ 0.20/ n
n
−0.03
 
0.03
0.95 =P √ <z < √
0.20/ n 0.20/ n
−0.03
 
0.95 =1 − 2P z < √
0.20/ n
−0.03
 
2P z < √ =1 − 0.95
0.20/ n
−0.03
 
0.05
P z< √ =
0.20/ n 2
−0.03
 
P z< √ =0.025
0.20/ n
Dr. Ayat Al-Meanazel (A.A.U) 152 / 190
7.5 Sampling Distribution of x̄

Then We have
−0.03
√ = − 1.96
0.20/ n

−0.03 n
= − 1.96
0.20√
−0.03 n = − 0.392
√ −0.392
n =13 (we round −0.03 = 13.06̄ to nearest integer)
n =169.

Dr. Ayat Al-Meanazel (A.A.U) 153 / 190


Probability and Statistics
401237
Chapter 8: Interval Estimation

Dr. Ayat Al-Meanazel

Al Albayt University

Dr. Ayat Al-Meanazel (A.A.U) 154 / 190


Population Mean: Standard Deviation Known

Margin of Error and the Interval Estimation


If a sample of size n with mean x̄ was selected from a population with
unknown mean µ and known standard deviation σ. Then the (1 − α)%
confidence interval for estimating the true population mean µ is given by
σ such that σ σ
x̄ ± z 2α √ −−−−−→ x̄ − z 2α √ ≤ µ ≤ x̄ + z 2α √
n n n

Where,
The Margin of error= z 2α √σn .
z 2α is the z value providing an area of α/2 in the upper tail.

Dr. Ayat Al-Meanazel (A.A.U) 155 / 190


Population Mean: Standard Deviation Known

Dr. Ayat Al-Meanazel (A.A.U) 156 / 190


Population Mean: Standard Deviation Known

To construct (1 − α)% Confidence Interval for the Population


Mean you will need the following values:
Population s.d. σ
Sample size n
Sample mean x̄
Confidence level (1 − α)%
z value Using the t table

How to find zα/2 ?

Dr. Ayat Al-Meanazel (A.A.U) 157 / 190


Population Mean: Standard Deviation Known
Example In an effort to estimate the mean amount spent per customer for
dinner at a major restaurant, data were collected for a sample of 49
customers. Assume a population standard deviation of 5 JD. If the sample
mean is 24.80 JD, what is the 98% confidence interval for the population
mean?
Population s.d. σ=5
Sample size n = 49
Sample mean x̄ = 24.80
Confidence level 0.98
z value zα/2 = 2.326
 
σ 5
Margin of error = z 2α √n = 2.326 √49 = 1.7

σ
x̄ ± z 2α √ = 24.8 ± 1.7 then µ ∈ (23.1, 26.5).
n

Dr. Ayat Al-Meanazel (A.A.U) 158 / 190


Population Mean: Standard Deviation Known
Example A 95% confidence interval for a population mean was reported to
be 152 to 160. If σ = 15, what sample size was used in this study?
Population s.d. σ = 15
Confidence level 0.95
z value zα/2 = 1.96
Upper value of µ → x̄ + z 2α √σn 160
Lower value of µ → x̄ − z 2α √σn 152

Note,
 
σ σ σ
x̄ + z 2α √ − x̄ − z 2α √ =2z 2α √
n n n
 
15
160 − 152 =2(1.96) √
n
 2
2(1.96)(15)
(round to nearest integer) n = = 54
8
Dr. Ayat Al-Meanazel (A.A.U) 159 / 190
Population Mean: Standard Deviation Unknown

Margin of Error and the Interval Estimation


If a sample of size n with mean x̄ and standard deviation s was selected
from a population with unknown mean µ and standard deviation σ. Then
the (1 − α)% confidence interval for estimating the true population mean
µ is given by
s such that s s
x̄ ± t 2α √ −−−−−→ x̄ − t 2α √ ≤ µ ≤ x̄ + t 2α √
n n n

Where,
The Margin of error= t 2α √sn .
t 2α is the t value providing an area of α/2 in the upper tail of the t
distribution with n − 1 degrees of freedom.

Dr. Ayat Al-Meanazel (A.A.U) 160 / 190


Population Mean: Standard Deviation Unknown

Dr. Ayat Al-Meanazel (A.A.U) 161 / 190


Population Mean: Standard Deviation Unknown

To construct (1 − α)% Confidence Interval for the Population


Mean you will need the following values:
Sample size n
Sample mean x̄
Sample s.d. s
Confidence level (1 − α)%
t value Using the t table

How to find tα/2 ?

Step 1 Degrees of freedom df = n − 1


α
Step 2 Upper tail probability 2
Step 3 t value Intersection between row
n − 1 and column 2α

Dr. Ayat Al-Meanazel (A.A.U) 162 / 190


Population Mean: Standard Deviation Unknown
Example A simple random sample with n = 56 provided a sample mean of
22.5 and a sample standard deviation of 4.4. Develop a 80% confidence
interval for the population mean.
Sample size n = 25
Sample mean x̄ = 22.5
Sample s.d. s = 4.4
then
Confidence level 1 − α = 0.80 −−→ α = 0.20
Upper tail probability α/2 = 0.20
2 = 0.10
df n − 1 = 25 − 1 = 24
t value tα/2 = 1.318
 
s 4.4
Margin of error = t 2α √n = 1.318 √25 = 1.16

s
x̄ ± t 2α √ = 22.5 ± 1.16 then µ ∈ (21.34, 23.66).
n

Dr. Ayat Al-Meanazel (A.A.U) 163 / 190


Population Mean: Standard Deviation Unknown
Example A sample of 81 weekly reports showed a sample mean of 19.5
customer contact customers service per week. The sample standard
deviation was 5.2. Provide 90% confidence interval for the population mean
number of weekly customer contacts.
Sample size n = 81
Sample mean x̄ = 19.5
Sample s.d. s = 5.2
then
Confidence level 1 − α = 0.90 −−→ α = 0.10
Upper tail probability α/2 = 0.10 2 = 0.05
df n − 1 = 81 − 1 = 80
t value tα/2 = 1.664
 
s 5.2
Margin of error = t 2α n = 1.664
√ √
81
= 0.96
s
x̄ ± t 2α √ = 19.5 ± 0.96 then µ ∈ (18.54, 20.46).
n
Dr. Ayat Al-Meanazel (A.A.U) 164 / 190
Probability and Statistics
401237
Chapter 9: Hypotheses Tests

Dr. Ayat Al-Meanazel

Al Albayt University

Dr. Ayat Al-Meanazel (A.A.U) 165 / 190


Hypotheses Tests about a Population Mean

Hypothesis testing can be used to determine whether a statement about the


value of a population mean µ should or should not be rejected.

In hypothesis testing we begin by making a unconfirmed assumption about


a population mean µ. This claim is called the null hypotheses and is
denoted by H0 .

We then define another hypothesis, called the alternative hypothesis, which


is the opposite of what is stated in the null hypothesis. The alternative
hypothesis is denoted by Ha .

Dr. Ayat Al-Meanazel (A.A.U) 166 / 190


Steps of Hypothesis Testing about a Population Mean

Step 1. Develop the Null and Alternative Hypothesis


Type of Test Hypotheses Keywords
Lower tail H0 : µ ≥ µ 0 Less than or at most µ0
Ha : µ < µ 0

Upper tail H0 : µ ≤ µ 0 More than or at least µ0


Ha : µ > µ 0

Two tailed H0 : µ = µ 0 Different or not equal to µ0


Ha : µ 6 = µ 0

Dr. Ayat Al-Meanazel (A.A.U) 167 / 190


Steps of Hypothesis Testing about a Population Mean

Step 2. Calculate the Test Statistic


Let x̄ be the mean of a sample of size n, and µ0 is the claimed population
mean. Then if the
1 Population Standard Deviation σ Known, the test statistic is given by
x̄ − µ0
ztest = √
σ/ n

where σ is the population standard deviation.


2 Population Standard Deviation σ Unknown, the test statistic is given by
x̄ − µ0
ttest = √
s/ n

where s is the sample standard deviation.

Dr. Ayat Al-Meanazel (A.A.U) 168 / 190


Steps of Hypothesis Testing about a Population Mean
Step 3. Determine the Critical Value and Rejection Rule
Type of Test Rejection Region Critical Value

3*Lower tail 3* zα
t(1−α,n−1)

3*Upper tail 3* z1 − α
t(α,n−1)

Dr. Ayat Al-Meanazel (A.A.U) 169 / 190


Steps of Hypothesis Testing about a Population Mean

Step 4. Conclusion
Type of Test Rule Conclusion
Lower tail ztest < zα 2*We reject H0
ttest < t(1−α,n−1)
Otherwise, we fail to reject H0
Upper tail ztest > z1−α 2*We reject H0
ttest > t(α,n−1)
Otherwise, we fail to reject H0
Two tailed |ztest | > z 2α 2*We reject H0
|ttest | > t( 2α ,n−1)
Otherwise, we fail to reject H0

Dr. Ayat Al-Meanazel (A.A.U) 170 / 190


Hypothesis Testing about a Population Mean

Example
Consider the following hypothesis test:

H0 : µ ≥80
Ha : µ <80

A sample of 100 is used and the population standard deviation is 12.


Compute and state your conclusion for sample mean 77, use α = 0.01.

µ0 80
σ 12
n 100
x̄ 77
α 0.01

Dr. Ayat Al-Meanazel (A.A.U) 171 / 190


Hypothesis Testing about a Population Mean

Step 2. Calculate your test statistics.


x̄ − µ0 77 − 80
ztest = √ = √ = −2.50
σ/ n 12/ 100
Step 3. Determine your rejection region (i.e. critical value).

Lower tail test → critical value = zα = z0.01 = −2.33

Step 4. Conclusion.

We will reject H0 since ztest = −2.50 < −2.33 = zα .

Then the true population mean µ is less than 80.

Dr. Ayat Al-Meanazel (A.A.U) 172 / 190


Hypothesis Testing about a Population Mean

Example
Consider the following hypothesis test:

H0 : µ ≤50
Ha : µ >50

A sample of 25 is used, compute and state your conclusion for sample mean
62 and variance 64, use α = 0.05.

µ0 50
s 8
n 25
x̄ 62
α 0.05

Dr. Ayat Al-Meanazel (A.A.U) 173 / 190


Hypothesis Testing about a Population Mean

Step 2. Calculate your test statistics.


x̄ − µ0 62 − 50
ttest = √ = √ = 7.50
s/ n 8/ 25
Step 3. Determine your rejection region (i.e. critical value).

Upper tail test → critical value = t(α,n−1) = t(0.05,24) = 1.711

Step 4. Conclusion.

We will reject H0 since ttest = 7.50 > 1.711 = t(α,n−1) .

Then the true population mean µ is more than 80.

Dr. Ayat Al-Meanazel (A.A.U) 174 / 190


Hypothesis Testing about a Population Mean

Example
Consider the following hypothesis test:

H0 : µ =15
Ha : µ 6=15

A sample of 16 is used, compute and state your conclusion for a population


with standard deviation 11 and sample mean 12. Use α = 0.25.

µ0 15
σ 11
n 16
x̄ 12
α 0.25

Dr. Ayat Al-Meanazel (A.A.U) 175 / 190


Hypothesis Testing about a Population Mean

Step 2. Calculate your test statistics.


x̄ − µ0 12 − 15
ztest = √ = √ = −1.09
σ/ n 11/ 16
Step 3. Determine your rejection region (i.e. critical value).

Two tailed test → critical value = z 2α = z0.125 = −1.15

Step 4. Conclusion.

We fail to reject H0 since |ztest | = 1.09 < 1.15 = |z 2α |.

Then the true population mean µ is equal to 15.

Dr. Ayat Al-Meanazel (A.A.U) 176 / 190


Hypothesis Testing about a Population Mean

Example
Individuals filing income tax returns prior to March 31 received an average
refund of 1056 JD. A researcher suggests that the population of late filers
who file their tax return during the last five days of the income tax period
has a standard deviation of 1600 JD, and will receive less refunds than do
early filers. For a sample of 400 individuals who filed a tax return during the
last five days the sample mean refund was 910 JD, develop appropriate
hypothesis using α = 0.05 to test the previous suggestion.

µ0 1056
σ 1600
Ha µ < 1056
n 400
x̄ 910
α 0.05

Dr. Ayat Al-Meanazel (A.A.U) 177 / 190


Hypothesis Testing about a Population Mean

Step 1. State your hypotheses.

H0 : µ ≥ 1056 vs Ha : µ < 1056

Step 2. Calculate your test statistics.


x̄ − µ0 910 − 1056
ztest = √ = √ = −1.825
σ/ n 1600/ 400
Step 3. Determine your rejection region (i.e. critical value).

Lower tail test → critical value = zα = z0.05 = −1.645

Step 4. Conclusion.
We will reject H0 since ztest = −1.825 < −1.645 = zα .
There is a sufficient evidence that the average refund is less than 1056 JD.

Dr. Ayat Al-Meanazel (A.A.U) 178 / 190


Hypothesis Testing about a Population Mean

Example
For Jordan, the mean monthly Internet bill is 32.79 JD per household. A
sample of 30 households in southern Jordan showed a sample mean of 35.63
JD with standard deviation of 5.60 JD. Formulate hypothesis for a test to
determine whether the sample data support the claim that the mean
monthly Internet bill in southern Jordan is more than the national mean of
32.79 JD at α = 0.15.

µ0 32.79
s 5.60
Ha µ > 32.79
n 30
x̄ 35.63
α 0.15

Dr. Ayat Al-Meanazel (A.A.U) 179 / 190


Hypothesis Testing about a Population Mean

Step 1. State your hypotheses.

H0 : µ ≤ 32.79 vs Ha : µ > 32.79

Step 2. Calculate your test statistics.


x̄ − µ0 35.63 − 32.79
ttest = √ = √ = 2.778
s/ n 5.60/ 30
Step 3. Determine your rejection region (i.e. critical value).

Upper tail test → critical value = t(α,n−1) = t(0.15,29) = 1.055

Step 4. Conclusion.
We will reject H0 since ttest = 2.778 > 1.055 = t(α,n−1) .
There is a sufficient evidence that the mean monthly Internet bill in
southern Jordan is more than the national mean.
Dr. Ayat Al-Meanazel (A.A.U) 180 / 190
Hypothesis Testing about a Population Mean

Example
The average hourly earnings for production workers assumed to be 14.32 JD
per hour. Assume the population standard deviation σ = 9.45JD. If we
have a sample of 75 production workers with a sample mean of 14.68 JD
per hour, can we conclude that the mean hourly earnings is different than
14.32 JD?
Use α = 0.20.

µ0 14.32
σ 9.45
Ha µ 6= 14.32
n 75
x̄ 14.68
α 0.20

Dr. Ayat Al-Meanazel (A.A.U) 181 / 190


Hypothesis Testing about a Population Mean

Step 1. State your hypotheses.

H0 : µ = 14.32 vs Ha : µ 6= 14.32

Step 2. Calculate your test statistics.


x̄ − µ0 14.68 − 14.32
ztest = √ = √ = 0.330
σ/ n 9.45/ 75
Step 3. Determine your rejection region (i.e. critical value).

Two tailed test → critical value = z 2α = z0.10 = −1.28

Step 4. Conclusion.
We fail to reject H0 since |ztest | = 0.330 < 1.28 = |z 2α |.
There is no sufficient evidence that the mean hourly earnings is different
than 14.32 JD.
Dr. Ayat Al-Meanazel (A.A.U) 182 / 190
Probability and Statistics
401237
Chapter 14: Simple Linear Regression

Dr. Ayat Al-Meanazel

Al Albayt University

Dr. Ayat Al-Meanazel (A.A.U) 183 / 190


Simple Linear Regression Equation

The equation that describes how the expected value of y , denoted E (y ), is


related to x is called the regression equation. The regression equation for
simple linear regression follows

E (y ) = β 0 + β 1 x

Dr. Ayat Al-Meanazel (A.A.U) 184 / 190


Estimated Simple Linear Regression Equation

The estimated regression equation for simple linear regression follows

ŷ = b0 + b1 x

Where, b0 is the y intercept and b1 is the slope. Also we can say as the
value of x increases by one unit, we expect the value of y to increase
| {z } or
if b1 >0

| {z } by |b1 |.
decrease
if b1 <0

Dr. Ayat Al-Meanazel (A.A.U) 185 / 190


Estimated Simple Linear Regression Equation

Steps to Find the Linear Regression Equation


Let (xi , yi ) be your data set for i = 1, 2, . . . , n
Step 1: Calculate the means

∑ni=1 xi ∑ni=1 yi
x̄ = and ȳ =
n n
Step 2: Calculate the slope

∑ni=1 (xi − x̄ )(yi − ȳ )


b1 =
∑ni=1 (xi − x̄ )2
Step 3: Calculate the y intercept

b0 = ȳ − b1 x̄

then we have the linear regression equation ŷ = b0 + b1 x.


Dr. Ayat Al-Meanazel (A.A.U) 186 / 190
Estimated Simple Linear Regression Equation

Example
A sample of pizza restaurants from 10 different cities was selected, the xi is
the size of the population (in thousands) and yi is the sales (in thousands of
JD). The values of xi and yi for all i = 1, 2, . . . , 10 are shown in the table
below

i 1 2 3 4 5 6 7 8 9 10
xi 2 6 8 8 12 16 20 20 22 26
yi 58 105 88 118 117 137 157 169 149 202

Find the linear regression equation ŷ = b0 + b1 x.

Dr. Ayat Al-Meanazel (A.A.U) 187 / 190


Simple Linear Regression Equation

For any data set (xi , yi ) with i = 1, 2, . . . , n, we can find the following:
the linear regression equation ŷ = b0 + b1 x, where A = b0 and
B = b1 .
the estimated value of y for any given x value.
the error in the estimation Error = y − ŷ .
the correlation coefficient r , where

(negative linear ) − 1 |{z} −→ 1 (positive linear )


←− r |{z}
close to close to

the coefficient of determination r 2 , which can be interpreted as the


percentage of y that can be explained by x.

Dr. Ayat Al-Meanazel (A.A.U) 188 / 190


Example

The following data were collected on the height (inches) and weight
(pounds) of adult swimmers.

Height 68 64 62 65 66
Weight 132 108 102 115 128

1 If a swimmer?s height is 63 inches, what would you estimate his/her


weight to be?
2 what is the error in the estimation for a swimmer?s height of 64 inches?
3 what is the sample correlation coefficient between height and weight?
Does it reflect a strong or weak relationship between swimmer’s height
and weight?
4 compute the coefficient of determination. What percentage of the
variation in weight can be explained by the height?

Dr. Ayat Al-Meanazel (A.A.U) 189 / 190


Example
The linear regression equation is given by ŷ = −240.5 + 5.5x
1 The estimated weight for a swimmer with height 63 inches is

ŷ = −240.5 + 5.5(63) = 106 pounds


2 The estimated weight for a swimmer with height 64 inches is

ŷ = −240.5 + 5.5(64) = 111.5 pounds

then the error in the estimation is

Error = y − ŷ = 108 − 111.5 = −3.5


3 The sample correlation coefficient between height and weight is
r = 0.96, then there is a strong positive relationship between
swimmer’s height and weight.
4 The coefficient of determination is r 2 = 0.9216, then the percentage of
the variation in weight that can be explained by the height is 92.16%.
Dr. Ayat Al-Meanazel (A.A.U) 190 / 190
Example

The following data are the monthly salaries and the grade point averages for
students who obtained a bachelor’s degree in business administration.

GPA 2.6 3.4 3.6 3.2 3.5 2.9


Monthly Salary 3300 3600 4000 3500 3900 3600

1 If a student’s GPA is 3.3, what would you estimate his/her monthly


salary to be?
2 what is the error in the estimation for a student’s GPA of 2.6?
3 what is the sample correlation coefficient between student’s GPA and
monthly salary? Does it reflect a strong or weak relationship between
student’s GPA and the monthly salary?
4 compute the coefficient of determination. What percentage of the
variation in monthly salary can be explained by the student’s GPA?

Dr. Ayat Al-Meanazel (A.A.U) 191 / 190


Example
The linear regression equation is given by ŷ = 1790.5 + 581.1x
1 The estimated monthly salary for a student with GPA 3.3 is

ŷ = 1790.5 + 581.1(3.3) = 3708.1


2 The estimated monthly salary for a student with GPA 2.6 is
ŷ = 1790.5 + 581.1(2.6) = 3301.4
then the error in the estimation is
Error = y − ŷ = 3300 − 3301.4 = −1.4
3 The sample correlation coefficient between student’s GPA and monthly
salary is r = 0.86, then there is a strong relationship between student’s
GPA and the monthly salary.
4 The coefficient of determination is r 2 = 0.7396, then the percentage of
the variation in monthly salary that can be explained by the student’s
GPA is 73.96%.
Dr. Ayat Al-Meanazel (A.A.U) 192 / 190

You might also like