Introduction To Statistics (Stat 2181)
Introduction To Statistics (Stat 2181)
Melkie C. (MSc.)
[email protected] /[email protected]
October, 2018
1. Introduction
2
The art of learning from data
3
Definition of Statistics
Plural form
Singular form
4
Classification of Statistics
Descriptive Statistics
From sample we have 40% fourth year computer Science students suggest
Drawing graphs that show the difference in the ‘scores’ of fourth year
6
Classification of Statistics …
Inferential Statistics
Utilizes sample data to make decision for entire data set based on sample
From sample study number of new computer accounts registered in Addis Ababa
university is 50--number of concurrent users.
7
Definition of Some Basic Statistical Terms
Data
drawn
Population/target population
collected
8
Definition of Some Basic Statistical Terms
Sample
Subset of a population
Population
Sample
Census
Parameter
Variable
A certain characteristic whose value changes from object to object and time
Sampling frame
10
Stages in Statistical Investigation
11
Stages in Statistical Investigation
1. Data Collection
The processes of measuring, assembling and gathering data
12
Stages in Statistical Investigation …
2. Data Organization
It is a stage where we edit our data
The collected data involve irrelevant figures, incorrect facts, omission and
mistakes
3. Data Presentation
The organized data can now be presented in the form of tables, diagram and
graphs.
13
Stages in Statistical Investigation …
4. Data Analysis
Study the data to draw conclusions about the population parameter
5. Data Interpretation
Draw valid conclusions from the results obtained through data analysis
14
Uses and Limitations of Statistics
Uses of Statistics
15
Uses and Limitations of Statistics …
Limitations of Statistics
Statistics doesn’t deal with single (individual) values rather it deals with
aggregate values
the subject
16
Scales of Measurment
A variable in statistics is any characteristic, which can take on different
values for different elements when data are collected
17
Scales of Measurement
Measurement “is assigning numbers to objects, events, or abstract
concepts according to a known set of rules”
Ordinal Scale
Interval Scale
19
Scales of Measurement …
Interval Scales of Measurement
20
2. Methods of Data Collection and Presentation
21
Sources of Data
Primary data
data measured or collect by the investigator or the user directly from the source
the data you collect is unique to you and your research and, until you publish, no one
The primary sources of data are objects or persons from which we collect the
Secondary data
second-hand data that was gathered by someone else
22
Sources of Data
Few of sources of secondary data are:-
Government reports
Official statistics
Journals
Reference books
Computerized database
Universities
Research institutes
Hospitals
23
Methods of Data Collection
Two activities involved in primary data collection
24
Methods of Data Collection…
1. Observation
Recording the behavioral patterns of people, objects and events in a
systematic manner.
Ranges from single visual observation to those requiring special skills like
direct observation/examination.
25
Methods of Data Collection…
2. Questionnaire
It is a popular means of collecting data,
questionnaire is produced.
26
Methods of Data Collection…
Advantages of Questionnaire
Can be used as a method in its own right or as a basis for interviewing or a
telephone survey.
coverage.
27
Methods of Data Collection…
Disadvantage of Questionnaire
Historically low response rate (although inducements may help).
required.
Respondent can read all questions beforehand and then decide whether to
complete or not.
28
Methods of Data Collection…
3. Interview
A technique that is primarily used to gain an understanding of the underlying
(indirect method)
Personal interview
Telephone interview
29
Methods of Data Collection…
Advantages of Interviewing
Serious approach by respondent resulting in accurate information and good response
rate.
30
Methods of Data Collection…
Disadvantages of Interviewing
Need to set up interviews.
Time consuming.
Geographic limitations.
Can be expensive.
Respondent bias – tendency to please or impress, create false personal image, or end
31
Methods of Data Collection…
4. Extract from Records/Documentary Sources
it is method of collecting information (secondary data) from published or
unpublished sources
Time saving
32
Methods of Data Collection…
Disadvantages of secondary data
Lack of availability
Lack of relevance
Inaccurate data
Insufficient data
33
Methods of Data Presentation
The major objectives of data presentation are
Diagrams, and
Graphs
34
Methods of Data Presentation …
Tabular presentation of data
understandable way.
Tables can be
Simple (one way table): table which present one characteristics for example age
distribution.
Two way table: it presents two characteristics in columns and rows for example
age versus sex.
A higher order table: table which presents two or more characteristics in one
table.
35
Methods of Data Presentation …
Frequency Distribution
frequencies.
36
Methods of Data Presentation …
Categorical Frequency Distribution
The categorical frequency distribution is used for data which can be placed
A B C D
Class Tally Frequency Percent
37
Methods of Data Presentation …
Example 2.1: Twenty five samples of computers exist in a computer lab and the
computers are identified and stated below. The data set is given as follows {Dell (D), HP
(H), (Apple (A)}
A H H D D
D D H A H
H H D H D
A D H D H
D D D H D
38
Methods of Data Presentation …
Ungrouped Frequency Distribution
It is the distribution that use individual data values along with their
frequencies.
often constructed for small set of data on discrete variable (when data are
The major components of this type of frequency distributions are class, tally,
39
Methods of Data Presentation …
Ungrouped Frequency Distribution
Example 2.2: A person is interested in the number of laptops a family may have,
he/she took sample of 30 families and obtained the following observations. Number
of laptops in a sample of 30 families
2, 0, 2 , 1, 0, 4, 1, 2, 2, 0, 0, 3, 2, 1, 2, 2, 2, 2, 2, 1, 2, 0, 3,
1, 1, 3, 3, 3, 4, 2.
0 5
1 7
2 12
3 4
4 2
40
Methods of Data Presentation …
the data must be grouped in which each class has more than one unit in
width.
We use when the range of the data is large, and for data from continuous
variable.
41
Methods of Data Presentation …
Class limit (CL)
have gaps between the upper limits of one class and the lower limit of the next class.
Class boundary(CB)
The boundary has one more decimal place than the raw data.
There is no gap between the upper boundaries of one class and the lower boundaries
This is the difference between possible two successive values. E.g. 1, 0.1, 0.01 …
The difference between the upper and lower class boundaries of any consecutive
class.
The class width is also the difference between the lower limit or upper limits of two
consecutive classes.
It is found by adding the lower and upper class limit (Boundaries) and divided the
sum by two.
43
Methods of Data Presentation …
Guidelines for classes
K 1 3.32 log n
Classes should be continuous.
Range R
W
Number of classes K
44
Methods of Data Presentation …
Steps to construct grouped frequency distribution
Take the smallest value as the first class lower class limit, and add class width to get consecutive
lower class limits.
To get upper class limit subtract unit of measurement from second class lower class limit, and add
class width to get remaining upper class limits.
Subtract half of unit of measurement from lower class limit to get class boundary, and add half of
unit of measurement to upper class limit to get upper class boundary.
Tally data
effectiveness of a processor for a certain type of tasks, we recorded the CPU time for
n = 20 randomly chosen jobs (in seconds).
15 31 8 36 29 23 33 18 24 43 20 12 25
21 24 29 28 40 35 24
46
Methods of Data Presentation …
Grouped frequency distribution
Class Class limit Class Boundary Tally Frequency LCF MCF Class mark
1 8 - 13 7.5 – 13.5 // 2 2 20 10.5
2 14 - 19 13.5 – 19.5 // 2 4 18 16.5
3 20 - 25 19.5 – 24.5 //// // 7 11 16 22.5
4 26 - 31 25.5 – 31.5 //// 4 15 9 28.5
5 32 - 37 31.5 – 37.5 /// 3 18 5 34.5
6 38 - 43 37.5 – 43.5 // 2 20 2 40.5
47
Methods of Data Presentation …
Diagrammatic and Graphic presentation of the data
There are several ways in which statistical data may be displayed pictorially
Bar chart
Histogram
48
Methods of Data Presentation …
Pie Chart
Pie chart is a circular diagram and the area of the sector of a circle is used in
pie chart.
Component part
Angle of sec tor 3600
Total
These angles are made in the circle by mean of a protractor to show different
components.
49
Methods of Data Presentation …
Pie Chart (Example)
Example 2.4: The following table gives the details of quarterly sale of a Microsoft
50
Methods of Data Presentation …
Pie Chart (Example)
Quarter Angle of sector
Profit($,000,000) Percent (%)
(in degrees)
1st quarter
7%
2nd quarter
4th quarter
33%
51
Methods of Data Presentation …
Bar Chart
While we draw bar chart, we have to consider the following two points.
Make the units on the axis that are used for the frequency equal in size
52
Methods of Data Presentation …
Simple Bar Chart
53
Methods of Data Presentation …
Multiple Bar Chart
When two or more interrelated series of data are depicted by a bar diagram
Example 2.6: Suppose we have export and import (in million) figures for a
80
60
40 Export
20 Import
0
2010 2011 2012
54
Methods of Data Presentation …
Stratified/Stacked Bar Chart
First make simple bars for each class taking total magnitude in that class
and then divide these simple bars into parts in the ratio of various
components
55
Methods of Data Presentation …
Stratified/Stacked Bar Chart
Example 2.7: The table below shows the profit of a company ($ Millions)
from different item sales in 1st quarter of the year. Draw stratified/stacked bar
chart Company Desktop Laptop Accessory Total
HP 30 50 40 120
DELL 33 16 27 76
TOSHIBA 37 13 37 87
140 Ball
120 T-shirt
Sales in $,000,000
100 40 Shoe
80
37
60 27
50
40 16 13
20 30 33 37
0
X Y Z
Company 56
Methods of Data Presentation …
Deviation Bar Chart
Used when the data contains both positive and negative values such as data
Example 2.8: Suppose we have the following data relating to net profit
(percent) of commodity.
Commodity Net profit
Soap 80
Sugar -95 Net profit
Coffee 125 150
100
50 Soap
0 Sugar
Soap Sugar Coffee
-50 Coffee
-100
-150
57
Methods of Data Presentation …
Histogram
represents classes of data values and the vertical scale represents frequencies.
The height of the bars correspond to the frequency values, and the drawn
represent frequencies.
58
Methods of Data Presentation …
Histogram
A histogram shows the shape of a pmf or a pdf of data, checks for homogeneity,
To construct a histogram, we split the range of data into equal intervals, “bins,”
59
Methods of Data Presentation …
Frequency polygon
It is a graphic form of a frequency distribution.
Remark: we should add two classes with zero frequencies at the two ends of
60
Methods of Data Presentation …
Ogive
A graph showing the cumulative frequency (less than or more than type)
class boundaries are plotted along the horizontal axis and the corresponding
61
3. Measures of Central Tendency
62
Introduction
A measure of central tendency is a descriptive statistic that describes the
average, or typical value of a set of scores.
Typical value
(Center of data)
The three major objectives of measures of central tendency are
To summarize a set of data by single value
63
Introduction
Good properties of typical average
Computation should be based on all the observed values.
it should be defined rigidly which means that it should have a definite value
Mean
Median
Mode
64
The Summation Notation
Also called Sigma notation
Let X is a variable
n ending point/
X
Upper limit of
the summation
i
i 1
Summation
notation
Xi is the index of
summation, each
starting point/
term of the sum
Lower limit of
the summation
(index of the
summation)
65
The Summation Notation..
Properties of summation notation
n
X
i 1
i X1 X 2 X n
XY
i 1
i i X 1Y1 X 2Y2 X nYn
i 1 2
X 2
i 1
X 2
X 2
X 2
n
n n
CX
i 1
i C X i CX 1 CX 2 CX n
i 1
66
The Mean
Mean is the most commonly used measure of central tendency. There are
different types of mean
Arithmetic mean,
Weighted mean,
67
The Arithmetic Mean
It is computed by adding all the values in the data set divided by the
number of observations in it.
X i
X i 1
n
If we have frequency distribution (ungrouped) mean is given by the
n
formula fX i i
X i 1
n
fm i i
LCBi UCBi
X i 1
, where mi LCB/UCB is lower/upper class boundary
n 2
68
The Arithmetic Mean …
Example 3.1: A person is interested to evaluate effectiveness of a processor for a certain
type of tasks, we recorded the CPU time for n = 10 randomly chosen jobs (in seconds)
70 36 43 69 82 48 34 62 35 15. (Ans:494/10=4.94 seconds)
Example 3.2: Twenty first year computer science students were ordered to write a
program using C++ language. The number of trials was recorded until a compile send
free message error. (Ans: 3.82 trials)
Number of trials Number of students
4 8
5 4
6 6
7 1
8 1
69
The Arithmetic Mean …
Example 3.3: Twenty nine tasks were performed by a computer and the time elapsed was
measured to evaluate effectiveness of a processor. The time is recorded in seconds, and
the result is summarized in the table. What is the mean CPU time of this processor. (Ans:
46.1 secs)
70
Properties of Arithmetic Mean …
It can be computed for any set of numerical data, it always exists, and unique.
The sum of deviations of the observations about the mean is zero i.e.
The sum of squares of deviations of all observations about the mean is the minimum
If a constant is added to all observations, the new mean is old mean plus constant
If all observations are multiplied by a constant, the new mean is the multiple of the constant and old
mean
If wrong value is recorded and latter on it is discovered, the new corrected mean is
X corr X wrong
X corr X wrong
n
71
Weighted Mean
Weighted mean is calculated when certain values in a data set are more
important than the others.
w x i i
Xw i 1
k
w
i 1
i
Example: CGPA of a students (each result is weighted by credit of a course) [Ans: 2.88]
72
Geometric Mean
It is defined as the arithmetic mean of the values taken on a log scale.
n
GM n X 1 * X 2 * X n n X
i 1
i
73
Geometric Mean…
Example 3.4: the price of a computer increased by the following percent for
three consecutive years; 10%, 15%, 20%. What is the average percentage
increase in the cost of computers the past three years. (Ans : 14.4%)
74
Harmonic Mean
It is the reciprocal of the arithmetic mean of the observations.
The harmonic mean is an average which is useful for sets of numbers which are
defined in relation to some unit, for example speed (distance per unit of time).
1
HM n
1 x
i 1
i
n
n
1
n
x
i 1 i
75
Harmonic Mean
Example 3.5: A students download 2.5 MB file with 25 kb/s, on the other day he
downloaded (same file size) with a 35kb/s. What is the average download rate of
the computer? (Ans: 29.2 kb/s)
76
Relation between AM, GM, and Hm
If all the values in a data set are the same, then all the three means (arithmetic
mean, GM and HM) will be identical.
As the variability in the data increases, the difference among these means also
increases.
Arithmetic mean is always greater than the GM, which in turn is always greater
than the HM.
AM > GM > HM
77
Median
If the sample data are arranged in increasing order, the median is
if n is an odd number, median is middle value
th
n 1
X positioned observation
2
Example 3.6: the number of new computer accounts registered during nine consecutive
days are 43, 37, 50, 52, 58, 105, 52, 45, and 45. what is the median number of new
accounts registered? (Ans: 50)
Example 3.8: Forty five computers were used to execute a program and evaluated their
performance using CPU time. The time is recorded in seconds, and the result is
summarized in the table. What is the median performance (time) of these computers.
(Ans: 19 secs) Time (in Number of Less than
seconds) computers cumulative
frequency
15 4 4
16 9 13
18 8 21
19 14 35
20 10 45
79
Median …
If the data is in grouped frequency distribution, median is
w n
X LCBmed cf
f med 2
Example 3.9: Forty five computers were used to execute a program and evaluated their
performance using CPU time. The time is recorded in seconds, and the result is
summarized in the table. What is the median performance (time) of these computers. .
(Ans: 20.81 secs) Time (in seconds) Number of
computers
14-16 6
17-19 12
20-22 16
23-25 9
26-28 7 80
Mode
The most frequent observation (value) in a data
There can be only one mode-unimodal Eg: 25, 27, 22, 25,18
There can be two mode-bimodal Eg: 25, 27, 22, 27, 25, 18, 20
There can be more than two mode-multimodal Eg: 25, 27, 22, 27, 25, 18, 20, 19, 22, 17
Xˆ LCBmod w 1
, 1 f1 f0 , 2 f1 f2
1 2
82
Quantiles
Quartiles are three points which divide an array into four parts in
such a way that each portion contains an equal number of
elements.
First quartile (Q1) 25% of the observations lies below or equal to it
a)
b)
13 23.5 39
84
Quantiles
The ith quartile for grouped frequency distribution is
w in
Qi LCBQi cf
fQi 4
85
Quantiles …
Deciles are nine points which divide an array into 10 parts in such
a way that each part contains equal number of elements.
The nine deciles are denoted by D1, D2, …, D9
Second decile (D2) 20% of the observations lies below or equal to it etc
w in
Di LCBDi cf
f Di 10
86
Quantiles …
w in
Pi LCBPi cf
f pi 100
87
Quantiles …
Example 3.12: The following frequency distribution is the score of 25
students.
Score Number
of
students Compute the following quantities
25-29 1
● First quartile (Ans:44.92)
30-34 1
●Ninth decile (Ans:65.75)
35-39 1
40-44 3 ●forty fifth percentile (Ans:51.38)
45-49 3
Remark: Q1 P25
50-54 6
Q2 D5 P50 Median
55-59 4
60-64 3 Q3 P75
65-69 2 D1 P10 ; D2 P20 ;; D9 P90
70-74 1
88
4. Measures of Dispersion
Number of pages written by two type writers for five consecutive days
89
Introduction
Central tendency measures do not reveal the variability present in the data.
all the data are the same and increases as the data become more diverse.
90
Introduction…
Properties of a good measures of dispersion
91
Introduction…
There are many types of dispersion measures
Range /Relative Range (Coefficient of range)
92
Range (R)
Range is the difference between two extreme values in a data
Denoted by R
Relative range is the ratio of the difference and sum of the two
extreme values in a data
Denoted by RR/CR
max min
RR
max min
Example 4.1: Consider 2.4, 2.5, 3.0, 1.5, 4.7, 4.3 and 3.5 as time
(in minutes) of installing five software on your personal computer.
94
Inter Quartile Range
Measures the range of the middle 50% of the values only
IQR= Q3 - Q1
The semi-interquartile range (or SIR) is defined as the difference of the first
and third quartiles divided by two
The SIR is often used with skewed data as it is insensitive to the extreme scores
It gives the average amount by which the two quartiles differ from the median
95
Coefficient of Quartile Deviation
The ratio of the difference to sum of the two extreme quartiles of a data
Denoted by QCD
Q3 Q1
CQD
Q3 Q1
Example: A basketball coach has a team of 20 players. He recorded the number of free throw
success out of 10 trials. The following are recorded: 9, 7, 3, 7, 1, 2, 5, 4, 5, 10, 10, 2, 2, 2, 6, 7, 9, 8,
5, 6. What are the SIR and CQD for the free throw success?
96
Mean Absolute Deviation (MAD)
Measures the ‘average’ distance of each observation away from the mean of the
data
Generally more sensitive than the range or interquartile range, since a change in
any value will affect it
x x i
MAD i 1
n
All values are used in the calculation.
98
Variance
Variance is the mean of squared deviation of observations from their
arithmetic mean
x x
2
i
s2 i 1
n 1
The units of variance are awkward: the square of the original units.
99
Standard Deviation
One of the most useful measures of dispersion is the standard deviation.
( x x )2 .
the variance.
s
n 1
To calculate standard deviation follow this step
1. Calculate the mean of the numbers
s
CV 100%
x
All values are used in the calculation.
The actual value of the CV is independent of the unit in which the measurement has been
For comparison between data sets with different units or widely different means, one
102
Standard Score
If X is a measurement from a distribution with mean X and standard
deviation S, then its value in standard units is
X X
Z
S
Z gives the deviations from the mean in units of standard deviation
103
Standard Score
Example: Two groups of people were trained to perform a certain task
and tested to find out which group is faster to learn the task. For the two
groups the following information was given:
Value Group one Group two
Mean 10.4 min 11.9 min
Stan.dev. 1.2 min 1.3 min
Relatively speaking:
104
Skewness
Skewness is the degree of asymmetry or departure from symmetry of a
distribution.
tail to the right of the central maximum than to the left, the distribution is said to be
skewed to the right or said to have positive skewness.
If it has a longer tail to the left of the central maximum than to the right, it is said to
be skewed to the left or said to have negative skewness.
105
Skewness
For moderately skewed distribution, the following relation holds among the three
mean-mode=3(mean-median)
mean mod e
skewness or
s tan dard deviation
m3 x x 2
x x 3
106
Kurtosis
Kurtosis is the degree of peakdness of a distribution, usually taken relative to a
normal distribution.
A distribution having relatively high peak is called leptokurtic.
The normal distribution which is not very high peaked or flat topped is called
mesokurtic.
107
Kurtosis
Measures of kurtosis
m4 x x 4
x x 2
108
Skewness and Kurtosis (Example)
Example: the following observations are score of 100 students.
Score 61 64 67 70 73
Number of students 5 18 42 27 8
Can we say the distribution is skeweed? What is the shape of the distribution?
m3 2.72
skewness 0.11
m2 8.61
3 3
2 2
110
Introduction
be uncertain.
When a job is sent to a printer, it takes uncertain time to print, and there is
Electronic components fail at uncertain times, and the order of their failures
111
Introduction…
Experiment
defined outcome
Example 5.1: Race competition, tossing a coin, identifying sex of new born baby, A ball is
manufactured, it is tested whether defective or non-defective.
Probability Experiment
112
Introduction…
Outcome
Sample Space
Event
Example 5.5: Roll a fair six sided die, A be observing even number A={2, 4, 6}
113
Introduction…
Equally Likely Events
Roll a die, let A be observing a number less than 4 and B be observing a number
greater than 3.
A and B are equally likely events
Complement of an Event
A={1, 3, 5}
AC ={2, 4, 6}
114
Introduction…
Elementary Event
Independent Events
Two events are independent if the occurrence of one does not affect the probability of
Dependent Events
Two events are dependent if the first event affects the outcome or occurrence of the
115
Counting Techniques
In order to calculate probabilities, we have to know
In order to determine the number of outcomes, one can use several rules of
counting.
Addition rule
Multiplication rule
Permutation rule
Combination rule
116
Counting Techniques…
Multiplication Rule
the 2nd step can be performed in n2 ways (regardless of how the 1st step was performed) ,
….
The kth step can be performed in nk ways (regardless of how the preceding steps were
performed) ,
different pairs of shoes, how many different outfits could we wear? (Ans: 360)
Exercise: How many 7-character license plates are possible if the first three characters must be
letters, the last four must be digits 0-9, and repeated characters are allowed?
117
Counting Techniques…
Addition Rule
the 2nd step can be performed in n2 ways (regardless of how the 1st step was performed) ,
….
The kth step can be performed in nk ways (regardless of how the preceding steps were
performed) ,
routes, 3 different train routes, and 5 different bus routs, how many different alternatives can a
person has to travel from place A to Place B? (Ans: 10)
118
Counting Techniques…
Permutation Rule
objects.
n!
n Pr
(n r )!
Example 5.9: How many ways can we pick a Gold, Silver, and Bronze medal for
8 competitors in a game? (Ans: 336)
119
Counting Techniques…
Permutation Rule
Example 5.10: In how many ways can you permute the word “STATISTICS”?
(Ans: 50400 ways)
Example 5.11: in a computer lab there are three identical HP desktop computers
and four identical DELL desktop computers. In how many ways can you arrange
these computers. (Ans: 35)
120
Counting Techniques…
Combination Rule
Example 5.13: A committee of two people must be chosen from a group of five
121
Approaches in Probability Definition
Classical/Mathematical Approach
If an event A occurs in n times out of a total of N exhaustive, mutually exclusive and
refurbished. Six computers are purchased for a student lab. From the first look, they are
indistinguishable, so the six computers are selected at random. Compute the probability that
among the chosen computers, two are refurbished. (Ans: 0.3522)
122
Approaches in Probability Definition…
Frequentist Approach
This is based on the relative frequencies of outcomes belonging to an event.
n
P( A) lim
N N
Example 5.15: If records show that 60 out of 100,000 bulbs produced are defective. What is the
123
Approaches in Probability Definition…
Axiomatic Approach
Let E be a random experiment and S be a sample space associated with E. With each
event A a real number called the probability of A (p(A)) satisfies the following
properties called axioms of probability or postulates of probability
P(A)≥0
P(S)=1
P(Ac)=1- P(A)
P( ) =0
P(AnBc) = P(A)-P(AnB)
124
Approaches in Probability Definition…
Axiomatic Approach
Example 5.16: Sixty percent of the families in a certain community own
their own car, thirty percent own their own home, and twenty percent own
both their own car and their own home. If a family is randomly chosen,
a) what is the probability that this family do not have a car?
c) what is the probability that this family owns a car or a house but not both?
Solution: Let A represents that the family owns a car and B represents that the family owns a
b) P(AUB) =0.7
c) P((AnBc)U(AcnB))=0.5
125
Approaches in Probability Definition …
Subjective Approach
Subjective probabilities, like the name suggests, are probabilities that come
Subjective probability differ from person to person, and because they are
126
Conditional probability and Independence
Conditional probability
Conditional probability provides us with a way to reason about the outcome of an
If the occurrence of one event has an effect on the next occurrence of the other event
The conditional probability of an event A given that B has already occurred, denoted
p(A|B) is
P A B
P( A | B)
P B
Remark: P( Ac | B) 1 P A | B
127
Conditional probability and Independence…
Conditional probability
Example 5.17: A family has two children. What is the conditional probability that
both are boys given that at least one of them is a boy? Assume that the sample space
S is given by S = {(b, b), (b, g), (g, b), (g, g)}, and all outcomes are equally likely. (b,
g) means, for instance, that the older child is a boy and the younger child is a girl.
Solution :
P A B
1
1
P( A | B) 4
P B 3 3
4
Exercise: Out of six computer chips, two are defective. If two chips are randomly chosen for
testing (without replacement), compute the probability that both of them are defective.
128
Conditional probability and Independence…
Conditional probability
Two events A and B are independent if and only if P A B P APB
Example 5.18: Toss a fair a coin and die together, what is the probability of getting
head on the coil if the die shows an even number. (Ans: 0.5)
Exercise: A box contains four black and six white balls. What is the probability of
getting two black balls in drawing one after the other under the following conditions?
The first ball drawn is not replaced?
129
Exercises on probability
There is a 1% probability for a hard drive to crash. Therefore, it has two backups, each having a
2% probability to crash, and all three components are independent of each other. The stored
information is lost only in an unfortunate situation when all three devices crash. What is the
probability that the information is saved?
A new computer virus can enter the system through e-mail or through the internet. There is a
30% chance of receiving this virus through e-mail. There is a 40% chance of receiving it
through the internet. Also, the virus enters the system simultaneously throughe-mail and the
internet with probability 0.15. What is the probability that the virus does not enter the system at
all?
Suppose that after 10 years of service, 40% of computers have problems with motherboards
(MB), 30% have problems with hard drives (HD), and 15% have problems with both MB and
HD. What is the probability that a 10-year old computer still has fully functioning MB and HD?
130
6. Random Variables and Probability
Distributions
131
Introduction
Random variable
a numerical description of the outcomes of the experiment or
capital letters.
Example 6.1: Flip a coin twice, let X be the number of heads in two tosses
X={0, 1, 2}
132
Introduction…
Random variables can be
Discrete random variables: are variables which can assume only a specific number
Continuous random variable: are variables that can assume all values between any
133
Introduction…
Probability Distribution
A probability distribution consists of a value a random variable can assume and
the corresponding probabilities of the values.
Example 6.4: Consider the experiment of tossing a coin twice. Let X is the
number of heads. Construct the probability distribution of X
First identify the possible value that X can assume.
Calculate the probability of each possible distinct value of X and express X in the
X 0 1 2
P(X=x) 1/4 2/4 1/4
134
Introduction…
Mean and variance of Random Variable
E X x P X x
Var X E X 2 E X ,
2
Where E X x P X x
2 2
Example 6.5: Consider the experiment of tossing a coin twice. Let X is the
number of heads.
X 0 1 2
P(X=x) 1/4 2/4 1/4
135
Binomial Distribution
A binomial experiment is a probability experiment that satisfies the
following four requirements called assumptions of a binomial
distribution.
1. The experiment consists of n identical trials.
2. Each trial has only one of the two possible mutually exclusive outcomes,
success or a failure.
3. The probability of each outcome does not change from trial to trial, and
Example 6.6
Scan computer for a certain virus(present, absent)
X ~ Bin n, p
n x
p X x p 1 p , x 0,1, 2,, n.
n x
x
E(X)=np
Var(X)=np(1-p)
137
Binomial Distribution…
Example 5.7: A quality control engineer tests the quality of produced computers.
Suppose that 5% of computers have defects, and defects occur independently of each
other. Find the probability of exactly 3 defective computers in a shipment of twenty.
(Ans: 0.06)
138
Poisson Distribution
A random variable X is said to have a Poisson distribution if its probability
distribution is given by:
e x
p X x , x 0,1, 2,
x!
The Poisson distribution depends only on the average number of occurrences per
unit time of space.
Number of misprints.
Accidents.
139
Poisson Distribution..
Example 5.8: Customers of an internet service provider initiate new accounts at the
average rate of 10 accounts per day.
(a) What is the probability that 8 new accounts will be initiated today? (Ans: 0.1126)
(b) What is the probability that more than 1 new accounts will be initiated today? (Ans:
0.9995)
(c) What is the probability that more than 10 new accounts will be initiated within 2 days?
(Ans: 0.9892)
140
Normal Probability Distribution
A random variable X is said to have a normal distribution if its probability
density function is given by
1 x
2
f x
1 2
e
2
It is asymptotic to the axis, i.e., it extends indefinitely in either direction from the mean.
It is a continuous distribution.
It is a family of curves, i.e., every unique pair of mean and standard deviation defines a different
normal distribution.
Total area under the curve sums to 1, i.e., the area of the distribution on each side of the mean is
0.5
Mean is zero
Variance is one
Areas under the standard normal distribution curve have been tabulated in
various ways. The most common ones are the areas between 0 and Z.
142
Normal Probability Distribution…
Example 5.9 : (read from table); determine the following probabilities
P(0<Z<1.43)=? (Ans: 0.4236 )
143
Normal Probability Distribution…
Remark: pa x b p a
x
b p a
z b for population
pa x b p as x x x
s bs x p as x z bs x for sample
On the average, it takes 15 sec to download one file, with a variance of 16 sec2.
seconds?
144
Normal Probability Distribution…
B. What is the probability that the software is installed in less than 20 seconds?
B. What is the probability that the software is installed in more than 20 seconds?
Exercise: An average scanned image occupies 0.6 megabytes of memory with a standard deviation
of 0.4 megabytes. If you plan to publish 80 images on your web site, what is the probability that
their total size is between 47 megabytes and 50 megabytes?
145
146
7. Introduction to Sampling
147
Introduction
When secondary data are not available for the problem under study, a
decision may be taken to collect primary data by using any of the
methods discussed in the previous chapter.
Why Sample?
Speed
Less costly
148
Definition of Terms
An element is an object on which a measurement is taken.
The deviation between an estimate from an ideal sample and the true population
value is the sampling error.
Almost always, the sampling frame does not match up perfectly with the target
population, leading to errors of coverage.
149
Essentials of Sampling
Representativeness
A sample should be so selected that it truly represents the population otherwise the
results obtained may be misleading.
Adequacy
The size of sample should be adequate; otherwise it may not represent the
characteristics of the population.
Independence
All items of the sample should be selected independently of one another and all
items of the population should have the same chance of being selected in the
sample.
By independence of selection we mean that the selection of a particular item in one draw
has no influence on the probabilities of selection in any other draw.
150
Essentials of Sampling…
Homogeneity
If two samples from the same population are taken, they should give
151
Types of Sampling
Methods of sampling
152
Simple Random Sampling
153
Simple Random Sampling …
Simple random sampling also reduces the chance of bias occurring in the
sample.
When using simple random sampling every unit in the population has an
equal chance of being included in the sample.
154
Simple Random Sampling …
155
Simple Random Sampling …
Lottery Method
number.
The numbers are then thoroughly mixed, like if you put them in a bowl or jar
The population members or items that are assigned that number are then
156
Simple Random Sampling …
Table of random numbers
A random number table typically contains random digits between 0 and 9 that
In the table, all digits are equally probable and the probability of any given
Using a table of random numbers, select and record a random number between 1 and N.
Select a second random number between 1 and N. If the second number is the same as the first selected
number, discard it and go to the next step. If the second number is not the same as the first number, record it.
Select a third random number between 1 and N. If this number is the same as either one of the previous
numbers, discard it and go to the next step. If the number is not the same as the previous numbers, record it.
Continue in this manner until n different numbers between 1 and N have been chosen.
157
Systematic Sampling
Systematic sampling is the selection of every kth element from a sampling
Using this procedure each element in the population has a known and equal
probability of selection.
158
Systematic Sampling…
A sampling interval (denoted by the symbol, k) is chosen.
If a sample of about n out of N elements is desired, k is usually the ratio, N/n, rounded to the
nearest integer.
Elements selected in the sample are those number j and every kth element for the
remainder of the list; i.e., j, j+k, j+2k, etc
159
Stratified Sampling
The process of dividing a population of elements into distinct subpopulations
called strata.
Strata are formed so that each population element is assigned to only one
stratum.
160
Stratified Sampling…
The strata should be mutually exclusive every element in the population must be
assigned to only one stratum.
Then random or systematic sampling is applied within each stratum. This often
improves the representativeness of the sample by reducing sampling error.
161
Cluster Sampling
Probability sampling in which sampling units at some point in the selection
process are collections, or clusters, of population elements.
The total population is divided into these groups (or clusters), and a sample of
the groups is selected.
Then the required information is collected from the elements within each
selected group.
This may be done for every element in these groups, or a subsample of elements
may be selected within each of these groups.
162
Cluster Sampling…
Cluster sampling can be single stage or multi stage.
163
Judgment Sampling
Judgmental sampling is a non-probability sampling technique where the researcher
selects units to be sampled based on their knowledge and professional judgment.
Judgment sampling is used in cases where the specialty of an authority can select a
more representative sample that can bring more accurate results than by using other
probability sampling techniques.
The process involves nothing but purposely handpicking individuals from the
population based on the authority's or the researcher's knowledge and judgment.
164
Quota Sampling
Quota sampling is a non-probability sampling technique wherein the assembled
sample has the same proportions of individuals as the entire population with
respect to known characteristics, traits or focused phenomenon.
The main reason why researchers choose quota samples is that it allows the
researchers to sample a subgroup that is of great interest to the study.
165
Quota Sampling…
exclusive subgroups.
Then, the researcher must identify the proportions of these subgroups in the
Finally, the researcher selects subjects from the various subgroups while taking into
The final step ensures that the sample is representative of the entire population.
166
Convenience Sampling
Convenience sampling is a non-probability sampling technique where subjects are
selected because of their convenient accessibility and proximity to the researcher.
The subjects are selected just because they are easiest to recruit for the study and the
researcher did not consider selecting subjects that are representative of the entire
population.
In all forms of research, it would be ideal to test the entire population, but in most cases,
the population is just too large that it is impossible to include every individual.
This is the reason why most researchers rely on sampling techniques like convenience
sampling, the most common of all sampling techniques.
Many researchers prefer this sampling technique because it is fast, inexpensive, easy and
the subjects are readily available.
167
Snowball/Chain Sampling
Snowball sampling is a non-probability sampling technique that is used by
researchers to identify potential subjects in studies where subjects are hard to
locate.
Researchers use this sampling method if the sample for the study is very rare or
is limited to a very small subgroup of the population.
This type of sampling technique works like chain referral. After observing the
initial subject, the researcher asks for assistance from the subject to help identify
people with a similar trait of interest.
168
Snowball/Chain Sampling
The process of snowball sampling is much like asking your subjects to nominate another
person with the same trait as your next subject.
The researcher then observes the nominated subjects and continues in the same way until
the obtaining sufficient number of subjects.
169