Statistics
Statistics
Colors in the
Blood type Religion
Ethiopian flag
Scale of
Measurement Ordinal
Arranged in order, but differences
between data entries are not
meaningful.
Scale of Interval
Measurement
Arranged in order, the differences between data
entries can be calculated.
Ratio
Nominal Yes No No No
Ordinal Yes Yes No No
Interval Yes Yes Yes No
Ratio Yes Yes Yes Yes
Discussion
Give the correct variable type and scales of measurement for
each variable
1. Blood group ...............................
2. Status of pc infected by virus ................................
3. Job satisfaction index (1-5).........
4. Number of heart attacks .............
5. Calendar year ...........................
6. RAM of a computer..........
7. Number of traffic accidents in a 3 - day ....
8. Number of cases of each reportable disease
reported by a health worker.......
11. Ethnic group..........................
11/13/2023 Introduction to Statistics 32
Method of data collection and presentation
Objectives
Primary Secondary
Tabular presentation
Diagrammatic and Graphic presentation.
Qualitative Quantitative
Tabular presentation
Categorical
Ungrouped Grouped
frequency frequency
frequency
distribution distribution
distribution
Qualitative Quantitative
Histogram,
Pei chart, bar
frequency
chart
polygon, ogive
11/13/2023 Method of data collection and presentation 84
Thank you !!!
2. SUMMARIZING OF DATA
i =1
xi
= 21+13+59+46+32+37=208.
11/13/2023 Measure of Central Tendency 92
Introduction………
Some Properties of the Summation Notation
The arithmetic mean, usually abbreviated to ‘mean’ is the sum of the observations divided by the number of observations.
x i
x= i =1
.
n
fi xi
x = i =1
n
• where,
k = the number of class intervals
xi = the mid-point of the ith class interval
fi = the frequency of the ith class interval
n
i =1
i
w x i i
xw = i =1
n
w
i =1
i
Example1: The man gets three annual raises in his salary. At the
end of first year he gets an increase of 3%, at the end of the
second year he gets an increase of 6% and at the end of the
third year he gets an increase of 9% of his salary. What is the
average percentage increase in the three periods?
• Solution: G.M = (1.03X1.06X1.09.)1/3 = 1.0631=>1.0631-
1=0.0631
Therefore, the average percentage increase is 6.31%.
log x i
GM = antilog( i =1
)
n
f log x
i
i
GM = antilog( i =1
)
n
Another important mean is the harmonic mean, which is suitable measure of central tendency when the data pertains to speed, rates and time.
Let x1,x2,x3,……..,xn be n variant values in a set of observation, then the harmonic mean is given by
If the data are arranged in the form of frequency distribution in which an observation x i has frequency fi (i=1, 2, 3…k), the harmonic mean is given by
Measure of Central
11/13/2023 122
Tendency
Mode………
Properties of mode
• The mode can be used as a summary measure for
nominal, ordinal, discrete and continuous data, in
general however, it is more appropriate for nominal
and ordinal data.
Median of ungrouped data : The median is found by arranging the data in order of magnitude. The median is then the value of the middle term
Let x1, x2, x3… xn be n ordered observations. Then median value is given by:
Deciles : are the nine points, which divide the given ordered data into 10 equal parts.
Percentiles: are values which divide the data in to one hundred equal parts
Elementary probability
173
Objectives
174
Introduction
175
Definition of some probability terms
179
Counting rule
In order to calculate probabilities, we have to
know
• The number of elements of an event
• The number of elements of the sample space.
That is in order to judge what is probable, we
have to know what is possible.
In order to determine the number of out
comes one can use several rules of counting:
180
Counting: Addition Principle
Addition Principle: if a task can be
accomplished by k distinct procedures where
the ith procedures has ni alternatives, the total
number of ways of accomplishing the task
equals n1+n2+…………..+n k
• Example1: there are two transportation
means from city A to city B, either using bus
transportation or train transportation. There
are 3 buses and 2 trains .how many ways of
transportation is there from city A to city B?
181
Counting: Multiplication Principle
Rule: if a sequence of n events in which the
first one has k1 possibilities, the second events
has k2, the third event has k3, and so on, the
total possibilities will be: k1.k2…….kn
182
Counting :Permutations
183
Solution
6!=6x5x4x3x2x1
=720
184
Counting :Permutations
Permutation rule-3
185
Counting : Combination
Combination is a selection of distinct objects
without regard to order.
Combination is used when the order of
arrangement is not important, as in the
selection process.
The number of combinations of r objects
selected from n objects is denoted by
186
Counting : Combination
187
Approaches of probability?
• Objective and subjective approaches of probability
• Subjective Probability:
– This probability measures based on feeling and may
not even be scientific.
For Example: the probability that a cure for cancer
will be discovered within the next 10 years.
188
Objective probability
• Classical,
• Relative and
• Axiomatic
189
Classical probability
This is based on the assumption that the outcomes of an
experiment are equally likely and the total number of the
outcomes is definite. Uses sample space to determine the
numerical probability that an event will happen.
Definition: if there are n equally likely outcomes of an
experiment, and out the n outcomes event A occur only k times
the probability of the event A is denoted by p(A) is defined as
For Example:
In the rolling of the die , each of the six sides is equally likely to
be observed.
Probability 191
2. When a fair coin is tossed, it may fall either as a head (H) or as a
tail (T)
193
frequency probability:
• This approach can be used when some random
phenomenon is observed repeatedly under identical
conditions.
• If an experiment possessing certain outcomes is
repeated a large number of times, N, and
N
• if some resulting event E occurs A times, the relative
frequency of occurrence of E , N(A)/N will be
approximately equal to probability of E .
Pr(E) = A/N .
194
Frequency Probabilities
Coin flipping:
195
11 cards containing the letters of the word PROBABILITY is put in a
Ex 1 box. A card is taken out at random. Find the probability that the card
chosen is
(a) letter B (b) a vowel (c) a consonant
n( S ) 11
196
There are x red balls and 8 yellow balls in bag. A ball is taken at
Ex 2 random from the bag. The probability of getting a red ball is 3
7
.
(a) Find the value of x. (b) If y red balls are then added to the box,
the probability of getting a yellow ball
Total number of balls = x + 8 becomes ½. Find the value of y.
197
Axiomatic approach:
Defines probability in terms of theorem:
1. 0 ≤ Pr (E) ≤ 1
3. If A and B are mutually exclusive events, the probability that one or the
other occur equals the sum of the two probabilities. i.e p(AUB)=P(A)+P(B)
4. P(A’)=1-P(A)
198
Conditional probability
199
Conditional probability
200
Probability of independent event
Two events are independent if the occurrence of one of the events does not affect the
probability of the other event.
Example:
Let event A stands for “the sex of the first child from a mother is female”; and event B
stands for “the sex of the second child from the same mother is female”
Are A and B independent?
Solution
P(B/A) = P(B) = 0.5
The occurrence of A does not affect the probability of B,
so the events are independent.
201
Class exercise
Fifteen Ethiopian athletes were entered to the
race. In how many different ways could prizes for
the first, the second and the third place be
awarded?
In a club containing 7 members a committee of 3
people is to be formed. In how many ways can
the committee be formed?
In the experiment of tossing a coin and a die
together, find the probability of an event E
consisting head and even numbers.
202
203
Debremarkos University
College of natural and computational
Sciences
Department of Statistics
Probability Distributions
n = number of trials
n X n X
p (1 p )
X 1-p = probability of
failure
X=# p = probability of
successes out success
of n trials
Pr (X= x) = n! p x (1- p) n- x
x ! (n -x )!
Therefore, Pr (X=5) = 10! X 0.52 5 x (1- 0.52)10-5 =0.24
5!(10-5)!
3 or more will be females?
Pr(X≥3) = 1- Pr (X<3) = 1-[Pr(X=0)+Pr(X=1)+Pr(X=2)]
=1-[0.001+0.013+0.055]= 1-0.069=0.931
Example 2: The exam has five questions and each question has
four multiple choice in which one of the choice is the correct
answer. If a student answers all the question by guess.
b) Getting 1 case
c) Getting 2 cases
d) Getting 3 cases
e)11/13/2023
Getting 4 cases Probability Distributions 226
Solution
a) P(x= 0) = = 0.111
b) p(X=1) = 0.244
c) p(x=2) = 0.268
d) p(x=3) = 0.197
e) p(x=4) = 0.108
f (x) Normal
Uniform Skewed
x
There are infinite number of continuous random variables
Normal curve
1 x 2
1 ( )
f ( x) e 2
2
This is a bell shaped curve with
Where, s = Population variance different centers and spreads
µ = population mean depending on and
e =2.718…, π= 3.14…
Total area = 1
Inflection points
x
μ 3σ μ 2σ μσ μ μ+σ μ + 2σ μ + 3σ
The Line of symmetry for the curve indicates the mean of the
distribution, and the spread shows the magnitude of the standard
deviation Probability Distributions
The area under the curve
The area under a curve can be obtained:
a. By taking the integral of an interval, (a, b)
b
Area under the curve b/n a & b = f(x)dx
a
1
e ( x ) / 2 2
2
Wheref ( x ) a b
2
b. By preparing a tables containing areas for each curve
Va lu e - Mea n x -μ
z = = .
St a n da r d devia t ion σ
z-score = no. of σ-units above (positive z) or below (negative z) a distribution mean μ
The resulting distribution will be the standard normal with a mean of 0 and a standard
deviation of 1.
0.0 .5000 .5040 .5080 .5120 .5160 .5199 .5239 .5279 .5319 .5359
0.1 .5398 .5438 .5478 .5517 .5557 .5596 .5636 .5675 .5714 .5753
0.2 .5793 .5832 .5871 .5910 .5948 .5987 .6026 .6064 .6103 .6141
2.6 .9953 .9955 .9956 .9957 .9959 .9960 .9961 .9962 .9963 .9964
2.7 .9965 .9966 .9967 .9968 .9969 .9970 .9971 .9972 .9973 .9974
2.8 .9974 .9975 .9976 .9977 .9977 .9978 .9979 .9979 .9980 .9981
Find the area by finding 2.7 in the left hand column, and then
moving across the row to the column under 0.01.
The area to the left of z = 2.71 is 0.9966.
11/13/2023 Probability Distributions 242
The Standard Normal Table
Example:
Find the cumulative area that corresponds to a z-score of 0.25.
3.4 .0002 .0003 .0003 .0003 .0003 .0003 .0003 .0003 .0003 .0003
3.3 .0003 .0004 .0004 .0004 .0004 .0004 .0004 .0005 .0005 .0005
0.3 .3483 .3520 .3557 .3594 .3632 .3669 .3707 .3745 .3783 .3821
0.2 .3859 .3897 .3936 .3974 .4013 .4052 .4090 .4129 .4168 .4207
0.1 .4247 .4286 .4325 .4364 .4404 .4443 .4483 .4522 .4562 .4602
0.0 .4641 .4681 .4724 .4761 .4801 .4840 .4880 .4920 .4960 .5000
Find the area by finding 0.2 in the left hand column, and then
moving across the row to the column under 0.05.
The area to the left of z = 0.25 is 0.4013
11/13/2023 Probability Distributions 243
Guidelines for Finding Areas
Finding Areas Under the Standard Normal Curve
1. Sketch the standard normal curve and shade the appropriate
area under the curve.
2. Find the area by following the directions for each case shown.
a. To find the area to the left of z, find the area that
corresponds to z in the Standard Normal Table.
z
0 1.23
1. Use the table to find the
area for the z-score.
z
0.75 0 1.23
i.e. if a random variable, x, is normally distributed, the probability that x will fall in a given
interval is the area under the normal curve for that interval.
μ = 10 μ=0
σ=5 σ=1
x z
μ =10 15 μ =0 1
Same area
P(x < 15) = P(z < 1) = Shaded area under the curve
= 0.8413
11/13/2023 Probability Distributions 249
Probability and Normal Distributions
Example: The average weight of pregnant women attending a prenatal care in a clinic was 78kg
with a standard deviation of 8kg. If the weights are normally distributed:
a) Find the probability that a randomly selected pregnant woman weights less than 90kg.
x - μ 90 - 78
z =
μ = 78 σ 8
σ=8 = 1.5
x - μ 85 - 78
μ = 78 z = =
σ 8
σ=8
= 0.875 0.88
P(x > 85)
The probability that a
x randomly selected pregnant
μ =78 85 woman weights greater than
z
μ =0 0.88
?
85kg. is 0.1894.
P(x > 85) = P(z > 0.88) = 1 P(z < 0.88) = 1 0.8106 = 0.1894
x - μ 60 - 78 = -2.25
z1 = =
σ 8
P(60 < x < 80) x - μ 80 - 78
z2 = = 0.25
σ 8
μ = 78
σ=8
The probability that a
x randomly selected pregnant
60 μ =78 80 women weights between 60
z
2.25
? Z =0 0.25
?
and 80 is 0.5865.
P(60 < x < 80) = P(2.25 < z < 0.25) = P(z < 0.25) P(z < 2.25)
= 0.5987 0.0122 = 0.5865
11/13/2023 Probability Distributions 252
Finding z-Scores
Example:
In a certain population, the proportion of individuals with uric acid level less than a certain
limit is 36.7%.
• Find the z-score that corresponds to this cut of point
z .09 .08 .07 .06 .05 .04 .03 .02 .01 .00
0.3 .3483 .3520 .3557 .3594 .3632 .3669 .3707 .3745 .3783 .3821
0.2 .3859 .3897 .3936 .3974 .4013 .4052 .4090 .4129 .4168 .4207
0.1 .4247 .4286 .4325 .4364 .4404 .4443 .4483 .4522 .4562 .4602
0.0 .4641 .4681 .4724 .4761 .4801 .4840 .4880 .4920 .4960 .5000
Find the z-score by locating 0.367 in the body of the Standard Normal
Table. Use the value closest to 0.367.
The z-score is 0.34.
11/13/2023 Probability Distributions 253
Finding a z-Score Given a Percentile
Example:
Find the z-score that corresponds to P75
Area = 0.75
z
μ =0 ?
0.67
x μ + zσ.
Example:
The monthly expenses for cigarette by smokers in a city are normally
distributed with a mean of 120Birr and a standard deviation of 16 Birr.
Find the monthly expense corresponding to a z-score of 1.60.
x μ + zσ
= 120 + 1.60(16)
= 145.6
We can conclude that an expense of 145.60 Birr for cigarette is 1.6
standard deviations above the mean.
11/13/2023 Probability Distributions 256
Exercise
A population of sandwich has a mean weight of 250 grams
with standard deviation of 20 grams. Based on this
information give a short answer to the following questions.