0% found this document useful (0 votes)
194 views

Statistics and Probability

Statistics is the science of collecting, organizing, and interpreting data. It involves both descriptive and inferential statistics. Descriptive statistics describes data through methods like charts and tables, while inferential statistics draws conclusions from samples. There are many key concepts in statistics including populations, samples, variables, scales of measurement, and methods of data collection and presentation. Common ways to present data include tables, graphs, charts, and diagrams to effectively summarize and compare the information.
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
194 views

Statistics and Probability

Statistics is the science of collecting, organizing, and interpreting data. It involves both descriptive and inferential statistics. Descriptive statistics describes data through methods like charts and tables, while inferential statistics draws conclusions from samples. There are many key concepts in statistics including populations, samples, variables, scales of measurement, and methods of data collection and presentation. Common ways to present data include tables, graphs, charts, and diagrams to effectively summarize and compare the information.
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 95

STATISTICS AND

PROBABILITY
STATISTICS
 is an art and science that deals with the collection, organization, creative
presentation, analysis and interpretation quantitative of data.
FIELDS OF STATISTICS
DESCRIPTIVE STATISTICS
 is concerned with the methods of collecting, organizing, and presenting
data appropriately and creatively to describe or assess group
characteristics.
INFERENTIAL STATISTICS
 is concerned with inferring or drawing conclusions about the population
based from preselected elements of that population.
JOHN PAUL D. GUNNAWA
.

CONSTANT AND VARIABLE


CONSTANT
 refer to the fundamental quantities that do not change in value.
VARIABLE
 are quantities that may take anyone of a specified set of value. These set of
values can be classified as qualitative (categorical) and quantitative (numerical)
variables.
QUALITATIVE VARIABLES
 are non measurable characteristics that cannot assume a numerical value but can
be classified into two or more categories.
QUANTITATIVE VARIABLES
 are those quantities that can be counted with your bare hands, can be measured
with the use of some measuring devices, or can be calculated with the use of a
mathematical formula. JOHN PAUL D. GUNNAWA
Quantitative variables are classified as discrete and continuous.
DISCRETE VARIABLES
 consist of variates usually obtained by counting.
CONTINUOUS VARIABLES
 are obtained by measurement, usually with units such as height in meter,
weight in kgs and time in minute.
DATA AND INFORMATION
DATA
 usually refers to facts concerning things such as status in life of people,
defectiveness of object or effect of an event to the society.
INFORMATION
 is a set of data that have been processed and presented in a form suitable for
human interpretation, usually with a purpose of revealing trends or patterns
about the population. JOHN PAUL D. GUNNAWA
SOURCE OF DATA
There are two source of obtaining data
1. PRIMARY SOURCE
 from which a firsthand information is obtained usually by means of personal
interview and actual observation.
2. SECONDARY SOURCE
 is taken from other works, new report, reading, and those that are kept by the
Philippines Statistics Authority, exchange commission and S.S.S.
SCALES OF MEASURING DATA
These classifications called scales of measurement, are the following:
1. NOMINAL SCALE
 classifies objects or peoples` responses so that all of those in a single category are
equal with respect to some attributes and then each category is coded numerically.
JOHN PAUL D. GUNNAWA
2. ORDINAL SCALE
 classifies object or individual responses according to degree or level, then each
level is coded numerically.
3. INTERVAL SCALE
 refers to quantitative measurement in which lower and upper control limits are
adapted to classify relative and differences of item numbers or actual scores.
4. RATIO SCALE
 it takes into account the interval size and ratio of two related quantities, which
are usually based on a standard measurement.
METHOD OF COLLECTING DATA
1. DIRECT OR INTERVIEW METHOD
 is a person to person interaction between an interviewer and an interviewee.
2. INDERICT OR QUESTINNAIRE METHOD
 is an alternative method for the interview method.
JOHN PAUL D. GUNNAWA
3. REGISTRATION METHOD
 is enforced by private organization or government agencies for recording
purposes.
4. OBSERVATION
 is a scientific method investigation that makes possible use of all senses to
measure or obtained outcomes/responses from the object of study.
5. EXPERIMENTATION
 is used when the objective is to determine the cause-and-effect of a certain
phenomenon under some controlled condition.
POPULATION AND SAMPLE
POPULATION
 is a finite or infinite collection of objects, events, or individual with specified
class or characteristics under consideration.
JOHN PAUL D. GUNNAWA
SAMPLE
 is a finite or limited collection of objects, events, or individual selected from a population.
SYMBOLS FOR PARAMETER
• POPULATION SIZE……………………………………………………. N
• POPULATION MEAN ………………………………………….…….. μ
• POPULATION STANDARD DEVIATION…………............... σ
• POPULATION VARIANCE …………………………………………. σ²
• POPULATION COEFFICIENT OF CORRELATION …………. p
SYMBOL FOR STATISTICS
• SAMPLE SIZE…………………………………………………. N
• SAMPLE MEAN………………………………………………. ˉx
• SAMPLE STANDARD DEVIATION……………………… s
• SAMPLE VARIANCE ………………………………………… s²
• SAMPLE COEFFICIENT OF CORRELATION ……….. r
JOHN PAUL D. GUNNAWA
CENSUS AND SAMPLING TECHNIQUES
CENSUS
 is a vital tool if the information gathered would be use for administrative purposes
and if it is of local or national concern.
SAMPLE
 is a portion or sub-aggregate of the population that should represent the common
qualities or characteristics of the population.
RANDOM AND NON-RANDOM SAMPLING
RANDOM SAMPLING
 is the most commonly used sampling technique in which each member in the
population is given an equal chance of being selected in the sample.
NON-RANDOM SAMPLING
 is a method of collecting a small portion of the population by which not all the
member in the population are deliberately left out from the selection for varied
reasons. JOHN PAUL D. GUNNAWA
.

PROPERTIES OF RANDOM SAMPLING


1. EQUIPROBABILITY
 means that each member of the population has an equal chance of being
selected and included in the sample.
2. INDEPENDENCE
 means that the chance of the one member being drawn does not affect the
chance of the other member.
TWO KINDS IF RANDOM SAMPLING
1. RESTRICTED RANDOM SAMPLING
 involves certain restriction intended to improve the validity of the sampling.
2. UNRESTRICTED RANDOM SAMPLING
 is considered the best random sampling design because there were no
restrictions imposed and every member in the population has an equal chance of
being included in the sample. JOHN PAUL D. GUNNAWA
RANDOM SAMPLING TECHNIQUES
1. LOTTERY OR FISHBOWL SAMPLING
 this is done by simply writing the names or numbers of all members of the population
in small rolled pieces of paper which are later placed in a container.
2. SAMPLING WITH THE USE OF TABLE OF RANDOM NUMBERS
 if the population is large, a more practical procedure is the use of table of random
number which contains rows and columns of digits randomly ordered by a computer.
3. SYSTEMATIC SAMPLING
 this method of sampling is done by taking every kth element in the population.
4. STRATIFIED RANDOM SAMPLING
 when the population can be partitioned into several strata or subgroups, it may be
wiser to employ the stratified technique to ensure a representative of each group in the
sample.
5. MULTI-STAGE OR MULTIPLE SAMPLING
 this technique uses several stages or JOHN
phases in getting the sample from the population.
PAUL D. GUNNAWA
NON-RANDOM SAMPLING TECHNIQUES
1. JUDGEMENT OR PURPOSIVE SAMPLING
 this method is also referred as non-random or non-probability sampling.
2. QUOTA SAMPLING
 this is relatively quick and inexpensive method to operate since the choice of the number
of persons or elements to be included in a sample is done at the researchers own
convenience.
3. CLUSTER SAMPLING
 this is sometimes referred to as area sampling because it is usually applied on a
geographical basis.
4. INCIDENTAL SAMPLING
 this design is applied to those samples which are taken because they are the most
available.
5. CONVENIENCE SAMPLING
 this method has been widely used in television and radio programs to find out opinion of
TV viewers and listeners regarding a controversial issue.
JOHN PAUL D. GUNNAWA
ORGANIZATION AND PRESENTATION OF DATA

 DATA/INFORMATION
 TEXTUAL
 TABULAR
 MAP GRAPH/CARTOGRAPH
 SCATTER POINT DIAGRAM
 PIE/PICTURE GRAPH
JOHN PAUL D. GUNNAWA
FORMS OF PRESENTATION OF DATA
A. TEXTUAL
 this form of presentation combines text and numerical facts in statistical report.
B. TABULAR
 this form of presentation is better than textual form because it provides numerical
facts in a more concise and systematic manner.
Advantage of Tabular Presentation
1. It is brief, it reduces the matter to the minimum.
2. It provides the reader a good grasp of the meaning of the quantitative relationship
indicated in the report.
3. The column and rows make comparison easier.
C. GRAPHICAL PRESENTATION
 this form is the most effective means of organizing and presenting statistical data
because the important relationships are brought out more clearly and creatively in
virtually solid and colorful figures. JOHN PAUL D. GUNNAWA
.

DIFFERENT KINDS OF GRAPHS/CHARTS


1. LINE GRAPH
 it shows relationships between two sets of quantities. This is done by plotting
points of X set of quantities along the horizontal axis against the Y set of
quantities along the vertical axis in a Cartesian coordinate plane.

JOHN PAUL D. GUNNAWA


2. BAR GRAPH
 it consist of bars or rectangle of equal widths, either drawn vertically or
horizontally, segmented or non-segmented.

JOHN PAUL D. GUNNAWA


3. CIRCLE GRAPH or PIE CHART
 it represent relationship of the different components of a single total as revealed
in the sectors of a circle.

JOHN PAUL D. GUNNAWA


.

4. PICTURE GRAPH or PICTOGRAM


 it is a visual presentation of statistical quantities by means of drawing pictures or
symbols related to the subject under study.

JOHN PAUL D. GUNNAWA


.

5. MAP GRAPH or CARTOGRAM


 it is one of the best ways to present geographical data.

JOHN PAUL D. GUNNAWA


6. SCATTER POINT DIAGRAM
 it is graphical device to show the relationship between two quantities variables.

JOHN PAUL D. GUNNAWA


FREQUENCY DISTRIBUTION
 is a tabulation or grouping of data into appropriate categories showing the
number of observations in each group or category.

Consider the given data below which show the scores of 60 students in a statistics
test.
5 13 8 6 13 10 5 13 15 16

8 12 15 10 12 16 12 9 3 7

11 15 11 7 15 2 13 5 9 12

13 9 12 9 9 14 12 11 19 13

16 18 3 13 18 10 15 14 18 11

10 12 6 9 5 17 9 6 9 18

The numbers shown above are called raw data.


JOHN PAUL D. GUNNAWA
PART OF FREQUENCY TABLE
1. CLASS LIMITS
 groupings or categories defined by lower and upper limits.
Example:
16 – 20
21 – 25
26 – 30
Lower class limits are the smallest numbers that belong to the different classes.
Upper class limits are the highest numbers that belong to the different classes.
2. CLASS SIZE – width of each class interval.
L.L U.L
16 - 20 class size is 5
21 - 25
JOHN PAUL D. GUNNAWA
3. CLASS BOUNDARIES
 are the numbers used to separate class but without gaps created by class limit.
C.I C.B
L.L U.L L.C.B U.C.B
16 - 20 15.5 - 20.5
21 - 25 20.5 - 25.5
26 - 30 25.5 - 30.5

4. CLASS MARK
 are the midpoint of the lower and upper class limits.
C.I CLASS MARK (X)

16 - 20 18

21 - 25 23

26 - 30 28
JOHN PAUL D. GUNNAWA
.

•The
  construction of this distribution is a very simple activity that requires the
following steps.
1. Get the value of the range. The range denoted by R, refers to the difference
between the highest and the lowest value in the distribution.
R=H–L
2. The number of classes can be approximately by using the relationship
k = 1 + 3.3 log n
Where : k is the number of classes
n is the sample size
3. Determine the size of the class interval. The value of c can be obtained by
dividing the range by the desired number of classes.
c=

4. Construct the classes.


JOHN PAUL D. GUNNAWA
Example:
Test Scores Obtained by the sixty Student in a Statistic Class
48 73 57 57 69 88 11 80 82 47

46 70 49 45 75 81 33 65 38 59
94 59 62 36 58 69 45 55 58 65
30 49 73 29 41 53 37 35 61 48
22 51 56 55 60 37 56 59 57 36
12 36 50 63 68 30 56 70 53 28

Steps 1. Get the range


R = H – L = 94 – 11 = 83
Step 2. Determine the number of class intervals.
k = 1 + 3.3 log n
= 1 + 3.3 log 60
= 6.88 or 7
Step 3. Determine the size of the class interval.
c = R = 83 = 11.86 or 12 JOHN PAUL D. GUNNAWA
k 7
Classes f x Classes Boundaries

11 - 22 3 16.5 10.5 – 22.5

23 - 34 5 28.5 22.5 – 34.5

35 - 46 11 40.5 34.5 – 46.5

47 - 58 19 52.5 46.5 – 58.5

59 - 70 14 64.5 58.5 – 70.5

71 - 82 6 76.5 70.5 – 82.5

83 - 94 2 88.5 82.5 – 94.5


n = 60

JOHN PAUL D. GUNNAWA


Example 2.
The intelligence quotients of 100 freshmen students admitted at the Senior High
School- Department of a certain university were taken and show below.
95 115 110 119 98 93 112 91 94 111
99 111 110 115 107 96 107 105 108 108 solution: using the same procedure:
83 85 109 89 107 100 103 100 94 116 step 1: H = 120 ; L = 83
106 101 108 105 101 120 90 100 112 107 step 2: R = H – L = 120 – 83 = 37
107 102 90 105 87 118 94 117 108 100 step 3: the no. of classes is given.
91 88 120 106 107 106 107 106 100 97 we can say that k = 10.
98 103 106 106 106 106 110 107 94 97 step 4. the size of the class interval
114 99 96 100 106 103 110 109 101 107 c = R = 37 = 3.7 or 4
k 10
107 95 99 97 92 100 113 101 106 106
119 114 96 107 108 112 97 106 105 112
JOHN PAUL D. GUNNAWA
.

Step 5 and 6. Determine the classes and the frequency of each class.
Classes f x Class
boundaries
83 - 86 2 84.5 82.5 – 86.5
87 - 90 5 88.5 86.5 – 90.5
91 - 94 8 92.5 90.5 – 94.5
95 - 98 11 96.5 94.5 – 98.5
99 - 102 15 100.5 98.5 – 102.5
103 - 106 26 104.5 102.5 – 106.5
107 - 110 15 108.5 106.5 – 110.5
111 - 114 9 112.5 110.5 – 114.5
115 - 118 5 116.5 114.5 – 118.5
119 - 122 4 120.5 118.5 – 122.5

n = 100
JOHN PAUL D. GUNNAWA
Derived Frequency Distribution
 we can construct other frequency distributions like the relative frequency
distribution and the cumulative frequency distribution.
Relative Frequency Distribution
 it is given set of data shows the proportion in percent the frequency of each
class to the total frequency. Denoted by %f.
%f = f x 100
n
where %f – the relative frequency for each class interval
f – the frequency of each class
n – the sample size
The relative frequency of the first interval can be obtained as follow: 11 - 22
%f = 3 x 100 = 5%
60

JOHN PAUL D. GUNNAWA


We can continue converting class frequency to percent, then we shall come up
with the relative frequency distribution below. (use example 1)
Classes f x Classes Boundaries %f

11 - 22 3 16.5 10.5 – 22.5 5

23 - 34 5 28.5 22.5 – 34.5 8.33

35 - 46 11 40.5 34.5 – 46.5 18.33

47 - 58 19 52.5 46.5 – 58.5 31.67

59 - 70 14 64.5 58.5 – 70.5 23.33

71 - 82 6 76.5 70.5 – 82.5 10

83 - 94 2 88.5 82.5 – 94.5 3.33


n = 60 Total = 99.99%
JOHN PAUL D. GUNNAWA
.

Cumulative Frequency Distribution


 this distribution can be obtained by simply adding the class
frequency.
Two types of Cumulative Frequency Distribution
1. Less than Cumulative frequency distribution
 refers to the distribution whose frequencies are less than or
below the upper class boundary they correspond to.
<cumf
2. Greater than cumulative frequency distribution
 refers to the distribution whose frequencies are greater than or
above the lower class boundary they correspond to.
>cumf JOHN PAUL D. GUNNAWA
Less than and greater than cumulative frequency distribution
Classes f x Classes Boundaries %f <cumf >cumf

11 - 22 3 16.5 10.5 – 22.5 5 3 60

23 - 34 5 28.5 22.5 – 34.5 8.33 8 57

35 - 46 11 40.5 34.5 – 46.5 18.33 19 52

47 - 58 19 52.5 46.5 – 58.5 31.67 38 41

59 - 70 14 64.5 58.5 – 70.5 23.33 52 22

71 - 82 6 76.5 70.5 – 82.5 10 58 8

83 - 94 2 88.5 82.5 – 94.5 3.33 60 2


n = 60 Total = 99.99%

JOHN PAUL D. GUNNAWA


HISTOGRAM
 refer to the data presentation that uses bars in presenting the frequencies of
each class.

JOHN PAUL D. GUNNAWA


MEASUREMENT OF CENTRAL TENDENCY
MEAN
 one of the simplest and most efficient measures of central tendency.
 it is the value obtained by adding the values in the distribution and dividing the
sum by the total number of values.
MEAN FOR UNGROUP DATA
To compute the mean for ungroup data:
̅x = sum of all the values in the distribution
number of values in the distribution
̅x = ∑x
n
Example 1. consider the following values.
21, 10, 36, 42, 39, 52, 30, 25,26
Compute the value of the mean
JOHN PAUL D. GUNNAWA
Solution: to compute for the value of the mean.
̅x = ∑x
n
= 21 + 10 + 36 + 42 + 39 + 52 + 30 + 25 + 26
9
= 281
9
̅x = 31.22
Example 2. The age of 15 students in a certain class were taken.
15, 18, 17, 16, 19, 21, 18, 23, 24, 18, 16, 17, 20, 21, 19
Solution: To compute for the value of the mean,
̅x = 15 + 18 + 17 + 16 + 19 + 21 + 18 + 23 + 24 + 18 + 16 + 17 + 20 + 21 + 19
15
= 282
15
̅x = 18.80
JOHN PAUL D. GUNNAWA
WEIGHTED MEAN
Formula: ̅x = ∑wx
∑w
Where x = represents the item value
w = represent the weight associated to x

Example: suppose we are interested in computing the weighted mean grade of the
student in our previous as shown below.
Student No. of units (w) Grade (x)
1 3 2.0
2 3 3.0
3 5 1.25
4 1 3.0
5 2 2.5
6 3 2.5

JOHN PAUL D. GUNNAWA


To compute the value of the weighted mean,
̅x = ∑wx
∑w
= 3(2.0) + 3(3.0) + 5(1.25) + 1(3.0) + 2(2.5) + 3(2.5)
3+3+ 5+1+2+3
= 36.75
17
̅x = 2.16

The weighted mean can also be computed by constructing


another column representing the products of the item values and
their corresponding weights.
JOHN PAUL D. GUNNAWA
Example 2: Suppose we want to compute the weighted mean grade of the student
in our example using vertical addition. If we let x be the grade of the student and w
be the number of units per subject.
Student No. of units (w) Grade (x) wx
1 3 2.0 6.0
2 3 3.0 9.0
3 5 1.25 6.25
4 1 3.0 3.0
5 2 2.5 5.0
6 3 2.5 7.5

∑w = 17 ∑w x= 36.75
̅x = ∑wx
∑w
= 36.75
17
x̅ = 2.16 JOHN PAUL D. GUNNAWA
MEAN FOR GROUPED DATA
To compute the value of the mean of a data presented in a frequency
distribution, we shall consider two methods:
1. midpoint method
2. Unit Deviation method
The formula is:
̅x = ∑fx
n
Where: f – represent the frequency of each class
x – the midpoint of each class
n – the number of frequencies or sample size
Steps: 1. Get the midpoint of each class
2. Multiply each midpoint by its corresponding frequency
3. Get the sum of the products in step 2
4. Divide the sum obtained in step 3 by the total number of frequencies. The
result shall be rounded off to two decimal
JOHN PAUL D. places.
GUNNAWA
Example: Consider the frequency distribution of the examination scores of the sixty
students in a statistics class. (MIDPOINT METHOD)
Solution: To be able to compute the value of the mean.
Step 1. Get the midpoint of each class. The midpoint are shown in the third
column. Classes f x

11 - 22 3 16.5

23 - 34 5 28.5

35 - 46 11 40.5

47 - 58 19 52.5

59 - 70 14 64.5

71 - 82 6 76.5

83 - 94 2 88.5
Step 2: Multiply each midpoint by its corresponding frequency. The product are
shown in the 4th column.
Classes f x fx

11 - 22 3 16.5 49.5

23 - 34 5 28.5 142.5

35 - 46 11 40.5 445.5

47 - 58 19 52.5 997.5

59 - 70 14 64.5 903

71 - 82 6 76.5 459

83 - 94 2 88.5 177

JOHN PAUL D. GUNNAWA


.

Step 3. Get the sum of the product in step 2.


Classes f x fx

11 - 22 3 16.5 49.5

23 - 34 5 28.5 142.5

35 - 46 11 40.5 445.5

47 - 58 19 52.5 997.5

59 - 70 14 64.5 903

71 - 82 6 76.5 459

83 - 94 2 88.5 177

n = 60 ∑fx = 3,174
JOHN PAUL D. GUNNAWA
.

Step 4: Divide the result in step 3 by the sample size. The


result is the mean of the distribution.
̅x = ∑fx
n
= 3,174
60
̅ x = 52.90

JOHN PAUL D. GUNNAWA


Example 2: Consider the frequency distribution of the ages of 75 mayors. Compute the
mean age of the mayors.
Classes f x fx
Solution: Using the same procedure. 25 - 30 3 27.5 82.5
31 - 36 6 33.5 201
37 - 42 11 39.5 434.5
43 - 48 27 45.5 1,228.5
49 - 54 16 51.5 824
55 - 60 7 57.5 402.5
61 - 66 4 63.5 254
67 - 72 1 69.5 69.5

n = 75 ∑fx = 3,496.5

̅x = ∑fx
n
= 3,496.5
75
̅ x = 46.62 JOHN PAUL D. GUNNAWA
UNIT DEVIATION METHOD
The formula is: ̅x = xₐ + (∑fd)c
n
Where: ̅x - represents the assumed mean
f – the frequency of each class
d – the unit deviation
c – the size of the class interval
n – the sample size
Follow the step:
1. Choose an assumed mean by getting the midpoint of any interval
2. Construct the unit deviation column
3. Multiply the frequencies by their corresponding unit deviation. Add the
products.
4. Divide the sum in step 3 by the sample size
5. Multiply the result in step 4 by the size of the class interval
6. Add the value obtained in step 5 to the assumed mean. The obtained result
which is the mean should be rounded off two decimal places.
JOHN PAUL D. GUNNAWA
Example 1. compute the value of the mean of the data. Using the unit deviation
method.
Solution:
Step 1. choose an assumed mean. Classes f

11 - 22 3

23 - 34 5

35 - 46 11

47 - 58 19

59 - 70 14

71 - 82 6

83 - 94 2
n = 60
JOHN PAUL D. GUNNAWA
Step 2. Construct the unit deviation column.

Classes f d

11 - 22 3 -3

23 - 34 5 -2

35 - 46 11 -1

47 - 58 19 0

59 - 70 14 1

71 - 82 6 2

83 - 94 2 3
JOHN PAUL D. GUNNAWA
Step 3. Multiply the frequencies by their corresponding unit deviation. Add the products.
Classes f d fd

11 - 22 3 -3 -9

23 - 34 5 -2 -10

35 - 46 11 -1 -11

47 - 58 19 0 0

59 - 70 14 1 14

71 - 82 6 2 12

83 - 94 2 3 6

∑fd = 2
JOHN PAUL D. GUNNAWA
Step 4, 5 and 6.
̅x = xₐ + (∑fd)c
n
= 52.5 + ( 2 )12
60
= 52.5 + ( 24 )
60
= 52.5 + 0.4
̅x = 52.9
JOHN PAUL D. GUNNAWA
MEDIAN
 is a potential measure defined as the middlemost value in the
distribution.
MEDIAN FOR UNGROUP DATA
 it is always a must that the values be arranged in terms of
magnitude either from lowest to highest or vice versa.
let ῀x be the median.
῀x = ᵡ(n + 1) if n is odd
2

= ᵡ(n) + ᵡ(n + 1)
2 2 If n is even
2 JOHN PAUL D. GUNNAWA
Example 1. find the median of the following values.
21, 10, 36, 42, 39, 52, 30, 25, 26
Solution: Before identifying the value of the median, it is
necessary that the values be arranged in terms of magnitude.
10, 21, 25, 26, 30, 36, 39, 42, 52
since n = 9 and is odd,
῀x = ᵡ(n + 1)
2
= ᵡ(9 + 1)
2
= x₅ (refers to the fifth value)

῀x = 30 JOHN PAUL D. GUNNAWA


Example 2. The following values are the number of students of the first 8 classes in a
certain college taken for inspection:
21, 25, 26, 30, 36, 39, 42, 55
Determine the median.
Solution: The values are already arranged in terms of magnitude. Since n = 8 and is
even,
ᵡ(n) + ᵡ(n + 1)
῀x = 2 2 .
2
ᵡ(8) + ᵡ(8 + 1)
῀x = 2 2 .
2
= x₄ + x₅
2
= 30 + 36
2
῀x = 33 JOHN PAUL D. GUNNAWA
MEDIAN FOR GROUP DATA
The computing formula for grouped data is given below.

Where:
x₁ - refers to the lower boundary
cumfₐ - the cumulative frequency before the median class
f – the frequency of the median class
To be able to apply, we shall follow the steps below.
1. Get ½ of the total number of values.
2. Determine the value of cumf
3. Determine the median class.
4. Determine the lower boundary and the frequency of the median class and the size of the class
interval.
5. Substitute the values obtained in step 1-4 . Round off the final result to two decimal places.

JOHN PAUL D. GUNNAWA


Example: Compute the value of the median of the examination scores of the
students in Statistics.
Solution: We shall first construct the less than cumulative frequency column.
Classes F <cumf
11 - 22 3 3
23 - 34 5 8
35 - 46 11 19
19 cumf
47 - 58 19 f 38
median class
59 - 70 14 52
71 - 82 6 58
83 - 94 2 60

JOHN PAUL D. GUNNAWA


Steps:
1. n = 60 = 30
2 2
2. cumfₐ = 19
3. Median class: 47 – 58
4. x₁ₐ = 46.5 ; fₐ = 19 ; c = 12

JOHN PAUL D. GUNNAWA


.

Example 2. A researcher is conducting an investigation regarding the income of the


alumni of a certain university 5 years after graduation. The monthly incomes of the
200 respondents were taken and are presented below.
Classes f
3,500 – 4,999 6
5,000 – 6,499 23
6,500 – 7,999 36
8,000 – 9,499 40
9,500 – 10,999 59
11,000 – 12,499 20
12,500 – 13,999 8
14,000 – 15,499 6
15,500 – 16,999 2

n = 200
Determine the median of the monthly income of the 200 respondents.
JOHN PAUL D. GUNNAWA
Solution: By using same procedure.
Classes f <cumf
3,500 – 4,999 6 6
5,000 – 6,499 23 29
6,500 – 7,999 36 65
8,000 – 9,499 40 105
9,500 – 10,999 59 164 median class
11,000 – 12,499 20 184
12,500 – 13,999 8 192
14,000 – 15,499 6 198
15,500 – 16,999 2 200

JOHN PAUL D. GUNNAWA


.

Steps:

JOHN PAUL D. GUNNAWA


MODE
 this type of average is the simplest both in concept and in application.
MODE FOR UNGROUPED DATA
 the value of the mode can be obtained through inspection, thus, no computation
is needed.
example: 1. Consider the following sets of measurements.
a: 21, 23, 16, 15, 26, 27, 19, 24
b: 31, 21, 16, 15, 21, 27, 19, 18
c: 17, 25, 24, 25, 27, 19, 19, 24
Solution:
a: there is no value that occurred more than once.
b: ᶺx = 21
c: ᶺx = 25, 19, 24
JOHN PAUL D. GUNNAWA
MODE FOR GROUP DATA
 it is necessary to identify the class interval that contains the mode.
MODAL CLASS
 contain the highest frequency in the distribution.

To be able to apply, we shall consider the following step:


1. Determine the modal class
2. Get the value of d₁.
3. Get the value of d₂.
4. Get the lower boundary of the modal class.
5. Apply the formula by substitution the values obtained in the proceeding steps.
JOHN PAUL D. GUNNAWA
Example 1: consider the frequency distribution of the examination scores of sixty
students. Compute the mode of that distribution.
Solution: The frequency distribution of the data is reproduced below.
Classes f
11 - 22 3
23 - 34 5
35 - 46 11
47 - 58 19
modal class
59 - 79 14
71 - 82 6
83 - 94 2

JOHN PAUL D. GUNNAWA


To get the value of d₁ and d₂, we have
d₁ = 19 – 11 = 8
d₂ = 19 – 14 = 5
Substituting these values:

JOHN PAUL D. GUNNAWA


QUARTILES
 refer to the values that divide the distribution into four equal parts. These are 3
quartiles represented by Q₁, Q₂, and Q₃.
Q₁ refers to the value in the distribution that falls on the first one fourth of the
distribution arranged in magnitude.
Q₂ this value correspond to the median.
Q₃ this value corresponds to three fourths of the distribution.

| <…………………..3/4……………….>|
|<……………1/2…………..>|
|<.…1/4…..>|

The procedure of the First (Q₁ ) Second (Q₂ ), and Third(Q₃ )


Quartile in a given Set of Data

JOHN PAUL D. GUNNAWA


For grouped data, the procedure of computing the value of the first and the third
quartiles is similar to that of computing the value of the median. The computing of
the kth quartile where k = 1, 2, 3 is given by

Where x₁ - lower boundary of the kth quartile class


cumfₐ - cumulative frequency before the kth quartile class
fₒ - frequency before the kth quartile class

JOHN PAUL D. GUNNAWA


.

Example: 1. For purpose of illustration, let us again reproduce the less than
frequency distribution of the results of examination of 60 students, let us compute
the value of the first quartile and the third quartile.
Solution: The frequency distribution is reproduced below.
Classes F <cumf
11 - 22 3 3
23 - 34 5 8
35 - 46 11 19
47 - 58 19 f 38
59 - 70 14 52
71 - 82 6 58
83 - 94 2 60

To compute the value of Q₁, we shall follow the procedure used in computing the
value of the median.

JOHN PAUL D. GUNNAWA


Steps:
1. Get ¼ of the total number of frequencies.
n/4 = 60/4 = 15
2. Get the value of the cumulative frequency before the first quartile class.
cumfₐ = 8
3. Determine the first quartile class.
1st quartile class: 35 – 46
4. Determine the lower boundary of the first quartile class.
x₁ₐ = 34.5
5. Get the frequency of the first quartile class.
fₐ₁ = 11
6. Substitute all values and compute.

JOHN PAUL D. GUNNAWA


To compute the third quartile , we shall follow the procedure used in computing
the value of the first quartile:
Steps:
1. 3n/4 = 3(60)/4 = 180/4 = 45
2. cumfₐ = 38
3. Third quartile class: 59 – 70
4. x₁ₐ = 58.5
5. fₐ₃ = 14
6.

JOHN PAUL D. GUNNAWA


DECILES
 it a set of data is divided into ten equal parts, then we have nine points of
division. The method of computing the values of these measurements is just the
same as in the median or quartiles.

JOHN PAUL D. GUNNAWA


Example: Using the same frequency distribution as in the preceding example,
determine the value of the following:
a. D₃
b. D₅
Solution: the frequency distribution is reproduced below.
Classes F <cumf
11 - 22 3 3
23 - 34 5 8
35 - 46 11 19
47 - 58 19 f 38
59 - 70 14 52
71 - 82 6 58
83 - 94 2 60

JOHN PAUL D. GUNNAWA


.

a. To compute the value of D₃, we have


1. 3n/10 = 3(60)/10 = 180/10 = 18
2. cumfₐ= 8
3. 3rd decile class: 35 – 46
4. x₁ₐ = 34.5
5. fₐ₃ = 11

JOHN PAUL D. GUNNAWA


b. To compute the value of D₅, we shall have
1. 5n/10 = 5(60)/10 = 300/10 = 30
2. cumfₐ = 19
3. Fifth decile class: 47 -58
4. x₁ₐ = 46.5 ; fₒ₅ = 19

JOHN PAUL D. GUNNAWA


PERCENTILE
 refer to those values that divide a distribution into one hundred equal parts.

JOHN PAUL D. GUNNAWA


.

Example: Determine the value of the 43rd percentile using the same frequency
distribution of the median, quartile or decile.
Solution: The frequency distribution is reproduced below.
Classes F <cumf
11 - 22 3 3
23 - 34 5 8
35 - 46 11 19
47 - 58 19 f 38
59 - 70 14 52
71 - 82 6 58
83 - 94 2 60

JOHN PAUL D. GUNNAWA


To compute the value of P₄₃, we have
Steps:
1. 43n/100 = 43(60)/100 = 2,580/100 = 25.8
2. cumfₐ = 19
3. 43rd percentile class: 47 – 58
4. x₁ₐ = 46.5
5. fₐ₄₃ = 19

JOHN PAUL D. GUNNAWA


.

SEMI-INTER QUARTILE RANGE OR QUARTILE DEVIATION


 this value is obtained by getting one half the difference between the third and
the first quartiles.
Q = Q₃ - Q₁
2
Example 1: The examination scores of 50 students in a statistics class resulted to
the following values Q₃ = 75.43 and Q₁ = 54.24. Determine the value of the semi-
inter quartile range.
Solution: To be able to compute the value of Q.
Q = Q₃ - Q₁
2
Q = 75.4 – 54.24
2
Q = 21.19
2
Q = 10.6
JOHN PAUL D. GUNNAWA
MEASURES OF VARIATION
 these are the values used to determine the scatter of values in a distribution.
RANGE
R=H–L
Example 1. Determine the value of the range of the data.
Solution:
Classes F
11 - 22 3 R=H-L
23 - 34 5 = 94.5 – 10.5
35 - 46 11 R = 84
47 - 58 19 f
59 - 70 14
71 - 82 6
83 - 94 2

n = 60

JOHN PAUL D. GUNNAWA


Example 2: suppose the performance ratings of 100 faculty members of a certain
college were taken and are presented in frequency distribution as follows:
Classes F
71 - 74 3
75 - 78 10
79 - 82 13
83 - 86 18
87 - 90 25
91 - 94 19
95 - 98 12
n = 100
Compute the value of the semi-inter quartile range.

JOHN PAUL D. GUNNAWA


Solution: we will first compute the value of Q₃ and Q₁, since only the frequency
distributions is given.
Classes F <cumf
71 - 74 3 3
75 - 78 10 13
79 - 82 13 26
1st quartile class
83 - 86 18 44
87 - 90 25 69
91 - 94 19 88
95 - 98 12 100 3rd quartile class

JOHN PAUL D. GUNNAWA


.

JOHN PAUL D. GUNNAWA


.

AVERAGE DEVIATION
 refers to the arithmetic mean of the absolute deviations of the values from the
mean of the distribution.
AVERAGE DEVIATION FOR UNGROUPED DATA
AD = ∑│x - ⁻x│
n
We shall follow the steps below:
1. Arrange the values in column according to magnitude.
2. Compute the value of the mean (⁻x).
3. Determine the deviation (x - ⁻x)
4. Convert the deviation in step 3 into positive deviation. Use the absolute value
sign │x - ⁻x│.
5. Get the sum of the absolute deviation in step 4.
6. Divide the sum in step 5 by n.

JOHN PAUL D. GUNNAWA


Example. Consider the following values.
x: 13, 16, 9, 6, 15, 7, 11
Determine the value of the average deviation.
Solution: First, we arrange the values in vertical column and then we compute the
value of the mean.
x
6
7
9
11
13
15
16
∑x = 77
⁻x =∑x = 77 = 11
n 7
JOHN PAUL D. GUNNAWA
We get the deviations of the individual items from the mean.
x x - ⁻x
6 6 – 11 = -5
7 7 – 11 = -4
9 9 – 11 = -2
11 11 – 11 = 0
13 13 – 11 = 2
15 15 – 11 = 4
16 16 – 11 = 5

Notice that some of the deviations from the mean are negative. Hence, we make
an assumption that all deviation are positive deviations by introducing the absolute
value sign. Adding all these absolute deviation.

JOHN PAUL D. GUNNAWA


x x - ⁻x │x - ⁻x│
6 -5 5
7 -4 4
9 -2 2
11 0 0
13 2 2
15 4 4
16 5 . 5 .
∑│x - ⁻x│= 22

If we divide the sum of the absolute deviation by n, then we were able to compute
the value of the average deviation.

AD = ∑│x - ⁻x│ = 22 = 3.14


n 7

JOHN PAUL D. GUNNAWA


AVERAGE DEVIATION FOR GROUPED DATA
For grouped data, the computing formula for the mean absolute deviation or
average deviation is given by:
AD = ∑f│x - ⁻x│
n
We shall follow the steps bellow.
1. Compute the value of the mean.
2. Get the deviation by using the expression x - ⁻x.
3. Multiply the deviation by its corresponding frequency.
4. Add the result in step 3.
5. Divide the sum in step 4 by n.

JOHN PAUL D. GUNNAWA


Example: compute the value of the average deviation of the frequency distribution.
Solution: we compute the value of the mean.
Classes f x fx
11 – 22 3 16.5 49.5
23 – 34 5 28.5 142.5
35 – 46 11 40.5 445.5
47 – 58 19 52.5 997.5
59 – 70 14 64.5 903
71 – 82 6 76.5 459
83 – 94 2 88.5 177
n = 60 ∑fx = 3,174

⁻x = ∑fx = 3,174 = 52.9


n 60

JOHN PAUL D. GUNNAWA


.

Second, we construct the deviation column x - ⁻x.


Classes f x fx x - ⁻x
11 – 22 3 16.5 49.5 -36.4
23 – 34 5 28.5 142.5 -24.4
35 – 46 11 40.5 445.5 -12.4
47 – 58 19 52.5 997.5 -0.4
59 – 70 14 64.5 903 11.6
71 – 82 6 76.5 459 23.6
83 – 94 2 88.5 177 35.6

JOHN PAUL D. GUNNAWA


.

Third, we construct the deviation to positive deviation.


Classes f x fx x - ⁻x │ x - ⁻x│
11 – 22 3 16.5 49.5 -36.4 36.4
23 – 34 5 28.5 142.5 -24.4 24.6
35 – 46 11 40.5 445.5 -12.4 12.4
47 – 58 19 52.5 997.5 -0.4 0.4
59 – 70 14 64.5 903 11.6 11.6
71 – 82 6 76.5 459 23.6 23.6
83 – 94 2 88.5 177 35.6 35.6

JOHN PAUL D. GUNNAWA


Fourth, we multiply the positive deviations by their corresponding frequencies.
Classes f x fx x - ⁻x │ x - ⁻x│ f│ x - ⁻x│
11 – 22 3 16.5 49.5 -36.4 36.4 109.2
23 – 34 5 28.5 142.5 -24.4 24.6 123
35 – 46 11 40.5 445.5 -12.4 12.4 136.4
47 – 58 19 52.5 997.5 -0.4 0.4 7.6
59 – 70 14 64.5 903 11.6 11.6 162.4
71 – 82 6 76.5 459 23.6 23.6 141.6
83 – 94 2 88.5 177 35.6 35.6 71.2

AD = ∑f│x - ⁻x│
n
= 751.4
60
AD = 12.52

JOHN PAUL D. GUNNAWA


VARIANCE FOR UNGROUPED DATA
If we let s² be the variance, then we have
s² = ∑(x - ⁻x)
n

We shall consider the following steps.


1. Compute the value of the mean.
2. Get the deviation of each value from the mean.
3. Square the deviations.
4. Calculate the sum of the squared deviations.
5. Divide the sum by the total number of values.
JOHN PAUL D. GUNNAWA
Example: Compute the value of the variance of the following measurements.
13, 5, 7, 9, 10, 17, 15, 12
Solution: for simplicity, we shall first arrange these values in magnitude in a vertical
column, using the steps indicated above.
x x - ⁻x (x - ⁻x)²
5 -6 36
7 -4 16 ⁻x = ∑n = 88 = 11
9 -2 4 n 8
10 -1 1
12 1 1
13 2 4 s² = ∑(x - ⁻x)² = 114 = 14.25
15 4 16 n 8
17 6 . 36 .
∑x = 88 ∑(x - ⁻x)² = 114

JOHN PAUL D. GUNNAWA


s² = ∑x ² - (∑x)²
n n
Example: Find the value of the variance of the distribution used in example 1 of this
section.
Solution: by simply following the steps above, we hall have
x x²
5 25
7 49 s² = ∑x ² - (∑x)²
9 81 n n
10 100 = 1,082 – (88)²
12 144 8 8
13 169 = 135.25 – (7,744)
15 225 64
17 289 = 135.25 - 121
∑x = 88 ∑x² = 1,082 s² = 14.25

JOHN PAUL D. GUNNAWA


VARIANCE FOR GROUPED DATA
s² = ∑f(x - ⁻x)²
n
We shall consider the steps below.
1. Compute the value of the mean.
2. Determine the deviation x - ⁻x by subtracting the mean from
the midpoint of each class interval.
3. Square the deviation obtained in step 2.
4. Multiply the frequency by their corresponding squared
deviations.
5. Add the results in step 4.
6. Divide the result in step 5 by the sample size.

JOHN PAUL D. GUNNAWA


Example. Calculate the value of the variance of the distribution.
Solution: First, we shall reproduce the frequency distribution.
Classes f x fx x - ⁻x (x - ⁻x)² f(x - ⁻x)²
11 – 22 3 16.5 49.5 -36.4 1,324.96 3,974.88
23 – 34 5 28.5 142.5 -24.4 595.36 2,976.8
35 – 46 11 40.5 445.5 -12.4 153.76 1,691.36
47 – 58 19 52.5 997.5 -0.4 0.16 3.04
59 – 70 14 64.5 903 11.6 134.56 1,883.84
71 – 82 6 76.5 459 23.6 556.96 3,341.76
83 – 94 . 2 . 88.5 . 177 . 35.6 1,267.36 . 2,534.72 .
n = 60 ∑fx =3,174 ∑f(x-⁻x) = 16,406.4

⁻x = ∑fx = 3,174 = 52.9


n 60
s² = ∑f(x - ⁻x)² = 16,406.4 = 273.44
n 60
JOHN PAUL D. GUNNAWA
s² = [∑fd² - (∑fd)²]c²
n n
The procedure for the computation of the variance using the unit
deviation method is as follow:
1. Determine the unit deviation column.
2. Multiply the frequency by its corresponding unit deviation.
3. Square the unit deviation.
4. Multiply the squared unit deviation by its corresponding
frequency.
5. Add the result in step 2.
6. Add the result in step 4.
7. Apply the formula through substitution.

JOHN PAUL D. GUNNAWA


Example: Calculate the data use in example 3, compute the value of the variance using the
unit deviation method.
Solution:
Classes f d fd d² fd²
11 – 22 3 -3 -9 9 27
23 – 34 5 -2 -10 4 20
35 – 46 11 -1 -11 1 11
47 – 58 19 0 0 0 0
59 – 70 14 1 14 1 14
71 – 82 6 2 12 4 24
83 – 94 . 2 . 3 . 6 . 9 . 18 .
n = 60 ∑fd= 2 ∑fd²= 114

s² = [∑fd² - (∑fd)²]c²
n n
s² = [114 - (2)²]12²
60 60

s² = 273.44 JOHN PAUL D. GUNNAWA


•  STANDARD DEVIATION
 it is extracting the square root of the value of the
variance will give the value.

Example. Suppose the value of the variance of a set of
measurements was computed to be equal to 128.93.
Determine the value of the standard deviation.
Solution: The standard deviation is simply the square root
of the variance.
s² = = 11.35

JOHN PAUL D. GUNNAWA

You might also like