Module 0. Review On Statistics
Module 0. Review On Statistics
Part 1
Data (Datum)
1. Descriptive Statistics
- deals largely with summary calculations, graphical
displays and describing important features of a set
of data. It does not attempt to draw
conclusions/insights about anything that pertains to
more than the data themselves.
2. Inductive / Inferential Statistics
- is concerned with making generalizations from
information gathered from a small group of
observations (sample) to a bigger group of
observations (population).
- It is equipped with an enormous number of
analytical tools that allows the investigator to grasp a
better understanding about the population from
which the sample data was gathered based on the
information that is contained only in the sample.
Measurement
1. Identity
– it enables a person to distinguish one number
from the other.
– They are identified by their shapes or the way they
are written.
– This is the simplest property of numbers.
2. Order
– it refers to the way numbers are arranged in a
sequence.
– It is an established convention that 1 comes
before 2, 2 comes before 3, and so on. We also say
that “7 is greater than 6” or “3 is less than 5”.
3. Additivity / equality of scale
#1. What type of data are we getting from the following? Write
‘qt’ if quantitative, ‘ql’if qualitative on the space provided.
___ width ___ scores ___ mass ___ temp in oC
___ % ratings ___ gender ___ income ___ height in ft
___ dean’s list ___ preferences___ perceptions___ civil
status
2. What type of measurement are we getting from the following? Write N if
nominal, O if ordinal, I if interval, or R if ratio:
___ width ___ scores ___ mass ___ temperature in oC
___ ratings in % ___ gender ___ income ___ height in ft
___ dean’s list ___ preferences ___ perceptions ___ civil status
3.
Four Methods of Collecting Data
• By Interview
• By Questionnaires
• By Direct Observation
• By utilizing Existing Records
– published or unpublished,
– primary(first hand and have not been subjected to
some transcription or condensation) or secondary
(transcribed or compiled from original sources)
Sampling Techniques
• Doing a census, that is studying the entire
population, is not always feasible because of
limited resources like money and time .
• Oftentimes researchers resort to do sample
surveys.
• To make reliable inferences regarding the
population, from which the sample was taken, one
should select a sample that is a good
representation of the population, that is unbiased
sample.
2 major Sampling Techniques
• Probability/Random Sampling
– A kind of sample selection where each member of
the target population has a known non-zero
chance of being selected
• Non-probability /Non-random Sampling
Probability/Random Sampling
– Simple random
– Systematic
– Stratified
– Cluster
– Multi-stage
Non-probability /Non-random Sampling
– Convenience/availability
– Quota
– Purposive/Judgmental
– Snowball/Referral
– Panel
Score Data
• After checking your students’ test papers you
now have a set of data. They need to be
organized. Statistical organization of scores is a
systematic arrangement or grouping of scores.
The purpose is to determine their significant
meaning.
Presentation of data
• Tabulating
• Ordering
• Ranking
• Grouping
• Graphing
Tabulating: The talligram
• Scores:
86 74 66 70 75 56 69
70 73 66 74 81 60 76
80 81 61 67 63 68 73
63 75 71 58 72 83 69
79 67 68 64 69 73 69
78 88 62 76 72 65 66
70 73 61 78 84 77
0 1 2 3 4 5 6 7 8 9 T
8 1 11 1 1 1 1 7
7 111 1 11 1111 11 11 11 1 11 1 20
6 1 11 1 11 1 1 111 11 11 1111 19
5 1 1 2
T 5 5 3 7 4 3 7 3 6 5 48
Ordering
– The data is arranged in descending (highest to lowest)
or ascending (lowest to highest) order writing each
score as many times as it occurs. Ordered
arrangement of scores is a prerequisite to ranking of
scores
Ranking
• Arrange scores in descending order (from highest to lowest).
Write each scores as many times as it occurs in one column.
This is the first column.
• Number each score consecutively from 1 to n where n equals
the number of scores. This is the second column.
• On the third column write the ranks of each scores.
– The rank of a score occurring once is the same as its consecutive
number.
– To find the rank of a score occurring two or more times, add the first
and the last consecutive numbers of the score and divide the sum by
two. The result is the rank.
Activity 2:
• Make a talligram out of the following scores in
CPE105 final exam and find their ranks.
94 76 56 79 80 87 68 75 70
95 93 90 76 87 67 76 87 87
96 68 93 56 76 87 94 51 60
97 51 56 76 67 87 90 56 88
98 85 59 57
Grouping: Class frequency distribution
• Find the range R = Hs - Ls
• Determine/ estimate the number of
intervals/classes, k. k= √n
• Find the class width ( c ) or the width of the
interval. C = R/k
• Find the lowest limit and the other limits of the
classes.
• Tally the scores.
• Write the frequencies, class boundaries,
cumulative frequencies.
Ex: Class Frequency Distribution Table
87 – 91 1 1 86.5 – 91.5 89 48 1
82 – 86 111 3 81.5 – 86.5 84 47 4
77 – 81 1111111 7 76.5 – 81.5 79 44 11
72 – 76 111111111111 12 71.5 – 76.5 74 37 23
67 – 71 1111111111 10 66.5 – 71.5 69 25 33
62 – 66 11111111 8 61.5 – 66.5 64 15 41
57 – 61 1111111 7 56.5 – 61.5 59 _7 48_
Graphing: Graphical Presentation of Class Frequency
Distributions
_ ∑ x i wi
X = __________
∑wi
~ X(n+1)/2 if n is odd
X={
½ ( x n + xn + 1 ) if n is even
2 2
to highest.
The Median (grouped data)
~
X = Lm + ( n/2 – cf<) c
f
۸
Crude mode: X = Lm + C
2
where c is the class interval and
Lm is the lower class boundary of the modal class
۸ ~
_
Refined mode X = 3X – 2X
Importance of the central measure:
= 71 + 68 + 68 + 58 + 55 + 52 + 52 + 45 + 38 + 38 + 38 + 30 + 25+25
14
= 663
14
= 47.36
ACTIVITY 3: BY TWOS. Find the mean, median, and
mode.
1. Ungrouped Scores
70 42 30 27
70 42 30 26
68 34 30 26
52 34 30 19
52 34 30 19
2. Grouped Scores
fi xi
85 – 89 2 87
80 – 84 1 82
75 – 79 3 77
70 - 74 4 72
65 – 69 5 67
60 – 64 7 62 Highest Score: 87
55 – 59 5 57
50 – 54 2 52 Lowest Score: 34
45 – 49 4 47
40 – 44 3 42
35 – 39 2 37
30 – 34 2 32
Other Measures of location
• Other measures of location that describe or
locate the non-central position of a set of data
are referred to as quantiles or fractiles . Most
common fractiles are known as percentiles,
deciles, and quartiles.
Percentiles
• are values that divide an ordered set of
observations into 100 equal parts denoted by
P1, P2, …, P99 such that 1% of the data falls
below P1, 2% of the data falls below P2, …,
and 99% of the data falls below P99.
Deciles
• are values that divide an ordered set of
observations into 10 equal parts denoted byD 1,
D2, …, D9 such that 10% of the data falls
below D1, 20% of the data falls below D2, …,
and 90% of the data falls below D9.
Quartiles
• are values that divide an ordered set of
observations into 4 equal parts denoted byQ 1,
Q2, and Q3 such that 25% of the data falls
below Q1, 50% of the data falls below Q2 and
75% of the data falls below Q3.
To solve for percentiles, deciles or quartiles
from ungrouped data
• 1. Arrange data in increasing order of magnitude
(ascending order).
• 2. Solve for the value of L where
L = mn / 100 for percentiles
L = mn / 10 for deciles L = mn / 4
for quartiles
where m is the location of the percentile,
decile, or quartile
n is the number of observations
3. If L is an integer, the desired fractile gets
the average of the Lth and the (L+1)th observation.
If L is fractional, the desired fractile gets the
next higher integer to find the required
location. The fractile corresponds to the value in
that location.
• Example:Find P63, D8, and Q1 in the following set of score data in
Bio 1: 95, 34, 45, 67, 56, 58, 76, 87, 91, 39, 56, 78
Solution:
The data arranged in ascending order:
34, 39, 45, 56, 56, 58, 67, 76, 78, 87, 91, 95. n = 12