Statistics
Statistics
Qualitative - places an individual or item into one of • Objective Method – data are collected by
several groups or categories measuring or observing the characteristics of
Quantitative - takes numerical values; arithmetic interest directly on the units.
operations such as adding and averaging can be • Subjective Method – information is collected
performed through interviews not necessarily requiring the
- can be discrete (its values are presence of the units under study.
countable) or continuous (it can take any • Use of Existing Records – if the data or part of
value in an interval or collection of data needed by the researcher have already been
intervals) collected by another researcher or institution,
perhaps for some other purposes, then this
LEVELS OF MEASUREMENT method can be convenient. The researcher should
remember to properly acknowledge the source of
Nominal - takes values that give names or labels to data.
various categories with no particular ordering.
Information that can be obtained from processing Classification of Data Collected
data on these variables is limited to frequency counts
and percentages. • Primary Data – data collected directly from
Examples: gender, place of origin, religion etc. source and are obtained through objective or
subjective methods.
Ordinal – basically nominal with categories having an • Secondary Data – data acquired with the use of
inherent ordering. The difference between existing records
categories cannot be measured and has no meaning.
Information that can be obtained from processing METHODS OF ORGANIZING AND PRESENTING
data on these variables is limited to frequency counts DATA
with conditional insight on the rank or order of the
categories specified. Textual Presentation – provides a concise narrative
Examples: social class (lower, middle and upper description highlighting a few but the most important
class), satisfaction rating (very dissatisfied, results of the study.
dissatisfied, satisfied, very satisfied)
Example
Interval – basically quantitative variables with Data collected consists of 10 respondents
differences between two consecutive quantities their gender, quiz score and address (town). It was
being constant. Intervals between categories can be observed that 6 out of 10 respondents are female.
quantified and have meaning however it is Four respondents were currently residing in Sta Cruz,
distinguished as having no true starting or zero 2 respondents from Pila and Bay, and 1 respondent
point. from Los Banos and Pagsanjan.
Examples: room temperature (in Celsius) and IQ Tabular Presentation – if it is necessary to present
more details or numerical information
Ratio – all characteristics of interval scale variable in
addition to having an absolute zero point.
Examples: weekly allowance and class standing
Responde Quiz Address Mean (or the Arithmetic Mean) – defined as the sum of
Gender the data values divided by the total number of data
nts Score (Town)
1 F 2 Sta Cruz values
2 M 6 Sta Cruz Median – a single value at the middle of an array of
data observations, denoted by Md.
3 F 0 Sta Cruz
Mode – refers to the most frequent value in the data
4 M 9 Pila set
5 F 3 Bay
6 M 3 Los Banos Generalized Formula for Quantiles
7 M 3 Bay
N +1
th
8 F 7 Pagsanjan Qk = k item
9 F 5 Pila
4
N +1
th
10 F 5 Sta Cruz
Dk = k item
Quiz Score Count 10
0 1 N +1
th
2 1 Pk = k item
100
3 3
5 2
6 1 Gender Count
7 1 F 6
9 1 M 4
Grand Total 10 Grand Total 10
M
F
**FDT
Pie Chart of Respondent’s Sex
MEASURE OF DISPERSION
- are quantities that describe the spread or variability
of the values in a data set.
(x − )
2
i
N
Variance – defined as the average squared
NUMERICAL DESCRIPTIVE MEASURES differences of the observations from the mean of the
data set. *Square of SD
MEASURE OF LOCATION Coefficient of variation – relative measure of
- is a value within the range of the data that describes variability that indicates the magnitude of variation
its specific location or position relative to the entire relative to the magnitude of the mean. It is denoted as
data set CV and expressed in percentage as shown in the
Minimum (MIN) - lowest value in the data set
formula, CV = 100% .
Maximum (MAX) - highest value in the data set
d − D0
The test statistic is given by t c = , which follows
sd
the Student’s t distribution with (n-1) df.
Example
A study involved n=25 right-handed students who
each turned two different knobs (right-hand thread
and left-hand thread). The time it takes to move knob
indicator a fixed distance was measured from all
individuals. It is of interest to assess if right-hand
threads are easier to turn, on average. Use a 5%
TESTS ON TWO POPULATION MEANS significance level.
TYPES OF SAMPLES
Related samples are obtained by matching of similar
units with respect to some important characteristics
or by self-pairing in which two measurement are taken
form the same unit.
Independent samples are obtained when two
unrelated sets of units are measured for a specific
variable.
s 2
=
( n1 − 1) s12 + ( n2 − 1) s22
p
n1 + n2 − 2
CORRELATION COEFFICIENT, r
s xy
r=
sx s y
x y
s xy =
( x−x )( y− y )
=
xy −
n
n−1 n−1
OR
n xy − x y
r=
n ( x ) − ( x ) n( y ) − ( y )
2 2 2 2