Stat Lesson 1 Concepts and Definitions
Stat Lesson 1 Concepts and Definitions
STATISTICAL CONCEPTS
Science is based on the empirical method for making observations - for systematically obtaining
information. It consists of methods for making observations. Observations are the basic empirical "stuff" of
science. Statistics is a set of methods and rules for organizing, summarizing and interpreting information.
Statistics is a set of concepts, rules, and procedures that help us to:
o organize numerical information in the form of tables, graphs, and charts;
o understand statistical techniques underlying decisions that affect our lives and well-being;
and
o make informed decisions.
A population is the set of all individuals of interest in a particular study. We will also refer to
populations of scores. A Sample is a set of individuals selected from a population, usually intended to
represent the population in a study. We will also refer to samples of scores.
A Parameter is a value, usually a numerical value that describes a Population. A parameter may be
obtained from a single measurement, or it may be derived from a set of measurements from the population.
A Statistic is a value, usually a numerical value that describes a sample. A statistic may be obtained from a
single measurement, or it may be derived from a set of measurements from the sample.
A variable is any information that differs from one member to another in a population or sample. A random
variable (designated as X) is one whose numerical value is determined by chance. The key elements here
are that the variable assumes a number (sales volume, rate of return, test score, etc.) and that the sample
selection process generates the numbers randomly, i.e., by a “random” selection.
A constant is an information about the population or sample that is true to all members.
Quantitative variable refers to that which exists in different AMOUNTS. When it is measured, the
scores tell something about the amount or degree of the variable. At the very least, a larger score indicates
more of the variable than a smaller score does.
Qualitative variable is one that exists in different KINDS. A number may be assigned to this
variable but the scores or members are simply used as names or labels (dummy). It does not have
quantitative meaning.
A discrete variable is obtained by counting indivisible units. It can take specific values only as it is
always a collection of whole numbers and can never be a part of a unit.
Continuous variable is one which comes in units which are divisible into an infinite number of
fractional parts. It can take any point in the number line.
Independent variable is that which is manipulated by the researcher in a study – the treatment
variable in an experiment. It is the presumed cause of the differences in the dependent variable.
Dependent variable is that which is measured and analyzed in an experiment. Its values are tested
to determine whether they are dependent upon values of the independent variable. It is the presumed effect
of the independent variable
Research Title: Mathematics Achievement of Grade VI Pupils Taught Under Three Methods of Teaching
Dependent Variable: Mathematics Achievement (the variable measured after employing the treatment)
Independent Variable: Methods of Teaching (the variable that is manipulated)
Classification of Scales
1. Nominal scale - the lowest level and primitive type of measurement scale. It permits classification
of individuals into two or more categories. It likewise permits the making of statements of equality or
difference. The basic requirement is to assign an item or individual to one and only one category
and specify the criteria for placing individuals into classes.
2. Ordinal scale – specifies the relative position of items/individuals with respect to a given
characteristic, with no indication as to the distance between the positions. It has the same quality
with a nominal scales, plus the characteristic of greater than or less than. One must be able to
determine whether a n item has more, same or less of the attribute than another item or individual
has.
3. Interval scale – permits the making of statements of sameness or difference, greater than or less
than, and the added property that the intervals between items are equal. However, it doe not have a
true zero point. Being zero does not mean absence of something or nothing.
Example: test score (one who scored 4 has twice more of the one who got 2, but one who got 0 does
not mean he knows nothing about the lesson discussed)
4. Ratio scale – permits the making of statements of sameness or difference, greater than or less than,
equal rations between items, and the presence of a TRUE zero point, which means absence of the
attribute being measured
The more important notion to be got across at this early stage is how the subject of statistical
methods is organized. This diagram may help:
Descriptive Statistics
Example: "The average income of the 104 families in our company is Php 18,673." In descriptive
statistics, our objective is to describe the properties of a group of scores or data that we have "in
hand," i.e., data that are accessible to us in that we can write them down on paper or type them into
a spreadsheet. In descriptive statistics we are not interested in other data that were not gathered but
might have been; that is the subject of inferential statistics.
What properties of the set of scores are we interested in? At least three: their center, their spread,
and their shape. Consider the following set of scores, which might be ages of persons in your
professional club:
28, 38, 45, 47, 51, 56, 58, 60, 63, 63, 65, 66, 66, 67, 68, 70
We could say of these ages that they range from 28 to 70 (spread), and the middle of them is
somewhere around 60 (center). Now their shape is a property of a graph that can be drawn to depict
the scores. If I marked the scores along a number line, like so
Inferential Statistics
Example: "This sample of 512 families from Barangay Macopa indicates with 95% confidence we
can conclude that the average family income in the county is between Php5,187 and Php9,328."
In inferential statistics, our interest is in large collections of data that are so large that we can not
have all of them "in hand." We can, however, inspect samples of these larger collections and use
what we see there to make inferences to the larger collection. How samples relate to larger
collections of data (called populations) from which they have been drawn is the subject of inferential
statistical methods. Inferential statistics are frequently used by pollsters who ask 1000 persons
whom they prefer in an election and draw conclusions about how the entire municipality or province
will vote on election day. Scientists and researchers also employ inferential statistics to make
conclusions that are more general than the conclusions they could otherwise draw on the basis of
the limited number of data points they have recorded.
STATISTICAL CONCEPTS
Activity 2 - Classify the above variables into scale type (nominal, ordinal, interval, ratio). If you believe a
variable can be classified in more than one scale type, justify.
Activity 3 – Identify the dependent and independent variables in the following research titles:
ADDITIONAL>>>>>>>>>>
Scales of Measurement
One of the most influential distinctions made in the field of measurement was Stevens' (1946, 1957)
classification of scales of measurement. He described nominal, ordinal, interval, and ratio scales of
measurement, which are briefly defined below. A more detailed discussion of these scales can be
found in Chapter 4 of the text.
Nominal: Nominal scales are naming scales. They represent categories where there is no
basis for ordering the categories.
Ordinal: Ordinal scales involve categories that can be ordered along a dimension. However,
we have no way of knowing how different the categories are from one another. We state the
latter property by saying that we do not have equal intervals between the items. Rankings
also represent ordinal scales, because we know the order but do not know how different each
person is from the next person.
The most important reason for making the distinction between these scales of measurement is that it
affects the statistical procedures that you will use in describing and analyzing your data.
In this unit, we will be presenting dozens of examples of measures at each of these levels of
measurement, along with some exercises to help you to refine your understanding of these
distinctions. We recommend that you complete the exercises since the best way to learn anything is
to actively process the information by using it to solve real-life problems.
diagnostic categories
sex of the participant
classification based on discrete characteristics (e.g., hair color)
group affiliation (e.g., Republican, Democrat, Boy Scout, etc.)
the town people live in
a person's name
an arbitrary identification, including identification numbers that are arbitrary
menu items selected
any yes/no distinctions
most forms of classification (species of animals or type of tree)
location of damage in the brain
scores on scales that are standardized (i.e., with an arbitrary mean and standard deviation,
usually designed to always give a positive score)
scores on scales that are known to not have a true zero (e.g., most temperature scales except
for the Kelvin Scale)
scores on measures in which it is not clear that zero means none of the trait (e.g., a math
test)
scores on most personality scales based on counting the number of endorsed items
Exercises
Listed below are a number of exercises designed to familiarize students with the classification of
measures using Stevens' classification system. For each of the measures listed, determine what scale
of measurement most closely approximates the measure as described. Some of the examples are
deliberately ambiguous. To find out the correct answer, click on the word answer at the end of the
description of the item.