I.Q.
Lesson 1
Terminology
Data
Data (singular datum) is whatever we
collect in a study. It can be numerical or not
A datum may be referred to as a data point
in a table or graph. It may also be referred
to as a score
Data set refers to all the data collected in an
experiment
Types of Data
Data are first categorized by numerical, and non
numerical
Quantitative data can be interval or ratio
In research, numerical is called QUANTitative, and
non-numerical is called QUALitative
The difference is that ratio data have an absolute 0
and standard increases, so for example, 4 would be
twice as more than 2
Qualitative data can be nominal or categorical
This distinction is irrelevant, as the analyses for
qualitative data are all the same
Variable
A variable is what we collect data of. It
determines what type of data we have
We call it a variable because it can have
different values that can be measured
e.g. IQ is a quantitative variable: we collect
numerical data from it
e.g. Hair colour is a qualitative variable: we
collect non-numerical data from it
Population
The population is the complete group from
which we collect data, and what we make
conclusions about in a study.
A population can be anything, from the total
people of a country, to a class of
psychology students, to the total number of
Oak trees in the world.
Sample
However, populations are usually to large to
collect data from, so we select a group from the
population at random, called a sample
If the sample is randomly selected from the
population, we can assume that it represents the
population, and we can make conclusions about
the population from the data we collect from the
sample.
Population random sample collect data
conclusion about the population.
Individual
The individual is a single member of the
population/sample.
Samples consist of a group of individuals
selected from the population
Populations consist of ALL individuals we
want to study
Descriptive Statistics
Before we can do inferential statistics to
make conclusions, we need to know things
about the population first: We need to
describe it.
Descriptive statistics are calculated from the
data set, or raw data (the data before ANY
calculations or manipulations are done to it)
Central Tendency
The first type of descriptive statistic we calculate
is a measure of central tendency.
Central tendency essentially means the center
point of a data set, or the number that all datum
have a tendency to be close to.
Measures of central tendency include the mean,
the median, and mode, and each serve different
purposes.
The Mean
The mean is the standard measure of central
tendency, and the most useful for doing inferential
statistics
The mean is simply the average: all the scores added
together then divided by the total number of scores
Sometimes the mean does not adequately measure
the central tendency, such as when one or more
scores are extreme compared to the rest.
e.g. 3, 5, 6, 30: 30 is extreme (a single score
significantly changes the mean)
The Median
In cases of extreme scores (called
outliers), we need a different measure of
central tendency that is not affected by
outliers.
That's the median: the center of a data set
when the scores are listed from smallest to
greatest, or vice versa
Smallest to largest is the standard and
expected way to find the median
The Mode
Mean and median only work for numerical data, like
age or IQ scores.
Sometimes we collect data that is not numerical, like
hair colour or gender.
We cannot find the average number hair
colour.
But, we can find the most common hair colour
The mode is simply whichever category appears the
most
Frequency
Frequency means something specific in
stats
It is the number of times a score appears in
a data set.
Recall, the mode is the score with the
highest frequency
Later we'll learn how to construct a
frequency table
Variance
Recall that measures of central tendency are
based on variables: things that have different
values (it varies between individuals)
This means there are differences between
individuals, resulting in different scores for each
individual
We call these differences variance.
Data comes from a variable, which varies,
resulting in variance
Distribution
A distribution is a graphical representation
of a data set
It gives us a picture of how the data are
distributed in the sample or population
It is created using, and gives us a picture
of, the central tendency, frequency, and
variance
It is used in the final steps of an analysis
Inferential Statistics
Using descriptive statistics (mean, variance, etc.),
we can conduct an analysis
An analysis (plural analyses) is a mathematical
procedure that takes numerical information
(descriptive statistics) about a population, and
determines a probability
Based on the results of analyses, we make
conclusions about the population, called
inferences.
Probability
Probability is the most basic form of inferential
statistics, and often discussed within the context
of descriptive statistics
Probability is simply the likelihood of a specific
outcome.
e.g. If you flip a coin, there are 2 possible
outcomes: heads or tails. Therefore, the
probability of getting heads is 50%, and 50% for
tails.
Conclusion
Hopefully by now you have a rudimentary
understanding of the terms we use in statistics.
We will use the terms from this lesson quite a bit in
future lessons, so make sure you are clear on the
terms. If you need to, come back to this slide
show, or save it as a reference.
Referring back to these as you go will help you to
better memorize the terms by using them, rather
than just reading them over and over again.