Chapter 3 A Statistics Refresher

CHAPTER 3: A STATISTICS REFRESHER where test scores or class intervals (X-axis)

Scales of Measurement meet frequencies (Y-axis)

Measurement - act of assigning numbers or symbols to  Measures of Central Tendency
characteristics of things (people, events, whatever) o statistic that indicates the average or midmost
according to rules score between the extreme scores in a
Scale - a set of numbers (or other symbols) whose distribution
properties model empirical properties of the objects to  Arithmetic mean
which the numbers are assigned o Average
 Nominal Scale o denoted by the symbol x̅ is equal to the sum of
o simplest form of measurement. the observations divided by the number of
o involve classification or categorization based observations
on one or more distinguishing characteristics, o most appropriate measure of central tendency
where all things measured must be placed into for interval or ratio data when the distributions
mutually exclusive and exhaustive categories are believed to be approximately normal
o e.g. yes/no  Median
 Ordinal Scale o middle score in a distribution
o Permits classification and ranking  Mode
o imply nothing about how much greater one o most frequently occurring score in a distribution
ranking is than another of scores
o no absolute zero point o Bimodal distribution – two scores that occur
o e.g. 1st, 2nd, 3rd with the highest frequency
 Interval Scales  Measures of Variability
o contain equal intervals between numbers; o Statistics that describe the amount of variation
each unit on the scale is exactly equal to any in a distribution
other unit on the scale o Variability - indication of how scores in a
o no absolute zero point distribution are scattered or dispersed
o e.g. IQ score, temperature  Range
 Ratio Scale o equal to the difference between the highest
o has a true zero point and the lowest scores
o e.g. test of hand grip  Interquartile and semi-interquartile ranges
Measurement Scales in Psychology o Quartiles - dividing points between the four
 Ordinal level of measurement is most frequently quarters
used in psychology. o Interquartile range - measure of variability
 be constantly alert to the possibility of gross equal to the difference between Q3 and Q1; an
inequality of intervals ordinal statistic
Describing Data o Semi-interquartile range - equal to the
Distribution - a set of test scores arrayed for recording interquartile range divided by 2
or study  Average deviation
Raw score - a straightforward, unmodified accounting of o Describe the amount of variability in a
performance that is usually numerical distribution
 Frequency Distributions o rarely used because the deletion of algebraic
o all scores are listed alongside the number of signs renders it a useless measure for purposes
times each score occurred of any further operations
o scores might be listed in tabular or graphic form
o simple frequency distribution - indicate that
individual scores have been used and the data
have not been grouped  Standard deviation
o grouped frequency distribution - test-score o a measure of variability equal to the square root
intervals, also called class intervals, replace the of the average squared deviations about the
actual test scores mean
o can also be illustrated graphically o equal to the square root of the variance
o Graph - diagram or chart composed of lines, o Variance (s2) - equal to the arithmetic mean of
points, bars, or other symbols that describe and the squares of the differences between the
illustrate data scores in a distribution and their mean
 Histogram – a graph with vertical lines o very useful measure of variation because each
drawn at the true limits of each test score individual score’s distance from the mean of the
(or class interval), forming a series of distribution is factored into its computation.
contiguous rectangles
 Bar graph - numbers indicative of frequency
also appears on the Y-axis, and reference to
some categorization appears on the X-axis.
 Skewness
 Frequency polygon - expressed by a
continuous line connecting the points
o the nature and extent to which symmetry is - composed of a scale that ranges from 5
absent standard deviations below the mean to 5
o indication of how the measurements in a standard deviations above the mean.
distribution are distributed - Advantage: none of the scores is negative.
Positive skew Stanine
- when relatively few of the scores fall at the high - a term that was a contraction of the words
end of the distribution standard and nine.
- indicate that the test was too difficult - Divided into nine-unit scale; 5th stanine
Negative skew indicates average performance
- when relatively few of the scores fall at the low Scholastic Aptitude Test (SAT) and Graduate Record
end of the distribution Examination (GRE)
- indicate that the test was too easy - Raw scores on those tests are converted to
 Kurtosis standard scores such that the resulting
o the steepness of a distribution in its center distribution has a mean of 500 and a standard
o platykurtic (relatively flat), leptokurtic deviation of 100
(relatively peaked), mesokurtic (somewhere in Linear transformation
the middle) - one that retains a direct numerical relationship
o High kurtosis = a high peak and “fatter” tails to the original raw score.
compared to a normal distribution - magnitude of differences between such
o Lower kurtosis = a rounded peak and thinner standard scores exactly parallels the differences
tails between corresponding raw scores
Nonlinear transformation
- required when the data under consideration are
not normally distributed yet comparisons with
normal distributions need to be made
- resulting standard score does not necessarily
have a direct numerical relationship to the
original, raw score
- original distribution is normalized
 Normal Curve Normalize standard scores
o a bell-shaped, smooth, mathematically defined  Normalizing a distribution
curve that is highest at its center o “stretching” the skewed curve into the shape
o From the center it tapers on both sides of a normal curve and creating a corresponding
approaching the X -axis asymptotically (meaning scale of standard scores
that it approaches, but never touches the axis o technically referred to as a normalized
o Distribution ranges from negative infinity to standard score scale
positive infinity o desirable for purposes of comparability
o perfectly symmetrical, no skewness
o mean, median, mode have the same exact value Correlation and Inference
o has two tails The Concept of Correlation
 Coefficient of correlation (or correlation coefficient)
o number that provides us with an index of the
 Standard Scores strength of the relationship between two
o raw score that has been converted from one things
scale to another scale, where the latter scale o Correlation: expression of the degree and
has some arbitrarily set mean and standard direction of correspondence between two
deviation. things
o converted to standard scores because standard o coefficient of correlation (r)
scores are more easily interpretable than raw  expresses a linear relationship between
scores two (and only two) variables, usually
z Score continuous in nature
- results from the conversion of a raw score into  reflects the degree of concomitant
a number indicating how many standard variation between variable X and
deviation units the raw score is below or above variable Y
the mean of the distribution.  numerical index that expresses this
- zero plus or minus one scale relationship: It tells us the extent to
- s equal to the difference between a particular which X and Y are “co-related.”
raw score and the mean divided by the o + 1.00 (indicating a perfect positive
standard deviation. relationship) through 0 (indicating no
T Scores relationship) to − 1.00 (indicating a perfect
- fifty plus or minus ten scale; a scale with a mean negative relationship)
set at 50 and a standard deviation set at 10
o Positive correlation (direct) - two variables - a family of techniques used to statistically combine
simultaneously increase or simultaneously information across studies to produce single
decrease estimates of the data under study
o Negative correlation (inverse) - one variable - Effect size
increases while the other variable decreases o derived estimates
o Zero correlation - no relationship exists o typically expressed as a correlation
between the two variables; “perfectly no coefficient
correlation” - Advantages: more weight can be given to studies
TYPES OF CORRELATION that have larger numbers of subjects; can be
 Pearson r replicated; conclusions of meta-analyses tend to be
o Pearson correlation coefficient or Pearson more reliable and precise than the conclusions from
product-moment coefficient of correlation single studies; there is more focus on effect size
o Devised by Karl Pearson rather than statistical significance alone; meta-
o can be the statistical tool of choice when the analysis promotes evidence-based practice, which
relationship between the variables is linear may be defined as professional practice that is
and when the two variables being correlated based on clinical and research findings
are continuous (or, they can theoretically take
any value) (between 2 interval variables)
o Coefficient of determination or r2: indication of
how much variance is shared by the X- and the
 Spearman Rho
o rank-order correlation coefficient or rank-
difference correlation coefficient
o Developed by Charles Spearman
o frequently used when the sample size is small
(fewer than 30 pairs of measurements) and
especially when both sets of measurements
are in ordinal (or rank-order) form.
 Multiple regression
o Uses a number of independent variables to
predict a single dependent variable
 Phi coefficient
o Shows sign and magnitude of correlation
between two nominal variables
 Eta correlation
o Used when the relationship between two
variables is curvilinear

Graphic Representations of Correlation

 Scatterplot
o simple graphing of the coordinate points for
values of the X-variable (placed along the
graph’s horizontal axis) and the Y-variable
(placed along the graph’s vertical axis)
o useful because they provide a quick indication
of the direction and magnitude of the
relationship, if any, between the two variables
o useful in revealing the presence of curvilinearity
in a relationship
o Curvilinearity: “eyeball gauge” of how curved a
graph is
o Outlier: extremely atypical point located at a
relatively long distance—an outlying distance—
from the rest of the coordinate points in a
 Bivariate distribution, scatter diagram, scattergram

- combine statistically the information across the
various studies

