Statistics Intro: Univariate Analysis Central Tendency Dispersion
Statistics Intro: Univariate Analysis Central Tendency Dispersion
Univariate Analysis
Central Tendency
Dispersion
Review of Descriptive
Stats.
Descriptive Statistics are used to
present quantitative descriptions
in a manageable form.
This method works by reducing
lots of data into a simpler
summary.
Example:
Batting average in baseball
Cornell’s grade-point system
Univariate Analysis
This is the examination across cases of
one variable at a time.
Frequency distributions are used to
group data.
One may set up margins that allow us to
group cases into categories.
Examples include
Age categories
Price categories
Temperature categories.
Distributions
Two ways to describe a univariate
distribution
A table
A graph (histogram, bar chart)
Distributions (con’t)
Distributions may also be
displayed using percentages.
For example, one could use
percentages to describe the
following:
Percentage of people under the
poverty level
Over a certain age
Over a certain score on a
Distributions (cont.)
A Frequency Distribution Table
Category Percent
Under 35 9%
36-45 21
46-55 45
56-65 19
66+ 6
Distributions (cont.)
A Histogram
45
40
35
30
25
20
Percent
15
10
5
0
Under
36-45
46-55
56-65
66+
35
Central Tendency
An estimate of the “center” of a
distribution
Three different types of estimates:
Mean
Median
Mode
Mean
The most commonly used method
of describing central tendency.
One basically totals all the results
and then divides by the number of
units or “n” of the sample.
Example: The HSS 292 Quiz 1
mean was determined by the sum
of all the scores divided by the
number of students taking the
Working Example (Mean)
Lets take the set of scores:
15,20,21,20,36,15, 25,15
The Mean would be 167/8=20.875
Median
The median is the score found at
the exact middle of the set.
One must list all scores in
numerical order and then locate
the score in the center of the
sample.
Example: If there are 500 scores in
the list, score #250 would be the
median.
This is useful in weeding out
Working Example (Median)
Lets take the set of scores:
15,20,21,20,36,15, 25,15
First line up the scores.
15,15,15,20,20,21,25,36
The middle score falls at 20. There
are 8 scores, and score #4 and #5
represent the halfway point.
Mode
The mode is the most repeated
score in the set of results.
Lets take the set of scores:
15,20,21,20,36,15, 25,15
Again we first line up the scores
15,15,15,20,20,21,25,36
15 is the most repeated score and
is therefore labeled the mode.
Central Tendency
If the distribution is normal (i.e.,
bell-shaped), the mean, median
and mode are all equal.
In our analyses, we’ll use the
mean.
Dispersion
Two estimates types:
Range
Standard deviation
Standard deviation is more
accurate/detailed because an
outlier can greatly extend the
range.
Range
The range is used to identify the
highest and lowest scores.
Lets take the set of
scores:15,20,21,20,36,15, 25,15.
The range would be 15-36. This
identifies the fact that 21 points
separates the highest to the lowest
score.
Standard Deviation
The standard deviation is a value
that shows the relation that
individual scores have to the mean
of the sample.
If scores are said to be
standardized to a normal curve,
there are several statistical
manipulations that can be
performed to analyze the data set.
Standard Dev. (con’t)
Assumptions may be made about the
percentage of scores as they deviate
from the mean.
If scores are normally distributed, one
can assume that approximately 69% of
the scores in the sample fall within one
standard deviation of the mean.
Approximately 95% of the scores would
then fall within two standard deviations
of the mean.
Standard Dev. (con’t)
The standard deviation calculates
the square root of the sum of the
squared deviations from the mean
of all the scores, divided by the
number of scores.
This process accounts for both
positive and negative deviations
from the mean.
Working Example (stand.
dev.)
Lets take the set of scores
15,20,21,20,36,15, 25,15.
The mean of this sample was found to
be 20.875. Round up to 21.
Again we first line up the scores.
15,15,15,20,20,21,25,36.
21-15=6, 21-15=6, 21-15=6,20-21=-
1,20-21=-1, 21-21=0, 21-25=-4, 36-
21=15.
Working Ex. (Stan. dev.
con’t)
Square these values.
36,36,36,1,1,0,16,225.
Total these values. 351.
Divide 351 by 8. 43.8
Take the square root of 43.8. 6.62
6.62 is your standard deviation.