Descriptive Stats
Descriptive Stats
“Bell Shaped”
f(X)
Symmetrical
Mean, Median and
Mode are Equal X
Interquartile Range
Equals 1.33 s Mean
Random Variable Median
Mode
Has Infinite Range
Good for
Number of People
categorical data
Good for Green
too many
segments
Green
Blue
Pull out a Brown
segment for
emphasis
Easy to construct
from Excel
Conclusions
https://www.youtube.com/watch?time_continue=
79&v=EqeVXI4WNHM
Measures of location, variability
and shape
Measures of location (measures of central
tendency)
Mean (average)
Where
▪ Xi = observed variable X
▪ n = number of observations
Measures of location, variability and
shape
Mode
The value that occurs most frequently
Median
Middle value when arranged in ascending and
descending order
Calculating the mean = average
x
x 6500 6500 6500 6500 10500 36500
£7300
n 5 5
The Median and the Mode
This list is already in order:
£6500 £6500 £6500 £6500 £10500
The middle one is the third value
median = £6500
The most frequently occurring value is the salary of
£6500
Mode = £6500
Exercise
Determine the mean and the median from the following data.
The weekly pay (x-variable) of a sample of 6 workers is as follows:
e220, e220, e180, e215, e208, e207
The mode: The attendance at five mathematics tutorials is as
follows:
15 ,18 ,17 ,17 ,20
Second Moment
The spread of the data
Measures of location, variability and
shape
Measures of variability
Range (max –min)
Variance – Is the spread of the data around the mean?
▪ The difference between the mean and an observed value is
called the deviation from the mean
▪ When the datapoints are clustered around the mean the
variance is small
Standard deviation
▪ Square root of the variance
Normal (Gaussian) Distribution
Consider the distributions in the figure. These tapering sides are called tails (or snakes),
and they provide a visual means for determining which of the two kinds of skewness a
distribution has:
1. Positive skew: The right tail is longer; the mass of the distribution is concentrated
on the left of the figure. The distribution is said to be right-skewed. An example
would be that of income distribution in which there are a few high incomes
2. Negative skew: The left tail is longer; the mass of the distribution is concentrated
on the right of the figure. The distribution is said to be left-skewed.
Parametric vs. Non-parametric
tests
Parametric
Ratio or Interval scales
Large samples
More powerful
Stringent assumptions
Non-parametric tests
Nominal or ordinal scales
Small samples
Less assumptions
Corresponding non-parametric techniques for many
parametric techniques
Not as powerful/less sensitive
Scatter plot
Gender
Internet Male Female Row
Usage Total
Light 5 10 15
15 10 5 15
Column 15 15 30
Total
Internet usage by gender
Gender
Internet Male Female
Usage
Light 33.3% 66.7%
15 66.7% 33.3%
Column 100% 100%
Total
Any questions?