0% found this document useful (0 votes)
14 views

Lecture3B Slides

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views

Lecture3B Slides

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

The Mean

• The mean is the average value of the data


(i.e. of a variable).
• The sample mean for the variable X is
denoted by
• The population mean is denoted by the
Greek letter µ (Mu).
• The sample mean for raw (ungrouped) data is
calculated as the sum of all the values for a
particular variable in the sample (i.e. all the
possible values added together) divided by
the total number of observations in the
sample (i.e. the n). This formula can be
expressed mathematically as:
• The sample mean for frequency (grouped)
data is calculated as the sum of each value
multiplied by its frequency divided by the
total number of observations in the sample.
This formula can be expressed
mathematically as:
• Advantages: the sample mean is very easy to
calculate; it has desirable properties in
relation to its use as an estimate of the
population mean
• Disadvantages: the mean can only be
calculated for numerical data that is interval
or ratio; it is influenced by outliers (extreme
values) and should not be used for skewed
data; the actual value of the mean may not
exist in the data as a data point.
EXAMPLE

Age Impulsivity Scores


N Valid 200 200

Missing 0 0

Mean 27.23 23.68

Median 25.50 23.00

Mode 23 22
Application
• The mean is easy to compute and interpret for
numeric data and is the preferred measure of
central tendency for statistical inference.
• However, the mean has the disadvantage of
being unduly influenced by a few very small or
very large measurements (outliers or extreme
values). It should therefore be used with caution
when the distribution of the observations is not
symmetric.
• When the frequency distribution is skewed
(in any direction) it is preferable to use the
median as a measure of central tendency
rather than the mean, as the median is not
easily affected by extreme values in the data.
• The appropriate measure of central tendency
used to describe a distribution depends on
the scale of measurement.
Scale of Measurement Mode Median Mean

Nominal Yes No No

Ordinal Yes Yes No

Interval Yes Yes Yes

Ratio Yes Yes Yes


Exercise 1
• A researcher was interested in measuring
levels of depression in a sample of university
students. She collected data and obtained
the following output:
Depression
N Valid 193
Missing 0
Mean 36.29
Median 37.00
Mode 42
Figure 1: Histogram for depression
• How many students participated in the study i.e.
what was the sample size?
• What was the average level of depression in the
sample?
• What was the most common depression score in
the sample?
• What was the median score for depression in
the sample?
• Based on the data in the table and the
histogram, how would you describe the shape of
the data for depression?
• Based on your answer for e), which measure of
central tendency would be the most appropriate
to calculate for the data?
Exercise 2
Identify the scale of measure for each of the
variables listed below. Based on this identification,
indicate which measure/s of central tendency
would be suitable to calculate for the variable.
• The amount of time taken to respond to a
question (measured in seconds)
• T-shirt size (small, medium, large, extra-large)
• Self-esteem scores (measured using a standard
psychometric test)
• Paint colour (white, beige, grey, brown)
• The number of people attending a concert
(based on headcount)

You might also like