Draswa Chapter2descriptivestatistics 28week3!29!281 29

Download as pdf or txt
Download as pdf or txt
You are on page 1of 34

Chapter 2

Descriptive Statistics
Measures of
Central Tendency
Measures of Central Tendency

Measure of central tendency


• A value that represents a typical, or central, entry of a
data set.
• Most common measures of central tendency:
§ Mean
§ Median
§ Mode
Mean
The mean of a data set is the sum of the data entries
divided by the number of entries.
å x Sample mean: åx
Population mean: µ = x =
N n

Example: “mu” “x-bar”


The following are the ages of all seven employees of a small
company:
53 32 61 57 39 44 57
Calculate the population mean.

å x 343 Add the ages and


µ= =
N 7 divide by 7.
= 49 years
The mean age of the employees is 49 years.
Median
The median of a data set is the value that lies in the
middle of the data when the data set is ordered.

Example:
Calculate the median age of the seven employees.
53 32 61 57 39 44 57
To find the median, sort the data.
32 39 44 53 57 57 61
The median age of the employees is 53 years.
Computing the Median

If the data set has an:


• odd number of entries: median is the middle data entry:
2 5 6 11 13

median is the exact middle value:

• even number of entries: median is the mean of the two


middle data entries:
2 5 6 7 11 13

median is the mean of the by two numbers: !x = 6 + 7 = 6.5


2
Mode

The mode of a data set is the data entry that occurs with
the greatest frequency. If no entry is repeated, the data
set has no mode.

If two entries occur with the same greatest frequency, each


entry is a mode and the data set is called bimodal.

Example:
Find the mode of the ages of the seven employees.
53 32 61 57 39 44 57
The mode is 57 because it occurs the most times.
Outlier
An outlier is a data entry that is far removed from the
other entries in the data set.
Example:
A 29-year-old employee joins the company and the ages of the
employees are now:
53 32 61 57 39 44 57 29
Recalculate the mean, the median, and the mode. Which measure
of central tendency was affected when this new age was added?

Mean = 46.5 The mean takes every value into account,


but is affected by the outlier.
Median = 48.5 The median and mode are not influenced
Mode = 57 by extreme values.
Ungrouped data

Grouped data
with frequency
distribution

Grouped data
with frequency
distribution
Mean of a Frequency Distribution for grouped
data

The mean of a frequency distribution for a sample is


approximated by
x = å(x × f ) Note that n = å f
n
where x and f are the midpoints and frequencies of the classes.
Median – grouped data
Mode – grouped data
The Shape of Distributions
The shape of your data and the existence of any outliers:
Measures of
Dispersion/Variation
Ukuran serakan
Measures of Variation (“Spread”)

Another important characteristic of quantitative data


is how much the data varies, or is spread out.

The 2 most common method of measuring spread are:


1. Range
2. Standard deviation and Variance
Range
The range of a data set is the difference between the maximum and
minimum date entries in the set. The data must be quantitative.

Range = (Maximum data entry) – (Minimum data entry)

Example:
The following data are the closing prices for a certain stock
on ten successive Fridays. Find the range.

Stock 56 56 57 58 61 63 63 67 67 67

The range is 67 – 56 = 11.


Deviation/ sisihan
The deviation of an entry x in a population data set is the difference
between the entry and the mean µ of the data set.
Deviation of x = x – µ
∑ x – µ /n
Example:
Stock Deviation
The following data are the closing x x–µ
prices for a certain stock on five 56 56 – 61 = – 5
successive Fridays. Find the 58 58 – 61 = – 3
deviation of each price. 61 61 – 61 = 0
63 63 – 61 = 2
The mean stock price is 67 67 – 61 = 6
µ = 305/5 = 61.
Σx = 305 Σ(x – µ) = 0

*deviation for ungrouped data


Population Variance and Standard Deviation

• For Ungrouped Data


The population variance of a population data set of N entries is
2 å(x - µ )2
Population variance = s = .
N
“sigma
squared”

The population standard deviation of a population data set of N


entries is the square root of the population variance.
2 å(x - µ )2
Population standard deviation = s = s = .
N
“sigma”
Population Variance and Standard Deviation

• For Grouped Data


The population variance of a population data set of N entries is

! ∑ #$ !
∑ # $%
Population variance = 𝜎 ! = %
&
“sigma squared”

The population standard deviation of a population data set of N entries


is the square root of the population variance.

∑ #$ !
∑ #! $%
Population standard deviation = 𝜎 = 𝜎 ! = %
&
“sigma”
Finding the Population Standard
Deviation
Example:
The following data are the closing prices for a certain stock on five
successive Fridays. The population mean is 61. Find the population
standard deviation.
Always positive!

Stock Deviation Squared


x x–µ (x – µ)2
å (x - µ)
2
56 –5 25 74
s2 = = = 14.8
58 –3 9 N 5
61 0 0
å (x - µ)
2
63 2 4 s= = 14.8 » 3.8
67 6 36 N

Σx = 305 Σ(x – µ) = 0 Σ(x – µ)2 = 74


σ » 3.80
Finding the Population Standard
Deviation

Class Interval Frequency


10- 19.9 3
20-29.9 5
30-39.9 7
40-49.9 2
Range=?

*grouping with interval, standard deviation for grouping data


Sample Variance and Standard Deviation
Sample Variance and Standard Deviation
Sample Variance and Standard Deviation
Sample Variance and Standard Deviation
Sample Variance and Standard Deviation
Sample Variance and Standard Deviation
Solution 1

*Single value grouping


Solution 2
Solution

You might also like