0% found this document useful (0 votes)
96 views

Statistics Notes

This document discusses different measures of central tendency (mean, median, mode) and dispersion (range, mean absolute deviation, variance, standard deviation) that can be used to describe data distributions. It provides examples of calculating these measures for various data sets and an assignment involving calculating several of these measures for additional data sets.

Uploaded by

Ahmed hassan
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
96 views

Statistics Notes

This document discusses different measures of central tendency (mean, median, mode) and dispersion (range, mean absolute deviation, variance, standard deviation) that can be used to describe data distributions. It provides examples of calculating these measures for various data sets and an assignment involving calculating several of these measures for additional data sets.

Uploaded by

Ahmed hassan
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 7

Mean Median and Mode

The big question:


Which average is better? What type of data requires what type of averages?
What type of average represents the data correctly?

To explain this, please refer to the notes provided before and also self-study on the concepts
given below;
If anyone don’t understand the concepts after studying, please discuss with me!

Self-Study:
Symmetric Distribution
Skewed Distribution
Positively Skewed Data (Distribution)
Negatively Skewed Data (Distribution)

Now we have studied how to calculate the middle of data, or one value that can represent the
whole set (or distribution). These averages are called measure of location. Now we need to see
how variable the data is, how spread apart it is. To study that, we need to focus on measure of
spread (dispersion of data)

Measure of spread
Let’s take an example;
You enrolled in an off-road race, where there are multiple rivers which your jeep needs to wade
through. The average depth of water is quoted at 3 feet. Your jeep has a wading height quoted
at 3.5 feet. Is it safe to enter the water?
You probably will not cross water only with this information.
You will need to know the maximum and minimum depth of water to make sure that the
maximum does not go over the 3.5 feet mark which your jeep can wade through.
There can be two conditions:
1. The range of water is from 1 foot to 5 feet. If this is the case, the average depth will be 3
feet, but your car cannot wade through this depth and will probably result in your
engine being hydro-locked.
2. What if the range of water is from 2.5 feet to 3.5 feet? In this case, your jeep can easily
wade through the water and be a part of the competition?
This is the reason why there is a need of other information apart from the measure of location.

The dispersion, or the measure of spread, combined with the measure of location helps us
understand the data fully. In our course we will study the following measures of spread;
1. The range
2. The mean absolute deviation (MAD)
3. Variance
4. Standard Deviation
Some properties of dispersion:
Reasons to study measure of dispersion:
1. A small value of measure of dispersion indicated that the data are clustered closely,
around arithmetic mean. The mean is therefore considered representative of the data.
Conversely, a larger measure of dispersion indicated that the mean is not reliable.
2. The measure of spread helps us compare the data of two or more distributions. Suppose
that there are two plants of a LCD manufacturer. Both plants have similar mean values
of hourly outputs. However, this may not be correct, as one plant might have near
average hourly output rate, but in second plant, the hourly output of first shift is bad
while second shift is working way ahead of the mean. This will require us to know the
range to understand which hourly output mean is correct and which factory is working
better.

The Range:

Measure of spread that is most associated with the mode is range. Since both are statistically
relatively easy and quick to calculate. They’re well suited for initial exploration of the data.

Range = highest – lowest


Example;
A recently retired couple are considering investing their pension lump sums in the purchase of a
small shop. Two suitably sited premises, A and B, are discovered.
The average weekly takings of the two shops are quoted as £1,050 and £1,080 for A and B,
respectively.
Upon further investigation, the investors discover that the averages quoted come from the
following recent weekly takings figures:

Shop A: £1,120 £990 £1,040 £1,030 £1,105 £1,015


Shop B: £1,090 £505 £915 £1,005 £2,115 £850

Advise the couple.

~ Solution will be provided later, once you have tried it.


~ You are to simply calculate range.

The Mean Absolute Deviation (MAD)


The defect of range is that its only based on two values, the highest and the lowest; it does not
take into consideration all the values.

The mean is the average being used, then one very good way of measuring the amount of
variability in the data is to calculate the extent to which the value differs from the mean.
MAD = ∑ | X − 𝑋| / n

X = value of each observation


𝑋 = arithmetic mean of values
n = number of observations in sample
|| indicated the absolute value.

Examples:
1. Measure the mean absolute deviation for the “shop A” mentioned in example above.
The arithmetic mean of sample is £1,050
2. The chart below shows the number of cappuccinos sold at Starbucks in the Orange
County airport and the Ontario, California, airport between 4 and 5 pm for a sample of
five days last month.
Determine the mean, median, range and MAD for each location.
California Airports
Orange County Ontario
20 20
40 49
50 50
60 51
80 80

Variance and Standard Deviation


Variance:
The arithmetic mean of squared deviations from the mean
Denoted by “𝑆 2 ”

2 2
∑(𝑋 − 𝑋 ) ∑ 𝑓 𝑥2 ∑ 𝑓𝑥
𝑆2 = = √ − ( )
𝑛 ∑𝑓 ∑𝑓
Steps to calculate Variance:
1. Calculate arithmetic mean
2. Calculate difference of every observation and mean
3. Square the differences
4. Sum the squares
5. Divide the sum of squares by the total number of observations.

Standard deviation
The square root of the variance.
Denoted by “S”
2
∑(𝑋 − 𝑋 )
𝑆= √𝑆 2 = √
𝑛
Steps to calculate Standard Deviation:
1. All Steps of Variance
2. Take Square root of variance.

Example:
The number of traffic citations issued last year by month in Beaufort County, South California, is
reported below;
Month Jan Feb Mar Apr May June July Aug Sep Oct Nov Dec
Citations 19 18 22 18 28 34 45 39 38 44 34 10
Determine the population variance.
ASSIGNMENT

1. Calculate the variance and standard deviation of Shop A and Shop B from previous
example.
2. An analyst is considering two categories of company, X and Y, for possible investment.
One of her assistants has compiled the following information on the price-earnings ratio
of the share of companies in the two categories over the past year.
Price-Earnings Ratio Number of category X Number of Category Y
companies companies
4.95 – 8.95 3 4
8.95 – 12.95 5 8
12.95 – 16.95 7 8
16.95 – 20.95 6 3
20.95 – 24.95 3 3
24.95 – 28.95 1 4
Compute the standard deviations of these two distributions and comment.
Mean of the two given distributions are 15.59 and 15.62 respectively.

3. Find the arithmetic mean for the following distribution, which shows the number of
employees absent per day
No. of employees absent No. of days (frequency)
2 2
3 4
4 3
5 4
6 3
7 3
8 3

4. Compute the mean of profit per vehicle from the following data of Applewood Auto
Group.
Profit Frequency
$200 upto $600 8
$600 up to $1,000 11
1,000 up to 1,400 23
1,400 – 1,800 38
1,800 – 2,200 45
2,200 – 2,600 32
2,600 – 3,000 19
3,000 – 3,400 4

Also calculate median, mode, MAD, variance and Standard Deviation.


5. A recent study of laundry habits of Americans included the time in minutes of the wash
cycle. A sample of 40 observations follows. Determine the mean, median and mode of a
typical wash cycle. Also calculate standard deviation.
35, 37, 28, 37, 33, 38, 37, 32, 28, 29, 39, 33, 32, 37, 33, 35, 36, 44, 36, 34, 40, 38, 46, 39,
37, 39, 34, 39, 31, 33, 37, 35, 39, 38, 37, 32, 43, 31, 31, 35

6. The enrollment of the 13 public universities in the state of Ohio are listed below;
College Enrollment
University of Akron 25,942
Bowling Green State University 18,989
Central State University 1,820
University of Cincinatti 36,415
Cleveland State University 15,664
Kent State University 34,056
Miami University 17,161
Ohio State University 59,091
Shawnee State University 4,300
Univerity of Toledo 20,775
Wright State University 18,786
Youngstown State University 14,682

a. Is the given data population or sample?


b. What is mean enrollment?
c. What is median enrollment?
d. What is the range of enrollment?
e. Compute the standard deviation.
7. The Kentucky Derby is held the first Saturday in May at Churchill Downs in Louisville,
Kentucky. The race track is one and one-quarter miles. The following table shows the
winners since 1990, their margin of victory, the winning time, and the payoff on the $2
bet.
Year Winner Winning Winning Time Payoff on $2
margin (Minutes) win bet
(length)
1990 Unbridled 3.5 2.03333 10.80
1991 Strike of Gold 1.75 2.05000 4.80
1992 Lil E. Tee 1 2.05000 16.80
1993 Sea Hero 2.5 2.04000 12.90
1994 Go for Gin 2 2.06000 9.10
1995 Thunder Gulch 2.25 2.02000 24.50
1996 Grindstone Nose 2.01667 5.90
1997 Silver Charm Head 2.04000 4.00
1998 Real Quiet 0.5 2.03667 8.40
1999 Charismatic Neck 2.05333 31.30
2000 Fusaichi 1.5 2.02000 2.30
Pegasus
2001 Monarchos 4.75 1.99950 10.50
2002 War Emblem 4 2.01883 20.50
2003 Funny Cide 1.75 2.01983 12.80
2004 Smarty Jones 2.75 2.06767 4.10
2005 Giacomo 0.5 2.04583 50.30
2006 Barbaro 6.5 2.02267 6.10
2007 Street Sense 2.25 2.03617 4.90
2008 Big Brown 4.75 2.03033 6.80
2009 Mine that Bird 6.75 2.04433 103.20
2010 Super Saver 2.50 2.07417 18.00
a. Determine the mean, median for the variables winning time and payoff on $2
bet.
b. Determine the range and standard Deviation of variable time and payoff
c. Refer to variable winning margin, what is the level of measurement? What
measure of location would be most appropriate?

You might also like