0% found this document useful (0 votes)
15 views30 pages

(Business Statistics) Chapter 3 Part 1

Chapter 3 discusses numerical descriptive measures, focusing on central tendency, variation, and the shape of data distributions. It defines key concepts such as mean, median, mode, variance, and standard deviation, explaining their significance and how they are calculated. Additionally, it covers the relationships among these measures and introduces the Z-score for identifying outliers.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views30 pages

(Business Statistics) Chapter 3 Part 1

Chapter 3 discusses numerical descriptive measures, focusing on central tendency, variation, and the shape of data distributions. It defines key concepts such as mean, median, mode, variance, and standard deviation, explaining their significance and how they are calculated. Additionally, it covers the relationships among these measures and introduces the Z-score for identifying outliers.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 30

Chapter 3

Numerical Descriptive
Measures
Summary Definitions
 The central tendency is the extent to which all the
data values group around a typical or central value.

 The variation is the amount of the spread or


variability or dispersion of the data values

 The shape is the pattern of the distribution of values


from the lowest value to the highest value.
Measures of Central Tendency:

Central Tendency

Arithmetic Median Mode Geometric Mean


Mean
n

X i
XG  ( X1  X2   Xn )1/ n

X i1
n Middle value Most Rate of
in the ordered frequently change of
array observed a variable
value over time
The Mean
 The arithmetic mean (often just called the “mean”)
is the most common measure of central tendency

 For a sample of size n:

The ith value


Pronounced x-bar
n

X i
X1  X2    Xn
X i1

n n
Sample size Observed values
The Mean
 The most common measure of central tendency
 Mean = sum of values divided by the number of values
 Affected by extreme values (outliers)

11 12 13 14 15 16 17 18 19 20 11 12 13 14 15 16 17 18 19 20

Mean = 13 Mean = 14

11  12  13  14  15 65 11  12  13  14  20 70
  13   14
5 5 5 5
The Median

 In an ordered array, the median is the “middle”


number (50% above, 50% below)

11 12 13 14 15 16 17 18 19 20 11 12 13 14 15 16 17 18 19 20

Median = 13 Median = 13

 Less sensitive than the mean to extreme values


Locating the Median
 The location of the median when the values are in numerical order
(smallest to largest):

n 1
Median position  positionin the ordered data
2

 If the number of values is odd, the median is the middle number

 If the number of values is even, the median is the average of the


two middle numbers (n/2 , n/2 +1)
The Mode

 Value that occurs most often


 Not affected by extreme values
 Used for either numerical or categorical (nominal)
data
 There may be no mode
 There may be several modes

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 0 1 2 3 4 5 6

Mode = 9 No Mode
The Geometric Mean
 Geometric mean
 Used to measure the rate of change of a variable over time

XG  (X1  X 2  X n ) 1/ n

 Geometric mean rate of return


 Measures the average return of investment per time period

RG  [(1 R1 )  (1 R2 )   (1 Rn )]1/ n  1


 Where Ri is the rate of return in time period i
Questions
1) Which measure of central tendency can be used for
both quantitative and qualitative variables?
A) Arithmetic mean B) Median
C) Mode D) Geometric mean
C) Mode

2) Which of the following statements about the median


is not true?
A) It is more affected by outliers than the arithmetic mean
B) It is a measure of central tendency
C) It is equal to Q2
D) All of the above
A) It is more affected by outliers than the arithmetic mean
The data below represent the amount of grams of
carbohydrates in a serving of breakfast in a sample of 11
different serving: (from Qu 1 to Qu 3)
11 15 23 29 19 22 21 20 15 25 17

1. The arithmetic mean of carbohydrates in this sample is:


a) 16.3 grams b) 19.7 grams c) 26 grams d) 34.5 grams
b) 19.7 grams
2. The median of carbohydrate amount is:
a) 15 grams b) 25 grams c) 29 grams d) 20 grams
d) 20 grams
3. The mode is:
a) 29 b) 23 c) 15 d) No mode
c) 15
Shape of a Distribution
 Describes how data are distributed
 Two useful shape related statistics are:
 Skewness
 Measures the extent to which data values are not
symmetrical
 Kurtosis
 Kurtosis affects the peakedness of the curve of
the distribution—that is, how sharply the curve
rises approaching the center of the distribution
Shape of a Distribution
(Skewness)
 Measures the extent to which data is not
symmetrical

Left-Skewed Symmetric Right-Skewed


Mean < Median Mean = Median Median < Mean

Skewness
Statistic < 0 0 >0
Relationships Among mean,
median, and mode
 If Mean = median = mode

⇒ the histogram would be symmetric with one peak where three


measures lie at the center of the distribution.
 If Mean > median > mode
⇒ the histogram would be positively skewed (skewed to right)
 If Mean < median < mode

⇒ the histogram would be negatively skewed (skewed to left)


Shape of a Distribution -- Kurtosis
measures how sharply the curve rises
approaching the center of the distribution)

Sharper Peak
Than Bell-Shaped
(Kurtosis > 0)

Bell-Shaped
(Kurtosis = 0)
Flatter Than
Bell-Shaped
(Kurtosis < 0)
The following data are the numbers of computer monitors
produced at the company for a sample of 10 days (from Qu 1
to Qu 3)
24 31 27 25 35 33 26 40 25 28

1. The arithmetic mean is


a) 29.4 b) 37 c) 21 d) 31 a) 29.4

2. The median is
a) 24 b) 31 c) 27.5 d) 35 c) 27.5
3. The data in the sample is:
a) right-skewed b) left-skewed c) symmetrical d) None of all
a) right-skewed
4. In a perfectly symmetrical bell-shaped "normal” distribution:
a) The median equals b) The median c) The arithmetic d) All of the
the arithmetic mean equals the mode mean equals the above
mode.
d) All of the above
5. The mean age of five members of a family is 40 years. The
ages of four of the five members are 61, 60, 27, and 23. The age
of the fifth member is:
a) 37 b) 29 c) 20 d) 42 b) 29

6. In a right-skewed distribution:
a) The median equals b) The median is c) The median is d) None of the
the arithmetic mean less than the greater than the above
arithmetic mean. arithmetic mean.
b) The median is less than the
arithmetic mean
Measures of Variation
Variation

Range Variance Standard Coefficient


Deviation of Variation

 Measures of variation give


information on the spread
or variability or
dispersion of the data
values.
Same center,
different variation
The Range
 Simplest measure of variation
 Difference between the largest and the smallest values:

Range = Xlargest – Xsmallest

• Why The Range Can Be Misleading?


 Does not account for how the data are distributed

7 8 9 10 11 12 7 8 9 10 11 12
Range = 12 - 7 = 5 Range = 12 - 7 = 5
 Sensitive to outliers
1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,3,3,3,3,4,5
Range = 5 - 1 = 4
1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,3,3,3,3,4,120
Range = 120 - 1 = 119
The Variance
 Average of squared deviations of values from the
mean
n

Sample variance:
 i
(X  X ) 2

S2  i1

n -1

Where X = sample mean


n = sample size
Xi = ith value of the variable X

 Population variance:

Where = population mean


The Standard Deviation
 Most commonly used measure of variation
 Shows variation about the mean
 Is the square root of the variance
n
 Sample standard deviation:  (X  X)
i
2

S i1
n -1
Comparing Standard Deviation

Data A
Mean = 15.5
11 12 13 14 15 16 17 18 19 20 21 S = 3.338

Data B Mean = 15.5


11 12 13 14 15 16 17 18 19 20
S = 0.926
21

Data C Mean = 15.5


S = 4.567
11 12 13 14 15 16 17 18 19 20 21
Comparing Standard Deviation

Smaller standard deviation

Larger standard deviation


Measures of Variation:
Summary Characteristics
 The more the data are spread out, the greater the
range, variance, and standard deviation.

 The more the data are concentrated, the smaller the


range, variance, and standard deviation.

 If the values are all the same (no variation), all


variation measures will be zero.

 None of these measures are ever negative.


Measures of Variation:
The Coefficient of Variation

 Measures relative variation


 Always in percentage (%)
 Shows variation relative to mean
 Can be used to compare the variability of two or
more sets of data measured in different units

 S
CV     100%

X 
Measures of Variation:
Comparing Coefficients of Variation
 Stock A:
 Average price last year = $50

 Standard deviation = $5

S $5
CVA     100%   100%  10%
X $50 Both stocks
 Stock B: have the same
standard
 Average price last year = $100 deviation, but
stock B is less
 Standard deviation = $5 variable relative
to its price
S $5
CVB     100%   100%  5%
X $100
Measures of Variation:
Comparing Coefficients of Variation (con’t)
 Stock A:
 Average price last year = $50

 Standard deviation = $5

S $5
 
CVA     100%   100%  10%
X $50 Stock C has a
much smaller
 Stock C:
standard
 Average price last year = $8 deviation but a
much higher
 Standard deviation = $2 coefficient of
variation
 S  $2
CVC     100%   100%  25%

X  $8
The Z-Score
It indicates whether the value is above or below the mean
by how many standard deviations, so it helps to identify
outliers
XX
Z
S
where X represents the data value
X is the sample mean
S is the sample standard deviation

• A value is considered an outlier if its Z-score is less than -3.0


or greater than +3.0.
• The larger the absolute value of the Z-score, the farther the
data value is from the mean.
Questions
For a sample of data below, answer the following questions
(from Qu. 1 to Qu. 6)
14 6 3 10 1 8 4 7

1. The variance is
a) 20 b) 17.13 c) 15.30 d) 10 b) 17.13

2. The standard deviation is


a) 4.138 b) 3.34 c) 2.04 d) 1.38 a) 4.138
3. The coefficient of variation is:
a) 75.3% b) 62.4% c) 50.4% d) 69.1% b) 62.4%

4. The Z-score for the value 10 is:


a) 0.6359 b) 0.8156 c) 0.5120 d) 0.4023 b) 0.8156
5. When extreme values are present in a set of data, which of
the following descriptive summary measures are most
appropriate:

a) coefficient of b) arithmetic c) median d) standard


variation mean deviation
c) median
6. Are there any outliers in the above data?
a) 14 b) 10 c) 8 d) there is no d) there is
outliers no outliers
7. The more the data are concentrated around the arithmetic
mean
a) the smaller the b) the higher the c) the higher the d) All the above
coefficient of standard value of the Z-
variation deviation score a) the smaller the
coefficient of variation

You might also like