0% found this document useful (0 votes)
17 views21 pages

Tutorial Wk3

The document provides an overview of key concepts in business statistics, including measures of central tendency (mean, median, mode), measures of variation (range, interquartile range, variance, and standard deviation), and distribution shape. It explains how to calculate these measures and their significance in analyzing data. Additionally, it introduces the five-number summary and boxplots as tools for visualizing data distribution.

Uploaded by

eelifluna11
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views21 pages

Tutorial Wk3

The document provides an overview of key concepts in business statistics, including measures of central tendency (mean, median, mode), measures of variation (range, interquartile range, variance, and standard deviation), and distribution shape. It explains how to calculate these measures and their significance in analyzing data. Additionally, it introduces the five-number summary and boxplots as tools for visualizing data distribution.

Uploaded by

eelifluna11
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 21

CB2200 Business Statistics

Topic 1
Introduction to Statistics
Tutorial Week3

1
Concept Review
 Measures of Central Tendency: Mean, Median and
Mode

 Measures of Variation: Range, Interquartile Range,


Variance and Standard deviation

 Distribution Shape

 Five-numbers Summary and Boxplot

2
Mean

 Sample mean
n

X
pronounced x-
bar i
X1  X 2    X n
X i 1

n n Sample
Size
 Population mean
N
Pronounced
miu X i
X1  X 2   X N
 i 1

N N Population
Size

3
Median
 In an ordered array, the median is the “middle”
number (50% above, 50% below)
 If n or N is odd, the median is the middle number
𝑵 +𝟏
𝐼𝑓 𝑁 𝑖𝑠 𝑜𝑑𝑑 , 𝑡h𝑒𝑚𝑒𝑑𝑖𝑎𝑛 𝑖𝑠 𝑡h𝑒𝑛𝑢𝑚𝑏𝑒𝑟 𝑟𝑎𝑛𝑘𝑒𝑑 𝑖𝑛 𝑡h𝑒 𝑎𝑟𝑟𝑎𝑦
𝟐
 If n or N is even, the median is the average of the 2 middle
numbers

4
Mode
 Value that occurs most often
 Not affected by extreme values (outliers)
 Used for both numerical and categorical data
 There may be no mode
 There may be several modes

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 0 1 2 3 4 5 6

No Mode
Modes =5, 9 and 12
5
Comparison Table
Measures Variable Type Can be non- Can be more- Effect of
exist? than-one? Outliers

Mean Numerical No No Yes

Median Numerical No No No

Mode Numerical & Yes Yes No


Categorical

6
Range
 Simplest measure of variation
 Difference between the largest and the smallest
values Range= 𝑋 Largest − 𝑋 Smallest
 Ignores the way in which data are distributed

7 8 9 10 11 12 7 8 9 10 11 12
Range = 12 - 7 = 5 Range = 12 - 7 = 5

7
Quartiles

Calculating the quantile position for n values:


Q1 position: Q2 position: Q3 position:
r= r= r=

 If r is a whole number, it is the ranked position to use


 When r is not a whole number, the following linear interpolation
steps can be used to determine the quantile value
1. d = r – [r], where [r] is the integer part of r
2. Quantile value = X[r] + d*(X[r]+1 – X[r]), where X[r] is the value at the rank rth position

8
Interquartile Range
Cont’d

 Interquartile range IQR = Q3 – Q1 and measures the spread in


the middle 50% of the data
25% 25% 25% 25%

Q1 Q2 Q3
Interquartile Range
 Interquartile range is also called the mid-spread because it covers the
middle 50% of the data
 Not influenced by outliers or extreme values
 Usually, values fall outside the range [Q1 - 1.5*IQR, Q3 + 1.5*IQR] are
considered as outliers

9
Variance
 Most preferred measure of variation due to its
mathematical property
 It shows variation of each value from the mean
 Sample Variance 𝑛

∑ ( 𝑋𝑖 − 𝑋 )
¯ 2

2 𝑖=1
𝑆 =
 Population Variance
𝑛 −𝟏
𝑁

pronounced ∑ ( 𝑋 𝑖 − 𝜇) 2

sigma 𝜎 2= 𝑖=1
squared 𝑁
10
Standard Deviation

 It is the square-root of variance.


 It has the same units as the original data


 Sample Standard Deviation 𝑛

∑ ( 𝑋𝑖− 𝑋 )
¯ 2

𝑖 =1
𝑆=
𝑛− 𝟏


 Population Standard Deviation 𝑁
pronounce
d sigma
∑ ( 𝑋 𝑖 − 𝜇) 2

𝑖 =1
𝜎=
𝑁

11
Standard Deviation
Smaller standard deviation

Larger standard deviation

𝜇or 𝑋
 Smaller standard deviation means most values of X are closer
to its mean value. Larger standard deviation means the
values of X are more spread out

12
Distribution Shape
 Position of mean and median for unimodal
continuous distribution
Left-Skewed Symmetric Right-Skewed
Mean < Median Mean = Median Median < Mean

Skewness <0 0 >0


Statistic
If data are skewed, the median may be a more
appropriate measure of central tendency

13
The Five Number Summary and
Boxplot
 The five numbers that help describe the center,
spread and shape of data are
Xsmallest -- Q1 -- Median -- Q3 -- Xlargest

 Boxplot
25% of data 25% 25% 25% of data
of data of data

Xsmallest Q1 Median Q3 Xlargest


or Q1 - 1.5 (IQR) or Q3 + 1.5 (IQR)

14
Distribution Shape and Boxplot
X smallest 𝑄1 𝑄2 𝑄 3 X largest
If A = B, then Symmetric
If A > B, then Left-Skewed
A B If A < B, then Right-
Skewed
If C = D, then Symmetric
If C > D, then Left-
C D Skewed If C < D, then
Right-Skewed
If E = F, then Symmetric
If E > F, then Left-Skewed
E F If E < F, then Right-
Skewed
 Look at all the three pairs of comparisons, go for the majority
Distribution Shape and Boxplot

Left-Skewed Symmetric Right-Skewed

Q1 Q2 Q3 Q1 Q2 Q3 Q1 Q2 Q3

16
Quick Check
 Kahoot! (20s for each question)
Please Google Kahoot and Click the first link.

17
Tutorial Question

 Topic 1
 Question 9 & 10.

18
Formula Reference n N

X i
X  X 2   X n X i
X1  X 2    X N
Mean X i 1
 1  i 1

n n N N

Median th (odd) Average of th (even)

Range= 𝑋 Largest − 𝑋 Smallest


Quantile

If r is not an integer

Interquartile Range = Q3 – Q1

19
Formula Reference

20
Summary

21

You might also like