0% found this document useful (0 votes)
13 views12 pages

ST02 DescriptiveStat

The document discusses descriptive statistics including graphs like bar charts and histograms, frequency distributions, measures of central tendency and dispersion, skewness, kurtosis, and the empirical rule. It provides examples and guidelines for constructing frequency distributions and interpreting summary statistics.

Uploaded by

pjk040319
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views12 pages

ST02 DescriptiveStat

The document discusses descriptive statistics including graphs like bar charts and histograms, frequency distributions, measures of central tendency and dispersion, skewness, kurtosis, and the empirical rule. It provides examples and guidelines for constructing frequency distributions and interpreting summary statistics.

Uploaded by

pjk040319
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

Descriptive Statistics

(기술통계: Chapters 2, 3, 4)

Kyung Sam Park


Professor of LSOM
Korea University Business School
[email protected]
Contents

 Graphs
 Bar charts, Line charts, etc
 Frequency Distribution (도수분포) & Histogram

 Summary Statistics
 Average, Variance, etc

2
Graph: Examples

3
Graph: Ethical Issue

 Line charts

4
Graph: Ethical Issue

 Bar charts

5
Frequency Distribution: Example

 Dataset: 17 numbers for service years


4 3 2 10 6 6 5 8 4
8 4 6 2 3 3 7 5

 Example frequency distribution


Class Count Frequency Percent

[1, 3) // 2 12%
[3, 5) ////// 6 35%
[5, 7) ///// 5 29%
[7, 9) /// 3 18%
[9, 11) / 1 6%
Total 17 100%

6
Frequency Distribution: Histogram

Histogram Frequency Polygon


7
8
Number of Employees

Number of Employees
7
5 6
4 5
3 4
3
2
2
1 1
0 0
0 2 4 6 8 10 12 0 2 4 6 8 10 12
Length of S ervice (years) Years of Service

7
Frequency Distribution: Guidelines

 Determine number of classes: 2 to the k rule


 k number of classes, holding 2k  n, where n is the data size
 Example 1: n = 17. k = 4, 24 = 16. k = 5, 25 = 32. Therefore, 4 or 5 is good.
 Example 2: If n = 60?

 Determine the class interval


Highest value ( H )  Lowest value ( L)
 Class interval 
Number of classes (k )

 Example: If H = 10, L = 2, and k = 5, then (10 – 2)/5 = 1.6. Therefore 2 is good.

 For every class


 Use equal class intervals
 Avoid overlapping class
 Avoid disconnected class

8
Summary Statistics

 Central tendency (or average)


 Mean(평균, full name = Arithmetic mean)
 Median(중위수 또는 중앙값): The midpoint of the values
• Odd number (58, 62, 85, 87, 91): Choose the only one middle value, 85.
• Even number (58, 62, 85, 87, 91, 99): Choose the two middle values 85 & 87, and then
average the two, so the median is 86.
 Mode(최빈값): The value that occurs most often, so it appears at the apex or
top (highest point) on the graph
• First prepare a frequency table & histogram, and find the apex (Note that the vertical
axis represents the frequency or likelihood).

 Dispersion (or variation)


 Variance(분산), Standard deviation(표준편차)

 X  X
n

 X  X
2 n
2
i i
s2  i 1
s i 1
n 1 n 1
 Range(범위): Largest value – Smallest value
9
Other Summary Statistics: Skewness(왜도)

 How skewed
 Symmetric: mean = median
 Positively skewed: mean > median
 Negatively skewed: mean < median

 Example: Dataset (90, 92, 94, 98, 350): Mean = 724/5 = 144.8, Median = 94.

3( X  Median )
 Skewness measure [–3.0, +3.0]: sk 
s
10
Interpretations: Skewness & Kurtosis(첨도)

 Skewness: How skewed the dataset is.


  2 or around: very negatively skewed.
  1 or around: a little negatively skewed.
 0 or around: not skewed.
 1 or around: a little positively skewed.
 2 or around: very positively skewed.

 Kurtosis: How high the mountain is.


  2 or around: a negative mountain (volcanic vent).
  1 or around: no mountain (flat distribution).
 0 or around: say, a standard mountain (the standard normal distribution’s
mountain).
 1 or around: a little sharper mountain.
 2 or around: much sharper mountain.

 Consider the two simultaneously


 Graphical interpretation possible

11
Empirical Rule(경험규칙)

 For a normal distribution,


 68% of the data belong to: Mean  1  Standard deviation
 95% of the data belong to: Mean  2  Standard deviation
 99.7% of the data belong to: Mean  3  Standard deviation

100  10  [90, 110] covers 68%


100  20  [80, 120] covers 95%
100  30  [70, 130] covers 99.7%

• Where Mean = 100, Stdev = 10

12

You might also like