Part 2 - Descriptive Statistics
Part 2 - Descriptive Statistics
Graphical Presentations
❖ Summarizing Categorical Data
• Frequency Distributions
• Bar Charts & Pareto Diagrams
• Pie Charts
❖ Summarizing Quantitative Data
• Frequency Distributions
• Dot plots
• Histograms and Skewness
• Cumulative Distributions
• Ogives
1
Section 2 - (Continued)
2
Topics
❖ Descriptive Statistics
❖ Summarizing Data
❖ Frequency Distributions
3
Descriptive Statistics
4
Summarizing Data
5
Summarizing Categorical Data
❖ Frequency Distribution
❖ Relative Frequency Distribution
❖ Percent Frequency Distribution
❖ Bar Chart
❖ Pie Chart
6
Frequency Distribution
7
Frequency Distribution
❖ Example of a Motel:
Guests staying at a motel are usually asked to rate the quality
of their accommodation. Here is a sample of 20 ratings:
8
Frequency Distribution
❖ Motel Example:
Rating Frequency
Poor 2
Below Average 3
Average 5
Above Average 9
Excellent 1
Total 20
9
Relative Frequency Distribution
10
Percent Frequency Distribution
11
Relative and Percent Frequency Distributions
❖ Example: Motel
Relative Percent
Rating Frequency Frequency
Poor .10 10
Below Average .15 15
Average .25 25 .10(100) = 10
Above Average .45 45
Excellent .05 5
Total 1.00 100
2/20 = .10
12
Topics
❖ Bar Charts
❖ Pareto Diagrams
❖ Pie Charts
13
Bar Chart
14
Bar Chart
Ratings for the Motel
10
9
8
Frequency 7
6
5
4
3
2
1
❖ In quality control, bar charts are used to identify the most important
causes of problems.
❖ When the bars are arranged in descending order of height from left
to right (with the most frequently occurring cause appearing first)
the bar chart is called a Pareto diagram.
16
Pie Chart
17
Pie Chart of Motel Ratings
Excellent
5%
Poor
10%
Below
Average
Above 15%
Average
45%
Average
25%
18
Insights Gained from the Pie Chart
19
Topics
❖ Summarizing Quantitative Data
❖ Steps in drawing Frequency Distribution
❖ Example of Frequency Distribution
20
Summarizing Quantitative Data
❖Frequency Distribution
❖Relative Frequency and Percent Frequency Distributions
❖Dot Plot
❖Histogram and Skewness
❖Cumulative Distributions
❖Ogive
21
Frequency Distribution
The three steps necessary to define the classes with quantitative
data are:
22
Step 1: Classes
23
Step 2: Class Width
❖ Guidelines for Determining the Width of Each Class
• Use classes of equal width.
• Approximate Class Width =
Using the same class width for the whole diagram makes it
simple and easy to understand.
24
Frequency Distribution
25
Step 3: Class Limits
❖ How to determine Class Limits
• Class limits must be chosen so that each data item belongs
to one and only one class.
• The lower class limit identifies the smallest possible data
value assigned to the class.
• The upper class limit identifies the largest possible data
value assigned to the class.
26
Example: A-Z Super Store
The manager of A-Z Superstore wants to have a better
understanding of how much customers are spending in
her store. She examines 50 customer invoices.
27
Example: A-Z Super Store
91 78 93 57 75 52 99 80 97 62
71 69 72 89 66 75 79 75 72 76
104 74 62 68 97 105 77 65 80 109
85 97 88 68 83 68 71 69 67 74
62 82 98 101 79 105 79 69 62 73
28
Frequency Distribution
❖ Example: A-Z Superstore
If we choose six classes:
Approximate Class Width = (109 - 52)/6 = 9.5 10
29
Relative and Percent Frequency Distributions
30
Insights from the Frequency Distribution
31
Topics
❖ Dot Plots
❖ Histograms
❖ Skewness in Histograms
32
Dot Plot
33
Dot Plot
50 60 70 80 90 100 110
Spending ($)
34
Histogram
35
Histogram
18
16
14
Frequency 12
10
8
6
4
2
36
Skewness in Histograms
❖ Symmetric
• Left tail is the mirror image of the right tail
• Examples: heights and weights of people
.35
37
Skewed Histograms
❖ Moderately Skewed towards Left
• A longer tail to the left
• Example: Scores for an easy exam ☺
.35
Relative Frequency .30
.25
.20
.15
.10
.05
0
38
ACT scores of students
39
Skewed Histograms
❖ Moderately Skewed towards Right
• A Longer tail to the right
• Example: housing values
.35
.30
Relative Frequency
.25
.20
.15
.10
.05
0
40
Prices of houses in the US
❖ Highly Skewed towards Right
41
Executive salaries in the US
❖ Highly Skewed towards Right
42
Topics
❖ Cumulative Distributions
❖ Ogives
43
Cumulative Distributions
44
Cumulative Distributions
45
Cumulative Distributions
❖ A-Z Superstore
Cumulative Cumulative
Spending Cumulative Relative Percent
($) Frequency Frequency Frequency
< 59 2 .04 4
< 69 15 .30 30
< 79 31 2 + 13 .62 15/50 62 .30(100)
< 89 38 .76 76
< 99 45 .90 90
< 109 50 1.00 100
46
Ogive
❖It is the graph of a Cumulative Distribution.
❖The data values are shown on the horizontal axis.
❖Shown on the vertical axis are the:
• cumulative frequencies, or
• cumulative relative frequencies, or
• cumulative percent frequencies
❖The frequency of each class is plotted as a point.
47
Ogive
❖ A-Z Superstore
• Because the class limits for the customers data are 50-59, 60-69,
and so on, there appear to be one-unit gaps from 59 to 60, 69 to
70, and so on.
• Thus, 59.5 is used for the 50-59 class, 69.5 is used for the 60-69
class, and so on.
48
Ogive with Cumulative Percent Frequencies
A-Z Superstore
100
60 (89.5, 76)
40
20
50 60 70 80 90 100 110
Spending ($)
49
Topics
❖ Exploratory Data Analysis
❖ Stem-and-Leaf Display
❖ Example
50
Exploratory Data Analysis
51
Stem-and-Leaf Display
❖ A stem-and-leaf display shows both the rank order
and shape of the distribution of the data.
❖ It is like a histogram, but with the advantage of
showing the actual data values.
❖ The first digits of each data item are arranged to the
left of a vertical line.
❖ To the right of the vertical line, we record the last
digit for each item in sequence
❖ Each row in the display is referred to as a stem.
❖ Each digit on a stem is a leaf.
52
Example: A-Z Super Store
The manager of A-Z Superstore wants to have a better
understanding of how much customers are spending in
her store. She examines 50 customer invoices.
53
Example: A-Z Super Store
91 78 93 57 75 52 99 80 97 62
71 69 72 89 66 75 79 75 72 76
104 74 62 68 97 105 77 65 80 109
85 97 88 68 83 68 71 69 67 74
62 82 98 101 79 105 79 69 62 73
54
Stem-and-Leaf Display
5 2 7
6 2 2 2 2 5 6 7 8 8 8 9 9 9
7 1 1 2 2 3 4 4 5 5 5 6 7 8 9 9 9
8 0 0 2 3 5 8 9
9 1 3 7 7 7 8 9
10 1 4 5 5 9
a stem a leaf
55
Stretched Stem-and-Leaf Display
56
Stretched Stem-and-Leaf Display
❖ Example: A-Z Superstore
5 2
5 7
6 2 2 2 2
6 5 6 7 8 8 8 9 9 9
7 1 1 2 2 3 4 4
7 5 5 5 6 7 8 9 9 9
8 0 0 2 3
8 5 8 9
9 1 3
9 7 7 7 8 9
10 1 4
10 5 5 9
57
Stem-and-Leaf Display
❖Leaf Units
• A single digit is used to define each leaf.
• In the preceding example, the leaf unit was 1.
• Leaf units may be 100, 10, 1, 0.1, and so on.
58
Example: Leaf Unit = 0.1
If we have data with values such as
8.6 11.7 9.4 9.1 10.2 11.0 8.8
59
Example: Leaf Unit = 10
If we have data with values such as
1806 1717 1974 1791 1682 1910 1838
Leaf Unit = 10
16 8
The 82 in 1682
17 1 9 is rounded down
18 0 3 to 80 and is
represented as an 8.
19 1 7 Some accuracy is lost.
60
Topics
❖ Cross-tabulation and Scatter Diagrams
❖ Cross-tabulation
❖ Row and Column Percentages
61
Cross-tabulations and Scatter Diagrams
62
Cross-tabulation
63
Cross-tabulation
❖ Example: Motel Rooms
The daily rents for 150 motel room, and their ratings is shown below:
Quantitative variable
Room Rent ($)
Rating
<50 50-100 100-150 >150 Total
Good 21 20 1 0 42
categorical Very Good
variable 17 32 23 3 75
Excellent 1 7 14 11 33
Total 39 59 38 14 150
64
Cross-tabulation
❖ Example: Motel Rooms
Total 39 59 38 14 150
66
Cross-tabulation: Row & Column percentages
67
Cross-tabulation: Row Percentage
Note: Some row totals are not exactly 100.00 due to rounding.
(Good rating and <50)/(All with ‘Good’ rating) x 100 = (21/42) x 100
68
Cross-tabulation: Column Percentage
69
Topics
❖ Scatter Diagram
❖ Trendline
❖ Summary of Section-2
70
Scatter Diagram and Trendline
71
Scatter Diagram
A Positive Relationship
y
72
Scatter Diagram
A Negative Relationship
73
Scatter Diagram
No apparent Relationship
74
Scatter Diagram
❖ Example: Ice-cream Sales vs. Temperature
The manager of an ice-cream parlor is interested in investigating the
relationship, if any, between Sales and outside Temperature.
x= y = Number of
Temperature Ice-creams Sold
71 37
66 27
79 61
67 35
77 49
75
Scatter Diagram
y
76
Example: Ice-cream Sales vs. Temperature
❖ Insights Gained from the Preceding Scatter Diagram
77
Tabular and Graphical Methods
Data
Categorical Data Quantitative Data