0% found this document useful (0 votes)
3 views

Chapter 2, Lecture Notes

Chapter 2 covers descriptive statistics, focusing on categorical and quantitative data types, and methods for summarizing data through frequency distributions, graphical displays like histograms and pie charts, and crosstabulations for two variables. It also discusses data visualization techniques and how to create various charts in Excel. The chapter emphasizes the importance of understanding data representation to analyze and interpret statistical information effectively.

Uploaded by

jaedon.ayven
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

Chapter 2, Lecture Notes

Chapter 2 covers descriptive statistics, focusing on categorical and quantitative data types, and methods for summarizing data through frequency distributions, graphical displays like histograms and pie charts, and crosstabulations for two variables. It also discusses data visualization techniques and how to create various charts in Excel. The chapter emphasizes the importance of understanding data representation to analyze and interpret statistical information effectively.

Uploaded by

jaedon.ayven
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 36

Chapter 2.

Descriptive Statistics: Tabular and


Graphical Displays

1
categorical
► There are two types of data sets:
and quantitative .

► Categorical data
.
use labels or names to identify categories of like items

► Quantitative data
are numerical values that indicate how much or how many
.

2
► Example 1:
Coca-cola Pepsi Pepsi Coca-
cola Sprite Coca-cola
Sprite Pepsi Coca-
cola Pepsi

► Example 2: Number of days required to audit


each account
10 5 2 9 8 11 3
describe the use of graphical 12
displays
4 7 to summarize data
or present information about a data set

► Data visualization is a term used to

.
3
4
2.1 Summarizing Data for a Categorical
Variable
Frequency, Relative Frequency, and Percent
Frequency Distributions
Category freq. relative f. percent f.
Coca-Cola
Pepsi
Sprite

Cumulative
Distributions
Category cum. freq. cum. rel. f. cum. percent f.
Coca-Cola
Pepsi
Sprite

5
2.1 Summarizing Data for a Categorical
Variable
Frequency, Relative Frequency, and Percent
Frequency Distributions
Category freq. relative f. percent f.
Coca-Cola 4 4/10 = .4 (0.4)(100) = 40%
Pepsi 4 4/10 = .4 (0.4)(100) = 40%
Sprite 2 2/10 = .2 (0.2)(100) = 20%

Cumulative
Distributions
Category cum. freq. cum. rel. f. cum. percent f.
Coca-Cola 4 .4 40%
Pepsi 8 .8 80%
Sprite 10 1.0 100%

6
► The relative frequencies add up to
1.

► The percent frequencies add up to


100%.

7
Histogram (= Bar
Chart)
relative frequency .
The y axis of a histogram is

Example 1:

Coca-cola Pepsi Pepsi Coca-


cola Sprite Coca-cola
Sprite Pepsi Coca-
cola Pepsi

8
Shapes of
Histograms

9
Pie
Chart

Draw a circle to represent all the data. Then subdivide


the circle into parts that correspond to the relative
frequency for each class.

10
Creating Frequency Distribution and Bar Chart
in Excel
Step 1: Select any cell in the data
set. Step 2: Click the Insert tab on
the ribbon. Step 3: Click
Recommended Charts.
Step 4: Click OK.
Step 5: Select the Frequency Distribution table and
click on the bar chart icon.

11
2.2 Summarizing Data for a Quantitative
Variable
Example 2: 10, 5, 2, 9, 8, 11, 3, 12, 4, 7
Frequency, Relative Frequency, and Percent
Frequency Distributions

Range Freq. Relative f. Percent f.


0-4 3 3/10 = .3 (0.3)(100) = 30%
5-9 4 4/10 = .4 (0.4)(100) = 40%
10-14 3 3/10 = .3 (0.3)(100) = 30%
Cumulative
Distributions
Range Cum. freq. Cum. r. f. Cum. p. f.
0-4 3 .3 30%
5-9 7 .7 70%
10-14 10 1.0 100%

12
Histogram (= Bar
Chart)

The y axis of a histogram isrelative frequency


.

Example 2: 10, 5, 2, 9, 8, 11, 3, 12, 4, 7

13
How to Create Frequency Distribution and Bar
Chart in Excel

Use
PivotTable.

softdrink.xlsx
Audit.xlsx

14
Dot Plot

A horizontal axis shows the range for the data. Each


data value is represented by a dot placed above the axis.

Example: 12, 15, 16, 13, 12, 14, 15, 12, 14

15
Stem–and–Leaf Display

► Arrange the leading digits of each data value to the


left of a vertical line.
► To the right of the vertical line, we record the last
digit for each data value.

112 97 107 92 86 126 128 118 127


124
104 92 108 96 100 92 115 91 102
81

16
Stem–and–Leaf Display

► Arrange the leading digits of each data value to the


left of a vertical line.
► To the right of the vertical line, we record the last
digit for each data value.

112 97 107 92 86 126 128 118 127


124
104 92 108 96 100 92 115 91 102
81

17
► Put the digits on each line in order.
► Use a rectangle to contain the leaves of each stem.
► Rotating this page counterclockwise onto its side
provides a picture that is similar to a histogram.

18
2.3 Summarizing Data for Two Variables Using
Tables
A crosstabulation is a tabular summary of data for two
variables. Quality rating and meal price data for 10 LA
restaurants
Restaurant Quality Rating Meal Price ($)
1 Good 18
2 Very Good 22
3 Good 28
4 Excellent 38
5 Very Good 33
6 Good 28
7 Very Good 19
8 Very Good 11
9 Very Good 23
10 Good 13

19
Meal Price
Quality $10– $20–29 $30–39
Rating 19 Total
Good
Very
Good
Excellen
t Total

20
Meal Price
Quality $10– $20–29 $30–39
Rating 19 Total
Goo 2 2 0 4
d
Very 2 2 1 5
Good
Excelle 0 0 1 1
nt
Tota 4 4 2 1
l 0

21
How to Create a Crosstabularion in
Excel

Quality Rating.xlsx

use pivot table

Count of Restaurant Column Labels


Row Labels 10-19 20-29 30-39 40-49 Grand Total
Good 42 40 2 84
Very Good 34 64 46 6 150
Excellent 2 14 28 22 66
Grand Total 78 118 76 28 300

22
We obtain column percentages by dividing each
element in a particular column by the total for that
column.
Meal Price
Quality $10– $20–29 $30–39
Rating 19 Total
Goo 2 2 0 4
d
Very 2 2 1 5
Good
Excelle 0 0 1 1
nt
Tota 4 4 2 1
l 0

23
We obtain column percentages by dividing each
element in a particular column by the total for that
column.
Meal Price
Quality $10– $20–29 $30–
Rating 19 39
Goo 50 50 0
d % % %
Very 50 50 50
Good % % %
Excelle 0 0 50
nt % % %
Tota 100 100 100
l % % %

24
We obtain row percentages by dividing each element in a
particular row by the total for that row.

Meal Price
Quality $10– $20–29 $30–39
Rating 19 Total
Goo 2 2 0 4
d
Very 2 2 1 5
Good
Excelle 0 0 1 1
nt
Tota 4 4 2 1
l 0

25
We obtain row percentages by dividing each element in a
particular row by the total for that row.

Meal Price
Quality $10– $20–29 $30– tota
Rating 19 39 l
Goo 50 50 0%
d % % 100%
Very 40 40 20%
Good % % 100%
Excelle 0 0 100 100
nt % % % %

26
Simpson’s Paradox
Simpson's paradox occurs when conclusions from
separate crosstabulations are reversed when the
data is aggregated.
Judge
Verdict Luckett Kendall Total
Upheld 129 (86%) 110 (88%) 239
Reversed 21 (14%) 15 (12%) 36

Total (%) 150 (100%) 125 (100%) 275

Which judge performed better overall ?

Judge Luckett Judge Kendall


Municipal
Verdict Common Pleas Municipal Court Total Verdict Common Pleas Total
Court
Upheld 29 (91%) 100 (85%) 129 Upheld 90 (90%) 20 (80%) 110
Reversed 3 (9%) 18 (15%) 21 Reversed 10 (10%) 5 (20%) 15
Total (%) 32 (100%) 118 (100%) 150 Total (%) 100 (100%) 25 (100%) 125

what caused the paradox ???


27
2.4 Summarizing Data for Two Variables Using
Graphical Displays
A scatter diagram is graphical representation of
two quantitative variables . A trendline is a line that
provides an approximation of that relationship.

Example:
No. of Weekly Sales (in thousands of
Commercials dollars)
2 50
5 57
3 54
Scatter
Diagram:

28
Shapes of Scatter
Diagrams

1. Study hours vs test score (P)


2. Demand vs price (P)
3. Supply vs price (N)
4. Age and test score (NA)

29
Side-by-Side Bar Chart

A side-by-side bar chart is a graphical display for


depicting multiple bar charts on the same display.

Meal Price
Quality $10– $20–29 $30–$40–
Rating 19 39 49
Goo 2.6 78.6
d % 11.9% 36.8% %
Very 53.8
Good % 33.9%
Excelle 43.6 21.4
2.6% 0%
nt % %
Tota 100 100
54.2%
l % %
60.5%
100%
100%
30
31
Stacked Bar Chart

A stacked bar chart is a bar chart in which each bar is


broken into rectangular segments of a different color
showing the relative frequency of each class.

Meal Price
Quality $10– $20–29 $30–$40–
Rating 19 39 49
Goo 2.6 78.6
d % 11.9% 36.8% %
Very 53.8
Good % 33.9%
Excelle 43.6 21.4
2.6% 0%
nt % %
Tota 100 100
54.2%
l % %
60.5%
100%
100%
32
33
How to Create Side-by-Side and Stacked Bar
Charts in Excel

Quality Rating.xlsx

use pivot table

34
Choosing the Type of Graphical Display
Group the following charts into three
categories:
► Displays Used to Show the Distribution
of Data
► Displays Used to Make Comparisons
► Displays Used to Show Relationships
Bar
Chart
Pie
Chart
Dot Plot
Histogra
m
Stem-and-Leaf
Display Side-by- 35
Choosing the Type of Graphical Display

Group the following charts into three categories:


► Displays Used to Chow the Distribution of Data
Bar Chart, Pie Chart, Dot Plot, Histogram, Stem-
and-Leaf Display
► Displays Used to Make Comparisons
Side-by-Side Bar Chart, Stacked Bar Charts
► Displays Used to Show
Relationships Scatter diagram,
Trendline

36

You might also like