0% found this document useful (0 votes)
40 views

2-Visualizing The Data - Part B

This document provides an overview of descriptive statistics techniques for tabular and graphical presentation of data, including stem-and-leaf displays, crosstabulations, scatter diagrams, and trendlines. It explains how to create and interpret these visualizations to explore relationships between variables and gain insights from data. Examples are given for each technique to demonstrate their use.

Uploaded by

Kashish Miglani
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
40 views

2-Visualizing The Data - Part B

This document provides an overview of descriptive statistics techniques for tabular and graphical presentation of data, including stem-and-leaf displays, crosstabulations, scatter diagrams, and trendlines. It explains how to create and interpret these visualizations to explore relationships between variables and gain insights from data. Examples are given for each technique to demonstrate their use.

Uploaded by

Kashish Miglani
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 30

Visualizing Data

Part B

Chapter 2

Slide 1
Chapter 2
Descriptive Statistics:
Tabular and Graphical Presentations
Part B
 Exploratory Data Analysis
 Crosstabulations and
Scatter Diagrams
y

Slide 2
Exploratory Data Analysis

 The techniques of exploratory data analysis consist of


simple arithmetic and easy-to-draw pictures that can
be used to summarize data quickly.
 One such technique is the stem-and-leaf display.

Slide 3
Stem-and-Leaf Display

 A stem-and-leaf display shows both the rank order


and shape of the distribution of the data.
 It is similar to a histogram on its side, but it has the
advantage of showing the actual data values.
 The first digits of each data item are arranged to the
left of a vertical line.
 To the right of the vertical line we record the last
digit for each item in rank order.
 Each line in the display is referred to as a stem.
 Each digit on a stem is a leaf.

Slide 4
Example: Hudson Auto Repair

The manager of Hudson Auto


would like to have a better
understanding of the cost
of parts used in the engine
tune-ups performed in the
shop. She examines 50
customer invoices for tune-ups. The costs of parts,
rounded to the nearest dollar, are listed on the next
slide.

Slide 5
Example: Hudson Auto Repair

 Sample of Parts Cost for 50 Tune-ups

91 78 93 57 75 52 99 80 97 62
71 69 72 89 66 75 79 75 72 76
104 74 62 68 97 105 77 65 80 109
85 97 88 68 83 68 71 69 67 74
62 82 98 101 79 105 79 69 62 73

Slide 6
Stem-and-Leaf Display

5 2 7
6 2 2 2 2 5 6 7 8 8 8 9 9 9
7 1 1 2 2 3 4 4 5 5 5 6 7 8 9 9 9
8 0 0 2 3 5 8 9
9 1 3 7 7 7 8 9
10 1 4 5 5 9

a stem
a leaf

Slide 7
Stretched Stem-and-Leaf Display

 If we believe the original stem-and-leaf display has


condensed the data too much, we can stretch the
display by using two stems for each leading digit(s).

 Whenever a stem value is stated twice, the first value


corresponds to leaf values of 0 - 4, and the second
value corresponds to leaf values of 5 - 9.

Slide 8
Stretched Stem-and-Leaf Display

5 2
5 7
6 2 2 2 2
6 5 6 7 8 8 8 9 9 9
7 1 1 2 2 3 4 4
7 5 5 5 6 7 8 9 9 9
8 0 0 2 3
8 5 8 9
9 1 3
9 7 7 7 8 9
10 1 4
10 5 5 9

Slide 9
Stem-and-Leaf Display

 Leaf Units
• A single digit is used to define each leaf.
• In the preceding example, the leaf unit was 1.
• Leaf units may be 100, 10, 1, 0.1, and so on.
• Where the leaf unit is not shown, it is assumed
to equal 1.

Slide 10
Example: Leaf Unit = 0.1

If we have data with values such as


8.6 11.7 9.4 9.1 10.2 11.0 8.8

a stem-and-leaf display of these data will be

Leaf Unit = 0.1


8 6 8
9 1 4
10 2
11 0 7

Slide 11
Example: Leaf Unit = 10

If we have data with values such as


1806 1717 1974 1791 1682 1910 1838

a stem-and-leaf display of these data will be

Leaf Unit = 10
16 8
The 82 in 1682
17 1 9 is rounded down
18 0 3 to 80 and is
represented as an 8.
19 1 7

Slide 12
Crosstabulations and Scatter Diagrams

 Thus far we have focused on methods that are used


to summarize the data for one variable at a time.
 Often a manager is interested in tabular and
graphical methods that will help understand the
relationship between two variables.
 Crosstabulation and a scatter diagram are two
methods for summarizing the data for two (or more)
variables simultaneously.

Slide 13
Crosstabulation

 A crosstabulation is a tabular summary of data for


two variables.
 Crosstabulation can be used when:
• one variable is qualitative and the other is
quantitative,
• both variables are qualitative, or
• both variables are quantitative.
 The left and top margin labels define the classes for
the two variables.

Slide 14
Crosstabulation

 Example: Finger Lakes Homes


The number of Finger Lakes homes sold for each
style and price for the past two years is shown below.
quantitative qualitative
variable variable
Price Home Style
Range Colonial Log Split A-Frame Total
< $99,000 18 6 19 12 55
> $99,000 12 14 16 3 45

Total 30 20 35 15 100

Slide 15
Crosstabulation

 Insights Gained from Preceding Crosstabulation


• The greatest number of homes in the sample (19)
are a split-level style and priced at less than or
equal to $99,000.
• Only three homes in the sample are an A-Frame
style and priced at more than $99,000.

Slide 16
Crosstabulation

Frequency distribution
for the price variable

Price Home Style


Range Colonial Log Split A-Frame Total
< $99,000 18 6 19 12 55
> $99,000 12 14 16 3 45

Total 30 20 35 15 100

Frequency distribution
for the home style variable

Slide 17
Crosstabulation: Row or Column Percentages

 Converting the entries in the table into row


percentages or column percentages can provide
additional insight about the relationship between
the two variables.

Slide 18
Crosstabulation: Row Percentages

Price Home Style


Range Colonial Log Split A-Frame Total
< $99,000 32.73 10.91 34.55 21.82 100
> $99,000 26.67 31.11 35.56 6.67 100

Note: row totals are actually 100.01 due to rounding.

(Colonial and > $99K)/(All >$99K) x 100 = (12/45) x 100

Slide 19
Crosstabulation: Column Percentages

Price Home Style


Range Colonial Log Split A-Frame
< $99,000 60.00 30.00 54.29 80.00
> $99,000 40.00 70.00 45.71 20.00

Total 100 100 100 100

(Colonial and > $99K)/(All Colonial) x 100 = (12/30) x 100

Slide 20
Scatter Diagram and Trendline

 A scatter diagram is a graphical presentation of the


relationship between two quantitative variables.
 One variable is shown on the horizontal axis and the
other variable is shown on the vertical axis.
 The general pattern of the plotted points suggests the
overall relationship between the variables.
 A trendline is an approximation of the relationship.

Slide 21
Scatter Diagram

 A Positive Relationship

Slide 22
Scatter Diagram

 A Negative Relationship

Slide 23
Scatter Diagram

 No Apparent Relationship

Slide 24
Example: Students Marks

 Scatter Diagram
The Management School is interested
in investigating the relationship, if any,
between study hours and marks scored.

x = Number of
y = Marks Scored
Study Hours
1 14
3 24
2 18
1 17
3 30

Slide 25
Scatter Diagram

y
35
30
Marks Scored

25
20
15
10
5
0 x
0 1 2 3 4
Number of Study Hours

Slide 26
Example: Students Marks

 Insights Gained from the Preceding Scatter Diagram


• The scatter diagram indicates a positive relationship
between the number of Study hours and the
scored.
• Higher points scored are associated with a higher
number of study hours.
• The relationship is not perfect; all plotted points in
the scatter diagram are not on a straight line.

Slide 27
Tabular and Graphical Procedures
Data
Qualitative Data Quantitative Data

Tabular Graphical Tabular Graphical


Methods Methods Methods Methods

•Frequency •Bar Graph •Frequency •Dot Plot


Distribution •Pie Chart Distribution •Histogram
•Rel. Freq. Dist. •Rel. Freq. Dist. •Ogive
•Percent Freq. •Cum. Freq. Dist. •Scatter
Distribution •Cum. Rel. Freq. Diagram
•Crosstabulation Distribution
•Stem-and-Leaf
Display
•Crosstabulation

Slide 28
Observation Occupation Satisfaction Score Observation Occupation Satisfaction Score

1 Lawyer 42 21 Physical Therapist 80

2 Physical Therapist 86 22 Systems Analyst 64

3 Lawyer 42 23 Physical Therapist 55

4 Systems Analyst 55 24 Cabinetmaker 64

5 Lawyer 38 25 Cabinetmaker 59

6 Cabinetmaker 79 26 Cabinetmaker 54

7 Lawyer 44 27 Systems Analyst 76

8 Systems Analyst 41 28 Systems Analyst 60

9 Physical Therapist 55 29 Physical Therapist 59

10 Systems Analyst 66 30 Cabinetmaker 78

11 Lawyer 53 31 Physical Therapist 60

12 Cabinetmaker 65 32 Physical Therapist 50

13 Lawyer 74 33 Cabinetmaker 79

14 Physical Therapist 52 34 Systems Analyst 62

15 Physical Therapist 78 35 Lawyer 45

16 Systems Analyst 44 36 Cabinetmaker 84

17 Systems Analyst 71 37 Physical Therapist 62

18 Lawyer 50 38 Systems Analyst 73

19 Lawyer 48 39 Cabinetmaker 60
Slide 29
20 Cabinetmaker 69 40 Lawyer 64
End of Chapter 2, Part B

Slide 30

You might also like