Session 2
Session 2
Bar Charts
Pie Charts
Pareto Charts
Bar Chart
Uses horizontal or vertical bars to show the distribution of
a categorical variable.
The bars have lengths proportional to the values that they
represent.
The length of the bar indicates the size of the group defined
by the column label.
A bar chart is very useful for recording certain information
whether it is continuous or not continuous data.
Bar charts also look a lot like a histogram and they are
often mistaken for each other.
Which hosts send the most visitors to
Amazon’s Web site?
Chart Title
120% 8,000
7,000
100%
6,000
80%
5,000
60% 4,000
3,000
40%
2,000
20%
1,000
0% 0
m m m m m m m m m m
. co . co co co l. co co co co co co
sn oo e. e. . e. a. s. .
h gl rc ao on
zin ol ng db
m o iw si
ya go sou e at
w
s im
p e bm -b
le
ci ily
re a
d
Series1 Series2
Example: ROLLING OVER
Question:
Are certain types of vehicles more prone to roll-over
accidents than others?
Method:
Data gathered from Fatality Analysis Reporting.
System (FARS) for roll-over accidents on interstate
highways.
Cases that make up the rows are accidents resulting in roll-
overs in 2000.
The column of interest is model of the car involved.
Frequency table
Bar Graph
Inference
Question:
Apple, Google and Research in Motion (RIM) aggressively
compete to sell their smartphones to businesses. RIM has
dominated with its Blackberry line, but has that success held
up to the intense competition from Apple and Google?
Example: Selling Smartphones to Businesses
Example: Selling Smartphones to Businesses
Example: Selling Smartphones to Businesses
Inference
Ordered Array
Frequency
Distributions
Cumulative
Distributions
Stacked Or Unstacked Format
This is an issue when you have a categorical variable that
may be used to group your numerical variable for analysis.
Advantages:
Advantages:
You must give attention to selecting the appropriate number of class groupings
for the table, determining a suitable width of a class grouping, and establishing
the boundaries of each class grouping to avoid overlapping.
The number of classes depends on the number of values in the data. With a
larger number of values, typically there are more classes. In general, a
frequency distribution should have at least 5 but no more than 15 classes.
To determine the width of a class interval, you divide the range (Highest value–
Lowest value) of the data by the number of class groupings desired.
Why Use a Frequency Distribution?
91 78 93 57 75 52 99 80 97 62
71 69 72 89 66 75 79 75 72 76
104 74 62 68 97 105 77 65 80 109
85 97 88 68 83 68 71 69 67 74
62 82 98 101 79 105 79 69 62 73
Tabular Summary:
Frequency and Percent Frequency
Parts Parts Percent
Cost ($) Frequency Frequency
50-59 2 4
60-69 13 26 (2/50)100
70-79 16 32
80-89 7 14
90-99 7 14
100-109 5 10
50 100
Given here are the marks scored by students in a
Mathematics test. Prepare a frequency distribution with
class width 10. Also draw a histogram, ogive and
frequency polygon for the data.
24, 35, 17, 21, 24, 37, 26, 46, 58, 30, 32, 13, 12, 38, 41, 43, 44, 27, 53,
27
Sort raw data in ascending order:
12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58
Find range: 58 - 12 = 46
Select number of classes: 5 (usually between 5 and 15)
Compute class interval (width): 10 (46/5 then round up)
Determine class boundaries (limits):
Class 1: 10 to 20
Class 2: 20 to 30
Class 3: 30 to 40
Class 4: 40 to 50
Class 5: 50 to 60
Compute class midpoints: 15, 25, 35, 45, 55
Count observations & assign to classes
Frequency Distribution of Marks