0% found this document useful (0 votes)
2 views39 pages

Chapter 02 - Fundamentals of Data Visualization

Chapter 02 of TM1151 covers the fundamentals of data visualization, focusing on techniques for categorical and numerical variables. It discusses various visualization methods such as bar charts, pie charts, histograms, frequency polygons, ogives, box plots, and stem-and-leaf plots, providing examples and steps for creating each type. The chapter emphasizes the importance of applying basic statistics to analyze business problems effectively.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views39 pages

Chapter 02 - Fundamentals of Data Visualization

Chapter 02 of TM1151 covers the fundamentals of data visualization, focusing on techniques for categorical and numerical variables. It discusses various visualization methods such as bar charts, pie charts, histograms, frequency polygons, ogives, box plots, and stem-and-leaf plots, providing examples and steps for creating each type. The chapter emphasizes the importance of applying basic statistics to analyze business problems effectively.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 39

TM1151 - Business Statistics and Software Applications

Chapter 02 - Fundamentals of data visualization

1
Learning Outcomes
At the end of this lesson you will be able to,
1. Apply basic statistics in analyzing business problems.

2
Data Visualization Techniques based on Variables
Variable
Type

One One
Categorical Numerical
Variable Variable

One way Stem and


Bar Histogram Box plots
frequency Pie Charts Charts Leaf plots
Frequency
tables Ogive
Polygon

3
One Way Frequency Table
Categorical Variable

4
Numerical Variable

5
Bar Chart
• In bar charts, each bar will represent each category level. These bars can be
drawn in vertically or horizontally.

• Frequency, cumulative frequency or percentages can be used for the y axis


while x axis will represent the categorical variable.

• Length of the bar will proportional to the value it represent.

6
• There are several types of bar charts. Some of them are,
1. Simple bar chart
2. Multiple/Clustered bar chart
3. Stacked/ Component bar chart
4. Percentage component bar chart

7
1. Simple Bar Chart

8
2. Multiple/Clustered Bar Chart

9
3. Stacked/Component Bar Chart

10
4. Percentage Component Bar Chart

11
Example 01
Following table represents the sales of newspapers in four years. Plot a
bar chart for the following data set:

Year Sales (1000’s)


2004 250
2005 150
2006 350
2007 450

12
Example 02
Plot a bar chart for the following data set:

Gender Hair color Count


Black 10
Male White 16
Grey 7
Black 15
Female White 9
Grey 8

13
Pie Chart

• Pie charts are used to analyze one categorical variable.

• In pie charts, area of each sector will proportional to the value of


category it represent.

• This is appropriate, when there are few number of categories for


the variable or when value of each category is varying widely.
14
Example 03
Number of fruits sold by a seller as follows. Draw a pie chart based on
the following data.
Fruit Type No of units sold
Watermelons 5
Pears 20
Apples 15
Oranges 10

15
Histogram
• First, divide the given data set into suitable number of classes
(intervals/categories) which have the same width.
• Classes with their frequencies (counts) is called a frequency
distribution.
• Frequency, relative frequency or percentages can be used for the y
axis while x axis will represent the classes of the variable.
• In histograms, each bar will represent each class and length of the bar
will proportional to the frequency of respective class.
• In histograms, bars are drawn adjacent with each other (No gaps
between two bars).

16
Example 04
Draw histogram for the following data set:

42 74 40 60 82 115 41 61 75 83 63
53 110 76 84 50 57 78 77 63 65 95
68 69 104 80 79 79 54 73 59 81 100
56 49 77 90 84 76 42 64 69 70 80
72 50 79 52 103 96 51 86 73 94 71

17
Step 01:

Arranged the data set in ascending order (smallest to largest ) or


descending (largest to smallest )order.

18
Step 02:
• Calculate the range of the data set.
• Range is the Difference between the largest and smallest
observation.

19
Step 03: Estimate the number of class intervals

• The number of class intervals depends on the data set.

• The number of class intervals,


• can be guessed
• can be taken as the smallest integer k such that
(n is the sample size)

20
Step 04:Estimate the Class Width

• Approximate class width can be found as follows. It can be rounded


off to a convenience number.

𝑹𝒂𝒏𝒈𝒆
𝑪𝒍𝒂𝒔𝒔 𝑾𝒊𝒅𝒕𝒉 =
𝑵𝒐 𝒐𝒇 𝑰𝒏𝒕𝒆𝒓𝒗𝒂𝒍𝒔

21
Step 05: Class intervals

Class Class Interval


1
2
3
4
5
6

22
Step 06: Frequency table

The number of observation in any particular class is called the class


frequency of the data.

Class Class Interval Tally marks Frequency


1
2
3
4
5
6
23
Step 07: Class boundary

𝑳𝒐𝒘𝒆𝒓 𝑪𝒍𝒂𝒔𝒔 𝑩𝒐𝒖𝒏𝒅𝒂𝒓𝒚


𝟏
= 𝑼𝒑𝒑𝒆𝒓 𝑪𝒍𝒂𝒔𝒔 𝑳𝒊𝒎𝒊𝒕 𝒐𝒇 𝑳𝒐𝒘𝒆𝒓 𝑪𝒍𝒂𝒔𝒔 + 𝑳𝒐𝒘𝒆𝒓 𝑪𝒍𝒂𝒔𝒔 𝑳𝒊𝒎𝒊𝒕 𝒐𝒇 𝑮𝒊𝒗𝒆𝒏 𝑪𝒍𝒂𝒔𝒔
𝟐

𝑼𝒑𝒑𝒆𝒓 𝑪𝒍𝒂𝒔𝒔 𝑩𝒐𝒖𝒏𝒅𝒂𝒓𝒚


𝟏
= (𝑳𝒐𝒘𝒆𝒓 𝑪𝒍𝒂𝒔𝒔 𝑳𝒊𝒎𝒊𝒕 𝒐𝒇 𝑼𝒑𝒑𝒆𝒓 𝑪𝒍𝒂𝒔𝒔 + 𝑼𝒑𝒑𝒆𝒓 𝑪𝒍𝒂𝒔𝒔 𝑳𝒊𝒎𝒊𝒕 𝒐𝒇 𝑮𝒊𝒗𝒆𝒏 𝑪𝒍𝒂𝒔𝒔)
𝟐

24
Step 08: Mid point

𝑪𝒍𝒂𝒔𝒔 𝑴𝒊𝒅 𝑷𝒐𝒊𝒏𝒕 𝑪𝒍𝒂𝒔𝒔 𝑴𝒂𝒓𝒌


𝟏
= (𝑼𝒑𝒑𝒆𝒓 𝑪𝒍𝒂𝒔𝒔 𝑩𝒐𝒖𝒏𝒅𝒂𝒓𝒚 + 𝑳𝒐𝒘𝒆𝒓 𝑪𝒍𝒂𝒔𝒔 𝑩𝒐𝒖𝒏𝒅𝒂𝒓𝒚)
𝟐

25
Class Class Class Mid Frequency
Interval boundaries values ( xi ) ( fi )

26
Frequency Polygon
• This is another way of displaying data graphically.

How to draw Frequency polygon?


Step 01: First create frequency table.

Step 02: Plot the frequency over the class mid point.

Step 03: Extend one class left to the first mid point and one class right to the last mid point.

Step 04: Then connect the mid points by straight lines, so that the polygons begin and ends
with frequency of zero.
27
Ogive Curve (Cumulative Frequency Curve)
• There are two types of ogive
1. Less than ogive
2. More than ogive

How to Draw Less Than Ogive Curve?


Step 01: First create frequency table.
Step 02: Calculate cumulative frequency.
Step 03: Take the cumulative frequencies along the y-axis (vertical axis) and
the upper-class boundaries on the x-axis (horizontal axis).
Step 04: Against each upper-class boundary, plot the cumulative frequencies.
Step 05: Connect the points with a straight line. 28
How to Draw More Than Ogive Curve?
Step 01: First create frequency table.
Step 02: Calculate cumulative frequency.
Step 03: Take the cumulative frequencies along the y-axis (vertical axis) and
the lower-class boundaries on the x-axis (horizontal axis).
Step 04: Against each lower-class boundary, plot the cumulative frequencies.
Step 05: Connect the points with a straight line.

29
Example 05
Draw Frequency Polygon, Less than Ogive and More than Ogive for
the data set in example 04.

30
Box Plot
• First identify the five number summary & outliers for the variable.

• Five Number Summary:


• Minimum
• Maximum
• Q1 (1st Quartile)
• Q2(Median/2nd Quartile)
• Q3 (3rd Quartile)

31
• A box is used to represent the middle half of the data.

• Line segments are used to represent other half of the data.

• Box plot is used to identify the distribution pattern of the data.

32
Outliers
• Outliers should be identify before draw the box-plot.

• A limit should be defined for the accepted range of values.

Upper Bound= 𝑄3 + 1.5 ∗ 𝐼𝑄𝑅

Lower Bound = 𝑄1 − 1.5 ∗ 𝐼𝑄𝑅

• Values outside the range are considered as outliers and marked with
asterisks (*).

33
• 𝑄1, Median, 𝑄3 are marked as a box.

• Minimum & maximum values which are not outliers, will be end point
for whiskers of the box plot.

34
Example 06
Draw box plot for the following data set:

53, 43, 30, 38, 30, 42, 12, 46, 39, 37, 34, 46, 32, 18, 5

35
Stem and Leaf Plot

• Stem and leaf plot is useful when the data set is very small.

• First, data set should be arranged in ascending order.

• Next, each data value split into two parts known as “stem” and “leaf.

• The “leaf” is usually the last digit of the number.

• The other digits to the left of the “leaf” form the “stem”.

36
Example 07
Draw stem and leaf plot for the following data set:

53, 43, 30, 38, 30, 42, 12, 46, 39, 37, 34, 46, 32, 18, 5

37
In Next Chapter…
• Descriptive Statistics will be discussed

38
Thank You

Rajika Gunarathne

Department of Management of Technology


Faculty of Business
University of Moratuwa

Email: [email protected] 39

You might also like