Graphical Representation of Statistical Data
Graphical Representation of Statistical Data
Graphical Representation of
Statistical Data
GRAPHICAL REPRESENTATION:
Tabulation is a good method of condensing and representing data in a readily
understandable form, but many people have no taste for figures. They would
prefer a way of representation where figures could be avoided. This purpose is
achieved by the presentation of statistical data in a visual form. The visual display
of statistical data in the form of points, lines, areas and other geometrical forms
and symbols, is the most general terms known as Graphical Representation.
Statistical data can be studied with this method without going through figures,
presented in the form of tables.
Such visual representation can be described in the sections that follow. The basic
difference between a graph and a diagram is that a graph is a representation of
data by a continuous curve, usually shown on a graph paper while a diagram is
any other one, two or three-dimensional form of visual representation
1
2
a) Categorical Data:-
1) Example: - Squash World Open Champions from 1976-2001.
Microsoft Encarta
Deserts Sq Km
Sahara 9100000
Gobi 1300000
Patagonian 670000
Rub al Khali 650000
Great Sandy 390500
Great Victoria 390500
2
3
b) Numerical Data:-
1) Example: - The Following data indicates Consumption of Raw
Material by Industry in Pakistan during 1999-2004.
Source:-APTMA
period Consumption ('000' Kgs)
1999-00 1,566,348
2000-01 1,673,280
2001-02 1,755,669
2002-03 1,943,197
2003-04 1,938,678
1. The following table gives the birth rate per thousand of different countries over a
certain period of time.
COUNTRY POPULATION RATE IN THOUSANDS
INDIA 33
GERMANY 15
UK 20
CHINA 40
DENMARK 30
SWEDEN 15
DIAGRAM:
2. The following table shows the price list of various brands of cars with 2 Liter
engine.
4
5
DIAGRAM:
5
6
1. The table below gives data relating to the exports and imports of a certain country
X (in thousands of dollars) during the four years ending in 1930 - 31.
DIAGRAM:
6
7
2. The table given below shows the values of shares of P.T.C.L.A, OGDCL, & PPL
in KSE on trading days of a week.
DIAGRAM:
DIAGRAM:
DIAGRAM:
9
10
4) Pie Chart:-
A pie-diagram, also known as sector diagram, is a graphic device consisting of a
circle divided into sectors or pie-shaped pieces whose areas are proportional to the
various parts into which the whole quantity is divided. The sectors are shaded or
coloured differently to show the relationship of parts to the whole.
Procedure for construction of pie chart: Draw a circle of any convenient
radius. As a circle consisting of 360o, the whole quantity to be displayed is
equated to 360. the proportion that each component part or category bears to the
whole quantity will be the corresponding proportion of 360o. These
corresponding proportions, i.e. angles, are calculated by
component part
Angle = �3600
wholequantity
Then divided the circle into different sectors by constructing angles at the
centre by means of a protector and draw the corresponding radii.
1) Example: - The Data Shows Soccer world cup won by different teams
from 1930-1998. Source:-www.google.com
No. of times Out of
Country won Percentages 3600
Uruguay 2 13 46.8
Italy 3 19 68.4
West Germany 3 19 68.4
Brazil 4 24 86.4
England 1 6 21.6
Argentina 2 13 46.8
France 1 6 21.6
10
11
2) Example:-The give data show Exports of USA in Billion dollars from 1975-2000.
Source:-Microsoft Encarta
Exports(Billion Out of
Year Dollars) Percentage (%) 3600
1975 132.6 4 14.4
1980 271.8 9 32.4
1985 288.8 9 32.4
1990 537.2 17 61.2
1995 794.2 26 93.6
2000 1065.7 35 126
1. The following table shows the yearly expenditure of a Mr. Ted, a college
undergraduate in various categories.
BIARDING
TRANSPORTATION 3000 10 36
INSURANCE 1000 3.33 12
SINDRY EXPENCES 4000 13.33 48
TOTAL 30000
DIAGRAM:
2. The pie chart below shows the fractions of dogs in a dog competition in seven
different groups of dog breeds. Suppose 1000 dogs entered the competition in all.
DIAGRAM:
12
13
5) Frequency Polygons:-
13
14
1. The following table shows the Frequency distribution for quantity of Glucose in
100 people.
14
15
92 14
96 18
100 11
104 18
108 6
112 8
116 5
120 3
124 1
128 0
132 0
136 0
140 0
144 0
DIAGRAM:
2. The following chart shows the number of lunatics and their frequency.
DIAGRAM:
15
16
2) Example: -The data shows wickets taken by a cricket player in his debut.
Wickets Cumulative
taken(x) No. of times(f) Frequency(c.f)
1 22 22
2 19 41
3 20 61
4 12 73
5 9 82
6 3 85
16
17
1. The following table shows the frequency distribution for quantity of Glucose in
100 people.
DIAGRAM:
17
18
2. The following chart shows the number of lunatics and their frequency
18
19
7) Pareto Diagrams:-
Definition: A bar graph used to arrange information in such a way that
priorities for process improvement can be established.
Purposes:
Pareto diagrams are named after Vilfredo Pareto, an Italian sociologist and
economist, who invented this method of information presentation toward
the end of the 19th century. The chart is similar to the histogram or bar
chart, except that the bars are arranged in decreasing order from left to
right along the abscissa. The fundamental idea behind the use of Pareto
diagrams for quality improvement is that the first few (as presented on the
diagram) contributing causes to a problem usually account for the majority
of the result. Thus, targeting these "major causes" for elimination results
in the most cost-effective improvement scheme.
How to Construct:
1. Determine the categories and the units for comparison of the data,
such as frequency, cost, or time.
2. Total the raw data in each category, then determine the grand total
by adding the totals of each category.
3. Re-order the categories from largest to smallest.
4. Determine the cumulative percent of each category (i.e., the sum of
each category plus all categories that precede it in the rank order,
divided by the grand total and multiplied by 100).
5. Draw and label the left-hand vertical axis with the unit of
comparison, such as frequency, cost or time.
6. Draw and label the horizontal axis with the categories. List from
left to right in rank order.
7. Draw and label the right-hand vertical axis from 0 to 100 percent.
The 100 percent should line up with the grand total on the left-
hand vertical axis.
8. Beginning with the largest category, draw in bars for each category
representing the total for that category.
9. Draw a line graph beginning at the right-hand corner of the first
bar to represent the cumulative percent for each category as
measured on the right-hand axis.
19
20
10. Analyze the chart. Usually the top 20% of the categories will
comprise roughly 80% of the cumulative total.
Tips:
20
21
1. The table given below shows the problems with the computer.
21
22
DIAGRAM:
DIAGRAM:
8) Pictographs:-
1) Example: - The following data shows Bikes assembled by a company
on different week days from Monday to Saturday.
Scale: - 1 Bike picture represents 5 bikes
Day Production
Monday 12
Tuesday 23
Wednesday 17
Thursday 27
23
24
Friday 10
Saturday 22
24
25
9) Fishbone diagram:-
Dr. Kaoru Ishikawa, a Japanese quality control statistician, invented the fishbone
diagram. Therefore, it may be referred to as the Ishikawa diagram. The fishbone diagram
is an analysis tool that provides a systematic way of looking at effects and the causes that
create or contribute to those effects. Because of the function of the fishbone diagram, it
may be referred to as a cause-and-effect diagram. The design of the diagram looks much
like the skeleton of a fish. Therefore, it is often referred to as the fishbone diagram.
Whatever name you choose, remember that the value of the fishbone diagram is to assist
teams in categorizing the many potential causes of problems or issues in an orderly way
and in identifying root causes.
When should a fishbone diagram be used?
Does the team...
Need to study a problem/issue to determine the root cause?
Want to study all the possible reasons why a process is beginning to have
difficulties, problems, or breakdowns?
Need to identify areas for data collection?
Want to study why a process is not performing properly or producing the desired
results?
25
26
EXAMPLE # 1:
Draw a fishbone diagram of doorknob by showing its parts.
GRAPH:
26
27
EXAMPLE # 2:
Draw a fishbone diagram of Biological Warfare Disease.
GRAPH:
27
28
1. The CEO of a call center wants to know the cause that why all calls is not
answered and also wants to improve the ability to handle calls.
DIAGRAM:
2. The following fish bone diagram illustrates that what were the causes that project
deadline was not met
DIAGRAM:
28
29
10) Histogram:-
1) Example: - The following data shows the number of workers in different factories.
No of workers Factories h
70-75 15 5
75-80 9 5
80-85 25 5
85-90 18 5
90-95 27 5
29
30
1. Divide each observation in the data set into two parts, the Stem and the
Leaf.
2. List the stems in order in a column, starting with the smallest stem and
ending with the largest.
3. Proceed through the data set, placing the leaf for each observation in the
appropriate stem row.
Depending on the data, a display can use one, two or five lines per stem. Among
the different stems, two-line stems are widely used.
30
31
2. a stem and leaf display arranges the data in an orderly fashion and makes
it easy to determine certain numerical characteristics to be discussed in
the following chapter.
the classes and numbers falling in them are quickly determined once we have
selected the digits that we want to use for the stems and leaves
12 13 12 14 11 17 16 11 14 16
2 2 8 0 0 0 5 2 5 7
17 18 19 15 18 16 16 14 17 13
6 9 0 6 8 6 5 5 8 3
11 19 18 17 16 15 14 12 11 11
8 2 5 3 9 4 8 2 3 1
17 12 13 13 10 16 19 12 16 17
0 2 2 7 9 5 7 9 7 8
10 11 18 17 15 17 14 18 14 12
9 8 7 6 1 6 5 9 5 3
NO#2
32
33
Now find the median of all the numbers. Notice that since there are
13 numbers, the middle one will be the seventh number:
This must be the median (middle number) because there are six
numbers on each side.
The next step is to find the lower median. This is the middle of the
lower six numbers. The exact centre is half-way between 8 and 9 ...
which would be 8.5
Now find the upper median. This is the middle of the upper six
numbers. The exact centre is half-way between 14 and 14 ... which
must be 14
33
34
Now you are ready to construct the actual box & whisker graph. First
you will need to draw an ordinary number line that extends far
enough in both directions to include all the numbers in your data:
First, locate the main median 12 using a vertical line just above your
number line:
Now locate the lower median 8.5 and the upper median 14 with
similar vertical lines:
Next, draw a box using the lower and upper median lines as
endpoints:
Finally, the whiskers extend out to the data's smallest number 5 and
largest number 20:
But what does it mean? What information about the data does this
34
35
Well, it's obvious from the graph that the lengths of the fish were as
small as 5 cm, and as long as 20 cm. This gives you the range of the
data ... 15.
You also know the median, or middle value was 12 cm.
Since the medians (three of them) represent the middle points, they
split the data into four equal parts. In other words:
one quarter of the data numbers are less than 8.5
one quarter of the data numbers are between 8.5 and
12
one quarter of the data numbers are between 12 and
14
Here is a picture of the quarter of the data that is between 8.5 and 12.
Notice that the data is more spread out here:
This picture is showing where half the data numbers are. Half of all
the fish caught had a length between 8.5 and 14 centimetres:
35
36
Graphical representation of
Textile Related data
1. SIMPLE BAR DIAGRAM
a) Exported man made fiber
Quantity of
fiber
Years Exported
1993-94 25,422
1994-95 61,485
1995-96 28,714
1996-97 48,484
1997-98 34,015
1998-99 34,515
1999-00 22,716
2000-01 28,524
2001-02 45,665
2002-03 66,653
2003-04 54,878
36
37
b) Consumption of cotton
Year Cotton
1990-91 1,128,978
1991-92 1,257,399
1992-93 1,318,892
1993-94 1,511,610
1994-95 1,412,732
1995-96 1,509,955
1996-97 1,444,368
1997-98 1,471,169
1998-99 1,441,923
1999-00 1,566,348
2000-01 1,673,280
2001-02 1,755,669
2002-03 1,943,197
2003-04 1,938,678
consumption of cotton
2,500,000
2,000,000
1,500,000
cotton
1,000,000
500,000
0
1
19 1
19 3
19 5
19 7
20 9
3
-0
-9
-9
-9
-9
-9
-0
00
92
96
98
02
90
94
19
20
37
38
a) Production of cloth
province wise
38
39
39
40
a) Production of yarn
1997-98 23 45 27 4
1998-99 26 47 24 2
1999-00 25 49 23 2
2000-01 25 50 22 3
2001-02 26 49 21 3
2002-03 25 46 25 4
2003-04 20 50 25 5
40
41
B) Production of cloth
4. FREQUENCY POLYGONES
41
42
Years Quantity
1995-96 455,693
1996-97 212,452
1997-98 512,862
1998-99 421,481
1999-00 512,971
2000-01 512,467
2001-02 544,217
2002-03 519,329
2003-04 458,962
B) Export of cloth
Year Quantity of cloth exported
1971-72 409808
42
43
1979-80 545768
1989-90 1017868
1996-97 1257430
1997-98 1271272
1998-99 1355166
1999-00 1574876
2001-01 1735824
2001-02 1957353
2002-03 2036321
2003-04 2378900
5. CUMMULATIVE FREQUENCY
POLYGONES
43
44
b) Export of cloth
6. PARETO DIAGRAM
A) Export of yarn
Year Quantity of yarn exported % quantity Cumulative %
1996-97 1,411,519 18.89990509 18.89990509
1997-98 1,153,542 15.44565416 34.34555925
45
46
7.PIE CHART
a) World wide fiber production in year 2004
(Quantity measured in 1000 tones)
47
48
48
49
2.
Summary stats for 1986
Which
NumNumeric = 55 of the
NumNonNumeric = 0
NumCases = 55
Mean = 6.0673
Median = 6.1000
Std Deviation = 0.47339
Range = 2.4000
Minimum = 4.6000
Maximum = 7
75-th %ile = 6.4000
following is NOT CORRECT?
a. The 25th percentile is about 5.9.
b. Some outliers appear to be present below a pH of 5.4.
c. About 95% of the observations have pH values in the approximate range
6±1.
d. About 10% of the values are in the range 5.8 to 6.0.
e. About 75% of the values are less than 6.4. (d)
3. The following is a histogram showing the actual frequency of the closing prices
on the New York exchange of a particular stock.
Based on the above frequency histogram for New York Stock exchange, the class
that contains the 80th percentile is:
a. 20-30
b. 10-20
49
50
c. 40-50
d. 50-60 (e)
e. 30-40
5. The weights of the male and female students in a class are summarized in the
following boxplots:
(e)
6. Consider the following box plots of the grades in a course in statistics for each sex
drawn according to the convention that the whiskers reach the 10th and 90th
percentiles.
50
51
7. Consider the following box-plot of the yield of barley drawn using the convention
that the wiskers reach the 10 and 90th percentiles.
51
52
e. 15%
10. Forest companies routinely take samples from tracts that have been replanted to
monitor the growth of the trees. Suppose that in a recent sample of two tracts, the
diameter of the trees was measured with the following results:
Tract A B
Trees 75 210
Range 232-315 215-250 (mm)
12. For each student in a class, the sex and weight (in kilograms) are recorded
Consider the following SAS program:
DATA STUDENTS;
INPUT SEX $ WEIGHT;
DATALINES;
F 62
52
53
.
. (<-- more data here)
.
M 78
;
PROC SORT DATA=STUDENTS;
BY SEX;
PROC CHART DATA=STUDENTS;
VBAR WEIGHT/TYPE=PERCENT MIDPOINTS=55 TO 95 BY 10;
BY SEX;
0|28
1|2245
2|01333358889
3|001356679
4|22444466788
5|000
a. 75
b. 44
c. 32
d. 37.5 (b)
e. 30
0|9
1|225
2|013335889
3|00136679
4|02244478
5|0
a. 30.5
b. 30.0
c. 25.0
d. 28.5 (a)
e. 44.0
16. Rainwater was collected in water collectors at thirty different sites near an
industrial basin and the amount of acidity (pH level) was measured. The following
stem-and- leaf diagram shows the pH values that ranged from 2.6 to 6.3.
Stems Leaves
2 679
3 237789
4 1222446899
5 0556788
6 0233
a. 4.2
b. 4.4
c. 4.5
d. 4.6 (c)
e. Average of 15 and 16.
17. Refer to the previous question. Which of the following box-plots is correct:
(e)
d. 3.77 (e)
e. 1.855
19. The following is a stem-plot of the birth weights of male babies born to the
smoking group. The stems are in units of kg.
Stems Leaves
2 3,4,6,7,7,8,8,8,9
3 2,2,3,4,6,7,8,9
4 1,2,2,3,4
5 3,5,5,6
a. 13.5
b. 3.2
c. 3.5
d. 3.7
e. Average of 13 and 14. (c)
20. Refer to the previous question. The first quartile (25th) percentile of the weights is
a. 2.3
b. 2.7
c. .25
d. 6.5 (e)
e. 2.8
55
56
23. The following is a display comparing the favorite TV shows selected from a
specified set by gender. Each person had to select one preferred show from the
three shows given below:
Female:sssssssssssffffffffffffffffffffffffffffffbbbbbbbbbb 300
24. An experiment was conducted to investigate the effect of a new weed killer to
suppress weed germination in onion crops. Two chemicals were used, the standard
week killer (C) and the new chemical (W). Both chemicals were tested at high
and low concentrations. Measurements are made on each of 50 plots for each
treatment combinations of the % weed germination. Here are some box-plots of
the results where the whiskers extend to the min and max of the data.
0 10 20 30 40 50
|---------|--------|---------|---------|---------|
|
| _______________
W-low conc. | -----------|________|______|--------
|
| ___________
C-low conc. | ----|_____|_____|----
|
| ______________
W-high conc | -----|_______|______|---------- * * *
|
| _________
C-high conc. | --|____|____|---
a. At either high or low concentrations, the new chemical (W) gives better
control of weed germination than the control (C).
56
57
57