0% found this document useful (0 votes)

36 views16 pages

CH 2

Uploaded by

moneymakeline9

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

36 views16 pages

CH 2

Uploaded by

moneymakeline9

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 16

Arba Minch University Basic Statistics

2. VISUAL DESCRIPTION OF DATA

❖ Objectives of this chapter
At the end of this chapter, students should be able to:
• Construct a frequency distribution and a histogram.
• Construct a stem-and-leaf plot, dot plot, and scatter diagram to represent data.
• Visually represent data by using graphs and charts.

2.1.Methods of Data Presentation

After collecting and organize data, what do we do with the organized data next? Now you
have to present the data you have collected so that they can be of use. Thus, the collected data
also known as ‘raw data’ are always in an unorganized form and need to be organized and
presented in a meaningful and readily comprehensible form in order to facilitate further
statistical analysis. Raw data: recorded information in its original collected form, whether it
is counts or measurements, is referred to as raw data. Classification is a preliminary and it
prepares the ground for proper presentation of data.

After collecting and organize data the next important task is effective presentation of bulk
volume data. The major objectives of data presentation are:-
➢ To presenting data in visual display and more understandable
➢ To have great attraction about the data
➢ To facilitate quick comparisons using measures of location and dispersion.
➢ To enable the reader to determine the shape and nature of distribution to make
statistical inference.
➢ To facilitate further statistical analysis.

There are three methods of data presentation, namely: Tables (e.g., frequency distribution),
Graphs (e.g., histogram), and Diagrams (e.g., bar chart) are commonly used to summarize
both qualitative and quantitative data.

2.2.Tabular Presentation of Data

Tabulation is the process of summarizing classified or grouped data in the form of a table so
that it is easily understood and an investigator is quickly able to locate the desired
information.
Tables are important to summarize large volume of data in more understandable way. Based
on the characteristics they present tables are:
i. Simple (one way table): table which present one characteristics for example age
distribution.
ii. Two way table: it presents two characteristics in columns and rows for example age
versus sex.
iii. A higher order table: table which presents two or more characteristics in one table.
In statistics usually we use frequency distribution table for different type of data.

1
Arba Minch University Basic Statistics

Frequency Distribution: is the organization of raw data in table form, using classes and
frequencies. Where, Frequency (f) is the number of values in a specific class of the
distribution.

There are three basic types of frequency distributions, and there are specific procedures for
constructing each type. The three types are categorical, ungrouped and grouped frequency
distributions.

2.2.1. Categorical Frequency Distribution (CaFD)

The CaFD is used for data which can be placed in specific categories such as nominal or
ordinal level data. For example, for data such as political affiliation, religious affiliation,
blood type, or major field of study categorical frequency distribution is appropriate.

❖ Steps of constructing CaFD

1. You have to identify that the data is in nominal or ordinal scale of measurement
2. Make a table as show below
A B C D
Class Tally Frequency Percent

3. Put distinct values of a data set in column A

4. Tally the data and place the result in column B
5. Count the tallies and place the results in column C
f
6. Find the percentage of values in each class by using the formula x100% , and place
n
the results in column D. Where, f is frequency, and n is total number of values.

Example 2.1: Twenty-five army inductees were given a blood test to determine their blood
type. The data set is given as follows:
A B B AB O
O O B AB B
B B O A O
A O O O AB
AB A O B A

Construct a frequency distribution for the above data.

❖ Solution:
Our data type is nominal so we use categorical frequency distribution type to present data.
After we have followed the above six steps the following frequency distribution was obtained.
A B C D
Class Tally Frequency Percent
A ///// 5 20
B ///// // 7 28

2
Arba Minch University Basic Statistics

O ///// //// 9 36
AB //// 4 16

2.2.2. Ungrouped Frequency Distribution (UFD)

It is a table of all the potential raw values that could possible occurs in the data along with the
number of times each actually occurred. In other words UFD is the distribution that use
individual data values along with their frequencies. It is often constructed for small set of data
on discrete variable (when data are numerical), and when the range of the data is small.
The major components of this type of frequency distributions are class, tally, frequency, and
cumulative frequency.
Cumulative frequencies (CF):- are used to show how many values are accumulated up
to and including a specific class. We have less than and more than cumulative
frequencies.
Less than Cumulative Frequency (LCF):-is the total sum of observations below
specified class including that class
More than Cumulative frequency (MCF):- is the total sum of observations above
specified class including that class.

❖ Steps of Constructing UFD:

1. First find the smallest and largest raw score in the collected data.
2. Arrange the data in order of magnitude and count the frequency.
3. To facilitate counting one may include a column of tallies.
4. Put respective frequency and cumulative frequency (LCF and MCF) along each
ordered data.

Example 2.2: A demographer is interested in the number of children a family may have,
he/she took sample of 30 families and obtained the following observations.
Number of children in a sample of 30 families
4 2 4 3 2 8
3 4 4 2 2 8
5 3 4 5 4 5
4 3 5 2 7 3
3 6 7 3 8 4
Construct a frequency distribution for this data.
❖ Solution:
• Find the range, Range=Max-Min= 8-2=6
• These individual observations can be arranged in ascending or descending order of
magnitude in which case the series is called array. Array of the number of children in 30
families is:
• 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 4, 4, 5, 5, 5, 5, 6, 7, 7, 8, 8, 8
• Frequency distribution of children in a 30 families is as follow:

No of children No of family

3
Arba Minch University Basic Statistics

(Class) (Frequency) LCF MCF

2 5 5 30 = 5+(7+8+4+1+2+3)
3 7 12 = 7+(5) 25 = 7+(8+4+1+2+3)
4 8 20 = 8+(5+7) 18 = 8+(4+1+2+3)
5 4 24 = 4+(5+7+8) 10 = 4+(1+2+3)
6 1 25 = 1+(5+7+8+4) 6 = 1+(2+3)
7 2 27 = 2+(5+7+8+4+1) 5 = 2+(3)
8 3 30 = 3+(5+7+8+4+1+2) 3
Each individual value is presented separately, that is why it is named ungrouped frequency
distribution

2.2.3. Grouped Frequency Distribution (GFD)

It is a frequency distribution when several numbers are grouped in one class; the data must be
grouped in which each class has more than one unit in width. We use this type of frequency
distribution when the range of the data is large, and for data from continuous variable.
Some of basic terms that are most frequently used while we deal with grouped frequency
distribution are:
Class limits (CL): Separates one class in a GFD from another. The limits could actually
appear in the data and have gaps between the upper limits of one class and lower limit of
the next.
Lower Class Limits (LCL): are the smallest numbers that can belong to the different
class.
Upper Class Limits (UCL): are the largest numbers that can belong to the different
classes.
Units of measurement (U): the distance between two possible consecutive measures. It
is usually taken as 1, 0.1, 0.01, 0.001, -----.
Class Boundaries (CB) (true class limits): are the number used to separate classes, but
without the gaps created by class limits.
Lower Class Boundary (LCB): is found by subtracting U/2 from the corresponding LCL
Upper Class Boundary (UCB): is found by adding U/2 to the corresponding UCL
Class Mark/Mid Points (CM): are the midpoints of the classes. Each class midpoint can
be found by adding the LCL/B to the UCL/B and dividing the sum by 2.
Class Width (W) is the difference between two consecutive LCL or two consecutive
LCB.
Cumulative Frequency Distribution (CFD): it is the tabular arrangement of class
interval together with their corresponding cumulative frequencies (CF). It can be more
than (MCF) or less than type (LCF), depending on the type of CF used.
Relative frequency (rf): it is the frequency divided by the total frequency.
Relative cumulative frequency (rcf): it is the CF divided by the total frequency.

❖ Steps in constructing GFD

1. Find the Highest (H) and the Lowest (L) values
2. Find the Range; Range = Maximum − Minimum or R = H − L

4
Arba Minch University Basic Statistics

3. Select the number of classes desired. Here, we have two choices to get the desired
number of classes:
i. Use Struge’s rule. That is, K = 1+ 3.32 log n where K is the number of class and n
is the number of observations. Round the decimal to the upper nearest integer.
ii. Select the number of classes arbitrarily between 5 and 20 conventionally. If you fail
to calculate K by Struge’s rule, this method is more appropriate.
When we choose the number of classes, we have to think about the following criteria:
i. There should be between 5 and 20 classes.
ii. The classes must be mutually exclusive. This means that no data value can fall into
two different classes
iii. The classes must be all inclusive or exhaustive. This means that all data values
must be included.
iv. The classes must be continuous. There are no gaps in a frequency distribution. The
only exception occurs when the class with a zero frequency is the first or last. A
class width with a zero frequency at either end can be omitted without affecting the
distribution.
v. The classes must be equal in width. The exception here is the first or last class. It is
possible to have a "below ..." or "... and above" class. This is often used with Ages.

4. Find the Class Width (W) by dividing the range by the number of classes
R Range
W = or W =
K Number of Classes
Note that: Round the value of W up to the nearest whole number if there is a reminder.
For instance, 4.7 ≈ 5 and 4.12 ≈ 5.
5. Select the Starting Point as the LCL. This is usually the lowest score (observation). Add
the width to that score to get the LCL of the next class. Keep adding until you achieve the
number of desired classes (K) calculated in step 3.
6. Find the UCL; subtract unit of measurement (U) from the LCL of the second class in
order to get the UCL of the first class. Then add the W to each UCL to get all UCL.
Unit of measurement (U): Is the smallest value of difference between consecutive
observations or sometimes it is next value. For instance, 28, 23, 52, and then the unit
of measurement of this data set is one. Because take one datum arbitrarily, say 23,
then the next value will be 24. Therefore, U = 24 − 23 = 1. If the data set is 24.12,
30, 21.2, then give priority to the datum with more decimal place. Take 24.12 and
guess the next possible value. It is 24.13. Therefore, U = 24.12 - 24.13 = 0.01
Note that: U=1 is the maximum value of unit of measurement and is the value when
we don’t have a clue about the data.
U
7. Find the Class Boundaries (CB). Lower Class Boundary = Lower Class Limit − And
2
U
Upper Class Boundary = Upper Class Limit + .
2
8. Find the Class Mark/Mid Points (CM)
9. Tally the data and write the numerical values for tallies in the frequency column.

5
Arba Minch University Basic Statistics

10. Find Cumulative Frequency (LCF and MCF)

11. If necessary, find Relative frequency (rf): it is the frequency (f) divided by the total
frequency (Tf) and find Relative cumulative frequency (rcf): it is the CF divided by the
total frequency (Tf).

Example 2.3: Consider the following set of data and construct the frequency distribution.
11 29 6 33 14 21 18 17 22 38
31 22 27 19 22 23 26 39 34 27
❖ Solution:
Using steps to construct grouped frequency distribution
1. Highest value=39, Lowest value=6
2. R = 39 - 6 = 33
3. K = 1 + 3.32 log20 = 5.32  6 which is a classes desired
R 33
4. W = = = 5.5  6
K 6
5. Select starting point. Take the minimum which is 6 then add width 6 on it to get the
next class LCL.
➢ Lower class limit of the first class: LCL1=6
➢ Lower class limit of the second class: LCl2= LCL1+W= 6+6 =12
➢ LCL3= LCL2+W= 12+6=18
➢ LCL4= LCL3+W= 18+6=24......... continue
6 12 18 24 30 36

6. Upper Class Limit (UCL). Since unit of measurement (U) is one.

➢ Upper class limit of the first class: UCL1= LCL2-U= 12-1=11
➢ UCL2= LCL3-U= 18-1=17 or UCL1+W= 11+6=17
➢ UCL3= LCL4-U= 24-1=23 or UCL2+W= 17+6=23.......... continue
11 17 23 29 35 41

Therefore, 6 − 11 is the first class limit; 12 – 17 is the second class

limit......continue
Classes desired( K) Class limit (CL)
1 6 - 11
2 12 - 17
3 18 - 23
4 24 - 29
5 30 - 35
6 36 - 41
7. Find the Class Boundaries (CB). Take the formula in step 7. LCBi = LCLi - 0.5 and
UCBi = UCLi + 0.5
➢ LCB1=LCL1-U/2= 6-0.5 =5.5
➢ LCB2=LCL2-U/2= 12-0.5 =11.5 or LCB1 + W= 5.5+6= 11.5… continue
➢ UCB1=UCL1+U/2= 11+0.5 =11.5
➢ UCB2=UCL2+U/2= 17+0.5 =17.5 or UCB1+W= 11.5+6= 17.5.… continue

6
Arba Minch University Basic Statistics

8. With respective the CL and CB, the class marks (CM) each class limit are:-
➢ CM1= (LCL1+UCL1)/2 = (6+11)/2 = 8.5 or (LCB1+UCB1)/2 = (5.5+11.5)/2 =
8.5……continue
9. Tally the data and write the numerical values for tallies in the frequency column.
Classes desired( K) Class limit (CL) Frequency (f)
1 6 - 11 2
2 12 - 17 2
3 18 - 23 7
4 24 - 29 4
5 30 - 35 3
6 36 - 41 2
Total frequency 20

10. Find Cumulative Frequency (LCF and MCF)

K Frequency (f) LCF MCF
1 2 2 20= 2+(2+7+4+3+2)
2 2 2+2= 4 18= 2+(7+4+3+2)
3 7 2+2+7= 11 16= 7+(4+3+20)
4 4 2+2+7+4= 15 9= 4+(3+2)
5 3 2+2+7+4+3= 18 5= 3+(2)
6 2 2+2+7+4+3+2= 20 2

11. Find Relative frequency (rf): it is the frequency (f) divided by the total frequency (Tf)
and find Relative cumulative frequency (rcf): it is the CF divided by the total
frequency (Tf).
➢ rf1= f1/Tf= 2/20= 0.1, rf3= f3/Tf= 7/20= 0.35......continue
➢ less than type rcf1= LCF1/Tf= 2/20= 0.1,
➢ less than type rcf2= LCF2/Tf= 4/20= 0.2........continue
➢ more than type rcf1= MCF1/Tf= 20/20= 1,
➢ more than type rcf2= MCF2/Tf= 18/20= 0.9.....continue
K f LCF MCF rf rcf(less than rcf(more than type)
type)
1 2 2 20 0.10 0.10 1.00
2 2 4 18 0.10 0.20 0.90
3 7 11 16 0.35 0.55 0.80
4 4 15 9 0.20 0.75 0.45
5 3 18 5 0.15 0.90 0.25
6 2 20 2 0.10 1.00 0.10

 Therefore: the overall grouped frequency distribution of the given data set is show below
K CL CB CM f LCF MCF rf rcf(less rcf(more
than type) than type)
1 6 - 11 5.5– 11.5 8.5 2 2 20 0.10 0.10 1.00
2 12 - 17 11.5–17.5 14.5 2 4 18 0.10 0.20 0.90
3 18 - 23 17.5– 23.5 20.5 7 11 16 0.35 0.55 0.80
4 24 - 29 23.5– 29.5 26.5 4 15 9 0.20 0.75 0.45

7
Arba Minch University Basic Statistics

5 30 - 35 29.5– 35.5 32.5 3 18 5 0.15 0.90 0.25

6 36 - 41 35.5– 41.5 38.5 2 20 2 0.10 1.00 0.10

Example 2.4: The following data are percentage coverage of forest in countries in Africa.
Construct frequency distribution of CL, CB, CM, and F by using sturge’s rule.
30, 25, 23, 41, 39, 27, 41, 24, 32, 29, 35, 31, 36, 33, 36, 42, 35, 37, 41, and 29
❖ Solution
1. Given no. of observation( n ) = 20,then no. of classes
K = 1 + 3.32 log 1020
 5 , where k is number of classes.
highestvalue − lowestvalue 42 − 23
2. Class width(W) = =  4
k 5

Classes Class boundary Class mark Frequency

23 - 26 22.5 –26.5 24.5 3
27 - 30 26.5 – 30.5 28.5 4
31 - 34 30.5 – 34.5 32.5 3
35 - 38 34.5 – 38.5 36.5 5
39 - 42 38.5 – 42.5 40.5 5
Total 20

2.3.Graphical Presentation of Data

Often we use graphical presentation form for continuous data type; results from the grouped
frequency distribution and continuous variables distributed over time.

2.3.1. Histogram
✓ Histogram is a special type of bar graph in which the horizontal scale represents class
intervals of data values and the vertical scale represents frequencies (f).
✓ The height of the bars correspond to the frequency values, and the drawn adjacent to
each other (without gaps).
✓ We can construct a histogram after we have first completed a frequency distribution
table for a data set.
Example 2.4: The histogram for the data in example 2.3 is (See Figure 2.1)

2.3.2. Frequency Polygon

✓ A frequency polygon uses line segment connected to points located directly above
class midpoint (class mark) values.
✓ A histogram can be easily transformed into a frequency polygon by joining the mid-
points of the rectangles by straight lines.
✓ The heights of the points correspond to the class frequencies, and the line segments
are extended to the left and right so that the graph begins and ends on the horizontal
axis with the same distance that the previous and next midpoint would be located.

Example 2.5: The frequency polygon for the data in example 2.3 is (See Figure 2.2)

8
Arba Minch University Basic Statistics

Frequency

Figure 2.1: Histogram

7.0

Frequency
6.0
polygon

5.0

4.0

3.0

2.0

2.5 8.5 14.5 20.5 26.5 32.5 38.5 44.5

Midpoints

Figure 2.2: Frequency polygon

2.3.3. O- give Graph (Cumulative Frequency Curve)

✓ An o-give (pronounced as “oh-give”) is a line that depicts cumulative frequencies, just
as the cumulative frequency distribution lists cumulative frequencies.
✓ A cumulative frequency distribution enables us to know how many observations are
above or below a certain value.

9
Arba Minch University Basic Statistics

✓ Note that the O-give uses class boundaries along the horizontal scale, and graph begins
with the lower boundary of the first class and ends with the upper boundary of the last
class.
✓ There are two type of O-give namely less than O-give and more than O-give.
✓ If we plot a ‘less than’ Ogive is moving up and to the right while if we plot a ‘more than’
curve then it would show a declining slope and to the right.

2.4.Diagrammatic Presentation of the Data

In last lesson we observed that the technique of tabulation helps us to put unorganized
collected data in an orderly form so that it is easily understood and the needed information is,
quickly located. However, the grouping of data or too many figures in a table do not always
appeal to a common man as too many figures are generally confusing and fail to convey the
definite pattern or trend of the figures. A picture is said to be worth 10,000 words, i.e.,
through pictorial presentation data can be presented in an interesting form. Importance:

➢ They have greater attraction.

➢ They facilitate comparison.
➢ They are easily understandable.
➢ Diagrams are appropriate for presenting discrete data.

The most commonly used diagrammatic presentation for discrete as well as qualitative data
are: Line diagram, Pie charts, Bar charts, and Pictograms

2.4.1. Line diagram

This is the simplest form of diagram. The height of each line indicates the value of an item
that is being measured. The line diagram is drawn taking a suitable scale.
Example 2.5: The following data represent sale by product, 1957- 1959 of a given company
for three products A, B, C.
Production Sales($) in 1957 Sales($) in 1958 Sales($) in 1959
A 12 14 18
B 24 21 18
C 24 35 54
 Draw a line diagram to represent the sales by product from 1957 to 1959
Solution:

60
50
40
Sales($)

30 Sales($) in 1957
20 Sales($) in 1958
10 Sales($) in 1959
0
A B C
Production

10
Arba Minch University Basic Statistics

Figure 2.3: Line diagram of the three products

2.4.2. Pie Chart

Pie chart can used to compare the relation between the whole values and its components. Pie
chart is a circular diagram and the area of the sector of a circle is used in pie chart. To
construct a pie chart (sector diagram), we draw a circle with radius (square root of the total).
The total angle of the circle is 3600 .
The angles and percentages of each component are calculated by the formula:

Component Part Component Part

Angle of Sector = x360 0 ; Percentageof Sector = x100
Total Total
These angles are made in the circle by mean of a protractor to show different components.
Example 2.6: The following table gives the details of monthly budget of a family. Represent
these figures by a suitable diagram.
Item of Expenditure Family Budget
Food $600
Clothing $100
House Rent $400
Fule and Lighting $100
Miscellaneouse $300
Total $1500

Solution: The necessary computations are given below:

600 600
➢ Angle of SectorFood = x 3600 =1440, PercentageofFood = x100 = 40%
1500 1500
Item of Expenditure Family Budget Angle of Sectors %
Food $600 1440 40
Clothing $100 240 6.67
House Rent $400 960 26.67
Fule and Lighting $100 240 6.67
Miscellaneouse $300 720 20
Total $1500 3600 100

Percent

20% Food
40% Clothing
7%
House Rent
Fule and Lighting
26%
Miscellaneouse
7%

11
Arba Minch University Basic Statistics

Figure 2.4: pie chart of monthly budget of a family

2.4.3. Bar Charts

A set of bars (thick lines or narrow rectangles) representing some magnitude over
time space.
They are useful for comparing aggregate over time space.
Bars can be drawn either vertically or horizontally.
There are different types of bar charts. The most common being :
➢ Simple bar chart
➢ Component or sub divided bar chart.
➢ Multiple bar charts.
➢ Deviation or two way bar chart

a. Simple Bar Chart:

✓ It used to represents data involving only one variable classified on spatial, quantitative
or temporal basis.
✓ In simple bar chart, we make bars of equal width but variable length, i.e. the
magnitude of a quantity is represented by the height or length of the bars.
Example 2.7: Draw a simple bar chart to represent the sales by product 1957 and 1958 using
the data in example 2.5.

Sales($) in 1957 Sales($) in 1958

30 40 35
24 24
30
20
21
12 20
Sales($) in 1957 14 Sales($) in 1958
10
10

0 0
A B C A B C

Figure 2.5: Simple bar charts of the three products in 1957 and 1958

b. Subdivided or Component Bar chart:

✓ When there is a desire to show how a total (or aggregate) is divided in to its
component parts, we use component bar chart.
✓ The bars represent total value of a variable with each total broken in to its component
parts and different colours or designs are used for identifications
Example 2.7: Draw a component bar chart to represent the sales by product from 1957 to
1959 using the data in example 2.5
Solution:

12
Arba Minch University Basic Statistics

100
80
60 54 C
24 35
40 B
24 21 18 A
20
12 14 18
0
Sales($) in 1957 Sales($) in 1958 Sales($) in 1959

Figure 2.6: Component bar charts of the three products in 1957, 1958 and 1959

c. Multiple Bars chart:

✓ When two or more interrelated series of data are depicted by a bar diagram, then such
a diagram is known as a multiple-bar diagram.
✓ Suppose we have export and import figures for a few years. We can display by two
bars close to each other, one representing exports while the other representing imports
figure shows such a diagram based on hypothetical data.
✓ Multiple bar chart should be noted that multiple bar diagrams are particularly suitable
where some comparison is involved.
Example 2.8: Draw a component bar chart to represent the sales by product from 1957 to
1959 using the data in example 2.5
Solution:
Sales by product 1957-1959

54
60
50
35
Sales in $

40
24 24 21 18 18
30 A
12 14
20 B
10
0 C
Sales($) in Sales($) in Sales($) in
1957 1958 1959
Year of production

Figure 2.7: Multiple bar charts of the three products in 195, 1958 and 1959

d. Deviation Bar Diagram:

✓ When the data contains both positive and negative values such as data on net profit, net
expense, percent change etc.
✓ It is possible that in one or two years, instead of earning net profit the company might
have sustained net loss. In such a case, the data on net profit will be displayed above the
base line while the data on net loss below it.
Example 2.9: Suppose we have the following data relating to net profit (percent) of
commodity.

13
Arba Minch University Basic Statistics

Commodity Net profit

Soap 80
Sugar -90
Coffee 125
Solution:
Net profit
200
125
100 80

0
Soap Sugar Coffee Net profit
-100
-90
-200
Figure 2.8: Deviation bar diagram of the three commodities

2.4.4. Pictogram
In this diagram, we represent data by means of some picture symbols. We decide about a
suitable picture to represent a definite number of units in which the variable is measured.

Example 2.10: draw a pictogram to represent the following population of a town.

Year 1989 1990 1991 1992
Population 2000 3000 5000 7000

2.5.The Stem-and-Leaf Display and the Dotplot

Stem and Leaf Plots

The stem and leaf plot is a method of organizing data and is a combination of sorting and
graphing. It has the advantage over a grouped frequency distribution of retaining the actual
data while showing them in graphical form. A stem and leaf plot is a data plot that uses part
of the data value as the stem and part of the data value as the leaf to form groups or classes.

Example:
At internet business center, the number of customers served each day for 20 days is shown.
Construct a stem and leaf plot for the data.
25 31 20 32 13
14 43 02 57 23
36 32 33 32 44
32 52 44 51 45
Solution:

Step 1: Arrange the data in order:

02, 13, 14, 20, 23, 25, 31, 32, 32, 32,

14
Arba Minch University Basic Statistics

32, 33, 36, 43, 44, 44, 45, 51, 52, 57

Step 2: Separate the data according to the first digit, as shown.

02 13, 14 20, 23, 25 31, 32, 32, 32, 32, 33, 36 43, 44, 44, 45 51, 52, 57

Step 3: A display can be made by using the leading digit as the stem and the trailing digit as
the leaf. For example, for the value 32, the leading digit, 3, is the stem and the trailing digit, 2,
is the leaf. For the value 14, the 1 is the stem and the 4 is the leaf. Now a plot can be
constructed as shown below

Figure shows that the distribution peaks in the center and that there are no gaps in the data.
For 7 of the 20 days, the number of customer visiting internet center between 31 and 36. The
plot also shows that the center served from a minimum of 2 customers to a maximum of 57
customers in any one day

The Dotplot

A dotplot uses points or dots to represent the data values. If the data values occur more than
once, the corresponding points are plotted above one another. A dotplot is a statistical graph
in which each data value is plotted as a point (dot) above the horizontal axis. Dotplots are
used to show how the data values are distributed and to see if there are any extremely high or
low data values.
Example
The data show the number of named storms each year for the last 40 years. Construct and
analyse a dotplot for the data.
19 15 14 7 6 11 11
9 16 8 8 11 9 8
16 12 13 14 13 12 7
15 15 19 11 4 6 13
10 15 7 12 6 10
28 12 8 7 12 9

Step 1: Find the lowest and highest data values, and decide what scale to use on the
horizontal axis. The lowest data value is 4 and the highest data value is 28, so a scale from 4
to 28 is needed.
Step 2: Draw a horizontal line, and draw the scale on the line.
Step 3: Plot each data value above the line. If the value occurs more than once, plot the other
point above the first point.

15
Arba Minch University Basic Statistics

The graph shows that the majority of the named storms occur with frequency between 6
and 16 per year. There are only 3 years when there were 19 or more named storms per year.

Exercise 2.1:- draw o-give curves (less than o-give and more than o-give curves), suppose we
are given the following data set:

Weekly Earnings (Br) Number of Employees

Below 550 5
550-600 10
600-650 22
650-700 30
700-750 16
750-800 12
800-850 15

Exercise 2.1: Following is the distribution of the size of certain farms selected at random
from a district. Draw Histogram, Frequency polygon and o-give curves (less than o-give and
more than o-give curves).
Size of farms No. of farms
5-15 8
15-25 12
25-35 17
35-45 29
45-55 31
55-65 5
65-75 3

Methods of Data Collection and Presentation
No ratings yet
Methods of Data Collection and Presentation
33 pages
CH 2 2024 PPT MSC Environmental Statistics and Computer Applications
No ratings yet
CH 2 2024 PPT MSC Environmental Statistics and Computer Applications
90 pages
Handouts 2 ENDATA130 Data Collection and Organization
No ratings yet
Handouts 2 ENDATA130 Data Collection and Organization
26 pages
Chapter 2
No ratings yet
Chapter 2
57 pages
Chapter Two
No ratings yet
Chapter Two
50 pages
Stat 153 Unit 2b
No ratings yet
Stat 153 Unit 2b
63 pages
3.descriptive Statistics Assig
No ratings yet
3.descriptive Statistics Assig
92 pages
Chapter 4.data Management Lesson 1 2
100% (1)
Chapter 4.data Management Lesson 1 2
86 pages
Chapter - Two
No ratings yet
Chapter - Two
38 pages
Statistical Analysis - Discrete and Interval
No ratings yet
Statistical Analysis - Discrete and Interval
16 pages
Reading Material PDF
No ratings yet
Reading Material PDF
22 pages
Chapter 2 Method of Data Collection and
No ratings yet
Chapter 2 Method of Data Collection and
59 pages
Module 2 Stat 111 2
No ratings yet
Module 2 Stat 111 2
20 pages
Biostat Ch-2&3
No ratings yet
Biostat Ch-2&3
37 pages
1 - Chapter 1 - Frequency Distribution and Graphs
No ratings yet
1 - Chapter 1 - Frequency Distribution and Graphs
29 pages
Data Presentation, Analysis and Interpretation
No ratings yet
Data Presentation, Analysis and Interpretation
57 pages
LEC03
No ratings yet
LEC03
16 pages
STA111 Lecture Note 1
No ratings yet
STA111 Lecture Note 1
12 pages
CH 2
No ratings yet
CH 2
47 pages
Module 2 Data Collection
No ratings yet
Module 2 Data Collection
17 pages
Statistics chp3&4
No ratings yet
Statistics chp3&4
33 pages
Business Statistics Chapter 2
No ratings yet
Business Statistics Chapter 2
33 pages
STA112 Week 2 Class Note
No ratings yet
STA112 Week 2 Class Note
102 pages
Chapter-2-Methods of Data Presentation
No ratings yet
Chapter-2-Methods of Data Presentation
17 pages
Chapter 2
No ratings yet
Chapter 2
32 pages
Management Theory and Practice - Chapter 1 - Session 1 PPT Dwtv9Ymol5
83% (6)
Management Theory and Practice - Chapter 1 - Session 1 PPT Dwtv9Ymol5
35 pages
Chapter 3: Descriptive Statistcs
No ratings yet
Chapter 3: Descriptive Statistcs
24 pages
Chapter 2
No ratings yet
Chapter 2
24 pages
Chapter 1ppt 1
No ratings yet
Chapter 1ppt 1
40 pages
Akakfjnxcwewesgsdzgcx
No ratings yet
Akakfjnxcwewesgsdzgcx
23 pages
Agroeco CH 2
No ratings yet
Agroeco CH 2
28 pages
Lesson 3 Frequency Distribution
No ratings yet
Lesson 3 Frequency Distribution
55 pages
Lecture 5-Exploring and Making Sense of Data-Deriving Information
No ratings yet
Lecture 5-Exploring and Making Sense of Data-Deriving Information
38 pages
Lecture-3 Frequency Distribution
No ratings yet
Lecture-3 Frequency Distribution
22 pages
Stat CH-2
No ratings yet
Stat CH-2
46 pages
Organizing Data
No ratings yet
Organizing Data
5 pages
Describing Data With Tables
No ratings yet
Describing Data With Tables
9 pages
Ch2 Statistics
No ratings yet
Ch2 Statistics
41 pages
GenAI IN HIGHER EDUCATION FALL 2023 UPDATE TIME FO 240205 203837
No ratings yet
GenAI IN HIGHER EDUCATION FALL 2023 UPDATE TIME FO 240205 203837
17 pages
Chapter-2-Methods of Dhhata Preseuhntation
No ratings yet
Chapter-2-Methods of Dhhata Preseuhntation
14 pages
STA 111 - Topic One - Lecture 2
No ratings yet
STA 111 - Topic One - Lecture 2
20 pages
Methods of Data Presentation
No ratings yet
Methods of Data Presentation
8 pages
Chapter-2-Methods of Data Presentation
No ratings yet
Chapter-2-Methods of Data Presentation
17 pages
BIOSTAT Chapter2
100% (1)
BIOSTAT Chapter2
57 pages
Chapter 2 SUMMARY Descriptive Statistics
No ratings yet
Chapter 2 SUMMARY Descriptive Statistics
32 pages
Group 2 Descriptive Statistics
No ratings yet
Group 2 Descriptive Statistics
27 pages
Methods of Data Presntation
No ratings yet
Methods of Data Presntation
53 pages
M2.2 - Presentaton of Data
No ratings yet
M2.2 - Presentaton of Data
28 pages
Statistics Chapter-II
No ratings yet
Statistics Chapter-II
66 pages
Descriptive Statistics
100% (1)
Descriptive Statistics
18 pages
Chapter 02 - Organization and Presentation of Data
No ratings yet
Chapter 02 - Organization and Presentation of Data
27 pages
AOL-2-Mod-1 MA
No ratings yet
AOL-2-Mod-1 MA
17 pages
Patient Safety Culture
100% (3)
Patient Safety Culture
444 pages
Chapter-3-Methods of Data Presentation
No ratings yet
Chapter-3-Methods of Data Presentation
13 pages
Ignou Statistics2
No ratings yet
Ignou Statistics2
153 pages
Graph Theory: Penn State Math 485 Lecture Notes: Licensed Under A
100% (1)
Graph Theory: Penn State Math 485 Lecture Notes: Licensed Under A
154 pages
Chapter 2
No ratings yet
Chapter 2
46 pages
Statistics in Education - Made Simple
100% (1)
Statistics in Education - Made Simple
26 pages
JANREX Immersion
No ratings yet
JANREX Immersion
19 pages
Decentarlized Education
100% (1)
Decentarlized Education
23 pages
Knowledge Management Strategies For Business Development (2009, IDEA) PDF
No ratings yet
Knowledge Management Strategies For Business Development (2009, IDEA) PDF
446 pages
Requency Istribution Able: Where: Class Size N Number of Class Intervals (Ideal N 5 - 20)
No ratings yet
Requency Istribution Able: Where: Class Size N Number of Class Intervals (Ideal N 5 - 20)
2 pages
Thesis New
No ratings yet
Thesis New
72 pages
Chapter 2: Frequency Distribution and Measures of Central Tendency 2.1 A FREQUENCY DISTRIBUTION Is A Tabular Arrangement of Data Whereby The Data Is Grouped
No ratings yet
Chapter 2: Frequency Distribution and Measures of Central Tendency 2.1 A FREQUENCY DISTRIBUTION Is A Tabular Arrangement of Data Whereby The Data Is Grouped
9 pages
Frequency Distributio2
No ratings yet
Frequency Distributio2
12 pages
The Principles of Risk Management
50% (2)
The Principles of Risk Management
3 pages
Final 3I CHAPTER1-5
No ratings yet
Final 3I CHAPTER1-5
87 pages
Flood Line Analysis
No ratings yet
Flood Line Analysis
15 pages
Unit 1: Structural Genomics
No ratings yet
Unit 1: Structural Genomics
4 pages
Project Report On Customer Service
No ratings yet
Project Report On Customer Service
13 pages
Investigate The Use of Solar Energy Into Small Houses For Sustainable Living
No ratings yet
Investigate The Use of Solar Energy Into Small Houses For Sustainable Living
27 pages
Reviewer4thq g7
No ratings yet
Reviewer4thq g7
3 pages
Understanding Multi Rule QC JUN23
No ratings yet
Understanding Multi Rule QC JUN23
12 pages
Developing Sustainable Tourism Product For Sailing Sapa Homestay - Rational Report
No ratings yet
Developing Sustainable Tourism Product For Sailing Sapa Homestay - Rational Report
73 pages
English-Social Justice Unit Plan
No ratings yet
English-Social Justice Unit Plan
19 pages
Emsat Math Achieve (General Track) : Total Time For Test: 50 Questions: 1.5 Hours
No ratings yet
Emsat Math Achieve (General Track) : Total Time For Test: 50 Questions: 1.5 Hours
10 pages
Professional Analysis of Characteristics Competencies Servin
No ratings yet
Professional Analysis of Characteristics Competencies Servin
2 pages
Difference Between Forecast Linear and Forecast ETS
No ratings yet
Difference Between Forecast Linear and Forecast ETS
6 pages
Community Scale Sustainability
No ratings yet
Community Scale Sustainability
25 pages
Quality Attributes Chewable Tablets
No ratings yet
Quality Attributes Chewable Tablets
16 pages
Literature Review of Education in Nigeria
100% (1)
Literature Review of Education in Nigeria
5 pages
The Contribution of Cognitive Psychology To The ST
No ratings yet
The Contribution of Cognitive Psychology To The ST
18 pages
Crag in 2004
No ratings yet
Crag in 2004
9 pages
ECA Report
No ratings yet
ECA Report
6 pages
Shapesplosion Game Instructions2
No ratings yet
Shapesplosion Game Instructions2
2 pages
Info Pack ICD 3.0 2014 March
No ratings yet
Info Pack ICD 3.0 2014 March
6 pages
De-Mystifying Math and Stats for Machine Learning: Mastering the Fundamentals of Mathematics and Statistics for Machine Learning
From Everand
De-Mystifying Math and Stats for Machine Learning: Mastering the Fundamentals of Mathematics and Statistics for Machine Learning
Seaport AI Madhavan
No ratings yet
Business Statistics I Essentials
From Everand
Business Statistics I Essentials
Louise Clark
5/5 (5)
Statistics I Essentials
From Everand
Statistics I Essentials
Emil G. Milewski
No ratings yet
Business Statistics For Dummies
From Everand
Business Statistics For Dummies
Alan Anderson
No ratings yet

CH 2

Uploaded by

CH 2

Uploaded by

Arba Minch University Basic Statistics

2. VISUAL DESCRIPTION OF DATA

2.1.Methods of Data Presentation

2.2.Tabular Presentation of Data

2.2.1. Categorical Frequency Distribution (CaFD)

❖ Steps of constructing CaFD

3. Put distinct values of a data set in column A

Construct a frequency distribution for the above data.

2.2.2. Ungrouped Frequency Distribution (UFD)

❖ Steps of Constructing UFD:

(Class) (Frequency) LCF MCF

2.2.3. Grouped Frequency Distribution (GFD)

❖ Steps in constructing GFD

10. Find Cumulative Frequency (LCF and MCF)

6. Upper Class Limit (UCL). Since unit of measurement (U) is one.

Therefore, 6 − 11 is the first class limit; 12 – 17 is the second class

10. Find Cumulative Frequency (LCF and MCF)

5 30 - 35 29.5– 35.5 32.5 3 18 5 0.15 0.90 0.25

Classes Class boundary Class mark Frequency

2.3.Graphical Presentation of Data

2.3.2. Frequency Polygon

Figure 2.1: Histogram

2.5 8.5 14.5 20.5 26.5 32.5 38.5 44.5

Figure 2.2: Frequency polygon

2.3.3. O- give Graph (Cumulative Frequency Curve)

2.4.Diagrammatic Presentation of the Data

➢ They have greater attraction.

2.4.1. Line diagram

Figure 2.3: Line diagram of the three products

2.4.2. Pie Chart

Component Part Component Part

Solution: The necessary computations are given below:

Figure 2.4: pie chart of monthly budget of a family

2.4.3. Bar Charts

a. Simple Bar Chart:

Sales($) in 1957 Sales($) in 1958

b. Subdivided or Component Bar chart:

c. Multiple Bars chart:

d. Deviation Bar Diagram:

Commodity Net profit

Example 2.10: draw a pictogram to represent the following population of a town.

2.5.The Stem-and-Leaf Display and the Dotplot

Stem and Leaf Plots

Step 1: Arrange the data in order:

32, 33, 36, 43, 44, 44, 45, 51, 52, 57

Step 2: Separate the data according to the first digit, as shown.

Weekly Earnings (Br) Number of Employees

You might also like