CH II Stat I
CH II Stat I
DATA COLLECTION
&
PRESENTATION
Types of Data
Data sets can consist of two types of
data: Qualitative data and
Quantitative data.
DATA
Consists of
Consists of
numerical
attributes, labels, or
measurements or
nonnumeric entries.
counts.
Qualitative and Quantitative Data
Example: The grade point averages of five
students are listed in the table. Which data
are qualitative data and which are
quantitative data?
Student GPA
Sara 3.22
Berhan 3.98
Mahlet 2.75
Tsehay 2.24
Hana 3.84
Ratio
Nominal Scale
• The values of a nominal attribute are just different
names, i.e., nominal attributes provide only enough
information to distinguish one object from another.
• Qualities with no ranking or ordering; no numerical
or quantitative value. These types of data consists
of names, labels and categories.
• It is a scale for grouping individuals into different
categories.
Example : Eye color: brown, black, etc,
Sex: Male, Female.
• In this scale, one is different from the other.
• Arithmetic operations (+, -, *, ÷) are not applicable,
comparison (<, >, ≠, etc) is impossible.
Ordinal Scale
• Defined as nominal data that can be ordered or ranked.
• Can be arranged in some order, but the differences
between the data values are meaningless.
• Data consisting of an ordering of ranking of measurements
are said to be on an ordinal scale of measurements.
• It provides enough information to order objects.
• One is different from and greater /better/ less than the
other.
• Arithmetic operations (+, -, *, ÷) are impossible,
comparison (<, >, ≠, etc) is possible.
Example: Letter grading (A, B, C, D, F),
Rating scales (excellent, very good, good, fair, poor),
Military status (general, colonel, lieutenant, etc).
Interval Level
• Data are defined as ordinal data and the differences
between data values are meaningful. However, there is no
true zero, or starting point, and the ratio of data values are
meaningless.
• Note: Celsius & Fahrenheit temperature readings have no
meaningful zero and ratios are meaningless. For example,
a temperature of zero degrees (on Celsius and Fahrenheit
scales) does not mean a complete absence of heat.
• One is different, better/greater and by a certain amount of
difference than another.
• Possible to add and subtract. For example; 800c – 500c =
300c, 700c – 400c = 300c.
• Multiplication and division are not possible. For
example; 600c = 3(200c). But this does not imply that an
object which is 600c is three times as hot as an object
which is 200c.
• Most common examples are: IQ, temperature.
Ratio Scale
• Similar to interval, except there is a true zero
(absolute absence), or starting point, and the
ratios of data values have meaning.
• Arithmetic operations (+, -, *, ÷) are
applicable. For ratio variables, both
differences and ratios are meaningful.
• One is different/larger /taller/ better/ less by a
certain amount of difference and so much
times than the other.
• This measurement scale provides better
information than interval scale of
measurement.
• Example : weight, age, number of students.
Summary of Levels of
Measurement
Levels of measurement
1974 30
1986 52
1991 60
•Qualitative Classification: - Data are
arranged according to attributes like
color, religion, marital-status, sex,
educational background, etc.
Employees in Factory X
Educated Uneducated
Femal Femal
Male Male
e e
•Quantitative Classification:- The
statistical data is classified according to
some quantitative variables. The
variable may be either discrete or
continuous.
Where
n=Number of Classes
N=Total number of observations
4.Determine the class limits
Determine the lower class limit of the
first class (LCL1), then
• LCL2 = LCL1 + cw, LCL3 = LCL2 + cw,… LCLi+1 = LCLi +
cw
Determine the upper class limit of the
first class (UCL1) i.e.
UCL1 = LCL1 + cw – u,
where u = the unit of measurement,
then
UCL2 = UCL1 + cw , UCL3 UCL2, … , UCLi+1 = UCLi + cw
Complete the GFD with the respective
class frequencies.
• Example. The number of
customers for consecutive 30 days in
a supermarket was listed as follows:
20 48 65 25 48 49
35 25 72 42 22 58
53 42 23 57 65 37
18 65 37 16 39 42
49 68 69 63 29 67
20
15
15
12
10
10
4 4
5 3
2
Class width
FREQUENCY POLYGON
• It is a line graph of frequency
distribution.
• Clearly illustrates shape of the
data than a histogram does.
• Connects the centers (class
marks) of the tops of the
histogram bars with a series of
straight lines.
Frequency Polygon
16
14
12
10
F
r
e
q 8
u
e
n
c 6
y
0
9.5 19.5 29.5 39.5 49.5 59.5 69.5 79.5 89.5
Class mark
CUMULATIVE FREQUENCY CURVE,
(OGIVE)
• It is useful for determining the number
of values below or above some
particular value.
• Uses class boundaries along the
horizontal axis and frequencies along
the vertical axis.
• There are two type of O-give namely
less than Ogive and more than Ogive.
CUMULATIVE FREQUENCY CURVE,
(OGIVE)
Cumulative Frequency
The More than Ogive
60 60
50 50
40 40
Cumulative
30
Frequency
30
20 20
10 10
0 0
14.5 24.5 34.5 44.5 54.5 64.5 74.5 84.5 14.5 24.5 34.5 44.5 54.5 64.5 74.5 84.5
20 20
15 15
10 10 10
5
0
1986 1987 1988 1989 1990 1991
Year
VERTICAL LINE GRAPH
• Is a graphical representation of discrete data and
frequencies.
• Vertical solid lines are used to indicate the
frequencies.
• Example . Draw a vertical line graph for the
following data Family A B C D E
Number of children 2 1 5 4 3
BAR CHART (BAR DIAGRAM)
• Histogram, Frequency polygon, ogives are
used for data having an interval or ratio
level of measurement.
• Bar chart is a series of equally spaced bars
of uniform width where the height (length)
of a bar represents the frequency
corresponding with a category.
• Bars may be drawn horizontally or
vertically. Vertical bar graphs are preferred
as they allow comparison with other bars.
• Example: Revenue (in millions of Birr) of
company x from 1980 to 1982 is given
below
Year Revenue Year Maize Wheat
1980 50 1980 40 80
1981 150 1981 20 60
1982 200 1982 60 100
quintals 50 40
40 maize
100 30 20 wheat
20
50 10
0
0 1980 1981 1982
1980 1981 1982
Year
year
SUBDIVIDED BAR CHART Example : percentage bar
chart
Year Wheat Maize
Year % of Wheat Production % of Maize
1980 150 150 Production
1981 300 200 1980 150/300 100 = 50 150/300 100 =
50
1982 350 100
1981 300/500 100 = 60 200/500 100 =
40
The number of 1982 350/450 100 = 78 100/450 100 =
quintals of wheat Percentage of wheat and22
and maize pro- maize production from 1980-
Percentage produced
duced by country X 100% 1982
600 90% 22
80% 40
Number of quintals
500 50
70%
400 200 100 60% wheat
Maize 50%
300 40% maize
78
150 Wheat
200 30% 60
350 50
300 20%
100 10%
150
0 0%
1980 1981 1982 1980 1981 1982
Year
Year
PIE CHART
• A pie chart is a circle that is divided in to
sections or according to the percentage of
frequencies in each category of the distribution.
• Example: The monthly expenditure of a certain family
is given below.
300
350 Food House rent
Clothing Misc.
100
250
PICTOGRAPH (PICTOGRAM)
• A pictograph is a graph that uses symbols or
pictures to represent data.
• Example : In comparing the population of a
country from 1990 to 1992, we simply draw
pictures of people where each picture may
represent 1000,000 people.
1992 - Key: = 1,000,000
1991 -
1990 -