DBMS Concepts
DBMS Concepts
DBMS Concepts
presentation
Having collected and edited the data, the next important step is to organize it. That is to
present it in a readily comprehensible condensed form that aids in order to draw
inferences from it. It is also necessary that the like be separated from the unlike ones.
Tabular presentation
Diagrammatic and Graphic presentation.
Classification is a preliminary and it prepares the ground for proper presentation of data.
Definitions:
Frequency distribution: is the organization of raw data in table form using classes
and frequencies.
Page 1 of 18
Lecture notes on Introduction to probability and Statistics (Stat 2061) Chapter 1 methods of data
presentation
Used for data that can be place in specific categories such as nominal, or ordinal. e.g. marital
status.
Example: a social worker collected the following data on marital status for 25
persons.(M=married, S=single, W=widowed, D=divorced)
M S D W D
S S M M M
W D S M M
W D D S S
S W W D D
Solution:
Since the data are categorical, discrete classes can be used. There are four types of marital
status M, S, D, and W. These types will be used as class for the distribution. We follow
procedure to construct the frequency distribution.
D
Page 2 of 18
W
Lecture notes on Introduction to probability and Statistics (Stat 2061) Chapter 1 methods of data
presentation
Step 2: Tally the data and place the result in column (2).
Step 3: Count the tally and place the result in column (3).
Percentages are not normally a part of frequency distribution but they can be added since
they are used in certain types diagrammatic such as pie charts.
Combing all the steps one can construct the following frequency distribution.
M //// 5 20
S //// // 7 28
D //// // 7 28
2) Ungrouped frequency Distribution:
W //// 6 24
-Is a table of all the potential raw
score values that could possible occur in the data along with the number of times each
actually occurred.
Page 3 of 18
Lecture notes on Introduction to probability and Statistics (Stat 2061) Chapter 1 methods of data
presentation
First find the smallest and largest raw score in the collected data.
Arrange the data in order of magnitude and count the frequency.
Example:
80 76 90 85 80
70 60 62 70 85
65 60 63 74 75
76 70 70 80 85
Solution:
Page 4 of 18
Lecture notes on Introduction to probability and Statistics (Stat 2061) Chapter 1 methods of data
presentation
75 // 2
76 / 1
80 /// 3
85 /// 3
90 / 1
-When the range of the data is large, the data must be grouped in to classes that are more than
one unit in width.
Definitions:
Units of measurement (U): the distance between two possible consecutive measures.
It is usually taken as 1, 0.1, 0.01, 0.001, -----.
Page 5 of 18
Lecture notes on Introduction to probability and Statistics (Stat 2061) Chapter 1 methods of data
presentation
Class width: the difference between the upper and lower class boundaries of any
class. It is also the difference between the lower limits of any two consecutive classes
or the difference between any two consecutive class marks.
Class mark (Mid points): it is the average of the lower and upper class limits or the
average of upper and lower class boundary.
Cumulative frequency above: it is the total frequency of all values greater than or
equal to the lower class boundary of a given class.
Cumulative frequency blow: it is the total frequency of all values less than or equal
to the upper class boundary of a given class.
Page 6 of 18
Lecture notes on Introduction to probability and Statistics (Stat 2061) Chapter 1 methods of data
presentation
5. The classes must be equal in width. The exception here is the first or last class. It
is possible to have an "below ..." or "... and above" class. This is often used with
ages.
5. Pick a suitable starting point less than or equal to the minimum value. The starting
point is called the lower limit of the first class. Continue to add the class width to
this lower limit to get the rest of the lower limits.
6. To find the upper limit of the first class, subtract U from the lower limit of the
second class. Then continue to add the class width to this upper limit to find the
rest of the upper limits.
7. Find the boundaries by subtracting U/2 units from the lower limits and adding U/2
units from the upper limits. The boundaries are also half-way between the upper
limit of one class and the lower limit of the next class. !may not be necessary to
find the boundaries.
8. Tally the data.
9. Find the frequencies.
10. Find the cumulative frequencies. Depending on what you're trying to accomplish,
it may not be necessary to find the cumulative frequencies.
11. If necessary, find the relative frequencies and/or relative cumulative frequencies
Example*:
Page 7 of 18
Lecture notes on Introduction to probability and Statistics (Stat 2061) Chapter 1 methods of data
presentation
11 29 6 33 14 31 22 27 19 20
18 17 22 38 23 21 26 34 39 27
Solutions:
Step 1: Find the highest and the lowest value H=39, L=6
Step 6: Find the upper class limit; e.g. the first upper class=12-U=12-1=11
11, 17, 23, 29, 35, 41 are the upper class limits.
So combining step 5 and step 6, one can construct the following classes.
Class limits
6 – 11
12 – 17
18 – 23
Page 8 of 18
Lecture notes on Introduction to probability and Statistics (Stat 2061) Chapter 1 methods of data
presentation
24 – 29
30 – 35
36 – 41
Class boundary
5.5 – 11.5
11.5 – 17.5
17.5 – 23.5
23.5 – 29.5
29.5 – 35.5
35.5 – 41.5
Step 9: Write the numeric values for the tallies in the frequency column.
Page 9 of 18
Lecture notes on Introduction to probability and Statistics (Stat 2061) Chapter 1 methods of data
presentation
Class Class boundary Class Tally Freq. Cf (less Cf (more rf. rcf (less
limit Mark than than type) than type
type)
These are techniques for presenting data in visual displays using geometric and pictures.
Importance:
Page 10 of 18
Lecture notes on Introduction to probability and Statistics (Stat 2061) Chapter 1 methods of data
presentation
-The three most commonly used diagrammatic presentation for discrete as well as qualitative
data are:
Pie charts
Pictogram
Bar charts
Pie chart
Solutions:
Step 3: Using a protractor and compass, graph each section and write its name corresponding
percentage.
Page 11 of 18
Lecture notes on Introduction to probability and Statistics (Stat 2061) Chapter 1 methods of data
presentation
Men 2500 25 90
Women 2000 20 72
Boys 1500 15 54
CLASS
Boys Men
Girls Women
Page 12 of 18
Lecture notes on Introduction to probability and Statistics (Stat 2061) Chapter 1 methods of data
presentation
Pictogram
-In these diagram, we represent data by means of some picture symbols. We decide
abut a suitable picture to represent a definite number of units in which the variable is
measured.
Bar Charts:
- A set of bars (thick lines or narrow rectangles) representing some magnitude over
time space.
- They are useful for comparing aggregate over time space.
- Bars can be drawn either vertically or horizontally.
- There are different types of bar charts. The most common being :
Simple bar chart
Deviation or two way bar chart
Broken bar chart
Component or sub divided bar chart.
Multiple bar charts.
Page 13 of 18
Lecture notes on Introduction to probability and Statistics (Stat 2061) Chapter 1 methods of data
presentation
-They are thick lines (narrow rectangles) having the same breadth. The magnitude of a quantity
is represented by the height /length of the bar.
Example: The following data represent sale by product, 1957- 1959 of a given company for three
products A, B, C.
Solutions:
30
25
Sales in $
20
15
10
5
0
A B C
product
Page 14 of 18
Lecture notes on Introduction to probability and Statistics (Stat 2061) Chapter 1 methods of data
presentation
Draw a component bar chart to represent the sales by product from 1957 to 1959.
Solutions:
100
80
Product C
60
Sales in $
Product B
40
Product A
20
0
1957 1958 1959
Year of production
Page 15 of 18
Lecture notes on Introduction to probability and Statistics (Stat 2061) Chapter 1 methods of data
presentation
Solutions:
60
50
40 Product A
Sales in $
30 Product B
20 Product C
10
0
1957 1958 1959
Year of production
Page 16 of 18
Lecture notes on Introduction to probability and Statistics (Stat 2061) Chapter 1 methods of data
presentation
Represent the class boundaries for the histogram or ogive or the mid points for the
frequency polygon on the X axes.
Plot the points.
Draw the bars or lines to connect the points.
Histogram
A graph which displays the data by using vertical bars of various heights to represent
frequencies. Class boundaries are placed along the horizontal axes. Class marks and class limits
are some times used as quantity on the X axes.
Frequency Polygon:
- A line graph. The frequency is placed along the vertical axis and classes mid points
are placed along the horizontal axis. It is customer to the next higher and lower
class interval with corresponding frequency of zero, this is to make it a complete
polygon.
Example: Draw a frequency polygon for the above data (example *).
Solutions:
8
4
Va lu e F re q u e n cy
0
2.5 8.5 14.5 20.5 26.5 32.5 38.5 44.5
Page 17 of 18
Lecture notes on Introduction to probability and Statistics (Stat 2061) Chapter 1 methods of data
presentation
- A graph showing the cumulative frequency (less than or more than type) plotted
against upper or lower class boundaries respectively. That is class boundaries are
plotted along the horizontal axis and the corresponding cumulative frequencies are
plotted along the vertical axis. The points are joined by a free hand curve.
Page 18 of 18