0% found this document useful (0 votes)
16 views35 pages

2022 Section 2a Data Presentation

This document discusses various primary and secondary methods for presenting raw data, including data arrays, stem-and-leaf diagrams, scatter plots, entry tables, frequency tables, histograms, ogives, line graphs, pie charts, bar charts, and cross-tables. These methods preserve original data values, arrange data in ascending or descending order, group data into classes to show frequencies, and investigate relationships between variables.

Uploaded by

GILVERT
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views35 pages

2022 Section 2a Data Presentation

This document discusses various primary and secondary methods for presenting raw data, including data arrays, stem-and-leaf diagrams, scatter plots, entry tables, frequency tables, histograms, ogives, line graphs, pie charts, bar charts, and cross-tables. These methods preserve original data values, arrange data in ascending or descending order, group data into classes to show frequencies, and investigate relationships between variables.

Uploaded by

GILVERT
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 35

qRAW DATA-data recorded in the

sequence in which they are collected and


before they are processed
q Data need to be presented in a way that will
benefit the intended audience
q Two forms of data presentation exist:
1. Primary data presentation forms
2. Secondary data presentation forms
q Preserve data originality e.g.
a. Data array
b. Stem and leaf
diagram
c. Scatter plot
d. Entry table
q Data arranged in ascending/descending
order
q Easier/more convenient to identify the
highest/lowest observation(s), the mode,
e.t.c
q Activity: Do the data array for following:
11 ; 13; 11; 12; 14; 13; 17; 20; 12; 22;20; 32; 33; 43;
50
q Each observation is divided into two
portions- a stem and a leaf
q Leaves for each stem shown
separately in a display
q Two-digit observations: the first digit
is the stem and the second digit is a
leaf
q First step is do a data array
q Then identify the stems from the data
q Next step is to match the leaves
associated with each stem
q stem I leaves
q Include a key on display-for easy
interpretation of the presentation
q Prepare a stem and leaf diagram for
the previous example
q Graphical presentation of the (possible)
linear relationship between two
quantitative variables
q Major objective-investigate for a
possible linear relationship between two
variables under study
q Identify the dependent (output=Y) and
independent (input=X) variables
q Each plot is for a pair of observations
(X;Y) under study
q Plot these pairs of observations and
interpret graph
q Possibilities:
q Perfect linear r/ship (+ve/-ve)
q Very strong linear r/ship (+ve/-ve)
q Strong linear r/ship (+ve/-ve)
q Mild linear relationship (+ve/-ve)
q Weak linear r/ship (+ve/-ve)
q Very weak linear r/ship (+ve/-ve)
q No linear relationship
q NB: Read on various types of
relationships that can be depicted by
scatter diagrams
q Find out if a relationship exists
between training hours and
productivity (output) as shown in the
next slide...
Training 20 36 20 38 40 33 32 28 40 24
hours
Output 40 70 44 56 60 48 62 54 63 38
q Grouped data; classes; entries for each
respective class; and respective
frequencies for respective classes at the
end of every class shown inside brackets
q For one to construct an entry table, the
following steps can be adopted:
1. Determine the number of classes (C)
q Use the Sturge’s rule :
C=1+3.33logN and N=total number of
observations to be grouped
§The following are data on number of
cars sold per week at a certain car sale
in Bulawayo over a period of 26
weeks:
10 12 11 12 11 13 15 11 25
10 19 11 36 18 15 40 15 13
12 14 10 14 10 18 11 13
q Present an entry table of the above
data
NB: C is an integer, round it up or down
accordingly
2. Then determine the width of each
class (i) where i = R + d
C
q WhereR=range=highest/maximum
observation minus lowest/minimum
observation;
δ=Delta=1; for discrete data to be grouped;
δ=0.1 for observations to be grouped, rounded
off to 1 decimal point;
δ =0.01 for observations with 2 decimal points
δ =0.001 for observations with 3 d.p.ps; e.t.c
e.t.c
N.B: the resulting value of i should always be
rounded up to the nearest value/decimal
place(s) to which the observations are
recorded
3. Determine the lower and upper
values for every class (class boundaries
and/or limits) by applying the
“Boundary Determination Formula”. The
formula states that;

R S + i [2 ( K - 1 ) - C ]
BK =
2
Where; BK is the lower class boundary
of the Kth class;
i=class interval
C=number of classes (integer)
K=1, 2, 3, 4,..., C, C+1
Rs=highest observation plus lowest
observation
Your; RS - iC
B1 =
2
If B1 is discrete/has same number of
decimal points as the
discrete/continuous data you are
grouping, B1 is a class limit and not a
class boundary!
q You need to apply the “Boundary
Correction Formula” where your
- ( R S - d ) - iC
B1 =
2
The result will be a lower class boundary
of the first class
4. Lastly, present an entry table of the
above by adding i to the above value you
will get the upper class boundary of the
first class. Remember the upper class
boundary of the previous class becomes
the lower class boundary of the next class,
and add i to this value to get B3 and
proceed up to B(C+1)
q Original data lost in the presentation
process-e.g.
ü Frequency tables; relative FT;
histogram; polygon; ogives;
Lorenz/concentration curve; line
graphs; pie charts; simple bar graphs;
component/stacked bar graph;
pictograms; cross/contingency tables
Ø Table with classes of values and
accompanying frequencies/counts
Ø Usually represented using a histogram
and a frequency polygon
Ø Visual representations-audience
interpret a data set quickly
Ø Do a FT for entry table data
q Expresses the frequencies in a frequency
distribution table proportionate to N
(population size)
q Do a RFT for the FT data
q Graphic display of a frequency
distribution where ;
q The bars are connected at the class
boundaries(X-axis) and frequency of
each class shown on y-axis
q Interpret data as displayed by the
histogram
q Accumulation of the frequencies w.r.t
upper and/or lower class boundaries of
the FD
q Gives the total number of values that fall
below the upper boundary of each class
or fall above the lower class of each
class, cumulatively
q Questions:
q “What %age of (or how many)
observations fall below (or above) a given
value of a random variable?”
q Answer: construct a less than and/or a
more than ogives
q < ogive-create a class below the first
class at zero frequency and for each
class, ask this question:
q “How many observations fall below this
upper class boundary, cumulatively?
q Plot cumulative frequencies at the upper
class boundary for each class
q Shows the total number of
observations counted above the lower
class boundary of any given class
q Question:
“What %age of (or how many)
observations fall above (or are more
than) a given value of the random
variable?”
q Create an extra class above the last
lower class boundary
q How many observations fall above this
class boundary cumulatively
q For the > ogive, the cumulative
frequencies are plotted (on graph
paper) at the lower class boundaries
for each class
qAlways plot the two ogives on
the same axes, if need be
qThe two ogives are mirror
images of each other
qYou should be able to answer
raised questions guided by the
relevant ogive
q Plot values of ratio-scaled variables over
time
q X-axis=time periods in time order
q Y-axis=values of random variable for
each period
q Trends in the movement of a random
variable overtime identified e.g.
q Company profits, student numbers, daily
share price over a particular time period
q Segmented circle
q Each segment=proportional to the
importance or frequency of the data
category=relative importance/contribution of
each category of a random variable
q Best suited to display categorical data
q Grouped continuous data can also be
displayed in a pie-chart form
q N.B: Label all diagrams adequately with titles
and legends/keys-to avoid misinterpretations
q Identify and show data sources
MALES FEMALES TOTAL
37 27 64
GENDER DISTRIBUTION FOR 2018 ECON105
MALES FEMALES

FEMALES, 42%

MALES, 58%
q Can display same data as the pie-chart=
detached bar form-instead of segments
of a circle
q Height of each bar-proportional to the
frequency of the r.v.
q X-axis-categories of the r.v.
q Y-axis-frequency/r.f/%r.f (%ages most
preferred)
q Equal width of each of the (detached)
bars-to avoid distortions
q Displays, simultaneously, on a single
chart, outcomes of two or more
r.v.s=overcomes limitation of a simple
bar chart
q Simple bar chart converted into a
component bar chart by splitting each
category bar of one r.v. into
sections/stacks
q Each section=relative importance of each
category under study
Gender
Store Female Male

Checkers 7 3

Pick’n’Pay 10 7

Spar 2 1
q Represents, in a table, two qualitative
variables at a time and
q Assists to establish whether a
relationship between these two
variables exists or not
q Your comment on the cross-table
display should therefore answer the
above question/ meet the above
objective, “Clearly”
Gender
Store Female Male

Checkers 7 3

Pick’n’Pay 10 7

Spar 2 1
q Tabular methods include;
ü Frequency distributions/R.F/%age F.D.
ü Cross-tables
q Graphical methods include;
o Bar charts
o Pie charts
o Component bar charts
o Pictograms
q Tabular methods include;
ü F.D./R.F.D/P.F.D./ogives
ü Cross-tables; e.t.c.
q Graphical methods include;
o Histogram
o Ogives
o Stem and leaf
o Scatter diagram
o Frequency polygon
o Line graphs; e.t.c.
o SECTION 2(b) is next

You might also like