Aphs .Presentation of Statistical Data - PPSX
Aphs .Presentation of Statistical Data - PPSX
1. Histogram
2. Frequency polygon
3. Frequency curve
4. Line chart or graph
5. Cumulative frequency diagram
6. Scatter or dot diagram
General rules for designing graphs
A graph should have a self-explanatory legend
A graph should help reader to understand data
Axis labeled, units of measurement indicated
Scales important. Start with zero (otherwise // break)
Avoid graphs with three-dimensional impression, it may
be misleading (reader visualize less easily)
Histogram
It is a pictorial diagram of frequency distribution.
Variable characters of different groups are
indicated on horizontal axis [x-axis] called
“ abscissa”
Frequency i.e., no of observations is marked on
vertical line [y-axis] called “ ordinate” .
Area of each block or rectangle is proportional
to the frequency
If the class interval is uniform, height of
rectangle alone will indicate the frequency
If the class intervals are different then area
alone indicate frequency.
Grouped Frequency Distribution of Psychology Test
Scores
Interval's Lower Limit Interval's Upper Limit Class Frequency
39.5 49.5 3
49.5 59.5 10
59.5 69.5 53
69.5 79.5 107
79.5 89.5 147
89.5 99.5 130
99.5 109.5 78
109.5 119.5 59
119.5 129.5 36
129.5 139.5 11
139.5 149.5 6
149.5 159.5 1
159.5 169.5 1
Eg. Age Data
36 25 38 46 55 68 72 55 36 38
67 45 22 48 91 46 52 61 58 55
Bin Frequency Scores Included in Bin
20-30 2 25,22
30-40 4 36,38,36,38
40-50 4 46,45,48,46
50-60 5 55,55,52,58,55
60-70 3 68,67,61
70-80 1 72
80-90 0 -
90-100 1 91
It is again an area diagram of frequency
distribution developed over a histogram.
Joining of mid-points of class intervals at the
height of frequencies by straight lines, gives a
polygon, i.e., a figure with many angles.
Frequency polygon for the
psychology test scores.
When the no. of observations is very large and
group interval is reduced, the frequency polygon
tend to lose its angulations giving place to a
smooth curve known as
’
Ogive is a graph of the cumulative relative
frequency distribution.
Cumulative frequency is the total no. of persons
in each particular range from lowest value of
characteristic up to and including any higher
group value.
It is obtained by cumulating the frequency of
previous classes including the class in question.
This is a frequency polygon presenting variations
by line.
They are used to show the trend of events with
the passage of time such as rising, falling or
showing fluctuations.
It is graphic presentation made to show the
nature of correlation between two variable
characters X and Y in the same person[s] or
group[s] such as ht and wt in men aged 20yrs
Hence it is also called
Stem and leaf plots
Box plots
Other Ways to describe nominal data
1.percentages and proportions
2.rates and ratios
They are developed in 1977 by tukey a
statistician interested in meaningful ways to
communicate by visual display.
37, 33, 33, 32, 29, 28, 28, 23, 22, 22,
22, 21, 21, 21, 20, 20, 19, 19, 18, 18,
18, 18, 16, 15, 14, 14, 14, 12, 12, 9, 6
.
Figure 1. Stem and leaf display of
the number of marks
3|2337
2|001112223889
1|2244456888899
0|69
Steps followed in drawing stem and leaf
plot
The first step in organizing data for a stem and
leaf plot is to decide on no. of sub divisions
called classes or intervals
Draw a vertical line
Now place the first digits of each class called
stem on left side of the line
No.’ s on right side of vertical line represent the
2nd digit of each observation, they are the leaves
Differences in total score on the functional autonomy measurement system for
patients age 85 yr or older
90 F 28 20 -8
88 F 8 11 3
88 F 6 9 3
90 F 22 18 -4
88 M 6 7 1
86 F 9 9 0
86 M 23 15 -8
85 F 12 40 28
For example for 2nd person, write the 3 [leaf] on
right side of vertical line opposite 1to 5[stem]
stem leaves
-9 to -5 8 8
-4 to 0 4 0
+1 to +5 3 3 1
+6 to +10 -
+11 to +15 -
+16 to + 20 -
+21 to +25 -
+26 to +30 8
+31 to +35
When the observations is two digits,
however, such as the score of 28 for a
nd
subject 8, only the 2 digit, or 8 in this case
considered.
It is generally preferred to have equal
class width and to avoid open ended
intervals.
Sometimes called as” a box and whisker plot” .
It is constructed from information in a stem and
leaf plot.
The median and the first and third quartiles of
the distribution are used in constructing box plot.
A box plot is drawn with the top of the box at the
third quartile and the bottom at the first
quartile; quartiles sometimes referred to as
in box plots.
Length of box is visual representation of
i.e., the box represents the middle
50% of data.
the location of mid point or median of
distribution is indicated with a horizontal line in
the box.
Exercise
Carl works at a computer store. He also
recorded the number of sales he made each
month. In the past 12 months, he sold the
following numbers of computers:
51, 17, 25, 39, 7, 49, 62, 41, 20, 6, 43, 13.
is no. of observations [a] with a given
characteristic divided by the total no. of
observations [a+b]
proportion= a
a+b
Ratio:
a ratio is no. of observations in a given group with a
given characteristic divided by the no. of observations
with out characteristic.
ratio=a/b
rates: they are similar to proportions except that a
multiplier (eg:1000,10,000… )
Is used and they are computed over a specified period
of time. Multiplier is called
rate= a/a+b .base
References
1. Sunderlal Textbook of Biostatistics
2. BK Mahajan Methods in Biostatistics