0% found this document useful (0 votes)

39 views67 pages

Summarizing Data

This document discusses various methods for summarizing data including tabular, graphical, and numerical methods. It covers frequency distribution tables, histograms, frequency polygons, bar charts, pie charts and scatter plots. Examples are provided for each method.

Uploaded by

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

39 views67 pages

Summarizing Data

Uploaded by

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 67

BIOSTATISTICS (BIO 03)

SUMMARIZING DATA

By Dr. NANA AYEGUA HAGAN SENEADZA

METHODS OF SUMMARIZING DATA

• LECTURE OUTLINE
• Methods of summarizing data
❖TABULAR
❖GRAPHICAL and
❖NUMERICAL methods-
- Simple frequencies,
- Measures of central tendency,
- Measures of spread
• Other methods
❖Rates and Ratios
❖Measures of morbidity
❖Measures of mortality
Tabular method

• Before one can display the data graphically, one has to organize
the data in the form of tables, which summarize data into
compact and readily comprehensible form
Eg. frequency distribution table.
Tabular method

Tables should be
• Well labeled axis
• Provide title
• Indicate source

Cross tabular presentation (two-dimensional tables) is used for

two variables
• E.g. age and sex distribution of Level 400 students
Tabular method

• Frequency distribution table is a table showing number of

observations at different values of the variable.

• The purpose is to display meaningful pattern. It can be used for

all types of data discrete or continuous.

• The categories must be mutually exclusive and mutually

exhaustive. Each disease must belong to a category and only one
category of the table.

• Avoid open ended intervals.

• Limit the number of classes to between 10 20.
• Classes could be of equal or unequal widths.
‑
Summarizing data – TALLY

Table 1: NATIONALITY OF MINE WORKERS IN TOWN A

Nationality Tally Frequency

GHANAIANS //// //// //// //// 19

OTHER ECOWAS //// //// //// 14
EUROPE //// / 6
AMERICAS //// //// 9
AFRICA /// 3
ASIA //// 4

Total 55

6
• Table 2: Disease pattern at an out patient clinic.

• DISEASE freq. relative freq.

•
• Malaria 186 31.0
• Pneumonia 132 22.0
• Measles 48 8.0
• Diarrhoeal dxs 54 9.0
• Malnutrition 60 10.0
• Others 120 20.0

• TOTAL 600 100

‑
Table 3. Age structure of patients

Age Interval males females

• 0 < 1yr 36 57
• 1 4 191 196
• 5 14 369 367
• 15 34 263 384
• 35 54 180 204
• 55 64 64 71
• 65 99 45 28
• TOTAL 1148 1307

8
‑
‑
‑
‑
‑
‑
‑
Graphical presentation

• a. Continuous data set

• i Histogram
• ii Line graph
• iii Frequency polygon

• b. Discrete data set

• i Pie chart
• ii Bar diagram

• c. Other
• i Scatter diagram
• ii Spot diagram
Basic terms for frequency distribution

• Class limit, boundary, interval, width and midpoint

For the first class (300-399)

• Lower class limit = ??????
• Upper class limit = ??????
• Lower class boundary = ??????
• Upper class boundary = ??????
• The class width = ??????
• Class midpoint = ???????
Basic terms for frequency distribution

• Class limit, boundary, interval, width and midpoint

For the first class (300-399)

• Lower class limit = 300
• Upper class limit = 399
• Lower class boundary = 299.5
• Upper class boundary = 399.5
• The class width = Upper class boundary – lower class boundary = 399.5 – 299.5= 100
• Class midpoint = (Upper class limit + lower class limit)/2 OR (Upper class boundary + lower class
boundary)/2
• HISTOGRAM

• In the construction of histogram, the area under the graph must correspond to
the frequencies of each interval.

• In the case of data with unequal interval widths, the heights on the y axis must
be adjusted.

• It is important to avoid open ended intervals.

• The bars are not separated ie no spaces in-between the bars.

• The y axis gives the frequency of individuals and the x axis gives the classes into
which the data have been grouped.
•
• The axis should be properly defined and clearly labelled and scale clearly shown.

• Foot note must be provided if it is from other source

‑
‑
‑
‑
Histogram
Weight of luggage

80
70
60
Frequency

50
40
30
20
10
0
1 - 5 6-10 11- 16- 21- 26- 31- 36- 41- 46- 51- 56-
15 20 25 30 35 40 45 50 55 60

Weight group

13
• HISTOGRAM

• In the case of data with unequal interval/widths, the heights on the y axis
must be adjusted.

• Frequency distribution gives the masses of 48 objects

Mass (g) 10 – 19 20 – 24 25 – 34 35 – 50 51 – 55
Frequency 6 4 12 18 8

Mass (g) 10 – 19 20 – 24 25 – 34 35 – 50 51 – 55
Frequency 6 4 12 18 8
Class widths 10 5 10 15 5
Width on the x-axis 2 × standard standard 2 × standard 3 × standard standard
Rectangle’s height in
6÷2=3 4 12 ÷ 2 = 6 18 ÷ 3 = 6 8
histogram

‑
Frequency polygons

Frequency polygons

Steps
• Create a histogram.
• Find the midpoints for each bar that
exists on the histogram.
• Place a point on the origin of the
histogram and its end.
• Connection of the points.
• intervals of unequal widths, the heights
on the y axis must be re adjusted.

• You can also create the polygon

without first creating the
histogram
‑
‑
Frequency polygons
LlNE GRAPH:
Median HIV Prevalence 2000 – 2009

3.6
3.6 3.4
3.2
3.1
2.9 2.9
2.7
2.7 2.6
2.3
2.2
HIV Prevalence

1.8

0.9

0
2000 2001 2002 2003 2004 2005 2006 2007 2008 2009
Multiple line graphs

• Figure 2. Cell phone use in Ghana, 1996 to 2002

Any advantages of the frequency polygon over the
histogram ???????
GRAPHICAL DISPLAYS FOR DISCRETE DATA

• BAR DIAGRAM
• The bars are separated and the widths are equal for the
respective categories. Numbers or frequencies or percentages
can be used.

• Bars can be vertical or horizontal

• Bars can be used to represent multiple categories of the
variable
Bar chart-types

21
COMPOSITE BAR CHART

• One bar is used for each group of

the descriptive attribute. ie
– distribution of patients who
responded yes/no/unknown to
having comorbid conditions in
addition to their present diagnosiss
Pie chart

• A circular diagram which is cut up into several

segments representing the various groups of a
descriptive attribute.
• It is drawn to show a percentage composition of a
descriptive attribute.
• May have to use compasses or protractors.
• It is not too good for comparing two or more
distributions, ie visually difficult to relate sizes of
segments

23
Pie chart

Arrivals at KIA
GHANAIANS
ECOWAS
12% AFRICANS
19%
AMERICAS
EU
ASIA
21%
14%

27%

24
Scatter Plots

A scatter plot is a graphical tool for exploring the

relationship between two variables
• The response/outcome/dependent variable, Y , is on
the vertical axis
• The predictor, covariate, independent variable, X, is on
the horizontal axis

• Question: What to look for in a scatter plot?

25
Nature of Relationship – Linear?

26
• OTHER GRAPHICAL METHODS

• Spot diagrams and

• Area graphs are often used in epidemiology to display
geographical distribution and intensity of disease distribution
respectively.
SPOT MAP Showing Location
of HIV Sentinel Sites in Ghana Ebola outbreak in West Africa

Source: NACP/GHS
SPOT MAP Showing
Location
of HIV Sentinel Sites in
Ghana

Source: NACP/GHS
Numerical or mathematical methods of data presentation

• Introduction
• It is often important to be able to describe the raw
data with one or two summary figures.
Numerical methods

• The appropriate summary measure depends on the type of

data.
✓ Numeric data- eg parity, systolic BP, are summarized using
measures of location/central tendency (mean, median, mode)
and dispersion (standard deviation range etc.)
✓ Non-numeric data eg sex, tribe, etc are summarized by
proportions or percentages
Numerical methods

Measures of Central Tendency

• Mean (arithmetic, geometric etc.)
• Median
• Mode
• Measures of spread/ dispersion
• Range
• Variance
• Standard deviation (square root of variance)
• Standard error of means
• Coefficient of variation [ sd /mean x 100%]
• Percentiles, quintiles, quartiles, tertiles etc
Numerical methods

• Proportions or percentages have no units,

• the measures of location and dispersion are in the same units
as the data eg average age in years
• except the variance which is in square units
• and the co-efficient of variation which has no units.
Calculating summary measures

• Proportions
• If N=no of subjects in a sample and
n=no within the same sample having an attribute, then the
proportion with the attribute is n/N
Eg. In a survey of 150 medical students, 20 tested positive for
Hepatitis B infection.
The proportion of students with Hepatitis B infection is 20/150=
0.13 or 13%
• MEASURES OF CENTRAL TENDENCY.

• The most common measures of central tendency are the mean,

median, mode and the geometric mean.
• Each has its advantages and disadvantages as a measure of
location.
• MEAN

• Sum of the individual items divided by the total number of items. It

is amenable to mathematical manipulation but is easily affected by
extreme observations.

• For grouped data, the mean is obtained by multiplying the

frequencies of each item by the value of the item and then summing
the products to obtain the numerator, the denominator is the sum of
all the frequencies.

• In the case of data classified into intervals, the frequencies are

multiplied by the class midpoint. Because it is not known exactly
where the frequencies are located within the classes.

• Class mid point is obtained by adding the two class limits and
dividing by two
• Mean
If X1, X2, X3, ….Xn are numeric observations made on n subjects,
then the mean
= X1+ X2 + X3 + … + Xn
n
= ΣX
n
Mean = ΣfX
Σf
Where f is the frequency of observation X
AGE IN YEARS (X) FREQUENCY (F) X2 fX fX2

21 38

22 35

23 28

24 24

25 28

Note : X can be the class midpoint when using classes(intervals)

• ΣfX = ?
• Σf = ?
• Mean=
• Median
The item located at the mid point when all the observations are arranged in ascending or descending order.

• It is the middle most ranked observation.

• It is less influenced by extreme values however, it is not easily amenable to mathematical manipulation.

• It is the best measure of central tendency in case of skewed distributed data.

First locate the midpoint= (n+1)/2

Odd vs Even number of observations

‑
For grouped data, the median =
LM+( n/2- FM-1 ) x Ci
FM
• Where
LM = lower class boundary of median class
n= total number of observations
FM-1= cumulative frequency below the median class
FM = median class frequency
Ci= median class interval

The median is the best measure of central tendency for skewed data.
• GEOMETRIC MEAN
• It is a useful summary statistic in antibody assay and
microbacterial counts and for skewed data.

• It is defined as the Nth root of the product of N observations.

• It is not used if any of the observations is negative.

• Example results of measles antibody measurements of 5
children.

• 4 8 16 16 64 ( VERY SKEWED)
• GM = fifth root of (4x8x16x16x64)
• taking the logs on both sides
• 5log GM = log4+ log8 + log16 + log16 + log16 + log 64
• = 5.71
• GM = antilog of 5.71/5
• = 13.9
• On the other hand, the arithmetic mean = 21.6 the median = 16
and the mode = 16.
• Mode
This is the most frequently occurring observation.
For grouped data, mode= L + (fz – fl) x i
2fz – (fl + fh)

• Where L= lower class boundary of the modal class

fz = frequency of the modal class
fl = frequency in the adjacent lower class
fh = frequency in the adjacent higher class
I = modal class interval
Distributions
MEASURES OF DISPERSION OR SPREAD OR
VARIATION

• If The data below represent the post-evaluation

results of three groups of participants at a workshop,
supervised by different facilitators.

• Which of the three groups would you have liked to

have been assigned to.

•I 70 29 48 90 92 61 30
• II 68 72 65 50 58 63 44
• III 59 59 58 60 60 61 63
• MEASURES OF DISPERSION OR SPREAD OR VARIATION

• The mean score per group is identical

• However, it is important to know if the observations are all

close to the mean or whether they scatter widely in each

direction

• Information about the variation within groups will provide

useful additional statistics which could help to rate the strength

of each group.
Dispersion /variation

• The degree to which numerical data tend to spread about an

average value
Measures of dispersion/variation/spread

• These include
o Range
o Variance
o Standard deviation
o Coefficient of variation
o Standard error of mean
o Inter-quartile range etc.
• RANGE:
• It is the simplest measure of spread
• defined as the difference between the highest and the lowest
observations.
Range= maximum observation-minimum
observation
• It tends to increase as the number of observations increases.
• It is not easily used for statistical inference.
• It only uses 2 of the observations and neglects all the
information regarding variation
• Variance and standard deviation
• The variance (σ2), is defined as the sum of the squared distances of
each term in the distribution from the mean (μ), divided by the
number of terms in the distribution (N).

•
54
• VARIANCE Mean square deviation (SUM(X X')2/(n 1)))

• Table 5. Example
• X (X X') (X X')2
• 70 10 100
• 29 31 961
• 48 12 144
• 90 30 900
• 92 32 1024
• 61 1 1
• 30 30 900
• TOTAL 0 4030

• VARIANCE = 4030/6 = 671.7

‑
‑
‑
‑
‑
‑
‑
• COEFFICIENT OF VARIATION

• It is the expression of the standard deviation as a percentage of

the mean.
• Useful in comparing variations of different attributes. e.g
variations in weight, height, and age of a study population.
• It can enable one to conclude that weights are more spread
than heights of preschool children.
• It is a dimensionless statistic.
• CV = (standard deviation / mean) * 100
Measures of Location/position

• Median divides the ranked dataset into 2 equal parts

• Quartile divide a given set of data that has been ranked into
four equal parts
• Deciles divide a given set of data that has been ranked into 10
equal parts
• Percentile divide a given set of data that has been ranked into
100 equal parts
Measures of Location/position

Measure of position Position in the ranked

dataset
Q1 (lower quartile) ¼ (n+1)
Q2 (median) ½ (n+1)
Q3 (upper quartile) ¾ (n+1)
D7 , D9, Dk 7/10 (n+1), 9/10(n+1)
k/10(n+1)
Pk k/100(n+1)
Example

• Dataset of ages of participants receiving the MMR vaccine

1, 27, 16, 7, 31, 7, 30, 3, 21, 15, 13, 11, 5

Find Q1, Q2,Q3, D5, D8, P80

• QUARTILES

• These are observations which divide a given set of data that has
been ranked into four equal parts.

• The value below which 1/4 of the ordered observations fall is called
the lower or the first Quartile Q1.

• The value which is exceeded by 1/4 is called the third or upper

Quartile Q3.

• The distance between the lower and the upper quartiles is called
inter quartile range (IQR) = Q3-Q1

• The semi inter quartile range is also a measure of variation and

unlike the range
(Q3-Q1)/2
‑
‑
Percentiles
• Finding percentiles in grouped data:

Recall
For grouped data, the median =
LM+( n/2- FM-1 ) x Ci
FM
• Where
LM = lower class boundary of median class
n= total number of observations
FM-1= cumulative frequency below the median class
FM = median class frequency
Ci= median class interval
Q. The following data represent the number of correct responses made to the examination
in statistics by 50 medical students in the Medical School selected systematically from the
list of all students in the School.

72 72 93 70 59 78 74 65 73 80
57 67 72 57 83 76 74 56 68 67
74 76 79 72 61 72 73 76 67 49
71 53 67 65 100 83 69 61 72 68
65 51 75 68 75 66 77 61 64 74
a. Prepare the frequency distribution table and the frequency histogram for this data set.
b. Compute the sample mean , sample median , sample range R, and sample variance .
c. Does the data set represent a sample or a population?
62

If it is a sample, describe the population from which it has been drawn.

Other summary measures

Rates and Ratios can be used to summarize data

• What are the differences between rates, ratios and

proportions?
Other summary measures

Rates and Ratios can be used to summarize data

Rates
• Crude rates
❖ Crude birth rate, Crude death rate
• Specific rates-age or sex specific rates
❖ Infant mortality, Under 5 mortality, Maternal Mortality etc.
• Standardized rates
❖ Direct and Indirect standardized rates (for comparing two or more
different population groups)
Ratios
• Male: Female ratio
• Maternal Mortality ratio etc
Some Measures of morbidity

• Incidence rates
= Number of new cases of illness in a defined period
Average number of persons exposed to risk
• Prevalence rates
= Number of persons who are sick at a given time
Average number of persons exposed to risk
• THANK YOU

Descriptive Statistics
No ratings yet
Descriptive Statistics
86 pages
CH 14
100% (1)
CH 14
28 pages
FINAL PRESENTATION
No ratings yet
FINAL PRESENTATION
123 pages
INTRODUCTION TO STATIATICS Basic Medical Sciences
No ratings yet
INTRODUCTION TO STATIATICS Basic Medical Sciences
79 pages
Startup Investments 08.06.23
No ratings yet
Startup Investments 08.06.23
217 pages
The Role of Artificial Intelligence in WCXC
No ratings yet
The Role of Artificial Intelligence in WCXC
15 pages
P52750 C Pre-InstallManual, ACCEND en
No ratings yet
P52750 C Pre-InstallManual, ACCEND en
68 pages
10 Tips To Make Teaching Music To Children
No ratings yet
10 Tips To Make Teaching Music To Children
15 pages
Lecture 01 Introduction To Statistics PPT 06022025 095924am
No ratings yet
Lecture 01 Introduction To Statistics PPT 06022025 095924am
40 pages
Pubtexto 701 2834 10082024014520
No ratings yet
Pubtexto 701 2834 10082024014520
6 pages
Course: Biostatistics: Haramaya University, Chms
100% (1)
Course: Biostatistics: Haramaya University, Chms
49 pages
Statistics 180930091746
No ratings yet
Statistics 180930091746
117 pages
Introduction To Biostatistics
No ratings yet
Introduction To Biostatistics
59 pages
Fundamentalsofbiostatistics 130802024231 Phpapp02
No ratings yet
Fundamentalsofbiostatistics 130802024231 Phpapp02
69 pages
Biosa
No ratings yet
Biosa
99 pages
Chapter 3 Descriptive Biostatistics
No ratings yet
Chapter 3 Descriptive Biostatistics
103 pages
Distribution of Data
No ratings yet
Distribution of Data
30 pages
Gigabyte Ga-Z370 Hd3 Rev1.0 Schematic Diagram
No ratings yet
Gigabyte Ga-Z370 Hd3 Rev1.0 Schematic Diagram
53 pages
Introduction To Statistics and SPSS
100% (1)
Introduction To Statistics and SPSS
110 pages
Gut2024-CAF-macrophage Crosstalk in Tumor Microenvironment Governs The Response To Immune Checkpoint Blockade in Gastric Cancer Peritoneal Metastases
No ratings yet
Gut2024-CAF-macrophage Crosstalk in Tumor Microenvironment Governs The Response To Immune Checkpoint Blockade in Gastric Cancer Peritoneal Metastases
46 pages
Statistics 180930091746
No ratings yet
Statistics 180930091746
117 pages
AEB801 20222023-Lecture 03-1
No ratings yet
AEB801 20222023-Lecture 03-1
38 pages
Basic Biostatistics
No ratings yet
Basic Biostatistics
31 pages
2 - Data Analysis
No ratings yet
2 - Data Analysis
57 pages
Lecture 4 Graphical Presentation of Data
No ratings yet
Lecture 4 Graphical Presentation of Data
14 pages
SB 4
No ratings yet
SB 4
128 pages
Descriptive Statistics
No ratings yet
Descriptive Statistics
65 pages
Chapter 2 Methods of Data Collection and Presentation
No ratings yet
Chapter 2 Methods of Data Collection and Presentation
35 pages
Biostatistics - I
No ratings yet
Biostatistics - I
46 pages
Biostatistics Presentation Assignment
No ratings yet
Biostatistics Presentation Assignment
67 pages
2 - Presenting Data Part
No ratings yet
2 - Presenting Data Part
42 pages
Harley Davidson Strategic Analysis Term Paper
100% (1)
Harley Davidson Strategic Analysis Term Paper
4 pages
Unit 4 Social Organization and Interaction
100% (1)
Unit 4 Social Organization and Interaction
26 pages
Exploring Useful and Harmful Materials in Science (Grade 5)
100% (4)
Exploring Useful and Harmful Materials in Science (Grade 5)
5 pages
Biostat Aguila Mission Solis
No ratings yet
Biostat Aguila Mission Solis
44 pages
Graphical Representation of Data Word
No ratings yet
Graphical Representation of Data Word
10 pages
Finals RT Core 3
No ratings yet
Finals RT Core 3
25 pages
Revit Architecture 2010 PDF
No ratings yet
Revit Architecture 2010 PDF
1,638 pages
Lesson2 - Measures of Tendency
No ratings yet
Lesson2 - Measures of Tendency
65 pages
Paper Sedimentation
No ratings yet
Paper Sedimentation
20 pages
Brookfield CT3 User Manual
No ratings yet
Brookfield CT3 User Manual
56 pages
Biostatistics in
No ratings yet
Biostatistics in
75 pages
2. presenting of data - ١١١٠٥٩
No ratings yet
2. presenting of data - ١١١٠٥٩
39 pages
KOMFIL Report
No ratings yet
KOMFIL Report
26 pages
Asthma: Ashima Tete B.SC Medical Laboratory Sciene (2 Year)
No ratings yet
Asthma: Ashima Tete B.SC Medical Laboratory Sciene (2 Year)
8 pages
2-Presentation of Data
No ratings yet
2-Presentation of Data
62 pages
PHY 408 Eletromagnetic Theory
No ratings yet
PHY 408 Eletromagnetic Theory
22 pages
Problem Set 5 Q & A
No ratings yet
Problem Set 5 Q & A
6 pages
3 2 Realtime Insights Into IoT With SAP Analytics Cloud SAP
100% (1)
3 2 Realtime Insights Into IoT With SAP Analytics Cloud SAP
16 pages
BathSoap Pres
No ratings yet
BathSoap Pres
27 pages
Statistics
No ratings yet
Statistics
49 pages
Statistics Ns 20231
No ratings yet
Statistics Ns 20231
49 pages
Statistics - Slide 2
No ratings yet
Statistics - Slide 2
15 pages
Ga Wire Mesh Welder 2021
No ratings yet
Ga Wire Mesh Welder 2021
11 pages
Methods of Data Presentation
No ratings yet
Methods of Data Presentation
10 pages
Biostatistics Notes-Numbered
No ratings yet
Biostatistics Notes-Numbered
21 pages
Nguyễn Quỳnh Mai HS163273
No ratings yet
Nguyễn Quỳnh Mai HS163273
9 pages
Stats For PGDM
No ratings yet
Stats For PGDM
52 pages
Biostatistics and Epidemiology LAB
No ratings yet
Biostatistics and Epidemiology LAB
13 pages
1 Stats Intro 14022024 105127am
No ratings yet
1 Stats Intro 14022024 105127am
26 pages
Descriptive Statistics, Tables and Graphs 20
No ratings yet
Descriptive Statistics, Tables and Graphs 20
34 pages
Data Management Lecture Notes
No ratings yet
Data Management Lecture Notes
14 pages
3 Data Description and Measures of Central Tenndency
No ratings yet
3 Data Description and Measures of Central Tenndency
72 pages
MMW Reviewer
No ratings yet
MMW Reviewer
3 pages
Introduction To Statistics: "There Are Three Kinds of Lies: Lies, Damned Lies, and Statistics." (B.Disraeli)
No ratings yet
Introduction To Statistics: "There Are Three Kinds of Lies: Lies, Damned Lies, and Statistics." (B.Disraeli)
32 pages
Data Organization
No ratings yet
Data Organization
30 pages
Lec 2 - Descriptive Statistics
No ratings yet
Lec 2 - Descriptive Statistics
40 pages
Representation of Data: Dr. H. Gladius Jennifer Associate Professor School of Public Health SRM Ist
No ratings yet
Representation of Data: Dr. H. Gladius Jennifer Associate Professor School of Public Health SRM Ist
27 pages
4 Methods of Data Organizing and Presentation
No ratings yet
4 Methods of Data Organizing and Presentation
47 pages
Math
No ratings yet
Math
13 pages
South Valley Academy DD Booklet
No ratings yet
South Valley Academy DD Booklet
79 pages
Introduction Book 1
No ratings yet
Introduction Book 1
41 pages
Book Publishing Contract Checklist
100% (1)
Book Publishing Contract Checklist
7 pages
Math 5
No ratings yet
Math 5
3 pages
Unit 4 Quantitative Analysis and Interpretation
No ratings yet
Unit 4 Quantitative Analysis and Interpretation
10 pages
1st Mid
No ratings yet
1st Mid
19 pages
Digesh Gathani 767297802
No ratings yet
Digesh Gathani 767297802
4 pages
Bio Statics
No ratings yet
Bio Statics
93 pages
Assignment 1a Chemistry
No ratings yet
Assignment 1a Chemistry
7 pages
Inferential Statistics
No ratings yet
Inferential Statistics
92 pages
Signs That You Need To Use Asthma First Aid
No ratings yet
Signs That You Need To Use Asthma First Aid
2 pages
BADB1014 Quantitative Methods - Lesson 3
No ratings yet
BADB1014 Quantitative Methods - Lesson 3
23 pages
GAC - Math Definition - Statistics
100% (1)
GAC - Math Definition - Statistics
3 pages
Scope
No ratings yet
Scope
5 pages
BIOSTAT LESSON 2 - Descriptive Statistics
No ratings yet
BIOSTAT LESSON 2 - Descriptive Statistics
3 pages
Math Reviewer
No ratings yet
Math Reviewer
6 pages
Statistics Essays 1. Classification & Tabulation
No ratings yet
Statistics Essays 1. Classification & Tabulation
20 pages
Use and Application of Statistic
No ratings yet
Use and Application of Statistic
9 pages
Introduction To Bio Statistics
No ratings yet
Introduction To Bio Statistics
53 pages
Mindful Math 2: Use Your Geometry to Solve These Puzzling Pictures
From Everand
Mindful Math 2: Use Your Geometry to Solve These Puzzling Pictures
Ann McNair
No ratings yet

Summarizing Data

Uploaded by

Summarizing Data

Uploaded by

BIOSTATISTICS (BIO 03)

By Dr. NANA AYEGUA HAGAN SENEADZA

Cross tabular presentation (two-dimensional tables) is used for

• Frequency distribution table is a table showing number of

• The purpose is to display meaningful pattern. It can be used for

• The categories must be mutually exclusive and mutually

• Avoid open ended intervals.

Table 1: NATIONALITY OF MINE WORKERS IN TOWN A

GHANAIANS //// //// //// //// 19

• DISEASE freq. relative freq.

• TOTAL 600 100

Age Interval males females

• a. Continuous data set

• b. Discrete data set

• Class limit, boundary, interval, width and midpoint

For the first class (300-399)

• Class limit, boundary, interval, width and midpoint

For the first class (300-399)

• It is important to avoid open ended intervals.

• The bars are not separated ie no spaces in-between the bars.

• Foot note must be provided if it is from other source

• Frequency distribution gives the masses of 48 objects

• You can also create the polygon

• Figure 2. Cell phone use in Ghana, 1996 to 2002

• Bars can be vertical or horizontal

• One bar is used for each group of

• A circular diagram which is cut up into several

A scatter plot is a graphical tool for exploring the

• Question: What to look for in a scatter plot?

• Spot diagrams and

• The appropriate summary measure depends on the type of

Measures of Central Tendency

• Proportions or percentages have no units,

• The most common measures of central tendency are the mean,

• Sum of the individual items divided by the total number of items. It

• For grouped data, the mean is obtained by multiplying the

• In the case of data classified into intervals, the frequencies are

Note : X can be the class midpoint when using classes(intervals)

• It is the middle most ranked observation.

• It is the best measure of central tendency in case of skewed distributed data.

First locate the midpoint= (n+1)/2

Odd vs Even number of observations

• It is defined as the Nth root of the product of N observations.

• It is not used if any of the observations is negative.

• Where L= lower class boundary of the modal class

• If The data below represent the post-evaluation

• Which of the three groups would you have liked to

• The mean score per group is identical

• However, it is important to know if the observations are all

close to the mean or whether they scatter widely in each

• Information about the variation within groups will provide

useful additional statistics which could help to rate the strength

• The degree to which numerical data tend to spread about an

• VARIANCE = 4030/6 = 671.7

• It is the expression of the standard deviation as a percentage of

• Median divides the ranked dataset into 2 equal parts

Measure of position Position in the ranked

• Dataset of ages of participants receiving the MMR vaccine

1, 27, 16, 7, 31, 7, 30, 3, 21, 15, 13, 11, 5

Find Q1, Q2,Q3, D5, D8, P80

• The value which is exceeded by 1/4 is called the third or upper

• The semi inter quartile range is also a measure of variation and

If it is a sample, describe the population from which it has been drawn.

Rates and Ratios can be used to summarize data

• What are the differences between rates, ratios and

Rates and Ratios can be used to summarize data

You might also like