0% found this document useful (0 votes)
9 views26 pages

01 - Scales of Mesurement - Sumarising Numeric Data

Uploaded by

Opolot Julius
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views26 pages

01 - Scales of Mesurement - Sumarising Numeric Data

Uploaded by

Opolot Julius
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 26

BIOSTATISTICS 1

• Introduction to Biostatistics
• Scales of measurement
• Summarizing numerical data with numbers

Joaniter Nankabirwa_Wandera
Introduction to statistics

• Statistics:

– Recorded data

– Characteristics calculated for a set of data

– Methods of collection, presentation, analysis and

interpretation of data

• Biostatistics:
Sources of data
• Surveys
– Examples of national surveys (MIS, DHS etc)

• Surveillance
– DHIS
– UMSP

• Records
– Hospitals
– Government institutions

• Planned studies
– NGOs
– Universities
– Researchers
VARIABLE TYPES IN STATISTICS
• Categorical variables
– Take on values that are names or labels
– E.g Sex, Education status, Disease Severity

• Binary/Dichotomous variable
– Categorical variables that take on only 2 values
– E.g Sex is male or female, responses with Yes/No

• Continuous variables
– Take on any set of values including decimal point
– E.g age, weight, height

• Discrete variable
– Takes on specific value
– E.g number of children, # of pregnancies
Scales of measurement
Ways in which variables/numbers are defined and
categorized
Nominal scale:
• Used when data fits into categories
• The data do not represent a quantity or an amount
• We count the number of observations (e.g Sex)
• They are often summarized as percentages or proportions

Ordinal Scales:
• Special kinds of nominal scales
• Used when an inherent order occurs among the categories
• E.g Stages of tumors, SES, level of education etc
Scales of measurement
Interval scale:
• Data is quantitative and measured in intervals
• There is order and there is difference between two values
• Distances between each interval on the scale are equivalent
along the
• E.g Temperature, Ph

Ratio scale:
• Data is also quantitative
• Has all properties of an interval scale
• In addition, has a clear definition/meaning of zero as having
nothing of that variable
• Example is weight
Numeric data
• The data types in a numeric data are
expressed in numbers

• Measured on either an Interval /


Ratio scale

• Variables can be continuous or


discrete

• Summarised by measuring the


middle and dispersion
Summarizing numeric data
Measures of the middle
• Mean( X ) / Average
• Add all observations and divide by no. of
observations.
• Mean=Ƹ(x)/n

• Example is mean age of the CEU Class (activity, 10)


Age:
-
Data
ID Age Sex Weight

Dorothy 1 25 F 65
Anthony 2 27 M 65
Janet 3 27 F 58
Jovan 4 24 M 55
Rogers 5 26 M 72
Joselyn 6 24 F 48
Bernard 7 30 M 76
Daniel 8 34 M 78
Seti 9 29 M 62
Aziida 10 31 F 75
Mean
• Mean = Sum values/total # of values

mean Age
=(25+27+27+24+26+24+30+34+29+31)/10

=277/10
=27.7
The mean is sensitive to extreme values (out-liars) especially when the sample size is small (check it out)

ID Age Sex Weight

1 100 F 65
2 27 M 65
3 27 F 58
4 24 M 55
5 26 M 72
6 24 F 48
7 30 M 76
8 34 M 78
9 29 M 62
10 31 F 75
Mean
• Mean = Sum values/total # of values

mean Age
=(100+27+27+24+26+24+30+34+29+31)/10
=352/10
=35.2
***** WITHOUT OUTLIER….27.7
If original observations are not available, use weighted average ; Through a frequency table.
Weighted mean =Ƹ(fx)/n (work it out)

CLASSES Age (f) x fx

20 - 24 II=2 22 44
25 - 29 IIIII=5 27 135
30 - 34 III=3 32 96
Total 275

Mean= Ƹ(fx)/n

Mean=275/10

=27.5
Measures of the middle continued (numeric data)

• Median
o Is the middle observation
o Point at which half the observations are smaller and half are
larger

• Steps
o Arrange observations from smallest to largest
o Count to find the middle value
o Median is the middle value for odd number of observations and
the mean of the two middle values for an even number of
observations
• Activity using our data
Position -----n/2,
=(24, 24, 25, 26, 27, 27, 29, 30, 31,34)
=27
With outlier=24, 24, 26, 27, 27, 29, 30, 31, 34,100 median 28
(24, 24, 25, 26, 27) (27, 29, 30, 31,34)
27.7 27.7
• 27.7 -24 =3.7 • 27.7 - 27 =0.7
• 3.7 • 27.7 – 29=-1.3
• 27.7-30=-2.3
Measures of the middle continued (numeric data)

• Mode
• Is the value that occurs most frequently
• When data set has two modes its called bimodal
• In the frequency tables the mode is also called the
model class
24, 24, 25, 26, 27, 27, 29, 30, 31,34
• Work it out using our data
Measures of the middle continued (numeric data)

• Geometric mean (GM)


• Not used as often as arithmetic mean
• It’s the nth root of the product of n observations
• GM= nth root {(X1)* (X2)*(X3)*…(Xn)}
• Its common way of presenting data measured on a
logarithmic data (e.g lab data)
10√(24*24*25*26*27*27* 29* 30*31*34)
=10√(
=27.5
When to use mean, median and mode
• Mean
– Used only when data is symmetrical (normally
distributed)

• Median
– Used when data is skewed

• Mode:
– Used for bimodal data

• Geometric mean
– Used for observations measured on a logarithmic scale
Measures of spread (Numeric data)

• Range
o Difference between the largest and smallest
observation

o May also be presented as the largest and smallest

observations rather than their difference

o Work out range using our data

(24, 24, 25, 26, 27, 27, 29, 30, 31,34) Range= 10 (24-34)
Measures of spread (Numeric data)

• Standard deviations (SD)


• Commonest measure of spread

• Spread of observations around the mean

• It is the average deviation of each individual


observation from the mean

• SD=√ ∑ (X –mean of X)˄2 /n-1]


Work out the standard deviation(Work it out, Our mean=32.8)
ID

1
2
3
4
5
6
7
8
9
10
Measures of spread (Numeric data)

• Variance
• SD ˄2
• Often SD preferred to this
• Working it out using our data

• Coefficient of variation
• Often used in biological sciences
• It’s SD/Mean *100
• Work it out
Measures of spread (Numeric data)
• Percentiles
o Percentiles tell you where a score stands relative to other scores
o E.g In the growth chart for girls at 3 years, 90th percentile is 15kg.

o Means that for girls aged 3 years, 90% weigh 15 Kg and below
How to get percentiles & The Interquartile range
• Arrange data in increasing numbers
• Say we want 70% percentile for our 10 observations
• 70/100*10= 7th Position, Determine 7th position
• Work it out

Interquartile range (IQR)


• Makes use of percentiles

• It’s the difference between the 25th and 75th percentiles

• ( i.e first and third quarter), usually state actual values

• Work it out with our data


Uses of the measures of spread
(Numeric data)

• Whenever you use a measure of the center, you


need to include a measure of spread

• SD: Used when mean is used (normally distributed data)


• %ntiles and IQR: Used when the median is used.
• Range: Used when the aim is to emphasize extremes
• Coefficient of variation: Used when aim is to compare
distribution measured on different scale.

You might also like