DEPARTMENT OF ORTHODONTICS
AND
DENTOFACIAL ORTHOPEDICS
1
COLLEGE OF DENTAL SCIENCES,
DAVANGERE
DEPARTMENT OF ORTHODONTICS AND DENTOFACIAL
ORTHOPEDICS
Seminar on
BIOSTASTICS
Prof. and Head Dr.NAVEEN KUMAR C
Dept. of Orthodontics PG Student.
2
CONTENTS
Introduction
What is statistics?
Biostatistics
Uses of Biostatistics
Data
Sample & Sampling designs
Presentation of data
Measures of central tendency
Measures of variability
Probability
Statistical Significance (Tests of significance )
Correlation & Regression
Conclusion
References
3
‘Statistic’ or ‘Datum’ – in singular, it is measured or counted fact or
piece of information stated as figure.
‘Statistics’ or ‘Data’ – Plural of the same , stated in more than one
figures.
Statistic -Statista (Italian word)- Statesman
Statistik ( German word )–political state
John Graunt (1620-1674) – Is the Father of health statistics.
….. Run towards the best materials in the
field …..
4
DEFINITION
Statistics:
Principles and methods for collection, presentation, analysis and
interpretation of numerical data.
Biostatistics:
Tool of statistics applied to the data that is derived from biological
science.
Why do we need biostatistics ?
We need biostatistics to;
Define normalcy
Test the difference b/w two population
Study the correlation or association b/w two or more attributes
To evaluate the efficacy of vaccines, sera etc by control studies
Locate , define & measure extent of disease
Evaluate achievements
Fix priorities
The five fundamental processes involved in organization of oral health
care services.
1. Acquisition of information.
2. Dissemination of information.
3. Application of knowledge and skill.
4. Judgement or evaluation.
5. Administration.
5
Uses of biostatistics in Public Health Dentistry:
Assess the state of oral health in community
Indicate basic factors underlying state of oral health
Determine success or failure of specific oral health care programmes
or to evaluate the programme action
Promote health legislation and in creating administrative standards for
oral health.
DATA
Data – collective recording of observations.
Variable- characteristic which varies from one person to another.
Sources;
1. Experiments
2. Surveys
3. Records
TYPES OF DATA
Depending upon the source of collection;
Primary data : Interview
Examination
Questionnaire
Secondary data :Records, Census data
6
Data can be of two types
Qualitative ( discrete data ) Quantitative (Continuous data )
Subjects with same Characteristic varies
characteristics are counted (variable ) are counted-
(Remains same) frequency varies
Eg deaths, sex, Eg. Height, arch length.
malocclusion.
SAMPLE
Population – Group of all individuals who are the focus of investigation.
Sample – Group of sampling units (individuals) that form part of population
generally selected so as to be representative of the population whose
variables are under study
Sampling units – Individuals who form the focus of study
Sampling frame or sampling list - List of sampling units
SAMPLING METHODS
A. Probability Sampling
Random selection of the sample is done
All units in population have equal probabilities (chances )of being
chosen in a sample.
7
Types;
1. Simple Random sampling
2. Stratified Random sampling
3. Cluster sampling
4. Systematic sampling
5. Multistage sampling
6. Multiphase sampling
B. Non Probability sampling
(Deliberate /Purposive)
Units in the sample are collected with no specific probability
structure.
Types;
1. Convenient / purposive sampling
Sample size Formulae
Formulae used in determining sample size.
n = z2 σ p2 /e2 : Z = constant,
σ = SD of population ,
e = acceptable error
n = Z2 pq / e2 : p = Sample proportion
8
ERRORS IN SAMPLING
Sampling errors
1. Faulty sampling design.
2. Small size of sample.
Non-sampling errors
1. Coverage errors.
2. Observational errors.
3. Processing errors.
PRESENTATION OF DATA
A. Tables
B. Diagrams/graphs
RULES FOR TABLE
1. Relevant title
2. No of Class interval 5- 25
3. Class interval of equal width
4. Well defined class limits
5. Rows & columns clear
6. Units of measurements specified
7. Source of data mentioned
8. Groups tabulated in order
9. Reason for omission of certain data mentioned
9
DIAGRAMS / FREQUENCY DISTRIBUTION DRAWINGS
It is One of the most convincing & appealing ways of data presentation.
It is Easy to study relative value of frequencies and gives “bird’s eye
view”
Basic rules are :
Self explanatory title
Value of variable on x-axis , frequency on y- axis
Diagram clear, simple & consistent with data
Scale of presentation should be mentioned
COMMON DIAGRAMS
A. Quantitative/ continuous / measured data
1. Histogram
2. Frequency polygon
3. Frequency curve
4. Line chart/ graph
5. Cumulative frequency diagram
6. Scatter / dot diagram
B. Qualitative/ discrete / counted data
1. Bar diagram
2. Pie/sector diagram
3. Pictogram
4. Map diagram / spot diagram
10
BAR GRAPH;
Represent only one variable
Represent qualitative data
MULTIPLE BAR GRAPH;
• Compare qualitative data with respect to single variable
Multiple bar graph showing protein content of common foods in g per 100 g
of edible portion
11
PROPORTIONAL BAR DIAGRAM;
Represents a qualitative data
Comparison of data
Populations or groups compared with respect to single variable
Compare only the proportion of subgroups between major groups of
observations.
Area-wise prevalence of caries (Rural /Urban)
COMPONENT BAR DIAGRAM;
Represent both no of cases in a group & subgroup simultaneously
Division of bars proportional to no of cases in subgroups
12
PIE DIAGRAM / SECTOR DIAGRAM;
Show percentage breakdown for a qualitative data
Degrees of angle denote frequency ( area of sector)
Angle = class frequency/total observations x 360
Cannot be used to represent 2 0r more data sets
Grading of malocclusion
LINE DIAGRAM / GRAPH;
Simplest mean to represent data
Useful in representing trends over time
X –axis represent time
Y –axis , value of any variable under study
13
HISTOGRAM;
Depict quantitative data of continuous type
Represents frequency distribution
X axis-size of the observation
Y axis- frequency
14
FREQUENCY POLYGON;
Represents frequency distributions
Comparative analysis
Area diagram developed over a histogram
Point marked over mid point of class interval
FREQUENCY CURVE;
Represents frequency distributions of quantitative data
Large no of observations , small group intervals
Continuous graph giving relative frequencies
15
CUMULATIVE FREQUENCY DIAGRAM;
Graph of cumulative relative frequency distribution
Ordinary frequency distribution table needs conversion to relative
cumulative frequency table
Cumulative frequency is total frequency, obtained by cumulating the
frequency of previous classes.
SCATTER OR DOT DIAGRAM;
Frequencies of two variables are represented.
Graphic presentation to show nature of correlation
Characters read on base and vertical axis and perpendicular drawn
from these readings meet to give one scatter point.
16
CARTOGRAMS OR SPOT MAP;
Used to show geographical distribution of frequencies of character
17
PICTOGRAM OR PICTURE DIAGRAM;
To impress the frequency of occurrence of health related events
MEASURES OF CENTRAL TENDENCY;
Value or parameter which serves as single estimate of a series of data
Summarizes the data
Enables comparison
One central value around which all other observations are dispersed
Concentration of all observations around the central value
Types
1. MEAN
2. MEDIAN
3. MODE
18
MEAN
ARITHMATIC AVERAGE
Mean = sum of all the observations
total no of observation
For grouped data :
Mean = total ( value of variable x frequency)
total frequency
For grouped data with range:
Mean = total { md pt of class interval x frequency}
total frequency
X = W Xi
n
Example;
ESR OF 7 PATIENTS IS 7,5,3,4,6,4,5 mins
MEAN (X) = 7+5+3+4+6+4+5 = 4.86
7
MEDIAN
Arrange the observations in ascending or descending order. The
middle observation is the median.
Examples:
DMFT of 7 children is 7,4,5,6,7,3,4
arranged in order = 3,4,4,5,6,7,7
median is 5
DMFT of 8 children is 10,9,4,5,8,3,7,6
arranged in order = 3,4,5,6,7,8,9,10
19
median = 6+7 = 6.5
2
MODE
Mode : that value which in a series of observation occurs with greatest
frequency
Mode = 3 median – 2 mean
Example:
1,2,2,8,5,2,7,3,2
Mode Is 2
MEASURES OF LOCATION – PERCENTILES
Centiles or percentiles : values in series of observations arranged in
ascending order of magnitude which divide the distribution
into 100 equal parts
Median is 50 centile
Quartiles : 3 in no . Median is Q2
Quintiles : 4 in no , divide in 5 parts
Deciles : 9 in no
Median = Q2 , D5 or 50% (P 50)
20
APPLICATION OF PERCENTILES
Location of percentiles give an idea about frequency distribution
Preparation of standards
Comparison of samples
21
MEASURES OF VARIABILITY
Dispersion is the degree of spread or variation of the variable about a
central value.
Uses :
Determine reliability of an average
Serve as a basis of control of variability
Comparison of two or more series
Facilitate further statistical analysis
A good measure of dispersion should be simple , easy to compute ,
based on all items , amenable for further analysis and not affected by
extreme values.
Of individual observations -
1. Range
2. Interquartile range
3. Mean deviation
4. Standard deviation
5. Coefficient of variation
Variability of samples-
1. Standard error of mean
2. Standard error of difference b/w 2 means
3. Standard error of proportion
4. Difference b/w 2 proportions
5. Standard error of correlation coefficient
6. Standard deviation of regression coefficient
22
RANGE
Difference b/w value of smallest & largest item
Range defines the normal limits of a biologic characteristic
Simple to calculate
Not based on all items
Subjected to fluctuations
Eg. 3,4,4,5,6,7,7 RANGE= 7-4 = 3
STANDARD DEVIATION
Root mean square deviation
It is the measure of differences of each observations from mean of all
observations
Greater the standard deviation greater will be magnitude of
dispersion from the mean
Small S.D higher degree of uniformity of observations
Calculation of S.D
Calculate the mean = x
Diff of each observation from mean (deviation)
d = xi – x
Square these = d²
Total these = Σ d²
Divide this by no of observations minus 1, variance = d²/ (n-1)
23
Σ d²/ (n-1)
Square root of this variance is SD
For a grouped data = SD = ∑ (Xi –x)2f
N -1
Fluoride concentration of water supply
Class Frequency Mid p Xi fi Xi -x (Xi –x)2 (Xi –x)2f
interval
.2-.3 1 .25 .25 -.8 .64 .64
.4-.5 1 .45 .45 -.6 .36 .36
.6-.7 1 .65 .65 -.4 .16 .16
.8-.9 5 .85 4.25 -.2 .04 .2
1-1.1 10 1.05 10.5 0 0 0
1.2-1.3 4 1.25 5 .2 .04 .16
1.4-1.5 1 1.45 1.45 .4 .16 .16
1.-1.7 1 1.65 1.65 .6 .36 .3
1.8-1.9 0 1.85 0 .8 .64 0
2-2.1 1 2.05 2.05 1 1 1
25 26.25 2.94
24
25