Introduction To Biostistics
Introduction To Biostistics
Biostatistcs
• General concepts and terminology
• Sampling methods
• Measurement of location, scale and shape
• Contingency tables and chi-square test
• Comparison of means, t-test, multiple range test
• Simple experimental design and analysis of variance
• Correlation and regression analysis
• Introduction to multivariate methods
Introduction to Biostatistics
There are lies, damned lies, and statistics
‘it is easy to lie with statistics, but easier to lie without them’
-Frederick Mosteller
Statistics
• A field of study concerned with the collection, analysis, presentation and interpretation
of data the drawing of inferences about a body of data when only a part of the data is
observed.
Data
• Data that consists of classification of the members of the sample into limited number of
classes on the basis of some property of the numbers
o Flower colour
o Gender
• Results from the process of counting
Measurement Data
Data Collection
Surveys: In surveys, the researcher is typically interested in describing some population - there
is usually no attempt to manipulate units within the population.
Experiments: In experiments, units from the population are manipulated in some fashion and a
response to the manipulation is observed.
Variables
Qualitative - differs in kind or quality but not amount. Examples: eye colour
2
Scales of Measurement
Non-Parametric Parametric
• Nominal or Categorical
o allow for only qualitative classification
o we cannot quantify or even rank order those categories
o gender, race, ethnic background, colour, city, etc.
o may be presented by "dummy codes“
• Ordinal or Rank
o Ordinal variables allow us to rank order the items we measure in terms of which
has less and which has more of the quality represented by the variable, but still
they do not allow us to say "how much more." Examples: socioeconomic status
of families, Hardness of rocks, Beauty
• Interval
o a parametric scale (i.e., it has a fixed unit of measurement)
o has an arbitrary zero point (which means the zero point does not truly reflect
absence of the characteristic).
o temperature, as measured in degrees Fahrenheit or Celsius, constitutes an
interval scale.
o We can say that a temperature of 40 degrees is higher than a temperature of 30
degrees, and that an increase from 20 to 40 degrees is twice as much as an
increase from 30 to 40 degrees.
• Ratio
o A metric scale because it has an absolute zero (which truly reflects absence of
the characteristic).
o Ratio variables are very similar to interval variables; in addition to all the
properties of interval variables, they feature an identifiable absolute zero point,
thus they allow for statements such as x is two times more than y.
o Kelvin temperature, speed, height
o the Kelvin temperature scale is a ratio scale, not only can we say that a
temperature of 200 degrees is higher than one of 100 degrees, we can correctly
state that it is twice as high.
o Interval vs Ratio Scales
Interval scales do not have the ratio property. Most statistical data analysis procedures do
not distinguish between the interval and ratio properties of the measurement scales.
3
Relationships between Variables
Population vs Sample
Parameter vs Statistic
4
• Provided the sample adequately represents the population (is sufficiently large and
unbiased), the sample statistics should be reliable estimates of the population
parameters of interest.
• most statistical procedures assume that sample observations have been drawn
randomly from populations to maximize the likelihood that the sample will truly
represent the population
The Science of Statistics is all about measurement and variation. If there was no variation, there
would be no need for statistical methods.
Statistical softwares
Graphical software
Suggested Readings