0% found this document useful (0 votes)
29 views56 pages

1 - Biol 605 Summer - Introduction

Uploaded by

bentisutume
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
29 views56 pages

1 - Biol 605 Summer - Introduction

Uploaded by

bentisutume
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 56

Biol 605-BIOSTATISICS

By:

Shibru Temesgen (PhD)


Associate Professor
Department of Statistics, Addis Ababa University,
Email: [email protected]
Mobile: 0911 -17 65 17
BIOL 605-BIOSTATISTICS: COURSE CONTENT

1)Introduction
2)Parametric Tests
3)Nonparametric Methods
4)Regression Models
INTRODUCTION
Focus Points

• Identify variables in a study.


• Identify populations and samples.
• Distinguish between parameters and statistics.
• Distinguish between quantitative and qualitative variables.
• Determine the level of measurement.
• Compare descriptive and inferential statistics.
• Identify Sources of Measurement Variability
• Identify data Collection Methods ( reading Assignment
• Indentify Sampling Techniques
Data and Statistics
Data consists of information coming from observations, counts,
measurements, or responses.

Statistics is the subject that deals with the collection, [organization],


presentation, analysis and interpretation of numerical data.

 Statistics is a set of scientific principles and techniques that are useful in


reaching conclusions about populations and processes when the available
information is both limited and variable; that is, statistics is the science of
learning from data.
Biostatistics
• Application of mathematical statistics in the field of life
science
 Concerned with interpretation of biological data & the
communication of information derived from these data
 Has central role in life science investigations
Why we need to study Biostatistics?
Three reasons:
(1)Basic requirement of life science research.
(2)Update your life science knowledge.
(3)Data management and treatment.
Branches of Statistics
Depending on how data can be used statistics has two major
branches: Descriptive statistics and Inferential statistics.

Statistics

Descriptive Inferential
statistics statistics
Involves the Involves using a
organization, sample to draw
summarization, and conclusions about a
display of data. population.
Descriptive vs. Inferential Statistics
(1) If the intent of the study is to examine and
explore the information obtained for its own
intrinsic interest only, the study is
descriptive.

(2) If the information is obtained from a sample of


a population and the intent of the study is to
use that information to draw conclusions
about the population, the study is
inferential.
Types of Data
Data sets can consist of two types of data: qualitative data and
quantitative data.

Data

Qualitative/ Categorical Quantitative Data


Data

Consists of attributes, labels, Consists of numerical


or nonnumeric entries. measurements or counts.
Types of Quantitative variables
Quantitative variables are of two types:

Quantitative
variables

Discrete: Continuous
which can assume only which can assume any
certain values, and there value within a specific
are usually "gaps" range, such as the air
between the values, such pressure in a tire.
as the number of
bedrooms in your house
Caution!!!

• Most quantitative variables can be asked in such a


way as to make them a categorical variable.
– Example: Age:
• Please choose the category that describes how old you
are
– 0-19 20-39 40-59 60-79 OVER 80
There are three kinds of lies:
Lies, Damned Lies and Statistics
^
Misused Statistics
Uses and Misuses of Statistics in Research

 It is clear that statistics plays a fundamental role in


scientific research however there are some common
misuse areas:
A. Mesearument and Scaling
B. The concept of sampling
C. Descriptive vs. Inferential Statistics
D. Graphs and data visualizations
E. Regression Related Misuses
F. Data Reduction: Multivariate Aspect
IMPORTANCE OF STATISTICS IN SCIENTIFIC RESEARCH

1) Statistics guide researchers in the direction for


proper characterization, summarization,
presentation and interpretation of the results from
their research.

2) Statistics is very important when it comes to the


conclusion of a research project.

3) Statistical tests help to answer research questions.


MEASUREMENT
Levels of Measurement
The level of measurement determines which statistical
calculations are meaningful. The four levels of measurement are:
nominal, ordinal, interval, and ratio.

Nominal
Levels Lowest to
Ordinal highest
of
Measurement Interval
Ratio
Levels of Measurement

1. Nominal Scale 3. Interval Scale

2. Ordinal Scale

4. Ratio Scale
Nominal Scale
A nominal scale is a figurative labeling scheme in which the numbers
serve only as labels or tags for identifying and classifying objects.

Classifies data according to a category only.


 E.g., which color people select.
Colors differ qualitatively not quantitatively.

A number could be assigned to each color, but it


would not have any value.
The number serves only to identify the color.

No assumptions are made that any color has


more or less value than any other color.
Nominal Scale
 Assign subjects to groups or categories
 No order or distance relationship
 No arithmetic origin
 Only count numbers in categories
 Only present percentages of categories
 Chi-square most often used test of statistical
significance
Ordinal Scale
An ordinal scale is a ranking scale in which numbers are assigned to
objects to indicate the relative extent to which the objects possess some
characteristic.

Classifies data according to some order or ank; e.g.


names ordered alphabetically
With ordinal data, it is fair to say that one response
is greater or less than another.

E.g. if people were asked to rate the hotness of 3 chili


peppers, a scale of "hot", "hotter" and "hottest" could
be used. Values of "1" for "hot", "2" for "hotter" and
"3" for "hottest" could be assigned.
The gap between the
items is unspecified.
Ordinal Scale

Can include opinion and preference scales


Median but not mean
No unique, arithmetic origin
Items cannot be ranked
Ordering is the sole property
In some research practice, ordinal scale variables are often
treated as interval scale variables
Interval Scale
Numerically equal distances on the scale represent equal
values in the characteristic being measured. An interval
scale contains all the information of an ordinal scale, but it
also allows you to compare the differences between objects.
Assumes that the measurements are made in equal units.
i.e. gaps between whole numbers on the scale are equal.
e.g. Fahrenheit and Celsius temperature scales
An interval scale does not have a true zero.
e.g. A temperature of "zero" does not mean that there is no
temperature...it is just an arbitrary zero point.
Can’t perform full range of arithmetic equations.
40 degrees is not twice as hot as 20 degrees
Permissible statistics: count/frequencies, mode, median, mean,
standard deviation
Ratio Scale
In ratio scales we can identify or classify objects, rank the
objects, and compare intervals or differences. It is also
meaningful to compare ratios of scale values.

Indicates actual amount of variable


o Shows magnitude of differences between points
o on scale
o Shows proportions of differences
All statistical techniques useable
Most powerful with most meaningful answers
Allows comparisons of absolute magnitudes
Allows you to compare differences between numbers.
If a train journey takes 2 hours and 30 min, then this is half as
long as a journey which takes 5 hours.
Summary of Levels of Measurement

Determine if one
Arrange
Level of Categorical Subtract data value is a
data in
measurement Data data values multiple of
order
another
Nominal Yes No No No
Ordinal Yes Yes No No
Interval No Yes Yes No
Ratio No Yes Yes Yes
USAGE POTENTIAL OF VARIOUS
LEVELS OF DATA

Ratio
Interval
Ordinal
Nominal
Data Level, Operations, and Statistical Methods

Statistical
Data Level Meaningful Operations
Methods

Nominal Classifying and Counting Nonparametric

Ordinal All of above plus Ranking Nonparametric

Interval All of above plus Addition, Subtraction, Parametric


Multiplication but no Division

All of the above and Division


Ratio Parametric
A study can only be as good as the data . . .

-Martin Bland
Reproducibility vs Validity
• Reproducibility
– the degree to which a measurement provides the
same result each time it is performed on a given
subject or specimen

• Validity
– from the Latin validus - strong
– the degree to which a measurement truly
measures (represents) what it purports to
measure (represent)
Reproducibility vs Validity
• Reproducibility
– reliability, repeatability, precision, variability,
dependability, consistency, stability

• Validity
– accuracy
Relationship Between Reproducibility and
Validity

Good Reproducibility Poor Reproducibility


Poor Validity Good Validity
Relationship Between Reproducibility and
Validity

Good Reproducibility Poor Reproducibility


Good Validity Poor Validity
Sources of Measurement Variability

• Observer
• within-observer
• between-observer
• Instrument
• within-instrument
• between-instrument
• Subject
• within-subject
Experimental Design
Designing a Study
GUIDELINES:
1. Identify the variable(s) of interest (the focus) and
the population of the study.
2. Develop a detailed plan for collecting data. If you
use a sample, make sure the sample is
representative of the population.
3. Collect the data.
4. Describe the data.
5. Interpret the data and make decisions about the
population using inferential statistics.
6. Identify any possible errors.
Methods of Data Collection

In an observational study, a researcher observes and measures


characteristics of interest of part of a population.
In an experiment, a treatment is applied to part of a population,
and responses are observed.
A simulation is the use of a mathematical or physical model to
reproduce the conditions of a situation or process.
A survey is an investigation of one or more characteristics of a
population.
A census is a measurement of an entire population.

A sampling is a measurement of part of a population.


Reasons for Sampling
– Reduced cost
– Greater speed
– Greater accuracy
– Greater scope
– Avoids destructive test
– The only option when the population is infinite
Sometimes taking a census makes more sense than using a
sample. Some of the reasons include:
• Universality
• Qualitativeness
• Detailedness
• Non-representativeness
Sampling Techniques
• There are two types of sampling tech.
A)Random Sampling
Simple Random Sampling
Stratified Random Sampling
Cluster Random Sampling
Systematic Random Sampling
B)Non-Random Sampling
• Judgment sampling.
• Convenience sampling
• Quota Sampling.
Simple Random Sampling

– Selected by using chance or


random numbers
– Each individual subject (human
or otherwise) has an equal
chance of being selected
– Examples:
• Drawing names from a hat
• Random Numbers
How to Use the Random Number Table
• Number each member of the population 1 to N.
• Determine the population size and the sample size.
• Select a starting point on the random number table.
• Choose a direction in which to read (up to down, left to right, or right
to left).
• Select the first n numbers (however many numbers are in your
sample) whose last X digits are between 0 and N. For instance, if N
is a 3 digit number, then X would be 3.
• Continue this way through the table until you have selected your
entire sample, whatever your n is. The numbers you selected then
correspond to the numbers assigned to the members of your
population, and those selected become your sample.
How to Select SRS Using A Computer Program
• There are several computer programs that can assign numbers and
select n random numbers quickly and easily. Excel: Randbetween(1, N)
Example 1:
Example 2:
Example 3:
Stratified Random Sampling
A stratified sample has members from each segment of a
population. This ensures that each segment from the population is
represented.
Stratified Random Sampling
Steps in Using Stratified Sampling for Prev. Example
Cluster Random Sampling
A cluster sample has all members from randomly selected
segments of a population. This is used when the population falls
into naturally occurring subgroups.

All members in
each selected
group are used.

 Used extensively by government and private research organizations


Cluster Random Sampling
STEPS IN USING CLUSTER SAMPLING
FOR PREV. EXAMPLE

STEPS IN USING CLUSTER SAMPLING


FOR PREV. EXAMPLE
Systematic Random Sampling
A systematic sample is a sample in which each member of the
population is assigned a number. A starting number is randomly
selected and sample members are selected at regular intervals.

Every fourth member is chosen.


Convenience Samples
A convenience sample consists only of available members of the
population.  Uses subjects that are easily accessible
Examples:
 Using family members or students in a classroom
 Mall shoppers
Example
You are doing a study to determine the number of years of water treatment
education each teacher at your college has. Identify the sampling technique
used if you select the samples listed..
1.) You randomly select two different departments and survey each teacher
in those departments. cluster sample

2.) You select only the teachers you currently have this semester.
convenience sample
3.) You divide the teachers up according to their department and then
choose and survey some teachers in each department. stratified sample

You might also like