0% found this document useful (0 votes)
12 views39 pages

Lecture-1 - Ch-1 - Basic Concept

The document outlines a biostatistics course, covering key topics such as data organization, measures of central tendency, probability, sampling techniques, and hypothesis testing. It defines statistics and biostatistics, emphasizing their importance in health-related research and decision-making. Additionally, it discusses various data collection methods and the applications, uses, and limitations of statistics in real-world scenarios.

Uploaded by

ROBUST
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views39 pages

Lecture-1 - Ch-1 - Basic Concept

The document outlines a biostatistics course, covering key topics such as data organization, measures of central tendency, probability, sampling techniques, and hypothesis testing. It defines statistics and biostatistics, emphasizing their importance in health-related research and decision-making. Additionally, it discusses various data collection methods and the applications, uses, and limitations of statistics in real-world scenarios.

Uploaded by

ROBUST
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 39

Biostatistics

Abebe N.
Department of Statistics, Jimma University
Email: [email protected]
October, 2024

Jimma, Ethiopia

1
Outlines of the course (chapters)

1. Introduction
2. Methods of data organization and presentation
3. Measures of central tendency and measures of variation
4. Elementary probability and probability distribution
5. Sampling and sampling technique
6. Estimation and hypothesis testing

2
Chapter One: Introduction

At the end of this chapter; students will be able to know:


o Define Statistics, population, census, sample survey,

parameter, and variable.


o Distinguish descriptive statistics and inferential statistics.

o Identify the types of variables and level of measurement.

o Identify applications, uses, and limitations of statistics

o Type and method of data collection

3
1. Introduction
What is statistics?
o Statistics is defined as the systematic collection, organization,
analysis and interpretation of data for the purpose of
 Answering a question
 Solving a problem or
 Adding body of knowledge

o Statistics is a crucial and indispensable tool to arrive an evidence


based policy and can help to solve complex problems related to
health, food, and livelihood, which mean a science of learning
from data.
o Researcher generate valid evidence for program planning, policy
making, intervention, evaluation and decision making.

4
What is biostatistics? Which is the branch of statistics directed
toward applications in the biological sciences and medicine .
o Because some statistical methods are more heavily used in health
applications than elsewhere.
o Biostatistics provides the most fundamental tools and techniques
of the scientific methods for generating information used for:
 Gathering and summarizing data
 Forming and testing hypotheses
 Designing experimental and observational studies
 Drawing inferences from data.
 To help scientists and health professionals make informed
decisions based on data.
Example: The determination of major risk factors for heart disease,
lung disease and cancer. Testing of new drugs to combat AIDS.
5
Classification of statistics
Depending on how data can be used, statistics divide into:
o Descriptive statistics: Summarize and describe the main feature of
the observed data using some statistical measure ( i.e. Sample
mean, sample variance, sample proportion, ...) and diagrams(i.e.
graphs, charts, tables).
o Inferential statistics: generalizes the finding from the sample to
the population: Includes
 Estimation and hypothesis testing
 Determining relationships
 Making prediction
o Inferential statistics use probability theory to estimate how likely
the conclusions drawn from the sample are to be true for the
entire population.
6
Inference …. Cont’d
o It is important because statistical data usually arises from
sample
o Valid inferential statistics requires a strong link between the
sample and the population about which one wishes to draw
conclusions.
Valid inferential statistics requires:
 Correct statistical methodology
 Correct interpretation of results
 Statistical techniques based on probability theory are
required.

7
Stages of scientific investigation
1. Collection of data: the processes of measuring or gathering
raw data to meet predefined objectives.
 Hence you have care should exercised (Garbage data
garbage result)
Data can be collected in a variety of ways; most common
choices include;
 Observation
 Interview : Face-to-face Vs Telephone
 Self Administered Questionnaire
2. Organization of data: Arrange and organize the data based on
some common characteristics. Its necessary to
 Edit the data to minimize recording error
 Classification and
 Tabulation of data.
8
Cont’d……stage
3. Presentation of the data:
 Provides an overview of what the data actually looks like.
 Makes it easier for the audience to understand and interpret.
 Can make informed decisions, draw conclusions, or take
appropriate actions based on the data.
 It can be done in the form of tables, graphs/diagrams.
4. Analysis of data:
o To dig out useful information for decision-making and extracting
relevant information from the data(like mean, median, mode,
range, and variance).
o The processes of cleaning, transforming, and modeling data to
discover useful information.

9
Cont’d……stage
5. Interpretation of data:
o Concerned with drawing conclusions from the observed
data and analyzed and giving meaning to the results.
o Generalize the finding from the sample to the target

population.

10
Definition of some basic terms
o Target population: is the entire group of individual which inferences are to
be made. Represents the target of an investigation, and the objective of the
investigation is to draw conclusions about the population.
E.g. A researcher conducted a study on the prevalence of HIV among orphan
children in Ethiopia; a random sample of orphan children in some
selected town was included. All orphan children in Ethiopia are target
populations.
o Study Population: a group of individual from target population who are
available and willing to participate.
Orphan children in selected town of Ethiopia.
 In research, it is not practical to include all members of a population.
o Sample: A subset of a study population, about which information is actually
obtained.
Selected orphan children from those towns participating in the study.
o Sampling: The process or method of selection of samples from the
population is called Sampling.

11
Cont’d…….Basic term

Collect information from a


Draw conclusions about a
comparatively sample
rather large population
sample

o To draw valid conclusions from your results, you have to carefully decide
how you will select a sample that is representative of the target population.
12
Defn…Cont’d
 Sample size: The number of elements or observation to be
included in the sample.
 Survey: a method of collecting data from a subset of individuals
to gather insights about a larger population.
 Reduced cost
 Save time and energy
 Greater accuracy
 Census: complete enumeration every individual in the entire
population. It is the collection of data from every element in a
population.
 Parameter: Characteristic or measure obtained from a
population.
 Statistic: value computed from observed data and used to
estimate the parameter. Because it may not always feasible
13
directly measure.
Defn…Cont’d
Variable: A characteristic or quantity that can be measured or
quantified. It can takes different values in different persons,
places, or things.
 Any aspect of an individual or object that is measured (e.g., BP)
or recorded (e.g., age, sex) and takes any value.
 E.g., A study of treatment outcome of TB: Hospital name, date
of birth, sex, date of diagnosis, weight (kg), smear result
(Positive, negative or uncertain), culture result (negative,
positive), cured after 6 months (yes/no).

14
Defn…Cont’d
 Qualitative Variables: Describe qualities or characteristics and
used to categorize or label distinct groups or categories.
 Are non- numeric variables and can't be measured.
Example: Gender of patients(M, F), Martial status, patients’
health status, hair color.
 They can be reassigned as numeric values (male =1, female =2
but they are still intrinsically qualitative)
 Quantitative Variables: Are numerical variables and can be
measured or counted.
Example: patients age, patients’ weight, BP of a patients in the
hospital.
Quantitative Variables can be classified as:
 Discrete variable
15  Continuous variable
Defn…Cont’d
o Discrete variable; is a variable which can take countable values.
 Take distinct, separate values with no intermediate values in
between.
e.g. Number of daily admissions to a hospital
o Continuous variable has a set of possible values including all
values in an interval of the real line.
 Can take any value within a range, including fractional or
decimal values.
 Heights of pre-school children
 Weight of participants
 Age of group of individual
 Blood pressure reading (mm Hg)

16
Defn…Cont’d
Variables can be again classified in to two broad categories

o Outcome variable

 Can be also called response or dependent variable

 It is the focus of the research

 Affected by other (independent) variables


o Predictor variables

 Can be also called explanatory or independent variable

 Affects the outcome variable and manipulates or categorizes

to examine its influence.


17
Scales of Measurement
o Measurement scale refers to the property of value assigning to
the data based on the properties of order, distance and fixed
zero
1. The Nominal Scales: Consists of ‘naming’ observations or
classifying them into various mutually exclusive categories.
 The values fall to unordered categories or classes
 Numbers are often used to represent the categories
Property:
 Categories are mutually exclusive & exhaustive
 No logical ordering
o In a certain study, for instance, males might be assigned the
value 1 and females the value 0,Marital status, ethnicity,
religion, health situation, Blood type.

18
Scale of …Cont’d
2. Ordinal Scales
 When the order among categories becomes important, the
observations are referred to as ordinal data.
Property:
 Ordinal data possess the property of order, but not the property
of distance & fixed zero.
 Arithmetic operations are not applicable but relational
operations are applicable.
 Ordering is the sole property of ordinal scale.
 For example injuries may be classified as (severe, moderate, and
minor).
Blood pressure (high, good, low)
Patients satisfaction rate, education level, etc.

19
Scale of …Cont’d
3. The Interval Scales: assigns each measurement to one of an
unlimited number of categories that are equally spaced.
 That possess the properties of equal interval between values,
but not the property of fixed zero (does not indicate the
absence of the quantity being measured)
 All arithmetic operations except division and multiplication
are applicable.
 Relational operations are also possible.
Examples
 IQ
 Temperature in 0c ,0F
 Grading scale

20
Scale of …Cont’d
4. The Ratio Scale: is characterized by meaningful compression
in term of ratios as well as equality of intervals may be
determined and usually used in quantitative data.
 Level of measurement which classifies data that can be
ranked, differences are meaningful, and there is a true zero.
 True ratios exist between the different units of measure.
 All arithmetic and relational operations are applicable.

Examples: time, the serum cholesterol level of a patient, height,


weight, length, blood pressure ,etc.
Question: when to use nominal, ordinal, interval, and ratio scales
of measurement; also, which type of statistical analysis is
appropriate for each scale of measurements?
21
Applications, Uses and Limitations of statistics
Why do we study biostatistics, where can we use it in real life?
 Its essential for designing and analyzing clinical trials.
 Test the efficacy and safety of new treatments or interventions.
 Its crucial tools to determine the effectiveness of new drug or an innovative
biological device.
 Help to determine sample sizes, randomization processes and outcome measures.

 It helps to study the distribution and determinants of health related issue in populations.
 Identify risk factors for diseases
 Disease outbreaks, and evaluate public health interventions

 It helps to analyze time to event data, such as


 Time until disease recurrence or death
 Understanding progression and treatment effectiveness

 Overall, statistics provides a systematic and objective way to analyze data and make
decisions based on that data, and is widely used in a variety of fields to inform decision-
making and improve outcomes
22
Uses of statistics
 The following are some uses of statistics:

 It presents facts in a definite and precise form.


 Data reduction/dimension redaction.
 Measuring the magnitude of variations in data.
 Estimating unknown population characteristics.
 Testing and formulating of hypothesis.
 Studying the relationship between two or more variable.
 Forecasting future events.

23
Limitations of statistics

 Statistics deals with aggregates of facts, so it does not apply to

a single observation, i.e., statistics are true on average only.


 Statistical data are true only approximately. They are always

subject to a certain amount of error/uncertainty.


 Statistics deal with quantitative data only. Even qualitative data

is converted into numerical data by the method of ranking,


scoring, or scaling.

24
Cont’d…limitation
 Statistics can be misused in the following ways

 They can be used for wrong purposes

 They can be collected incorrectly and so are biased.

 They can be analyzed carelessly and the results obtained

are misleading.

 Therefore ,it should be used by experts.

25
2. Types and Methods of Data Collection
Data: is the raw material of statistics.
 It is numerical fact and it can be obtained either by measurement or
counting.
 The statistical data(numerically expressed, aggregate of facts, collected in a
systematic manner, collected for a predetermined purpose, estimated
according to the reasonable standards of accuracy) may be classified under
two categories depending up on the sources;
Primary data:
 are those data which are collected by the investigator himself for the
purpose of a specific inquiry or study.
 unique until published, no one else has access to it.
Secondary data:
 when an investigator uses data which have already been collected by
others.
Note: Data which are primary for one may be secondary for the other.

26
Primary data collection
Planning:
 Identify source and elements of the data.

 Decide whether to consider sample or census.

 If sampling is preferred, decide on sample size, selection method,… etc

 Decide measurement procedure.

 Set up the necessary organizational structure.

Methods of primary data collecting:


 Observation
 Questionnaire
 Interviews (Face-to-Face, Telephone)
 Focus Group Discussion (FGD)
 Experiment

27
Observation
Watching people engaged in activities and recording what occurs.
i.e Jot down the wanted information
Advantages:-
 Gives relatively more accurate data on behavior and activities
 Collection of information on facts.
Disadvantages:-
 Investigators or observers own bias, prejudice
(discrimination), desires,
 It needs more resources and skill human power during the use
of high level machines.

28
Interview
A. Personal interview (face to face)
Data collection through oral conversations
Advantages:
 Serious approach by respondent resulting in accurate
information
 Good response rate
 Completed and immediate
 Interviewer in control and can give help if there is a problem
 Can use recording equipment
 Characteristics of respondent assessed – tone of voice, facial
expression, hesitation, etc.

29
Interview… Cont’d
Disadvantages:
 Time consuming
 Geographic limitations
 Can be expensive
 Normally need a set of questions
 Respondent bias – tendency to please or impress, create false
personal image, or end interview quickly
 Embarrassment possible if personal questions
 training is required

30
Interview… Cont’d
B. Telephone interview
Advantages:
 Quick
 Can cover reasonably large numbers of people or organizations
 Wide geographic coverage
 High response rate – keep going till the required number
 No waiting
 Spontaneous response
 Help can be given to the respondent

31
Interview… Cont’d
Disadvantages:
 Not everyone has a telephone
 Questionnaire required
 Repeat calls are inevitable
 Straightforward questions are required
 Respondent has little time to think
 Cannot use visual aids
 Can cause irritation
 Good telephone manner is required

32
Experimental
 Desired information is also be collected from conducting an
experiment in laboratories or experiment cites.
 Manipulating one or more independent variables to determine
their effect on a dependent variable.
 Biologist, physics, chemists and other natural scientist obtain
the required data from laboratories
 Scientist may take a sample blood and examine about the blood
group, the hemoglobin content, the nature and amount of Red
and White blood cells.
 Agriculturist may study the soil ingredient in a particular area.
Example: A teacher who wants to study whether a new
methodology of teaching is superior to the old one. He /She may
divide the students into experimental and control groups.

33
Focus group discussion
 Method used to gather insights, opinions, and perceptions from
a targeted group of participants.
 It is usually conducted by inviting six to ten people to gather for
a few hours with a trained moderator to talk about a product,
service or organization.
 The moderator needs objectivity, knowledge of the subject and
industry, and some understanding of group and consumer
behavior.
 The moderator starts with a broad question before moving to
more specific issues, encouraging open and easy discussion to
bring out true feelings and thoughts.
 The meeting is held in a pleasant place, and refreshments are
served to create a relaxed environment.

34
Con‘d….FGD
Advantages:
 Quick result and cost-effective
 Groups may generate important issues
 Ideas as how to proceed with the study may be generated
Disadvantages:
 Topic of discussion may be missed
 The discussion my be manipulated by the moderator
 Needs well trained professionals

35
Questionnaire
 An instrument (form) consisting of a series of questions designed
to gather information from respondents
 A series of written questions/items in a fixed, rational order.
Advantages:
 Can cover a large number of people or organizations
 No prior arrangements are needed
 No interviewer bias
 Great impersonality
Disadvantages:
 Little opportunity to use visual aids
 Low response rate
 Can’t reach all type of people

36
Designing a questionnaire
 Questions should be simple
 Questions should be unambiguous
 The best kinds of questions are those which allow a pre-printed
answer to be ticked
 The questionnaire should be as short as possible
 Questions should be neither irrelevant nor too personal
 Questions should have a logical sequence.
 Leading questions should be avoided
Example. “How would you describe the taste of our new ice-
cream?” You then provide the following response categories:
Super Excellent
Great Pretty good

37
Types of questions
Closed ended questions
 A question is asked and then a number of possible answers are
provided for the respondent. The respondent selects the answer
which is appropriate.
 Sex: Male [ ] Female [ ]
 Did you watch television last night? Yes [ ] No [ ]
Open ended questions
 It allows the respondent to elaborate upon an earlier more
specific question.
 Permit free responses
 Not allowed any possible answers to choose from.
 Mostly used for the investigation of facts which the
researcher is not familiar.
38
Secondary data collection
 A data that has already been collected by someone else for a
different purpose.
For example, annual company reports, Government statistics, and
Health care records.
Where has the data come from?
In this case data were obtained from already collected sources like
newspaper, magazines, CSA, DHS, hospital records and
existing data like;
 Mortality reports
 Morbidity reports
 Epidemic reports
 Reports of laboratory utilization (including laboratory test
results)
39

You might also like