0% found this document useful (0 votes)
11 views26 pages

Chapter 1 Introduction To Biostatistics

The document serves as a reference book on biostatistics, focusing on the foundational concepts and methodologies used in health sciences research. It covers key topics such as data types, sampling methods, measurement scales, and the scientific method in experimental design. The learning outcomes aim to equip students with the ability to understand biostatistical terminology, select samples, and utilize computer tools for data analysis.

Uploaded by

alaasameh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views26 pages

Chapter 1 Introduction To Biostatistics

The document serves as a reference book on biostatistics, focusing on the foundational concepts and methodologies used in health sciences research. It covers key topics such as data types, sampling methods, measurement scales, and the scientific method in experimental design. The learning outcomes aim to equip students with the ability to understand biostatistical terminology, select samples, and utilize computer tools for data analysis.

Uploaded by

alaasameh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 26

BIOSTATISTICS

AND RESEARCH METHODS

1
Reference Book

BIOSTATISTICS
A Foundation for Analysis in the Health Sciences

2
Chapter 1: Introduction to Biostatistics 3
4
CHAPTER 1
INTRODUCTION TO BIOSTATISTICS

SOME BASIC CONCEPTS

SAMPLING AND STATISTICAL INFERENCE

THE SCIENTIFIC METHOD AND THE DESIGN OF EXPERIMENTS

COMPUTERS AND BIOSTATISTICAL ANALYSIS

5
LEARNING OUTCOMES

After studying this chapter, the student will:

1. understand the basic concepts and terminology of biostatistics, including the various
kinds of variables, measurement, and measurement scales.
2. be able to select a simple random sample and other scientific samples from a
population of subjects.
3. understand the processes involved in the scientific method and the design of
experiments.
4. appreciate the advantages of using computers in the statistical analysis of data
generated by studies and experiments conducted by researchers in the health sciences.

6
1.1 SOME BASIC CONCEPTS
Data are characteristics or informations, usually numerical, that are collected through
observation. The raw material of statistics is Data.
For example, when a nurse weights a patient or takes a patient’s temperature, a
measurement, consisting of a number such as, for weight 150 pounds (68 kg) or 100 degrees
Fahrenheit (37.8 °C), is obtained (continuous). Sometimes the Gender type, Male or
Female, or nationality, or skin colour (Qualitative categorical).
Quite a different type of number is obtained when a hospital administrator counts the
number of patients—perhaps 20—discharged from the hospital on a given day (discrete).
Statistics: is the science that interested to find scientific methods for collecting, organizing,
summarizing, presenting, and analyzing data, as well as drawing valid conclusions and making
relevant decisions on the basis of such analysis.
More concretely, however, we may say that statistics is a field of study concerned with
(1) the collection, organization, summarization, and analysis of data; and
(2) the drawing of inferences about a body of data when only a part of the data is observed.
The tools of statistics are employed in many fields—business, education, psychology,
agriculture, and economics, to mention only a few.
Biostatistics : when the data analyzed are derived from the biological sciences and medicine,
we use the term biostatistics to distinguish this particular application of statistical tools.
Biometry A branch of biology that studies biological phenomena and observations by means
of statistical analysis. The study of biology involving mathematical techniques or statistics to
provide insight to biological queries. 7
1.1 SOME BASIC CONCEPTS

Variable : If as we observe a characteristic, we find that it takes on different values in


different persons, places, or things, we label the characteristic a variable.
A variable: is a term used to denote anything that varies within a set of data.
For example of variables include diastolic blood pressure, heart rate, the heights of adult
males, the weights of preschool children, and the ages of patients seen in a dental clinic.
A constant is a value which remains unchanged
Historical Data Sources of Data Field Data

OR

Surveys Experiments External sources


Routinely
The administrator of a A nurse may wish The data needed to
kept records
clinic wishes to obtain to know which of answer a question may
Hospital medical
information regarding several strategies is already exist in the
records, for
the mode of best for form of published
example, contain
transportation used by maximizing patient reports, commercially
immense amounts
patients to visit the compliance available data banks, or
of information
clinic. the research literature.
on patients

8
1.1 SOME BASIC CONCEPTS
Variables

Qualitative Variables: Quantitative Variable:


Some characteristics can be categorized is one that can be measured in the usual
only, as, for example, when an ill person sense. We can, for example, obtain
is given a medical diagnosis, a person is measurements on the heights of adult
designated as belonging to an ethnic males, the weights of preschool children,
group, or a person, place, or object is and the ages of patients seen in a dental
said to possess or not to possess some clinic.
characteristic of interest. Measurements Measurements made on quantitative
made on qualitative variables convey variables convey information regarding
information regarding attribute. amount.
Random Variable : Whenever we determine the height, weight, or age of an individual, the
result is frequently referred to as a value of the respective variable. When the values
obtained arise as a result of chance factors, so that they cannot be exactly predicted in
advance, the variable is called a random variable

9
1.1 SOME BASIC CONCEPTS

Population:
a population of entities (objects) as the largest collection of entities (objects) for which we
have an interest at a particular time, or is the total set of observations that can be made. For
example, if we are studying the weight of adult women at Cairo, the population is the set of
weights of all the women in at cairo. If we are studying the grade point average (GPA) of
students at Al-Azhar, the population is the set of GPA's of all the students at Al-Azhar.
A population of values as the largest collection of values of a random variable for which
we have an interest at a particular time
Finite and infinite Populations : a population of values consists of a fixed number of
these values, the population is said to be finite. If, on the other hand, a population consists
of an endless succession of values, the population is an infinite on
A sample may be defined simply as a part of a population
Suppose our population consists of the weights of all the elementary school children
enrolled in a certain county school system. If we collect for analysis the weights of only a
fraction of these children, we have only a part of our population of weights, that is, we have
a sample.

10
1.2 MEASUREMENT AND MEASUREMENT SCALES
The Nominal Scale
The lowest measurement scale is the nominal scale.
As the name implies it consists of “naming” observations or classifying them into various
mutually exclusive and collectively exhaustive categories.
Some examples include such dichotomies as male–female, well–sick, under 60 years
(youngest) age–60 and over (oldest) , child–adult, and married–not married, smokers and non-
smokers.
The nominal scale organizes data into mutually exclusive categories, but the categories have
no rank, order, or value.

The Ordinal Scale


Whenever observations are not only different from category to category but can be ranked
according to some criterion, they are said to be measured on an ordinal scale.
For example, a much improved patient is in better health than one classified as improved,
while a patient who has improved is in better condition than one who has not improved.
The ordinal scale organizes data into mutually exclusive categories that are rank ordered
based on a criterion, but the difference between ranks is not necessarily equal in value

11
1.2 MEASUREMENT AND MEASUREMENT SCALES

The Interval Scale


The interval scale is a more sophisticated scale than the nominal or ordinal in that with this
scale not only is it possible to order measurements, but also the distance between any two
measurements is known.
For example, the temperature can be – 10-degree Celsius.
Interval data cannot be multiplied or divided, however, it can be added or subtracted.
A simple example of interval data: The difference between 100 degrees Fahrenheit and 90
degrees Fahrenheit is the same as 60 degrees Fahrenheit and 70 degrees Fahrenheit. The
energy content at 80 Fahrenheit degrees is not twice at 40 Fahrenheit degrees

The Ratio Scale


Ratio scale is the 4th level of measurement and possesses a zero point or character of origin.
Example of ratio data is the measurement of heights. Height could be measured in
centimeters, meters, inches, or feet. It is not possible to have a negative height. “one object
has twice the length of the other” or “is twice as long.”
In ratio scale, variables can be systematically added, subtracted, multiplied and divided (ratio).
The ratio scale is a measurement that contains all the characteristics of other scales and
also has an absolute zero point determined by nature. 12
1.3 SAMPLING AND STATISTICAL INFERENCE

DEFINITION:
Statistical inference is the procedure by which we reach a conclusion about a population on
the basis of the information contained in a sample that has been drawn from that population

There are many kinds of samples that may be drawn from a population.

DEFINITION (simple random sample):


If a sample of size (n) is drawn from a population of size (N) in such a way that every
possible sample of size n has the same chance of being selected, the sample is called a simple
random sample.
Example: a short study designed to discuss the effectiveness on smoking cessation of
bupropion SR, a nicotine patch, or both, when co-administered with cognitive-behavioral
therapy. Consecutive consenting patients assigned themselves to one of the three treatments.
For illustrative purposes, let us consider all these subjects to be a population of size N =189.
We wish to select a simple random sample of size n=10 from this population whose ages are
shown in the following

13
Ages of 189 Subjects Who Participated in a Study on Smoking Cessation
ID Age ID Age ID Age ID Age ID Age ID Age ID Age ID Age
1 48 26 65 51 43 76 59 101 63 126 77 151 50 176 53
2 35 27 67 52 47 77 57 102 50 127 76 152 53 177 61
3 46 28 38 53 46 78 52 103 59 128 71 153 54 178 54
4 44 29 37 54 57 79 54 104 54 129 43 154 61 179 51
5 43 30 46 55 52 80 53 105 60 130 47 155 61 180 62
6 42 31 44 56 54 81 62 106 50 131 48 156 61 181 57
7 39 32 44 57 56 82 52 107 56 132 37 157 64 182 50
8 44 33 48 58 53 83 62 108 68 133 40 158 53 183 64
9 49 34 49 59 64 84 57 109 66 134 42 159 53 184 63
10 49 35 30 60 53 85 59 110 71 135 38 160 54 185 65
11 44 36 45 61 58 86 59 111 82 136 49 161 61 186 71
12 39 37 47 62 54 87 56 112 68 137 43 162 60 187 71
13 38 38 45 63 59 88 57 113 78 138 46 163 51 188 73
14 49 39 48 64 56 89 53 114 66 139 34 164 50 189 66
15 49 40 47 65 62 90 59 115 70 140 46 165 53
16 53 41 47 66 50 91 61 116 66 141 46 166 64
17 56 42 44 67 64 92 55 117 78 142 48 167 64
18 57 43 48 68 53 93 61 118 69 143 47 168 53
19 51 44 43 69 61 94 56 119 71 144 43 169 60
20 61 45 45 70 53 95 52 120 69 145 52 170 54
21 53 46 40 71 62 96 54 121 78 146 53 171 55
22 66 47 48 72 57 97 51 122 66 147 61 172 58
23 71 48 49 73 52 98 50 123 68 148 60 173 62
24 75 49 38 74 54 99 50 124 71 149 53 174 62 14
25 72 50 44 75 61 100 55 125 69 150 53 175 54
1.3 SAMPLING AND STATISTICAL INFERENCE

Simple Random Sample

Example: A simple random sample: 10 random numbers drawn from the subject
number
Sample Subject 1 2 3 4 5 6 7 8 9 10
Number
Random number 137 114 155 183 185 028 085 181 018 164
Corresponding Age 43 66 61 64 65 38 59 57 57 50

Systematic Sampling
A random numbers table is then employed to select a starting point in the file system.
The record located at this starting point is called record 𝑥. A second number,
determined by the number of records desired, is selected to define the sampling
interval (call this interval 𝑘).
Consequently, the data set would consist of records 𝑥, 𝑥 + 𝑘, 𝑥 + 2𝑘, 𝑥 + 3𝑘, and
so on, until the necessary number of records are obtained.
15
1.3 SAMPLING AND STATISTICAL INFERENCE

Example: Sample of 10 Ages Selected Using a Systematic Sample from the Ages
Sample Subject 1 2 3 4 5 6 7 8 9 10
Number
Random number 4 22 40 58 76 94 112 130 148 166
Age 44 66 47 53 59 56 68 47 60 64

Our starting point, 𝑥 =4, we need 10 observations, then 𝑘 = 185/10 ,


approximately 𝑘 = 18.
Stratified Random Sampling: it may be desirable to partition a population of
interest into groups, or strata, in which the sample units within a particular stratum
are more similar to each other than they are to the sample units that compose the
other strata. After the population is stratified, it is customary to take a random
sample independently from each stratum. This technique is called stratified random
sampling.

The resulting sample is called a stratified random sample.


16
1.4 THE SCIENTIFIC METHOD AND THE DESIGN OF EXPERIMENTS
DEFINITION: A research study is a scientific study of a phenomenon of interest. Research
studies involve designing sampling protocols, collecting and analyzing data, and providing valid
conclusions based on the results of the analyses.

DEFINITION: Experiments are a special type of research study in which observations are
made after specific manipulations of conditions have been carried out; they provide the
foundation for scientific research.
DEFINITION: The scientific method is a process by which scientific information is
collected, analyzed, and reported in order to produce unbiased and replicable results in an
effort to provide an accurate representation of observable phenomena
The scientific method is recognized universally as the only truly acceptable way to produce
new scientific understanding of the world around us. It is based on an empirical approach, in
that decisions and outcomes are based on data. There are several key elements associated with
the scientific method, and the concepts and techniques of statistics play a prominent role in
all these elements.
The first step: Making an Observation.
An observation is made of a phenomenon or a group of phenomena. This observation leads to
the formulation of questions or uncertainties that can be answered in a scientifically rigorous way.
For example, it is readily observable that regular exercise reduces body weight in many people. It is
also readily observable that changing diet may have a similar effect. In this case there are two
observable phenomena, regular exercise and diet change, that have the same endpoint. The nature
of this endpoint can be determined by use of the scientific method. 17
1.4 THE SCIENTIFIC METHOD AND THE DESIGN OF EXPERIMENTS

The second step: Formulating a Hypothesis.


A hypothesis is formulated to explain the observation and to make quantitative
predictions of new observations. Often hypotheses are generated as a result of
extensive background research and literature reviews.
Hypotheses may be stated as either research hypotheses or statistical hypotheses.
From the weight-loss example :
A research hypothesis would be a statement such as,
“Exercise appears to reduce body weight.”
There is certainly nothing incorrect about this conjecture, but it lacks a truly
quantitative basis for testing.
A statistical hypothesis may be stated using quantitative terminology as follows:
“The average (mean) loss of body weight of people who exercise is greater than the
average (mean) loss of body weight of people who do not exercise.”
In this statement a quantitative measure, the “average” or “mean” value, is
hypothesized to be greater in the sample of patients who exercise.
The role of the statistician in this step of the scientific method is to state the
hypothesis in a way that valid conclusions may be drawn and to interpret correctly
the results of such conclusions. 18
1.4 THE SCIENTIFIC METHOD AND THE DESIGN OF EXPERIMENTS

The third step : Designing an Experiment


The third step of the scientific method involves designing an experiment that will yield
the data necessary to validly test an appropriate statistical hypothesis. This step of
the scientific method, like that of data analysis, requires the expertise of a
statistician. Improperly designed experiments are the leading cause of invalid results
and unjustified conclusions.

The last step: Conclusion


In the execution of a research study or experiment, one would hope to have
collected the data necessary to draw conclusions, with some degree of confidence,
about the hypotheses that were posed as part of the design. It is often the case that
hypotheses need to be modified and retested with new data and a different design.
Whatever the conclusions of the scientific process, however, results are rarely
considered to be conclusive. That is, results need to be replicated, often a large
number of times, before scientific credence is granted them.

19
1.5 COMPUTERS AND BIOSTATISTICAL ANALYSIS

The widespread use of computers has had a tremendous impact on health sciences
research in general and biostatistical analysis in particular. The necessity to
perform long and tedious arithmetic computations as part of the statistical analysis
of data lives only in the memory of those researchers and practitioners whose
careers antedate the so-called computer revolution. Computers can perform more
calculations faster and far more accurately than can human technicians. The use of
computers makes it possible for investigators to devote more time to the
improvement of the quality of raw data and the interpretation of the results.

20
QUESTIONS AND EXERCISES
True or False.
Random sample is a sample in which all elements have an equal chance of being selected
(A) True (B) False:
A population of entities as the largest collection of entities for which we have an interest at
a particular time
(A) True (B) False
A sample is defined as a simply as a part of a population.
(A) True (B) False
The population is the aim of study, while sample is under study
(A) True (B) False

Statistic is a value calculated from a sample of observations


(A) True (B) False
Biostatistics is the application of statistical techniques to scientific research including Social
Science and commercial fields and the development of new tools to study these areas.
(A) True (B) False
A population defined as the largest collection of entities for which we have an interest at a
particular time.
(A) True (B) False 21
QUESTIONS AND EXERCISES
True or False
Biostatistics is defined as the application of statistical principles in medicine, public health, or biology
(A) True (B) False
A random sample defined as the largest collection of entities for which we have an interest at a particular
time.
(A) True (B) False
A random sample is a part of population
(A) True (B) False
The raw material of statistics is data.
(A) True (B) False
Age, Weight, and Height are categorical variables
(A) True (B) False
Data at the ordinal level are qualitative only
(A) True (B) False
The Different methods of Teeth Whitening (Photobleaching, Chemical bleaching, Mechanical method,
Endobleaching, Laser Whitening) is an example of categorical variables
(A) True (B) False
If 20 are selected randomly among 500 smoking persons to define the number of failed (decay) teeth, the
sample size equals 500.
(A) True (B) False
Dentists believe that “in general Fluoride suffers from tooth decay”, this believe represent a statistical
hypothesis.
(A) True (B) False
Statistical hypotheses involve restating the research hypotheses in such a way that they may be addressed
by statistical techniques. 22

(A) True (B) False


QUESTIONS AND EXERCISES
True or False
Summary statistics are a subset of descriptive statistics.
(A) True (B) False
Frequency distributions are a subset of inferential statistics.
(A) True (B) False
A population is a part of random sample
(A) True (B) False
The nominal scale organizes data into mutually exclusive categories, but the categories have no rank, order, or
value
(A) True (B) False
The ordinal scale organizes data into mutually exclusive categories that are rank ordered based on a criterion, but
the difference between ranks is not necessarily equal in value
(A) True (B) False
The ratio scale is a measurement that contains all the characteristics of other scales and also has an absolute zero
point determined by nature
(A) True (B) False
Age, Weight, and Height are quantitative variables
(A) True (B) False
Data at the ratio level are qualitative only
(A) True (B) False
The purpose of descriptive statistics is to simplify and organize the data from a study.
(A) True (B) False
If 20 are selected randomly among 500 smoking persons to define the number of failed (decay) teeth, the sample
size equals 20.
(A) True (B) False
Dentists believe that “in general Fluoride suffers from tooth decay”, this believe represent a researcher
hypothesis. 23
(A) True (B) False
QUESTIONS AND EXERCISES
MCQ
Which of the following involves the use of data analysis and interpretation in health care research?
A: Biostatistics B: Correlation C: Data analysis D: Interpretation
. Patient classification (0, 1, 2, 3, 4) is an example of which scale of measurement?
A: Interval B: Ordinal C: Ratio D: Nominal
Dental Hygiene (dh) degree possessed (AAS, AS, BS, MS) What are the four scales of
measurement for data?
A: Caliber, nominal, ratio, and interval B: Index, ordinal, interval, and ratio
C: Ordinal, nominal, ratio, and interval D: Scale, ordinal, ratio, and interval
Which of the following is used to demonstrate response to dental hygiene therapy?
A: Interpretation B: Biostatistics C: Data analysis D: Correlation
Which scale has characteristics of ordinal scales, plus equal distance between any two adjacent
units of measurement, and has no meaningful zero point?
A: Ordinal B: Ratio C: Interval D: Nominal
Which scale organizes data into mutually exclusive categories that are rank ordered based on a
criterion, but the difference between ranks is not necessarily equal in value?
A: Ordinal B: Ratio C: Interval D: Nominal
Which scale organizes data into mutually exclusive categories, but the categories have no rank,
order, or value?
A: Ordinal B: Ratio C: Interval D: Nominal
Which scale is a measurement that contains all the characteristics of other scales and also has an
absolute zero point determined by nature?
A: Ordinal B: Ratio C: Interval D: Nominal 24
QUESTIONS AND EXERCISES
1: Define:
(a) Statistics, (b) Biostatistics, (c) Population, (d) Sample and (e) Simple random sample
2: For each of the following situations (A and B) , answer questions a through e:
(a) What is the sample in the study?
(b) What is the population?
(c) What is the variable of interest?
(d) How many measurements were used in calculating the reported results?
(e) What measurement scale was used?
Situation A. A study of 300 households in a small southern town revealed that 20 percent
had at least one school-age child present.
Situation B. A study of 250 patients admitted to a hospital during the past year revealed that,
on the average, the patients lived 15 miles from the hospital.
3: For each of the following questions, chose the correct answer
(i) Number of cups of coffee served at a restaurant is an example of what type of data
A: Discrete B: Continues C: Ordinal D: Nominal
(ii) The lifetime (in hours) of 12 flashlight batteries is an example of …... data
A: Discrete B: Continues C: Ordinal D: Nominal
(iii) Nationality of people who are living in Egypt is an example of ……..data?
A: Discrete B: Continues C: Ordinal D: Nominal
(iv) Ranking of Football teams is an example of ……..data?
A: Discrete B: Continues C: Ordinal D: Nominal 25
QUESTIONS AND EXERCISES
For each of the following examples, the measurement scale that is appropriate for data which.
(i) The starting salaries of graduates from a Computer program is:
A nominal B ordinal C interval D ratio
(ii) The month of highest sales for each firm in a sample is:
A nominal B ordinal C interval D ratio
(iii) The weekly closing price of gold throughout the year is:
A nominal B ordinal C interval D ratio
(iv) The size of soft drink (small, medium, or large) ordered by a sample of customers is:
A nominal B ordinal C interval D ratio
(v) Method of payment (cash, check, credit card) is:
A nominal B ordinal C interval D ratio
(vi) The time to start a football game is:
A nominal B ordinal C interval D ratio
(vii) The final letter grades received by students in a statistics course is:
A nominal B ordinal C interval D ratio
(viii) The amount of crude oil imported monthly by a certain country
A nominal B ordinal C interval D ratio
(x) The temperature in Cairo tomorrow is
A nominal B ordinal C interval D ratio 26

You might also like