Chapter 1 Introduction To Biostatistics
Chapter 1 Introduction To Biostatistics
1
Reference Book
BIOSTATISTICS
A Foundation for Analysis in the Health Sciences
2
Chapter 1: Introduction to Biostatistics 3
4
CHAPTER 1
INTRODUCTION TO BIOSTATISTICS
5
LEARNING OUTCOMES
1. understand the basic concepts and terminology of biostatistics, including the various
kinds of variables, measurement, and measurement scales.
2. be able to select a simple random sample and other scientific samples from a
population of subjects.
3. understand the processes involved in the scientific method and the design of
experiments.
4. appreciate the advantages of using computers in the statistical analysis of data
generated by studies and experiments conducted by researchers in the health sciences.
6
1.1 SOME BASIC CONCEPTS
Data are characteristics or informations, usually numerical, that are collected through
observation. The raw material of statistics is Data.
For example, when a nurse weights a patient or takes a patient’s temperature, a
measurement, consisting of a number such as, for weight 150 pounds (68 kg) or 100 degrees
Fahrenheit (37.8 °C), is obtained (continuous). Sometimes the Gender type, Male or
Female, or nationality, or skin colour (Qualitative categorical).
Quite a different type of number is obtained when a hospital administrator counts the
number of patients—perhaps 20—discharged from the hospital on a given day (discrete).
Statistics: is the science that interested to find scientific methods for collecting, organizing,
summarizing, presenting, and analyzing data, as well as drawing valid conclusions and making
relevant decisions on the basis of such analysis.
More concretely, however, we may say that statistics is a field of study concerned with
(1) the collection, organization, summarization, and analysis of data; and
(2) the drawing of inferences about a body of data when only a part of the data is observed.
The tools of statistics are employed in many fields—business, education, psychology,
agriculture, and economics, to mention only a few.
Biostatistics : when the data analyzed are derived from the biological sciences and medicine,
we use the term biostatistics to distinguish this particular application of statistical tools.
Biometry A branch of biology that studies biological phenomena and observations by means
of statistical analysis. The study of biology involving mathematical techniques or statistics to
provide insight to biological queries. 7
1.1 SOME BASIC CONCEPTS
OR
8
1.1 SOME BASIC CONCEPTS
Variables
9
1.1 SOME BASIC CONCEPTS
Population:
a population of entities (objects) as the largest collection of entities (objects) for which we
have an interest at a particular time, or is the total set of observations that can be made. For
example, if we are studying the weight of adult women at Cairo, the population is the set of
weights of all the women in at cairo. If we are studying the grade point average (GPA) of
students at Al-Azhar, the population is the set of GPA's of all the students at Al-Azhar.
A population of values as the largest collection of values of a random variable for which
we have an interest at a particular time
Finite and infinite Populations : a population of values consists of a fixed number of
these values, the population is said to be finite. If, on the other hand, a population consists
of an endless succession of values, the population is an infinite on
A sample may be defined simply as a part of a population
Suppose our population consists of the weights of all the elementary school children
enrolled in a certain county school system. If we collect for analysis the weights of only a
fraction of these children, we have only a part of our population of weights, that is, we have
a sample.
10
1.2 MEASUREMENT AND MEASUREMENT SCALES
The Nominal Scale
The lowest measurement scale is the nominal scale.
As the name implies it consists of “naming” observations or classifying them into various
mutually exclusive and collectively exhaustive categories.
Some examples include such dichotomies as male–female, well–sick, under 60 years
(youngest) age–60 and over (oldest) , child–adult, and married–not married, smokers and non-
smokers.
The nominal scale organizes data into mutually exclusive categories, but the categories have
no rank, order, or value.
11
1.2 MEASUREMENT AND MEASUREMENT SCALES
DEFINITION:
Statistical inference is the procedure by which we reach a conclusion about a population on
the basis of the information contained in a sample that has been drawn from that population
There are many kinds of samples that may be drawn from a population.
13
Ages of 189 Subjects Who Participated in a Study on Smoking Cessation
ID Age ID Age ID Age ID Age ID Age ID Age ID Age ID Age
1 48 26 65 51 43 76 59 101 63 126 77 151 50 176 53
2 35 27 67 52 47 77 57 102 50 127 76 152 53 177 61
3 46 28 38 53 46 78 52 103 59 128 71 153 54 178 54
4 44 29 37 54 57 79 54 104 54 129 43 154 61 179 51
5 43 30 46 55 52 80 53 105 60 130 47 155 61 180 62
6 42 31 44 56 54 81 62 106 50 131 48 156 61 181 57
7 39 32 44 57 56 82 52 107 56 132 37 157 64 182 50
8 44 33 48 58 53 83 62 108 68 133 40 158 53 183 64
9 49 34 49 59 64 84 57 109 66 134 42 159 53 184 63
10 49 35 30 60 53 85 59 110 71 135 38 160 54 185 65
11 44 36 45 61 58 86 59 111 82 136 49 161 61 186 71
12 39 37 47 62 54 87 56 112 68 137 43 162 60 187 71
13 38 38 45 63 59 88 57 113 78 138 46 163 51 188 73
14 49 39 48 64 56 89 53 114 66 139 34 164 50 189 66
15 49 40 47 65 62 90 59 115 70 140 46 165 53
16 53 41 47 66 50 91 61 116 66 141 46 166 64
17 56 42 44 67 64 92 55 117 78 142 48 167 64
18 57 43 48 68 53 93 61 118 69 143 47 168 53
19 51 44 43 69 61 94 56 119 71 144 43 169 60
20 61 45 45 70 53 95 52 120 69 145 52 170 54
21 53 46 40 71 62 96 54 121 78 146 53 171 55
22 66 47 48 72 57 97 51 122 66 147 61 172 58
23 71 48 49 73 52 98 50 123 68 148 60 173 62
24 75 49 38 74 54 99 50 124 71 149 53 174 62 14
25 72 50 44 75 61 100 55 125 69 150 53 175 54
1.3 SAMPLING AND STATISTICAL INFERENCE
Example: A simple random sample: 10 random numbers drawn from the subject
number
Sample Subject 1 2 3 4 5 6 7 8 9 10
Number
Random number 137 114 155 183 185 028 085 181 018 164
Corresponding Age 43 66 61 64 65 38 59 57 57 50
Systematic Sampling
A random numbers table is then employed to select a starting point in the file system.
The record located at this starting point is called record 𝑥. A second number,
determined by the number of records desired, is selected to define the sampling
interval (call this interval 𝑘).
Consequently, the data set would consist of records 𝑥, 𝑥 + 𝑘, 𝑥 + 2𝑘, 𝑥 + 3𝑘, and
so on, until the necessary number of records are obtained.
15
1.3 SAMPLING AND STATISTICAL INFERENCE
Example: Sample of 10 Ages Selected Using a Systematic Sample from the Ages
Sample Subject 1 2 3 4 5 6 7 8 9 10
Number
Random number 4 22 40 58 76 94 112 130 148 166
Age 44 66 47 53 59 56 68 47 60 64
DEFINITION: Experiments are a special type of research study in which observations are
made after specific manipulations of conditions have been carried out; they provide the
foundation for scientific research.
DEFINITION: The scientific method is a process by which scientific information is
collected, analyzed, and reported in order to produce unbiased and replicable results in an
effort to provide an accurate representation of observable phenomena
The scientific method is recognized universally as the only truly acceptable way to produce
new scientific understanding of the world around us. It is based on an empirical approach, in
that decisions and outcomes are based on data. There are several key elements associated with
the scientific method, and the concepts and techniques of statistics play a prominent role in
all these elements.
The first step: Making an Observation.
An observation is made of a phenomenon or a group of phenomena. This observation leads to
the formulation of questions or uncertainties that can be answered in a scientifically rigorous way.
For example, it is readily observable that regular exercise reduces body weight in many people. It is
also readily observable that changing diet may have a similar effect. In this case there are two
observable phenomena, regular exercise and diet change, that have the same endpoint. The nature
of this endpoint can be determined by use of the scientific method. 17
1.4 THE SCIENTIFIC METHOD AND THE DESIGN OF EXPERIMENTS
19
1.5 COMPUTERS AND BIOSTATISTICAL ANALYSIS
The widespread use of computers has had a tremendous impact on health sciences
research in general and biostatistical analysis in particular. The necessity to
perform long and tedious arithmetic computations as part of the statistical analysis
of data lives only in the memory of those researchers and practitioners whose
careers antedate the so-called computer revolution. Computers can perform more
calculations faster and far more accurately than can human technicians. The use of
computers makes it possible for investigators to devote more time to the
improvement of the quality of raw data and the interpretation of the results.
20
QUESTIONS AND EXERCISES
True or False.
Random sample is a sample in which all elements have an equal chance of being selected
(A) True (B) False:
A population of entities as the largest collection of entities for which we have an interest at
a particular time
(A) True (B) False
A sample is defined as a simply as a part of a population.
(A) True (B) False
The population is the aim of study, while sample is under study
(A) True (B) False