Chapter One
Chapter One
CHAPTER ONE
INTRODUCTION TO STATISTICS
1.1 Definition and classification of Statistics
The word statistics is defined in different ways depending on its use in the plural and singular
sense.
In the plural sense:- statistics is defined as the collection of numerical facts or figures ( or the raw
data themselves).
2. The average mark of statistics course for students is 70% would be considered as a
statistics whereas Abebe has got 90% in statistics course is not statistics.
Remark: statistics are aggregate of facts. Single and isolated figures are not statistics as they
cannot be compared and are unrelated.
In its singular sense:- the word Statistics is the subject that deals with the methods of collecting,
organizing, presenting, analyzing and interpreting statistical data.
Classification of Statistics
Statistics is broadly divided into two categories based on how the collected data are used.
Descriptive Statistics:- deals with describing the data collected without going further conclusion.
Example 1.1: Suppose that the mark of 6 students in Statistics course for Mathematics is given as
40, 45, 50, 60, 70 and 80. The average mark of the 6 students is 57.5 and it is considered as
descriptive statistics.
Inferential Statistics:- It deals with making inferences and/or conclusions about a population
based on data obtained from a sample of observations. It consists of performing hypothesis testing,
determining relationships among variables and making predictions.
Example 1.2: In the above example, if we say that the average mark in Statistics course for
Mathematics students is 57.5, then we talk about inferential statistics (draw conclusion based on
the sample observation).
Fundamental Biostatistics Chapter One Introduction to Statistics
The area of statistics points out the following five stages. These are collection, organization,
presentation, analysis and interpretation of data.
Collection of data: This is the process of obtaining measurements or counts or obtaining raw data.
Data can be collected in a variety of ways; one of the most common methods is through the use of
sample or census survey. Survey can also be done in different methods, three of the most common
methods are:
Telephone survey
Mailed questionnaire
Personal interview.
Organization of data: - Data collected from published sources are generally in organized form.
However if an investigator has collected data through a survey, it is necessary to edit these data in
order to correct any apparent inconsistencies, ambiguities, and recording errors.
This phase also includes correcting the data for errors, grouping data into classes and tabulating.
Presentation of data:- After the data have been collected and organized they can be presented in
the form of tables, charts, diagrams and graphs. This presentation in an orderly manner facilitates
the understanding as well as analysis of data.
Analysis of data: - the basic purpose of data analysis is to dig out useful information for decision
making. This analysis may simply be a critical observation of data to draw some meaningful
conclusions about it or it may involve highly complex and sophisticated mathematical techniques.
Interpretation of data: - Interpretation means drawing conclusions from the data collected and
analyzed. Correct interpretation will lead to a valid conclusion of the study & thus can aid in
decision making.
1.3 Definition of some statistical terms
Population: - It is the totality of objects under study. The population represents the target of an
investigation, and the objective of the investigation is to draw conclusions about the population
hence we sometimes call it target population. The word population doesn’t necessarily refer to
people.
Examples:
All clients of Telephone Company
Fundamental Biostatistics Chapter One Introduction to Statistics
Limitations of Statistics
The field of statistics, though widely used in all areas of human knowledge and widely applied in
a variety of disciplines such as engineering, economics and research, has its own limitations. Some
of these limitations are:
a) It does not deal with individual values: as discussed earlier, statistics deals with aggregate of
facts. For example, wage earned by an individual worker at any one time, taken by itself is not a
statistics.
b) It does not deal with qualitative characteristics directly: statistics is not applicable to
qualitative characteristics such as beauty, honesty, poverty, standard of living and so on since these
cannot be expressed in quantitative terms. These characteristics, however, can be statistically dealt
with if some quantitative values can be assigned to these with logical criterion. For example,
intelligence may be compared to some degree by comparing IQs or some other scores in certain
intelligence tests.
c) Statistical conclusions are not universally true: since statistics is not an exact science, as is
the case with natural sciences, the statistical conclusions are true only under certain assumptions.
d) It can be misused: statistics cannot be used to full advantage in the absence of proper
understanding of the subject matter.
1.5 Levels of Measurement
Proper knowledge about the nature and type of data to be dealt with is essential in order to specify
and apply the proper statistical method for their analysis and inferences.
Scale Types
Measurement is the assignment of values to objects or events in a systematic fashion. Four levels
of measurement scales are commonly distinguished: nominal, ordinal, interval, and ratio and each
possessed different properties of measurement systems. The first two are qualitative while the last
two are quantitative.
Nominal scale: The values of a nominal attribute are just different names, i.e., nominal attributes
provide only enough information to distinguish one object from another. Qualities with no ranking
or ordering; no numerical or quantitative value. These types of data are consists of names, labels
and categories. This is a scale for grouping individuals into different categories.
Example 1.3: Eye color: brown, black, etc, sex: male, female.
In this scale, one is different from the other
Fundamental Biostatistics Chapter One Introduction to Statistics
Arithmetic operations (+, -, *, ÷) are not applicable, comparison (<, >, ≠, etc) is impossible
Ordinal scale: - defined as nominal data that can be ordered or ranked.
Can be arranged in some order, but the differences between the data values are
meaningless.
Data consisting of an ordering of ranking of measurements are said to be on an ordinal
scale of measurements. That is, the values of an ordinal scale provide enough information
to order objects.
One is different from and greater /better/ less than the other
Arithmetic operations (+, -, *, ÷) are impossible, comparison (<, >, ≠, etc) is possible.
Example 1.4: -Letter grading (A, B, C, D, F), -Rating scales (excellent, very good, good, fair,
poor), military status (general, colonel, lieutenant, etc).
Interval Level: data are defined as ordinal data and the differences between data values are
meaningful. However, there is no true zero, or starting point, and the ratio of data values are
meaningless. Note: Celsius & Fahrenheit temperature readings have no meaningful zero and ratios
are meaningless.
In this measurement scale:-
One is different, better/greater and by a certain amount of difference than another.
Possible to add and subtract. For example; 800c – 500c = 300c, 700c – 400c = 300c.
Multiplication and division are not possible. For example; 600c = 3(200c). But this does
not imply that an object which is 600c is three times as hot as an object which is 200c.
Most common examples are: IQ, temperature.
Ratio scale: Similar to interval, except there is a true zero (absolute absence), or starting point,
and the ratios of data values have meaning.
Arithmetic operations (+, -, *, ÷) are applicable. For ratio variables, both differences and
ratios are meaningful.
One is different, larger /taller/, better/ less and by a certain amount of difference and so
much times than the other.
This measurement scale provides better information than interval scale of measurement.
Example 1.5: weight, age, number of students.