Introduction To Statistics
Introduction To Statistics
Geography
1.1 Introduction
This is an introductory Section to statistics, which provides the definition and types of statistics. It
also gives the purpose of statistical analyses, especially the meaning of statistical significance in
geography. It demonstrates that even proper use of statistics does not guarantee useful results.
Other important aspects introduced in this Section are a number of new terms, concepts and
symbols that are used quite often in statistics.
1.3 Definition
Webster’s New Dictionary gives two definitions of the word statistics:
a) Facts or data of a numerical kind, assembled, classified and tabulated so as to present
significant information about a given subject.
b) The science of assembling, classifying and tabulating such facts or data.
However, statistics entails much more than what the above definitions can convey. Statisticians
assemble, classify and tabulate data. They also analyze data in order to make generalizations
and decisions (Weiss 1996). For example, a city council may decide where to locate a dumping
site based on environmental impact assessment statements, which include a variety of statistical
data.
According to Levin (1978) statistics is the application of methods used in collecting, processing,
summarizing and analyzing data/figures. The data/figures, which result from the statistical
analysis and summary such as the mean, mode, median, and standard deviation, among others
are also referred to as statistics.
Inferential statistics consists of methods for drawing and measuring the reliability of
conclusions about a population based on information obtained from a sample of the population.
Polling opinion of political voting provides a good example of inferential statistics. It would be
expensive and unrealistic to interview all Kenyans on their voting preferences in a coming
General Election. Statisticians who want to gauge the sentiment of the entire Kenyan voters on
any polling project can afford to interview only a carefully chosen group of a few thousand, or
tens of thousand voters. This group is referred to as a sample of the Kenyan voters. Then the
statisticians can analyze the information obtained from the sample of the voting population to
draw conclusions on the preferences of the entire voting population. Inferential statistics thus
provides methods for making such inferences.
ii) Statistical methods enable geographers to establish precisely the form, nature and degree of
relationships between spatially covariant phenomena. The methods are concerned with
processes that are in operation, and which always engender identical results in identical
circumstances. For instance, armed with this kind of information geographers should be able to
predict the relationship between temperature and altitude of a certain location. It should,
however, be noted that very few geographical processes are of deterministic nature. As noted
by Edbon (1978) they frequently behave in different ways at different times, and one can hardly
predict with certainty of the outcome of a process even under carefully controlled conditions.
iii) The techniques minimize subjective judgment, and thus increasing objectivity and precision
in explaining the spatial trends of geographical distributions and relationships. Note that most
of the geographical data is obtained from samples, rather than with all data from the population.
In such a case it is assumed that the sample is a representative of the total dataset of the
population or the universe from which it has been drawn. This is where inferential statistics is
applied with certain defined limits, that one can make statements about the characteristics of a
population based only on data collected from a sample (refer to voting preferences under types
of statistics above).
iv) Statistical significance is one of the most powerful uses of statistics, in helping to decide
whether an observed relationship between two sets of sample data is statistically significant.
Statistical significance is concerned with whether an observed difference, for instance, between
two samples can be taken as signifying something else. It helps to establish if there is a
difference within the population from which the samples are drawn. In other words, is the
difference appearing in the samples merely as a result of chance in the sampling procedure?
v) The statistical techniques help geographers to relate geographical studies to other scientific
disciplines. For instance, the use of hypotheses to be tested by statistical methods encourages
geographers to have original thoughts (Ogonda 1991). It also results in contributions to a wider
community of interest for the purpose of planning and development. Reasoning and judgment
are exercised and knowledge gained from one situation can be applied to another discipline of
different nature and similar results obtained. Thus, holding an important position in connecting
geographical models or theories/concepts with the physical and human environment in which
we live.
Symbols
= The symbol (the upper-case version of the Greek letter sigma) implies the operation
of summing. If x stands for a variable, then x means “sum all the observations” of x.
X = The sample mean, obtained by summing all the observed values of x and dividing by
the number of observed values. The formula for this is X = x/n,
Where n is the number of observed values
We pronounce X as “x bar”.
1.8 Summary
• Statistics has been defined as numerical facts pertaining to a body of objects or people, or
as a science that deals with methods of data collection, analysis and drawing of inferences
• It is possible to perform a descriptive study on a sample as well as on a population. It is
only when an inference is made about the population based on information from the
sample does the study become inferential. In other words, descriptive and inferential
statistics are somewhat interrelated.
• It is almost always necessary to employ techniques of descriptive statistics to organize
and summarize the information from the sample before carrying out any inferential
analysis. Moreover, the preliminary descriptive analysis of a sample often reveals
features that lead to the choice of, or reconsideration of the choice of the appropriate
inferential method.
1.9 Exercise
(1) Using relevant illustration, explain the relationships between a sample and population.
(3) Kenya Planters Cooperative Union (KPCU) appointed two companies to audit its annual
financial accounts for the year 2001. The two companies arrived at different
conclusions, and yet they used the same data. In the light of the above statement,
comment how statistics might have been misused.