1.STA 112 Session 1
1.STA 112 Session 1
MEANING OF STATISTICS
In its initial concept, statistics is the collection of data on the population and
social – economic activities vital to the nation or state or province. The nation
requires information on numbers of taxable adults to allow a projection of
reliable total income. In this century statistics has advanced far beyond that
narrow conception. In general, the word “statistics” has three possible
connotations depending on the context of the application, namely: As a
SUBJECT; As a piece of Information and as a mathematical function (or entity).
TYPES OF STATISTICS
Like almost all fields of study, statistics has two aspects; Theoretical and
Applied. Theoretical or mathematical statistics deal with the development,
derivation, and proof of statistical theorems, formulas, rules and laws.
Applied statistics involves the application of those theorems, formulas, rules and
laws to solve real – world problems. This (course) book is concerned with
applied statistics and not with theoretical statistics. By the time you finish
1
studying this book, you will learn how to think statistically and to make
educated and scientific guesses.
STATISTICAL VARIABLE
A quantitative variable is one that can assume numerical values: examples are
height, age, weight, parity, number of students in STA 112 and STA113 and so
on. The quantitative variable can further be classified into two major groups as
2
A qualitative variable is one that cannot be characterized by numerical values.
For example, educational status, marital status, gender and so on. This variable
form can be further classified in to two groups: (i) Those that can be
characterized by only two possible outcomes. Examples are No or Yes; Dead or
Alive, Present or Absent and so on. (ii) Those qualitative variables whose values
have more than two possible outcomes. For example, marital status has values as
never married, married, divorced or widows.
DATA SETS
EXAMPLES
the financial section of our daily paper contains price data for securities
and commodities;
an economic report showing inflation rates for different countries;
A newspaper article contains data on achievement test scores for all
schools in a city.
Consider the following table showing 2006 sales revenues and industrial
classification for the companies included in a study:
3
ACC 243.9 Chemicals
The data set in the above table represents a collection of facts and figures, more
technically, the data set contains observations on the variables of interest for the
elements, in the data set.
A data set represents a collection of elements, and for each element information
on one or more characteristics of interest is included in the data set. In the above
table, the element is a company and the characteristics of interest are the
company’s 2006 SALES REVENUES AND INDUSTRIAL.
4
DATA SOURCES
Statistics is not only concerned with organizing and analyzing data once they are
assembled, but also with the sources of data and how data are collected for
study. The initial stage of any investigation involves a specification or definition
of the problem to be studied. From this specification comes an identified need for
particular types of data to clarify the problem. At this point, the question of
where to obtain the necessary data is posed.
ILLUSTRATION
SECONDARY SOURCE: Consider a store location analyst for a retail outlet chain
who is concerned with crime patterns in different cities might find the necessary
data in external sources such as government reports on criminal and judicial
statistics or computerized police records.
REMARK: Primary data are data collected directly from the field of inquiry that
is data collected by means of surveys or census. Whereas secondary data are data
obtained from published or unpublished results or reports of organisations.
5
LIMITATIONS IN THE USE OF DATA SOURCES
When using any data source, the user should be thoroughly acquainted with the
nature and limitations of the data. Limitations may include imperfect or
improper methods of data collection, recording and classification as well as
errors of omission or commission when data are transferred from one record to
another.
The user needs to determine whether the definitions employed in compiling the
data are appropriate for the purpose. Also, important is to check whether
changes in concepts, definition and data – collection methods have occurred over
the time period of interest, and, if so, to determine the effect of these changes on
the data.
The user should have access to pertinent information and explanatory comments
about the manner in which the data were compiled, the data’s accuracy and how
the data should be interpreted.
6
7