Topic 1 SSC201
Topic 1 SSC201
COURSE OUTLINE
Topic 1: Nature and Objectives of Statistics
Topic 2: Data Organization and Presentation
Topic 3: Sample and Sampling Techniques
Topic 4: Measures of Central Tendency/Partition
Topic 5: Measures of Dispersion
Topic 6: Elementary Probability Theory
Topic 7: Random Variable and Probability Distribution
Topic 8: Hypothesis Testing
Recommended Text:
Schaum’s Outlines: Statistics by Spiegel, M.R. and Stephens, L.J. (Fourth or Sixth Edition)
PDE715: STATISTICAL TECHNIQUES IN
ECONOMICS
Department of Economics
Faculty of Social Sciences
Obafemi Awolowo University
Ile-Ife, Nigeria
Introduction
It is a well-known fact that decision-making is very crucial to every human being. There are a lot of
decisions to be made to keep life going. However, these decisions are made based on numerical
information called DATA. The data can be supplied or implied and can be used to access different
outcomes. The aspect of making a decision based on numerical information is called STATISTICS.
There are many ways in which statistics can be defined according to different scholars:
• Statistics is a science which deals with the collection, analysis, and interpretation of numerical data
• Statistics is concerned with a method for treating numerical data that have been collected in
observation taken in the form of measurement or count so that meaningful conclusions are drawn
from such data.
• In the real sense, statistics is a science that deals with the collection, compilation, analysing,
summarisation, presentation, and interpretation of data.
There could be many more definitions by scholars. We notice three important facts common to these
definitions
• Statistics is concerned with the technique by which information is collected, organised, and
interpreted.
• Most information analysed is quantitative data collected from the process of sampling.
• The essence of statistical interpretation of processed data is decision-making under the condition
of uncertainty.
In summary, statistics can be referred to as the collection, organisation, presentation, analysis, and
interpretation of numerical data with the aim of drawing valid conclusions and making reasonable
decisions on the basis of the analysis.
From the above definition, it then follows that statistics have the following:
i. Collection of data: This can be done either by the investigator himself/herself or by another person.
Data obtained by the investigator himself/herself are known as primary data, while those collected
from another source are known as secondary data. Sources of primary data are interviews,
questionnaires, observations, and experiments, while sources of secondary data include the internet,
books/magazines, newspapers, Central Banks' statistical bulletin, annual abstract of statistics by IMF
(International Monetary Fund), and world development indicators (WDI) by World Bank. Generally,
there are two types of errors in data collection.
Response Error: this can occur when the respondents supplied wrong information to the investigator
or when the enumerator misquotes the response of the respondent.
Non-Response Error: this may be partial or total. It is total when the respondents refuse to answer
none of the questions asked but when some of the questions are answered it is partial.
ii. Organisation of data: There is a need for proper organisation of data before such data can be used.
This is because data collected are generally in an organized form, especially when they have to be
collected from different sources. This can be done by editing and classifying the data according to
some characteristics possessed by the variables that constitute the data.
iii. Presentation of data: The next step is to present the data after organising the data. Data can be
presented using tables, charts, and diagrams.
iv. Analysis of data: The aim of analysing data is to obtain information useful for decision-making.
There are different methods of analysing data ranging from simple observation of data to highly
estimation techniques. However, only the most commonly used methods will be discussed.
v. Interpretation of data: The last stage in statistical investigation is interpretation, that is, drawing a
conclusion from the data collected and analysed. If the data analysed are not properly interpreted, the
whole objective of the investigation may be defeated and fallacious. Conclusions are drawn but only
correct interpretations will lead to a valid conclusion.
It should be clear that statistics is much more than just the tabulation of numbers and the graphical
presentation of these tabulated numbers. Statistics is the science of gaining information from
numerical (quantitative) and categorical (qualitative) data.
It has been said above that Statistics is a body of procedure and technique used to collect, present and
analyse data for applications in practically every profession today. This implies that statistics can be
used in various fields.
Statistical methods are widely used in Economics. For instance, an economist may use it to access the
impact of government expenditure on economic growth. A doctor may use it to describe a patient's
condition or effectiveness of treatment. A social psychologist may also use it to summarise or analyse
or measure peer pressure among teenagers and also to interpret the causes. A college professor may
use it to evaluate how much his/her students like or dislike his/her course, thereby, helping him/her to
do better on the job. A quality control manager may use it to ascertain whether or not the required
quality standards are met.
1. It helps to gain precise knowledge about a particular topic since it quantifies information.
2. It helps in formulating suitable policies as well as testing hypothesis.
3. It can be used to predict a future event.
4. It helps the government in allocating resources efficiently.
5. It enables comparison to be made among various types of data.
6. It is used for describing data and making inferences.
7. It shows the interrelationship between variables in a clearer form and provides a good measure of
comparison.
8. It helps in understanding numerical information in a more readable form.
In summary statistical techniques provide useful means of informed and unbiased decision-making in
the face of uncertainty in Economics, other social sciences, and other fields such as Medicine,
Agriculture, Biological Sciences, and the Physical Sciences. Indeed, the knowledge of the subject,
Statistics, enables us to get a picture of a problem where precise measurements or observations are
difficult to make or where events are not easily predictable.
Types of Statistics
Statistics is sub-divided into two: Descriptive and Inferential Statistics.
Descriptive Statistics: This is the aspect of statistics that describes and summarises a set of data in
order to make them readily comprehensible. How does descriptive statistics summarise data? Data
summarisation is done by finding out one or more pieces of information that characterise a whole data.
Descriptive statistics includes the construction of graphs, charts, tables, and the calculation of various
descriptive measures. Among the quantitative summary values are averages, mode, median, and
measures of dispersion. For instance, suppose we have data on the incomes of 1000 Nigerian families,
the body of data can be summarised by finding the average family income and by finding the spread
of these family incomes above or below the average. Again, how does descriptive statistics describe a
body of data? This is done by representing a body of data in graphic forms such as tables, charts, and
graphs of the population of the family in each income class.
Inferential Statistics: is the process of reaching generalisation about the whole (called the population)
by examining a portion (called a sample). The inferential statistics include those techniques by which
decisions about the statistical population can be made without observing or measuring all elements in
the population. In sum, inferential statistics consists of methods for drawing conclusions about a
population based on the information obtained from a sample of the population. Typically, inferential
statistics make use of a random sample as the basis for statistical inference. In order for such inferences
to be valid, a sample must be representative of the population and the probability of error also must be
specified. It should be noted that inferential statistics has two aspects: (a) Estimation (b) Hypothesis
testing. Furthermore, inferential statistics involves inductive reasoning, that is, making generalisations
based on a known fact. In inferential statistics two conditions are required:
• The sample must be representative. This is to say that the sample must fully reflect the
characteristics and properties of the population from which it is drawn.
• The probability of error must be specified since the probability of error exists in statistical
inference. Estimate or test of population properties or characteristics should be given together with
the chance of probability of being wrong. This probability theory is an essential element in
statistical inferences.
Consider again the sample of 1000 Nigerian families above. Definitely, we have more than
1000 families in Nigeria. If these 1000 families are representative of all Nigerian families, we
can estimate and test the hypothesis about the average family income in Nigeria as a whole.
However, since these conclusions are subject to error, we also could have to indicate the
probability of error.
Common Terms in Statistics
• Observation: In Statistics, an observation refers to the things been observed. There could be an
observation about any object such as height, weight, plot, etc. The numerically recorded
observation, which is referred to as data, is the raw materials with which statisticians work.
• Population: In Statistics, a population is the entire individuals, objects, or items that may be living
or non-living that are to be observed in a given problem situation. Consider a single toss of a coin.
There are two outcomes which are Head (H) and Tail (T). Hence, the population consists of (H,
T). In throwing a Ludo die the population is (1,2,3,4,5,6). It should be noted that a population
could be finite or infinite. The number of students registered for SSC 201 is a finite population
because the counting process can end. On the other hand, the population of stars in the sky is
infinite.
• Census: This is the complete enumeration of a target population together with the collection of
important information of every element of the population. For instance, a census of people in Lagos
state will consist of the enumeration of everybody-children and adults- and other vital information
on all of them such as sex, marital status, age, qualification e.t.c. Data from complete enumeration
are free from sampling error but are time-consuming and expensive.
• Variable: is a feature possessed by the member of a population e.g. age, weight, height, etc. The
variable may take on different values which may be integers or any kind of real numbers. A
variable can be discrete or continuous. A discrete variable takes on the countable value each of
which can be identified exactly e.g. the number of oranges, size of family, size of shoe, etc. The
continuous variable can assume any values within any given interval. It takes any kind of real
numbers which has no exact value. Hence, a continuous variable can only be measured. We usually
approximate or estimate its value e.g. weight, speed, time, distance, height, etc.
• Parameter: is a feature of a population that helps to summarise information about the population
with regards to the variable under study e.g. the mean and standard deviation of the population. A
parameter is to population.
• Statistic: is a feature of a sample drawn from the population e.g. sample mean and standard
deviation. Indeed, it is a value or quantity obtained from a sample for the purpose of estimating a
population parameter.
• Data: These are the set of recorded observations made on a sample. They are raw facts about an
event that becomes information when processed. They may be in form of numeric values (i.e.
quantitative data) or non-numerical perceptions or observations (i.e. qualitative data) made by man
or machine. Numeric data consist of values that can be quantified. They are generated by counting
and/or measuring. Numeric data may be discrete or continuous. Discrete data are defined as that
which can only take certain values (e.g the number of students in a class). Continuous data are
defined as data that can take any value (e.g. weight, distance, and speed). Non-numeric data are
those that have no numerical value given to them, that is, they cannot be quantified. Examples are
sex, socio-economic status, age group, race e.t.c. They can be categorized as ordinal and
categorical data. Ordinal data have values that can be put on an ordinal scale, e.g age group and
wrestlers’ weight class. Categorical data can only be put into categories, e.g. sex, marital status,
region, race, socio-economic status e.t.c. For instance, sex has two categories (female and male),
while socio-economic status has high, medium, and low socioeconomic status.
• Information: refers to the evaluated, validated, or processed data. Therefore, data that have been
processed into useful form are known as information. For instance, a class average score computed
from examination grade provides useful information. To obtain such useful information, the
examination score will undergo calculation of the class average score. Information is very
important for decision-making and gathered data will not be useful until it is processed.
• Sample: is a subset of the population observed for the purpose of making scientific inferences
such as generalisation or conclusion about the population. Recall that in other for statistical
inference to be valid, it must be based on a sample. It fully reflects the characteristics and properties
of the population from which it is drawn. A representative sample is ensured by random sampling
whereby each element of the population has an equal chance of being included in the sample.
• Sampling Frame: contains the basic details of all members of the population from which samples
are to be drawn. Statisticians believe that without a complete sampling frame, a truly random
sample cannot be selected. The sampling frame includes voters register, telephone directory, and
so on.
Reasons why a sample is preferred to a population in most statistical enquiries and analysis
1. Analysis based on a representative sample is as precise as that based on the entire population
2. Use of a sample is time-saving and cost-minimizing in terms of human and material cost
3. Use of the population to obtain some of its parameters may not be feasible i.e. not practicable
especially with an infinite population (i.e. population whose number is too large to be known) or
when the observation process is disrupted e.g. In testing the efficacy of a new vaccine or new raw
materials in production.
4. There is a greater speed in sampling than in population.