Statistics For Computing COM 114
Statistics For Computing COM 114
INTRODUCTION
We all take decision in one way or the order; it can be decision on what or what not to
buy, course of study in the Polytechnic, categories of personnel needed in an
establishment, the level of stock an establishment can hold, the relationship between the
costs and method of production in industries; when, where and how to establish new
health centre, schools, prisons, sites of industries etc.
However, making decision can either be very simple or difficult depending on the present
situation. Thus, difficult decision making requires numerical information of statistical
data before making any reasonable definite and accurate decisions.
Numerical information otherwise known as Statistical data are often preferred in decision
making to any other available information. It is so because information are often used in
assessing the possible consequences of the decisions that are made.
1.1 Definition
This is first-hand data generated internally by researcher himself or herself (as the case
may be) for the sole aim of meeting the researcher’s objectives. For instance, if it is desired
to conduct a research on the spending habit of students on campus, the researcher will
have to prepare a questionnaire (containing likely questions to achieve his or her aim) by
himself or herself.
At secondary source, data collected do not originate from the researcher but rather collect
the data from the source or record. Hence, secondary source deals with references,
documents and bibliographies recorded by someone else. For Illustration, data obtained
1|Page
from University Teaching Hospital, Central Bank of Nigeria, World bank and National
Bureau of Statistics are secondary data.
The area where statistical data are observed and analysed are endless. Statistics play a
vital role in every field of human activity. It holds a central position in the following fields
of study: Business, Economics, Banking, Administration, Auditing, Natural and Social
Sciences, Medicine, Education etc.
(a) Business
Statistics help a business man to plan production according to the taste of the customers.
The quality of the products can be checked more efficiently using statistical methods.
In fact, a business man makes a correct decision about the location of business, marketing
of the product and financial resources based on available statistical information.
(b). Economics
Statistics play a vital role in Economics. It is used to prepare National Income Accounts
which are multipurpose indicators for economists. Statistical methods are used for
collecting and analyzing data and testing hypothesis in economic research.
The relationship existing between demand and supply, inflation rates, exchange rates,
per capital income etc are problems that require good knowledge of statistics.
(c). Banking
It operates on the principle that not all customers who deposit money with them
withdraw their money at the same time. Banks earn profits on every money lend to their
customer inform of interest. It is statistical method that Bank use to estimate the number
of depositors and their claims for a certain day.
(d). Administration
Different policies of government of a country are based on statistics as statistical data are
used to take all administrative decision. For example, if government wants to review the
salary scales of employees in view of an increase in the cost of living, statistical methods
will be used to determine the risk in the cost of living.
Also, government budgets and planning depend on statistics as it helps in estimating the
expected expenditures and revenue from different sources.
2|Page
(e). Auditing
Marketing forms do use statistics to determine the number of people they need to survey,
measure of track. The data obtained are then used to make general assumption and
predictions about a large group.
(g). Astronomy
Astronomy deals with the measurement of distance, sizes, masses and densities of
heavenly bodies by means of observations. In the process of taking these measurements,
errors are unavoidable. Thus. Most probable measurements are obtained using statistical
methods (i.e Least square Method).
Statistics play an important role in almost all the Natural and Social Science. For example,
statistical methods are commonly used to analyze results of experiments and test of their
significance in Physics, Chemistry, Biology, Sociology etc.
Journalism (i.e news reports), metrology (i.e weather forecast), Educational research etc.
statistics is also used to measure the strength of relationship or association between two
or more variables.
iii. Computer helps to reveal relationships between data variables and real-world
objects
3|Page
1.5 Uses of statistical data
Statistical data is used in many different fields to help researchers, businesses, and
organizations better understand and analyze information. Some of the key uses of
statistical data include:
i. Making predictions and forecasts: Statistical data can be used to make predictions
and forecasts about future trends and events. For example, a business might use
statistical data to predict sales and revenue over the next quarter.
ii. Testing hypotheses: Statistical data is often used to test hypotheses and determine
the likelihood that a particular theory is true. This can be useful for scientific
research, as well as for making business decisions.
iii. Identifying patterns and relationships: Statistical data can be used to identify
patterns and relationships within a dataset. This can help researchers and analysts
gain insight into complex systems and make more informed decisions.
v. Informing policy decisions: Statistical data is often used to inform policy decisions
at the local, state, and national levels. Policymakers can use this data to better
understand the needs of their constituents and make more informed decisions.
1.6 Quantitative data
These are observations measured using numerical scale. That is, they are numeric form.
For example, scores of students in an examination, ages of students in school, number of
leaves on a tree etc.
1.7 Scales of measurement
There are four basis levels or scales of measurement in statistical analysis. They are:
(i) Nominal scale (ii). Ordinal scale (iii). Interval scale (iv). Ratio scale
4|Page
(ii). Ordinal scale
This scale places event in order. For instance, 1st position will be given to the best student
in a class. 2nd to the second best student in the same class etc.
Observations are said to be measured at interval scale when the distance between any
two measurements is constant. The observations here are numerical in nature
(quantitative) unlike the first two levels of measurement that are qualitative. For example,
when we measure temperature in Fahrenheit, the distance from 300-to 400 is equal to that
between 700-800.
This is a quantitative scale where there is a true zero and equal intervals between
neighbouring points. For instance, if a man is 60 years old while the son is 20years, it is
obvious that the man is three times older than his son.
5|Page
CHAPTER TWO
(ii) provide adequate data for analyzing the various subpopulations, and
(iii) enable different research methods and procedures to be used in different strata
In stratified random sampling, the population of N units first divided into non-
overlapping, homogenous, subpopulations (blocks of units, district, states etc) of N1,
N2,….NL units respectively such that N i N
These subpopulation are called strata. Within each stratum, a without replacement
simple random sample of size nh, h= 1,2,3,….L such that n h n h = 1, 2,3,…..L is made
the selection are being made independently in different strata.
There are three common ways of allocation, the simple units to strata. They are:
6|Page
first member of the sample is selected using the method of simple random sampling. The
others are selected thereafter by selecting every kth item on the list where K is a positive
N
integer such that K , N is the total units in the population and n is sample size. For
n
1000
Illustration, suppose N =1000 and n=50, K 20
50
We then use the table of random numbers to select the first item on the sample (this is
achieved by having random numbers between 1 to 20) and choose units with numbers
between 1 to 20 and choose every Kth units. Thereafter if, for Illustration 18 is chosen, then
the sample will consist of units with numbers 18,38,58,78,98,…..998.
(ii). Questionnaire
(iii). Observation
(iv). Experimentation
(vi). Registration
7|Page
2.2.1 Interviews Method
It is the most appropriate method of data collection when dealing with human
population. The method may be interview conducted by the investigator (Personal
interview) or conducted by delegates (the use of enumerators in the interviewing
process).
In personal interview method, the investigators collect the data himself. The field he can
cover is naturally limited. The personal interview method has the advantages in that data
will be collected in a uniform manner. The danger (to be guarded against) is that the
investigator may be tempted to select data that accords with some of his preconceived
notions. The personal interview method is also useful in pilot survey carried out prior to
the main survey since personal interview will reveal the problems that are likely to occur.
Interview by enumerators (delegated investigators) becomes the only option when the
field to be covered is extensive and the task of collecting information is too great for one
person. In this, a team of selected and trained investigators or interviewers are employed.
The enumerators must be properly trained and informed of the purposes of the
investigation; their instrument must be very carefully prepared to ensure that the results
are in accordance with the requirements of statistical data. If there are many interviewers,
personal bases may tend to cancel out. Care in allocating the duties to the investigators
can reduce the risk of personal bias.
Interview (personal or delegated) may be formal or informal. It is formal when the same
set of questions are asked by the enumerator(s) and the responses are recorded in
standard form(s). It is informal when the enumerator(s) is (are) free in his (their)
approaches and he (they) can vary the order of questions to the respondents. Generally,
interviewing requires trained personnel. It consumes a lot of time and money. Despite
these, it still produce high response rates as the enumerators are present to help the
respondents when additional information are required. It is reliable, efficient and
dependable. The information required are obtained faster and the questions are answered
by right persons.
The sincerity of the respondents as they give answer can be judged by the interviewers.
Few probing questions may be asked to clarify issues. Apart from the cost, it is often
difficult to reach certain people during normal interviewing hour either because of the
geographical location of their homes or some other reasons. Also thought provoking
questions and questions that require mathematical computation and analysis may not
receive answers from some respondents (because of their level of awareness or some
other reasons). Furthermore, if care is not taken biases may be introduced by the
8|Page
interviewers as a result of suggesting possible answers to questions by the interviewers.
Interview method may also be through phone or internet (such as zoom, Goggle meet.
Skype, teams etc).
2.2.2 Questionnaire
In some enquires, the data are made up of information which must be supplied by a large
number of respondents. A very convenient form of collecting data is to issue
questionnaire forms to the respondents concerned and ask them to fill in the answers to
a set of well ordered and logically drawn and printed questions. This method is usually
cheaper than interview method and can cover wider geographical area.
(i) The questionnaire forms may be completed by respondents who may not be aware
of some of the requirements and who may place different interpretation on the questions
even when the questions most carefully worded.
(ii) The respondents may give false or misleading information either because they
have forgotten the material facts or they are designed to give favourable answers.
(iv) Many forms may not be required and this may be due to lack of interest of the
concerned respondents on the subject matter or respondents who are hostile to the
enquiry. The outcome is that we end up with complete schedules from only certain kind
of respondents and thus have a biased sample.
If the questionnaire forms are distributed and collected by enumerators, a greater
response rate is expected and queries can be answered. Here, however, care must be
taken so that the enumerators do not lead respondents in any way. The success of
questionnaire method sent via internet depends on the ability and knowledge of the
respondents about the information required. Provided the respondents are intelligent
enough to answer the questions correctly. Questionnaire method via internet is generally
quicker and cheaper when compared to other methods. It helps to cover wider
geographical area. This gives rooms for the respondents to supply answers at their
convenience.
9|Page
2.2.3 Observation
Data collected by physical observation or measurement consist of physical examination
of the units or respondents and recording data as a result of personal judgment or using
measuring instrument by the investigators or enumerators. Where practicable, this
should ordinarily be the best of collecting information since it is free from memory errors
of the respondents, exaggeration and prestige effects. Usually, the method may involve
greater effort and cost. It is suitable in case of traffic census, statistical quality control,
market research and so on. It is also useful in studying small communities to find out
how people live, their attitude and their relationship with one another. Opinions differ
as to whether the observer should be an active participant in the life of the community or
not. In any case, his success will depend on his skill and his personality. One of the
limitation in this method is that the observer may not be subjective (take place in the mind
but modified by individual bias) in reporting what he sees.
2.2.4 Experimentation
This is commonly employed in the science and seldomly used in social sciences. The
required information may be obtained by clinical examination or adoption of some
laboratory procedure. The success or failure of this method depends mainly on the skill
of the experiment and the quality of the instruments used for measurements. If the
experimenter is thorough and the instruments are good, the results obtained must be
quite accurate and reliable.
The cost involved in adopting this method is much less compared with the case of
interview or questionnaire. This method is generally used only to collect statistical data
covered by statutory regulations. In such cases, the response rate is likely to be better due
to the fear of possible consequence and penalty e.g voters registration.
10 | P a g e
2.2.7 Transcription from records
The method of transcription from records (published or unpublished) used when data
needed for a specific purpose are already available in registers maintained in one or more
places, making it no more necessary to collect them directly from the original units at
much cost and effort. We only need to reassemble and reanalyze data which has already
been collected by someone else for some other purposes. This method is extensively used
since a good deal of government and business statistics are collected as by-product of
routing administrative operations. Obviously, the quality of the data obtained through
this method can at best be quality of the original data. When using this method, it is very
important to state clearly the definition of terms and units. The source of the information
must be reliable and the information must be up-to-date.
11 | P a g e