0% found this document useful (0 votes)
473 views82 pages

Definition of Statistics

This document provides an introduction to statistics, defining it and outlining its key concepts. Statistics can refer to either numerical data or the methods used to analyze quantitative data. It defines statistics as the aggregate of facts affected by multiple causes and numerically expressed in a systematic way. The main objectives of statistics are to make sense of large populations of data, take action based on available information, and draw conclusions. It functions to simplify complex data, enlarge individual experience, and indicate trends to aid decision making.

Uploaded by

emma
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
473 views82 pages

Definition of Statistics

This document provides an introduction to statistics, defining it and outlining its key concepts. Statistics can refer to either numerical data or the methods used to analyze quantitative data. It defines statistics as the aggregate of facts affected by multiple causes and numerically expressed in a systematic way. The main objectives of statistics are to make sense of large populations of data, take action based on available information, and draw conclusions. It functions to simplify complex data, enlarge individual experience, and indicate trends to aid decision making.

Uploaded by

emma
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 82

Intro.

To statistics &for Economics -Emma Charles 0753-236-367/0787-080-333

INTRODUCTION

Definition of statistics

The term ‘Statistics’ is used in plural as well as in Singular meaning; In Plural


meaning, Statistics means, Description of numerical facts that are presented systematically.
In Singular; It refers to the statistical methods and principles used for the classification and
analysis of a quantitative data so as to arrive valid conclusions. Statistics has been defined
by many writers from time to time. It is .practically impossible to enumerate all the
definitions given to statistics. However, below are some selected definitions:

A. Statistics as Numerical Data

"Statistics are the classified facts representing the conditions of the people in a state... specially
those facts which can be stated in number or in tables of numbers or in any tabular or
classified arrangement." - Webster.

"Statistics are numerical statement of facts in any department of enquiry placed in relation to
each other."-Bowely.

"By Statistics we mean quantitative data affected to a marked extent by multiplicity of


causes."- Yule and Kendall.

"Statistics may be defined as the aggregate of facts affected to a marked .extent by multiplicity
of causes, numerically expressed, enumerated or estimated according to a reasonable standard
of accuracy, collected in a systematic manner, for a predetermined purpose and placed in
relation to each other." - Prof. Horace Secrist.

This definition seems to be the most exhaustive 'of all the definitions. By examining
this definition, we can draw the following essentials of "Statistics":

 Aggregate of Facts: Statistics are a part of aggregate of facts relating to any


particular field of enquiry. No single or isolated items can be termed as statistics.
 Affected by Multiplicity of Causes: The facts are hardly ever traceable to a single
cause. They are affected by multiplicity of causes or factors. The joint effect of all
the causes on a single item is studied with the help of the statistical techniques.'
We are more concerned with the facts rather than their causes

1|Page
Intro. To statistics &for Economics -Emma Charles 0753-236-367/0787-080-333

 Numerically Expressed: Statistics are quantitatively expressed. The facts having


qualitative characters cannot be termed as 'statistics'. However, the qualitative
aspects, which can be expressed numerically by assigning scores or ranks or
standards, can be treated as 'Statistics'.
 Enumerated or Estimated According to Reasonable Standard of Accuracy:
Statistics are made available by maintaining reasonable standard of accuracy. For
precise results, statistics must be accurately compiled. If not possible, a sampling
method is adopted and a reasonable standard of accuracy is maintained in
collecting, classifying and analyzing the data. It all depends upon the nature and
purpose of the enquiry for which statistics are to serve.
 Collected in a Systematic Manner: The facts about a particular phenomenon are
collected systematically. A proper plan is prepared, and it is executed effectively
by the trained investigators.
 Collected for a Pre-determined Purpose: The objectives or purpose of the
enquiry must be defined in clear and concrete terms well in advance. This avoids
the collection of irrelevant data.
 Comparable: From practical point of view, the statistics are capable of being
compared with other concerned figures. They are being placed in relation to each
other. They are all comparable, homogeneous and related to the same
phenomenon or subject.

Thus, from the definition of Prof. Horace Secrist and its discussions above, we may
conclude that:

"All statistics are numerical statements of facts but all numerical statements of facts are not
statistics." The numerical statements of facts, to be designed as statistics, must possess
some of the characteristics given in the definition of Prof. Horace Secrist.

B. Statistics as a Science

The term 'Statistics', as a singular noun, means a body of theories and techniques or
methods employed in analyzing the numerical information. It is a branch of
scientific method and deals with the mathematical process to yield finished product.
It is concerned with the collection, classification, tabulation, presentation and
analysis of data relating to a particular field of study.

2|Page
Intro. To statistics &for Economics -Emma Charles 0753-236-367/0787-080-333

"Statistics may be defined as the science of collection, presentation, analysis and interpretation
of numerical data." F.E. Croxton and D.J. Cowden.

This definition is the best of all the above definitions. It is satisfactory and complete
in giving the correct meaning of the term 'statistics'. Thus, a study and analysis of
various definitions of statistics helps us to draw the following essentials of
"Statistics":

i. It is a science and an art as well.


i i. It deals with the quantitative mass data.
ii i. It includes collection, classification, tabulation and presentation of data. ,
i v. It is a system of analysis and synthesis of numerical data.
v. It is a device of summarization, comparison, treatment and interpretation of
numerical data.
v i. It is a process of observing, recording, describing and enumerating the
quantitative data.
v i i. Its purpose is to obtain and explore knowledge.
v ii i. It involves a technique of drawing conclusions and making wise decisions on the
face of uncertainty.
ix. It is a body of methods for obtaining information (i.e. a branch of scientific
method).

Thus, the science of statistics is widely employed as a tool of analyzing the problems
in the field of natural and social sciences. It provides tools and techniques for
research students; it is not studied for its own sake but for the sake of developing
new sciences. It is a method of research in which statistical method (experiment
method in natural sciences) is adopted in studying the problems concerned with the
social sciences.

Main Divisions of Statistics

The domain of statistics or the subject-matter of statistics can be generally classified


into two main divisions - Statistical Methods and Applied Statistics.

1. Statistical Methods: Statistical methods are concerned with the formulation of


the general rules and principles applicable in handling different branches of data,

3|Page
Intro. To statistics &for Economics -Emma Charles 0753-236-367/0787-080-333

Collection, Classification, Organization, Tabulation, Presenting, Analyzing and


Interpreting data relating to any field of enquiry. They are tools in the hands of
statistical investigators in achieving predetermined objectives. They help in
extracting truths which are hidden in a mass of data. They are again divided into
two groups:
 Descriptive Statistics: It is confined to the treatment of data for the purpose of
describing their characteristics. It involves techniques for summarizing data and
presenting them in a usable form.
 Inferential Statistics: It is an inductive statistics which involves making forecasts,
estimations or judgments, about some larger group of data, from sample data.
Inferences about population drawn from sample measures may involve some
error or discrepancy.
2. Applied Statistics: Applied statistics deals with the application of statistical rules
and principles to concrete factors like wages, prices, income, trade, population
and other variables, as they exist. Quality control, sample surveys, quantitative
analysis for business decisions and other applications are included in this
division.

III. Objectives of Statistics

The main objective of statistics is to study the population and the variables therein
for the purpose of reducing the population to the possible extent which helps to
make decisions and solve problems. Following are the various objects of statistics:

i. To make sense from the population or mass.


ii. To take action on the basis of available data.
iii. To bear light on the complexity of the problem.
iv. To forecast the future trend from the data.
v. To prove unknown from the known data.
vi. To examine the changes in particular activities.
vii. To draw conclusions from the information.
viii. To provide basis for the formation of knowledge relating to a particular field of
study.

4|Page
Intro. To statistics &for Economics -Emma Charles 0753-236-367/0787-080-333

Therefore, the quantitative data are processed for the purpose of doing something
and making use of them. They are utilized to examine the problems concerning a
field of enquiry in their true perspective, to find out the causes of changes and to
estimate their probable effects. The statistical methods are employed as a tool for
comparison between past and present events to throw light on the reasons of
changes.

IV. Functions of Statistics

The main function of statistics is:

To enlarge our knowledge of complex phenomenon and to lend precision to our


ideas that would otherwise remain vague and indeterminate. Statistics increases the
field of mental vision as an opera glass or telescope that increases the field of
physical vision. It widens our knowledge because of its following functions:

i. It simplifies unwieldy (awkward) and complex mass of data in an intelligible


manner.
ii. It enlarges individual experience that helps in making decisions.
iii. It indicates tendencies or trends or positions or directions of changes in data.
iv. It collects the data systematically in a definite form, as information, useful for
various purposes.
v. It presents data in a most suitable manner that can be understood at a glance.
vi. It compares one set of data with the other and discloses the comparative
position.
vii. It studies or establishes relationship between the two related aspects of
particular phenomenon.
viii. It guides the management in formulating the plans and policies.
ix. It acts as a guide in measuring the effects of government policies and business.
x. It assists in testing the hypothesis in theory and discovering new theories.
xi. It helps in estimating the present and forecasting the future activities.

Thus, the important functions of statistics are simplifying, enlarging, indicating,


collecting, presenting, comparing, studying, guiding and helping in the process of statistical

5|Page
Intro. To statistics &for Economics -Emma Charles 0753-236-367/0787-080-333

investigation and interpretation. It discloses the hidden facts and enlarges the field of all
the branches of human knowledge.

Importance of Statistics

Statistics has been termed as the "Science of Kings". Indeed in ancient times, it kept
the Kings informed about the man power and riches of domain. Nowadays, it is
treated as the "Arithmetic of Human Welfare". It is indispensable for a clear
appreciation of any problem affecting the welfare of mankind.

In modern times, planning without statistics cannot be imagined. Statistics is the light
bearer that enlightens the way to life's adventure. It unravels the crowded
complexities of life and thought. Without its support man has to wander aimlessly
through this perplexing universe. It discloses casual connection between related
facts. Such a study of statistics is at the bottom of all sound human endeavors.

Statistics are the eyes of administration. All the businessmen and statesmen can tender
sound advice on a problem to their administrative machinery with the help of
adequate data before them to base their judgments.

Statistics are aids to supervision. The present days of impersonal relationship between
the employer and employees, statistics are becoming tools for the supervision of
work in obtaining efficiency of the employees. The supervisors or officers are
provided with accurate and concise information to supervise the work of their
subordinates.

Statistics are invaluable in business. To be successful, a businessman estimates demand-


for his products in the market. His business runs on estimates and probabilities.
Statistics help him in planning and policy making.

Statistical methods are indispensable in a quantitative study. They are useful in


marketing, accounting, producing and operating activities. They bring truth to light
and correct the faulty observations. They are extensively applicable to all the
branches of human knowledge - governing, managing, accounting, business,
research, social studies, planning and other fields. They are closely associated with
the progress of civilization.

6|Page
Intro. To statistics &for Economics -Emma Charles 0753-236-367/0787-080-333

VI. Scope of Statistics

The scope of statistics is stretched over all those branches of human knowledge in
which a grasp of the significance of large numbers is looked for. Its methods provide
an important manner of measuring numerical changes in complex groups and
judging collective phenomenon. Its scope is, thus, wide, the limiting factor being its
applicability to studies of quantitative aspects alone

"Sciences without statistics bear no fruits; Statistics without sciences have no roots."

There is hardly any field of human knowledge where statistical methods are not
applicable. Thus the significance of statistics has increased from the "Science of
Kings" to the "Science of Universe".

VII. Limitations of Statistics

For proper use of statistics, it is essential to keep in mind the following limitations:

 Ignores Qualitative Aspects: Statistical methods cannot study those qualitative


phenomena which cannot be expressed quantitatively. However, the qualitative
phenomenon, to which quantitative standards or scores can be assigned, is
included in the study
 Does not Study Individual Items: A single or isolated figure cannot be regarded
as 'statistics'. Statistical methods do not study or recognize an isolated fact.
Statistics is confined only to those problems where group characteristics are to be
studied"
 Statistical Laws are not exact: Statistical laws are true only on the average or in
the "long run. They are not like the exact laws of physical sciences which are said
to hold true in every individual case. Statistical laws only show approximate
tendencies.
 Statistics does not Reveal Entire Story: Statistics provides solution to a problem.
It does not study the problem itself. Its results are depending upon many other
evidences. It cannot give the entire story of the problem. Too much dependence
on statistics may lead to wrong conclusions. "Statistics is only a means to an end but
not an end itself."

7|Page
Intro. To statistics &for Economics -Emma Charles 0753-236-367/0787-080-333

 Liable to be Mis-used: Statistical tools are dangerous in the hands of those who
do not know their use and deficiencies. They can be misused to serve one's own
purpose. Statistics is the useful servant but only of great value to those who
realize them properly.

Therefore, the use of statistics by the experts, who are well experienced and skilled in
the analysis and interpretation of statistical data for drawing correct and valid
inferences, very much reduces the chances of disgrace to the mass popularity of this
important science.

DATA COLLECTION

"Collection of Data” is the process of counting or enumerating or measuring


together with the recording of results. It is the preliminary step of the statistical
enquiry or investigation.

SOURCES OF DATA

The sources of data collection are classified, on the basis of the nature of data, into
two groups i.e. Primary and Secondary.

"Primary Data" are those data which are originally collected for the first time, for a
specific purpose by an investigator. They are collected directly from the people to
whom the enquiry is related. They are original in character. They are primary to the
institution which collects them; they are secondary to all other institutions which
refer such data.

"Secondary Data" are those data which are already collected, processed and used by
someone else for their own purpose. They are either published or unpublished. They
are finished products. They are secondary to all those institutions except the one
which has collected them.

It may be observed that the distinction between primary and secondary data is a
matter of degree or relativity only. The same .set of data may be primary in the
hands of one and secondary in the hands of others. The sources of collecting the
data, thus, are also classified as – Primary Sources and Secondary Sources.

8|Page
Intro. To statistics &for Economics -Emma Charles 0753-236-367/0787-080-333

PRIMARY SOURCES

The data which are originally collected for the first time are called "primary data." The
methods of collecting the primary data are as under:

i. Direct Personal Observation.


ii. Indirect Oral Interviews.
iii. Information through Local Agencies.
iv . Mailed Questionnaire Method.
v. Schedules through Enumerators.

Direct Personal Observation: This method involves collection of data personally by


the investigator or organizing agency who has to go to the field personally to make
observations or enquiries and soliciting information from the informants or
respondents. Such a method is suited only if the enquiry is intensive rather than
extensive.

Advantages of Direct Personal Observation method

i. It is used only if the investigation is local or regional.


ii. It requires personal attention of the investigator.
iii. The information gathered is original in nature.
iv . It involves personal face to face contact with the informants or respondents.
v. The information is first hand one, more reliable and accurate.
v i. Good response from the informants because of personal approach,.
v ii. The investigator can get additional information, if any, from the informants
and he can adjust his plan of observation according the nature and behaviour
of the informants.

Disadvantages

i. This method is restrictive in nature and is suited only for intensive studies
and not for extensive enquiry.
ii. It is handicapped due to lack of time, money and manpower since the
informants are to be approached at their residence or office at their
convenience.
iii. It requires skilled and experienced investigator for enquiry.
9|Page
Intro. To statistics &for Economics -Emma Charles 0753-236-367/0787-080-333

Indirect Oral Interviews: This method consists of the collection of data through
enumerators appointed for the purpose. A small list of questions pertaining to the
subject matter of the enquiry is prepared. These listed questions are then put to the
informants or witnesses who are closely connected or concerned with the subject
matter. The enumerators collect all the answers from the informants and record
them systematically. This method is adopted by the commissions appointed by the
government.

Advantages

i. The information collected by the enumerators is correct and relevant.


ii. It is less expensive and requires less time for conducting the enquiry.
iii. If necessary, the expert views and suggestions can be obtained in order to
formulate and conduct the enquiry more efficiently and effectively.

Disadvantages

i. This method suffers from lack of personal touch and supervision by the
investigator as the information is collected through the enumerators from the
informants.
ii. The accuracy of data collected much depends upon the nature and skill of the
informants who may give wrong information.
iii. The informants may be prejudiced (biased) persons.
iv . It is very difficult to find the informants who are more concerned with the
subject matter of the enquiry. Some of them even may not really possess the
knowledge in answering the questions.

Information through Local Agencies: This method involves the appointment of


local agents or correspondents by the investigators in different parts of field of
enquiry. These agencies or correspondents in different regions collect the
information according to their .own ways, fashions, likings and decisions, and then
submit their reports periodically to the central or head office where the data are
finally collected. This method is usually employed by the newspaper or periodical
agencies and the various departments of the government.

Advantages

10 | P a g e
Intro. To statistics &for Economics -Emma Charles 0753-236-367/0787-080-333

This method works out to be very cheap and economical for extensive investigations
particularly when the data are obtained by part-time correspondents or agents.
Moreover, the information can be obtained expeditiously since only rough estimates
are required.

Disadvantages

This method suffers from personal prejudices and whims (sudden change of mind)
of the correspondents in different fields of the enquiry and consequently the data so
obtained will not be very reliable.

This method is not suitable for collecting the accurate and exact information.

Mailed Questionnaire Method: This method involves preparing a 'Questionnaire' (a


list of questions) relating to the field of enquiry and providing space for the answers
to be filled in by the respondents or informants or addressees to whom the
questionnaire is mailed with a request for quick response within' the specified time.

A very polite covering note or letter explaining the aims and objects of enquiry is
attached. Respondents are also requested to extend their full co-operation by
furnishing information and returning the questionnaire duly filled in, Respondents
are taken into confidence by ensuring them that the information so supplied by
them will be kept strictly confidential and secrete.

In order to ensure quick and better response, the return postal expenses are usually
borne by the investigating organization by sending the self-addressed stamped
envelope. This method is usually used by the research workers, private individuals,
non-official agencies and government departments.

In this method, the questionnaire plays an important role.

Characteristics of a good Questionnaire

i. Due skill, efficiency, care and experience are required in framing the
questionnaire.
ii. The questions asked must be clear, short, brief, corroborative, non-offending,
courteous, unambiguous and suitable for answering in the form of "Yes" or
"No".
11 | P a g e
Intro. To statistics &for Economics -Emma Charles 0753-236-367/0787-080-333

iii. They should not be too confidential and hurt the feelings of the respondents.
iv . They must be arranged in a logical order.
v. They must suit the general educational level of the respondents.

Advantages

i. This method is by far the most economical method in terms of time, money
and manpower.
ii. It is used for extensive enquiries covering a very wide area.
iii. The information obtained is original and much more authentic.

Disadvantages

i. It is used only if the audience is educated.


ii. It is not at all useful when the informants are illiterate. Even if they are
literate, they may not take interest in answering and returning the
questionnaire.
iii. Quite often, the correct information may be suppressed.
iv . It is very difficult to verify the authenticity of the answers given by
respondents in the questionnaire.
v. Sometimes the informants may hesitate to disclose their income, property and
habits in writing.
v i. There is also no scope for obtaining supplementary information by putting
additional questions for cross checking the answers.
v ii. Informants are not given any opportunity for clearing the doubts that are in
their minds.

Schedules through Enumerators; A schedule is the device of obtaining answers to


the questions in the form which is filled in by the enumerators in a face to face
situation with the respondents. Schedules are sent through the enumerators.

A questionnaire is a list of questions which is sent directly to and filled in by the informants.

A schedule is a list of questions which is filled in by the enumerators.

The enumerators visit the respondents personally with the list of questions
(schedules), ask them the questions therein and record the answers from the

12 | P a g e
Intro. To statistics &for Economics -Emma Charles 0753-236-367/0787-080-333

informants. This method is generally used by the big business houses, large public
undertakings and research institutions.

Advantages

i. The enumerators can explain in detail the objective and scope of the enquiry
to the informants and impress upon them the need and utility of furnishing
the correct information.
ii. This method is useful in extensive enquiries and it yields reliable and
dependable results.
iii. There will be' a good and quick response from the informants whether they
are literates or illiterates.
iv . The enumerators can effectively check the accuracy of the information by
cross-questioning and putting additional questions.

Disadvantages

i. This method is fairly expensive as the team of enumerators is to be highly paid


for the work.
ii. It is also more time consuming as compared to the "mailed questionnaire"
method.
iii. The success of the method largely depends upon the efficient, skilled and
trained enumerators.
iv . Inefficient and biased enumerators may suppress or even twist the
information supplied by the informants or respondents.
v. There will be variation in the information recorded by different enumerators
with different personal views.
v i. If the schedule is not correctly and carefully framed, the enumerators may get
incomplete and wrong information from the respondents.

From all the above sources that primary data can be collected, there are other
methods such as- Local Reports, Enquiries, Telephone Calls and Fax - which are used in
most of the advanced countries for collecting data.

SECONDARY SOURCES

13 | P a g e
Intro. To statistics &for Economics -Emma Charles 0753-236-367/0787-080-333

The data which have already been collected by someone else are called "secondary
data" which have gone through the statistical process at least once. They may be
either finished or semi-finished and published or unpublished.

i. Published sources are those publications which belong to the government,


national and international organizations, trade associations and chambers of
commerce. The various government departments also publish Journals,
Bulletins and Periodicals which contain valuable and ready information. Even
research centers and educational institutions are publishing various types of
information.
ii. Unpublished sources are those publications which are not openly circulated
in the public. The data are just maintained, as records, by the private firms or
business concerns. They are not at all published as they are meant for private
purposes. Even the various units or departments of the government can
circulate the information within the departments. Such secondary data may
be made available to others. The secondary data should be used with extra
caution. The investigator must satisfy himself as regards the reliability,
accuracy, adequacy and suitability of the data.

TECHNIQUES OF COLLECTING DATA

Statistical data are collected for the purpose of analysis and interpretation. The data
are collected by techniques such as - Census and Sample. The problem of
application of techniques will arise only in-case of collecting the primary data.

1. Census Method: In a census technique of collecting the data every element of the
population is included in the investigation.

Characteristics of Census Method

i. It involves a complete count in which information is collected about every


unit of the universe relating to the subject matter of the enquiry.
ii. It is the oldest method of collecting the data.
iii. It is a straight forward technique in which an investigator observes each and
every item within the scope of enquiry.
iv . Not a single item is left unnoticed by the investigator

14 | P a g e
Intro. To statistics &for Economics -Emma Charles 0753-236-367/0787-080-333

v. All the items of the population are observed and recorded.

Advantages

i. The results from such method are more accurate and reliable.
ii. This method helps in making available the intensive knowledge.
iii. It is useful (suitable) where units of diverse nature constitute universe or
population or mass.

Disadvantages

i. This technique is more expensive and requires more labour and time.
ii. It is the most difficult and giant task to count all the items in the population
as they are always subject to changes.
2. Sample Method: In a sample technique of collecting the data selection of the
part of the universe is made for the purpose of study.

Characteristics of sample method

i. Only the representative items from the mass are collected.


ii. The number of items selected generally represent the entire population.

It is very essential to select the adequate number of items so that each one shall have
the characteristics of the mass to the possible extent. A finite subset of the
population is called "sample". The number of units in the sample is called the
"sample size".

Advantages

i. Sampling enables us to draw conclusions.


ii. This method is economical, time-saving, convenient and practicable.

Disadvantage

The samples selected may not represent the mass and they may be false and
influenced by the personal bias of the selector.

15 | P a g e
Intro. To statistics &for Economics -Emma Charles 0753-236-367/0787-080-333

As the census method is too cumbersome and costly, in most of the cases the
sampling method is used in statistical enquiries. There are two methods of taking
the samples - Random sampling and Non-random sampling.

Random Sampling: It refers to the chance selection but it does not mean that the
selection should be made in a haphazard manner. A few mechanical devices are used
in a sample selection so that every item selected on random basis has equal
probability of entering the sample. It is a blind chance alone that determines
whether one or the other item mayor may not be selected.

The term "random", is not used in the sense of 'hit or miss' method. There shall be no
room for discrimination to be exercised by the investigator. In fact, sampling is a
scientific way of getting items selected from the mass. The selection of the items
should be free from personal bias or judgment. The various methods of sampling
may include:-

Drawing chips from the bowl, rotating the drum and use of a table of random digits - are used
in collecting the data. The most important method adopted nowadays is the "lottery"
draws that are based on purely random sampling.

Non-random Sampling: It involves a deliberate selection of the samples and refers


to the personal bias and discretion of the investigator while selecting the specific
items. It is also called a purposeful sampling in which an investigator exercises
human judgment. There is scope for bias or personal interest and no element of
chance in selection of the items.

There are many more other methods of taking the samples. They are as under:

In a "Systematic or Quasi-random" sampling, a list of universe is prepared either on the


basis of alphabetical, geographical or numerical order. The first item is selected at
random from each set, then eleventh item, then twenty first and so on.

In a "Stratified sampling", different strata or segments are made in the population on


the basis of homogeneous characteristics. Each stratum is called sub-population from
which selection of the items are made.

16 | P a g e
Intro. To statistics &for Economics -Emma Charles 0753-236-367/0787-080-333

In a "Cluster sampling", various clusters having heterogeneous characteristics are


made. It involves selection of some of the clusters on random basis.

In a "Double Sampling", samples are taken into two stages. If the first set of sample is
inclusive, the second set is added to the first one to strengthen the results.

In a "Multiple sampling", a sample procedure is carried out in several stages - the first
stage, the second stage and so on. For example, in the first stage a few districts from
the state are selected. In the second stage, a few divisions are selected from the
sample districts. In the third stage, a few villages are selected from the sample
divisions. In the next stage, a few families are selected from the sample villages. In
this process, the size of the sample goes on reducing at each stage.

In a "Quota sampling", the investigator is directed to collect information from an


assigned quota or number of individuals in each of several groups characterized in
the universe.

In a "Convenience Sampling", a sample is obtained from the readily available lists such
as telephone directory, ration card-holders' list and voters' list.

In a "Sequential sampling", a number of samples are drawn from the universe one after
another depending on the results of previous samples. If the first sample is not
acceptable, the lot from which the first samples are made is rejected. Then the
second lot is taken from which again samples are selected. After each lot of items, we
have to decide whether to accept or reject or to reserve the judgement or continue
the process until a final decision is taken.

Thus, the choice of sampling technique depends upon the nature of the universe,
the sampling unit, the size of sample and the other factors.

PRINCIPLES OF SAMPLING

The following are the theories or principles or laws of the statistical methods:

 Theory of Probability: The method of random sampling is based on the


mathematical "theory of probability". The theory implies that if from a very large
number of items (population), a moderately large number of items are chosen at

17 | P a g e
Intro. To statistics &for Economics -Emma Charles 0753-236-367/0787-080-333

random, such items are almost sure, on the whole, to possess the characteristics
of the population.
 Law of Statistical Regularity: The law indicates that a moderately large number
of items chosen at random from a large group are almost sure to possess, on an
average, the characteristics of the large group or population. The ''Theory-of
Probability" gave rise to the "Law of Statistical Regularity". The former tells us of
mathematical expectation of happening and not happening of events having
equal chances in a sample selection. The latter tells us that random selection
from the universe is likely to give a representative character of the population.
 Law of Inertia of Large Numbers: It is a corollary to the "Law of Statistical
Regularity". It states that, other things being equal, larger the size of the sample, it
is likely that the results will be more accurate.

STATISTICAL ERRORS

An "error", in general, refers to a mistake. An "error", in statistics, denotes the


difference between the true value (a result of census) and the estimated value (a
result of sampling). So, in statistics, an error does not mean mistake but a difference.

After the collection and editing of the data, there still remains the difference
between the true value and the estimated value. Such errors are called statistical
errors which arise due to inadequacy of data, unintentional acts of the investigators,
random sampling and editing of the data.

Types of Statistical errors

Statistical errors are of two types sampling errors and non-sampling errors.

1. Sampling Errors: These errors arise due to drawing inference about the
population or universe on the basis of samples or observations, because samples
are seldom perfect miniature (very small in kind) of the population. These
sampling errors may be classified, further, as biased and unbiased (purposeful and
unpurposeful).

"Biased Errors" exist due to biased views of the informants and investigators,
defective measurements, defective sampling and prejudiced treatment of data

18 | P a g e
Intro. To statistics &for Economics -Emma Charles 0753-236-367/0787-080-333

collected. These errors are constant, cumulative, persistent and systematic. They are
not compensating and they move in the same direction. They can be checked by
examining their sources carefully and taking proper precautions at each stage in the
process of investigation.

"Unbiased Errors" exist naturally and suddenly without any bias or prejudice. They
arise due to the principles of statistics - "Theory of Probability", "Law of Statistical
Regularity" and "Law of Inertia of Large Numbers". The excess errors in one
direction are almost balanced by the excess errors in the other direction. Such
errors, therefore, are compensating. They are non-cumulative and compensating.
They lie in both the directions - positive and negative.

Therefore in the sampling enquiry we find both the sampling and non-sampling
errors

Non-sampling Errors: These errors occur in the sampling enquiry and the census
enquiry as well. But in the census enquiry we find only non-sampling errors. Non-
sampling errors include both statistical errors and mistakes. They creep in during the
process of collecting the information.

Causes of Non-sampling errors

i. They arise due to the negligence of the investigators in asking the questions or
in recording the answers.
ii. They also arise due to the lack of knowledge in the investigation.
iii. They are also the result of the approximation, faulty definitions and defective
system of collecting the data.

They increase as the number of units included in the census enquiry. They are
greater in a census enquiry than in a sample.

All the statistical errors can be measured absolutely or relatively. So the statistical
errors may be either absolute or Relative.

a. Absolute Error: It is the difference between the true value and the estimated
value. It is the difference between the origin figure and the approximated figure.
To put it in a formula form:

19 | P a g e
Intro. To statistics &for Economics -Emma Charles 0753-236-367/0787-080-333

Absolute Error = Actual Value - Estimated Value

AE = AV – EV (The result may be positive or negative)


b. Relative Error: It is the ratio of the absolute error to the estimated value. It can
also be expressed in a percentage.
Relative Error = Actual Value - Estimated Value
Estimated Value
= AV – EV = Absolute Error
EV Estimated Values
For example, if there are three workers having daily wages Shs 44, Shs 84 and Shs
164. Their estimated daily wages will be Shs 40, Shs 80 and Shs 160 respectively.
Then the errors are;

Absolute Errors Relative Errors

44-40=4 4140 = 0.1 = 10%


84-80=4 4180 = 0.05 5%
164 -160 = 4 41160 = 0.025 = 2.5%
Thus, to make the secondary data fit for analysis and interpretation, we should edit
them to suit the purpose of enquiry. The errors and irregularities in them are to be
removed. The suitability, adequacy and reliability of the secondary are examined
carefully. When they are satisfactory and reliable, they may be used for analysis and
interpretation.

CLASSIFICATION OF DATA

The data collected in any statistical enquiry are voluminous, huge and in a raw form.
They are unwieldy and uncomprehensible. Some process of condensation is
necessary to make the data, organized so that they could be easily understood.

There are three steps in organizing the raw data - Classification, Tabulation and
Analysis.

20 | P a g e
Intro. To statistics &for Economics -Emma Charles 0753-236-367/0787-080-333

"Classification" is a process of arranging things (either actually or notionally) in


groups or classes according to their resemblances and affinities. It gives expression to
the unity of attributes that may subsist amongst a diversity of individuals. All the
items with similarities are brought together. Following are the definitions.

"Classification is the process of arranging data into sequences and groups according to their
common characteristics, or separating them into different but related parts". - Secrist

"Classification is a scheme for breaking a category into a set of parts, called classes, according
to some precisely defined differing characteristics possessed by all the elements of the category".-
Tuttle. A.M.

Thus, the process of classification involves the arrangements of the data into
different classes which are to be determined depending upon the nature, objectives
and scope of the enquiry.

Objectives of Classification

The objects of classification are as under:

 To present the facts in a simple form: Classification makes the unorganized mass of
data simple, brief, logical and understandable. It helps in understanding the
structure and nature of data.
 To bring out points of similarity and dissimilarity: Classification brings out clearly the
points of similarity and dissimilarity existing among the data. Different attributes
of data are placed in different classes.
 To facilitate comparison: Classification helps in making comparisons between the
two series which enables one to draw inferences. Without classification it is very
difficult to have a comparative study and take decisions
 To bring out relationship: Classification helps to establish relationship between the
two series. With the help of such' a relationship, one can conclude about the
cause and effect relationship.
 To present a mental picture: Classification makes one to have a mental picture of
the mass data. It gives a clear picture of the complex data in a condensed form.
 To prepare the basis for tabulation: Classification makes the process of tabulation
easy. Without classification it is very difficult to tabulate the data.
21 | P a g e
Intro. To statistics &for Economics -Emma Charles 0753-236-367/0787-080-333

 To eliminate unnecessary data: Classification eliminates the unnecessary and


irrelevant data for further study. It gives prominence to the important
information only and drops out the unwanted elements.
 To bring in uniformity among attributes out of the diversities persistent in the
collected raw data. This helps to understand the data easily.

Functions of Classification

The functions of classification may be briefly summarized as follows:

 It condenses the data:


 It facilitates comparisons:
 It helps to study the relationships:
 It facilitates the statistical treatment of the data:

Bases of Classification

The basis or the criteria with respect to which the data are classified primarily
depend on the objectives and the purpose of the enquiry. Generally, the data are
classified as under:

1. Qualitative Classification: Under this method, data are classified according to


attributes or qualities or characteristics or some properties. Generally, the
qualitative phenomena are not capable of quantitative measurements such as
honesty, beauty, employment, intelligence, occupation, sex, literacy and other
attributes.
2. "Simple Classification; this involves presence or absence of a single attribute.
There are only two classes with reference to a single attribute only. Examples of
such classification are male and female, honest and dishonest, trained and
untrained or black and white. Only one pair is considered under the simple
classification.
3. Composite Classification" involves presence or absence of more than one
attribute. It starts with a simple (two-fold) classification and extends further with
other attributes of the same data. Examples of such classification are Male and
Female, among these both attributes again we have Male-Married and
Unmarried, and Female-Married and Unmarried, etc.

22 | P a g e
Intro. To statistics &for Economics -Emma Charles 0753-236-367/0787-080-333

4. "Arbitrary classification" refers to those attributes which are not clearly defined
and in which no demarcation can be made. For example, tall persons and short
persons literates and illiterates, young and old and so on. Such attributes require
specific definitions. They are called "notional attributes".
5. Quantitative Classification: Under this method data are classified according to
quantitative measurements like age, height, marks, weight, prices, production,
income, wages, expenditure, sales and profits. The quantitative phenomenon
under study is known as 'variable', hence this classification is also sometimes
called "classification by variables"

A variable' is a quantitative phenomenon which can assume any numerical value


within a particular range. It is a thing or quantity or factor which varies in value
from one individual to another. It is an item which can be described and expressed
in numerical terms. E.g. Marks, Wages, Weight, Age, etc.

Types of variables

Variables are of two kinds - Discrete and Continuous.

1. Discrete Variable: is a variable which cannot take all the possible values within a
given specified range. It takes only integers (whole numbers) characterized by
jumps and gaps between one value and the next. The values, generally, change
with particular gap constantly. Thus a variable, which shows definite breaks
between one value and the other succeeding it, is called a "Discrete Variable."
2. Continuous Variable: The variable which takes all the possible values, integer as
well as fractional, is called a continuous variable. It is capable of passing from any
given value (integer and fractional) to the next value by infinitely small
gradations.

Therefore, values in a discrete variable are obtained just by counting and the values
in a continuous variable are obtained through specified measurements.

Following are the two examples of series based on discrete and continuous variables.

Discrete (variable) Series Continuous (Variable) Series

23 | P a g e
Intro. To statistics &for Economics -Emma Charles 0753-236-367/0787-080-333

No. of Children No. of Families Weights (lbs) No. of Students


1 15 80 - 90 50
2 60 90 - 100 45
3 43 100 - 110 160
4 25 110 - 120 90
5 8 120 - 130 40
6 4 130 - 140 15
Total = 155 Total = 400

Seriation
Seriation is concerned with the logical listing or arrangement of the data into a
particular sequence. The data are listed orderly in different categories. The things or
attributes are measured, counted or weighted and then they are placed one after
another in the process of seriation. Thus, 'seriation' means a logical or systematic
arrangement of data with reference to time or area or condition.

Statistical series may be defined as things, events or attributes arranged in a


particular or logical or systematic order. There are three types of series based on
time, area or condition -Time Series, Geographical Series and Condition Series. Statistical
series are prepared to present the classified data in a properly arranged way.

 "Individual Series" is one in which data are shown as they are observed. The
values of variable are small in size or limited in number. The observations are just
recorded and presented as they are collected. Generally, they are in raw form and
unorganized.
 "Condition Series" is one in which data are arranged with reference to the
physical condition such as age, height and weight with respect to their occurrence
(frequency) at a given time. A frequency distribution is to be formed to reduce
the size of the data collected.

The condition series are of three types:

24 | P a g e
Intro. To statistics &for Economics -Emma Charles 0753-236-367/0787-080-333

1. Discrete (Discontinuous or Ungrouped) Series: When the values of variable in


the data are individually distinct and integers, we can have the discrete form of
frequency distribution.

The values of variable are smaller in size as compared to the number of observations.
Because, each value is occurring in the series repeatedly. There is a definite gap
between the variables which makes the series discontinuous in values. There are no
fractional values at all. Each value takes a definite frequency against it according to
the repetitions of the same in the series.

2. Grouped Series: When the values of variable in the data are individually distinct
and integers, we can have the discrete form of frequency distribution. It is
possible only when the size of the values of the variable is smaller as compared to
the number of observations. If the values' of variable and the number of
observations are both large in size, we can have the grouping of the values for the
purpose of condensing the data.

Grouped series are classified into different classes (class-intervals) with two limits for
each class-interval i.e. lower limit and upper limit.

Methods of counting grouped series


 Inclusive method; is where both the limits are included while counting the
observations in the particular class interval. The class intervals, under this series,
are called ''inclusive class intervals". There is a gap of one value between the
upper limit of the preceding class interval and the lower limit of the succeeding
class interval.
 Exclusive method; under this method, the lower limit of the each class interval is
only included while counting the observations in the particular class interval
while the upper limit is excluded in the process of counting.

FREQUENCY DISTRIBUTION.

"Frequency" is a number of times each value of variable occurs in the series. It refers
to the number of repetitions of a particular value of variable. It is shown, with the

25 | P a g e
Intro. To statistics &for Economics -Emma Charles 0753-236-367/0787-080-333

help of counting, against the particular value or class. So it is the rate of occurrence
of a particular event or thing or value.

"Frequency Distribution" on the other hand, is a summary presentation of values of


variable (or attributes) arranged according to their magnitudes either individually or
in groups or classes. It depicts the different values of variable in one column and
their respective frequencies in another. It is also called a "Frequency Table" or
"Frequency Chart" or "Frequency Series".

Types of Frequency distributions

On the basis of statistical series) we can construct the three types of frequency
distributions as under:

1. Discrete (discontinuous) Frequency Distribution.


2. Grouped Frequency Distribution.
3. Continuous Frequency Distribution.

Discrete Frequency Distribution: in constructing the discrete frequency distribution, we


count the number of times each value of variable occurs in the series. This is
facilitated through the technique of "Tally bars" or "Tally marks".

"Tally bars" or "Tally marks" are small vertical bars scored parallel to each other. They
are put opposite to the particular value or class to facilitate the counting of the
frequencies.

Grouped Frequency Distribution: under this, the entire range of the values of variable is
to be condensed by dividing it into a suitable groups or classes. Each class having
two limits i.e. lower limit and upper limit.

Note: The inclusive method is not in general usage. That is why the inclusive method
is converted into the exclusive method for all the purposes. There is a gap of 1
between the upper limit of the preceding class and the lower limit of the succeeding
class. This gap is adjusted by a correction factor, computed as under
𝑠𝑢𝑐𝑐𝑒𝑑𝑖𝑛𝑔 𝑙𝑜𝑤𝑒𝑟 𝑙𝑖𝑚𝑖𝑡 − 𝑝𝑟𝑒𝑐𝑒𝑑𝑖𝑛𝑔 𝑐𝑙𝑎𝑠𝑠 𝑢𝑝𝑝𝑒𝑟 𝑙𝑖𝑚𝑖𝑡
𝑐𝑜𝑟𝑟𝑒𝑐𝑡𝑖𝑜𝑛 𝑓𝑎𝑐𝑡𝑜𝑟 =
2

26 | P a g e
Intro. To statistics &for Economics -Emma Charles 0753-236-367/0787-080-333

The correcting factor is generally 0.5 which is adjusted as under.

Inclusive method of classes converted into exclusive method of classes


(Marks) (Marks)
1-3 0.5 - 3.5
4-6 3.5 - 6.5
7-9 6.5 - 9.5
10 -12 9.5 _ 12.5
13 -15 12.5 - 15.5
16 -18 15.5 - 18.5
19 - 21 18.5 - 21.5
Note: Correcting factor 0.5 is deducted from all the lower limits and added to all the
upper limits.

The magnitude of the class interval is 2, before the adjustment, but it is 3 after the
adjustment. However, the mid-value in both the cases remains the same.

Continuous Frequency Distribution: under this, the entire range of values of variable is
condensed by dividing it into suitable groups or classes. Each class having two limits
i.e. lower limit and upper limit. In this distribution, there will not be a gap of 1
between the upper limit of the preceding class and the lower limit of the succeeding
class. Because both the limits are same.

CONSTRUCTION OF CONTINUOUS SERIES


Formation of continuous frequency distribution is most popular in practice
nowadays. There are no hard and fast rules laid down for constructing the
continuous series. A statistician has to use his discretion for classifying the data in a
continuous frequency distribution. Experience, wisdom, skill and aptness are
required for an appropriate classification of the data.

Terminologies used in construction of continuous frequency distribution


The following technical terms and guidelines are to be studied and borne in mind
while constructing the continuous frequency distribution:

27 | P a g e
Intro. To statistics &for Economics -Emma Charles 0753-236-367/0787-080-333

Class interval: It is a size of each group of the data beginning with the lower limit and
ending with the upper limit. Thus, it is a group constituted by the two limits. E.g. of
inclusive method of class intervals are 10-19, 20-29, 30-39 and so on. For exclusive
method of class intervals are 10-20, 20-30, 30-40 and so on.

Determination of number of class interval


There are no hard and fast rules regarding the determination of number of class
intervals in any given observations. However, Prof. H.A. Sturges has given a formula
by which the ideal number of class

Intervals can be computed so as to form any series. His formula is,

 n = 1 + 3.222 log N
 n = number of class intervals
 N = number of observations

Accordingly, on the basis of the above Sturges Rule, we can determine the
magnitude of the class interval as under:
𝑅𝑎𝑛𝑔𝑒 𝐿−𝑆
𝑖= =
𝑁𝑜. 𝑜𝑓 𝑐𝑙𝑎𝑠𝑠𝑒𝑠 1 + 3.222 𝑙𝑜𝑔 𝑁

Where;

L = Largest Value,

S = Smallest Value.

i = Magnitude of the class interval

When we apply the Sturges Formula, the number of classes are determined within
the range of 4 and 20 in general. It is a fairly reasonable range within which we get
the number of class intervals. However, the number of classes should be a whole
number (integer) preferably 5 or multiples of 5 such as 10, 15, 20 and so on. Such
integers are readily perceptible to the mind and quite convenient for numerical
computations in the analysis of data. Similarly, the magnitude of the class intervals
should also be taken preferably in 5 or least common multiples of 5 such as 0-5 or
15-25 or 20-40 and so on.

28 | P a g e
Intro. To statistics &for Economics -Emma Charles 0753-236-367/0787-080-333

Empty Class Interval: It is the class interval which is having no frequency. It means
there are no observations against this class interval.

Class Limits: They are the two ends by which the class interval is constituted. Each
class interval is having two limits - lower limit and upper limit. The lowest value is
the lower limit and the highest value the upper limit.

Class Boundaries: These are the new corrected class limits in a grouped series which
are formed by converting inclusive method into the exclusive method.
1
𝑢𝑝𝑝𝑒𝑟 𝐶𝐵 = 𝑢𝑝𝑝𝑒𝑟 𝑐𝑙𝑎𝑠𝑠 𝑙𝑖𝑚𝑖𝑡 + 𝑑
2
1
𝐿𝑜𝑤𝑒𝑟 𝐶𝐵 = 𝑙𝑜𝑤𝑒𝑟 𝑐𝑙𝑎𝑠𝑠 𝑙𝑖𝑚𝑖𝑡 − 𝑑
2

Where; d= the difference between the lower class limit of the succeeding class and
the upper class limit of the preceding class.

Magnitude and Width of the Class: The "magnitude" is the difference between the lower
limit and the upper limit of the class interval, whereas the "width" is the difference
between the lower boundary and the upper boundary of the class interval.

Example of inclusive method of class interval and its conversion into the exclusive method as
under:

Class intervals Magnitude Class boundary Width


10 – 19 9 9.5 – 19.5 10
20 -29 9 19.5 - 29.5 10
30 -39 9 29.5 – 39.5 10
40- 49 9 39.5 – 49.5 10
Mid - value (Mid-point or Class Mark): It is the central point of the class interval which
is exactly at the middle of the two extreme limits of the class interval. It is
determined as under:
𝑙𝑜𝑤𝑒𝑟 𝑙𝑖𝑚𝑖𝑡 + 𝑢𝑝𝑝𝑒𝑟 𝑙𝑖𝑚𝑖𝑡
𝑚𝑖𝑑 − 𝑣𝑎𝑙𝑢𝑒 =
2

29 | P a g e
Intro. To statistics &for Economics -Emma Charles 0753-236-367/0787-080-333

Class Frequency: It is the number of the observations corresponding to the specific


class. It is the rate of occurrence of a particular event or value relating to a particular
class interval.

Construction of a frequency distribution

Example 1

The following is a record of marks obtained by 90 candidates in a statistics


examination.

84, 91, 58, 72, 44, 87, 76, 43, 83, 40, 73, 86, 77, 75, 73, 71, 54, 46, 55, 43,33, 76,
94, 65, 74, 50, 65, 80, 57, 73, 36, 33, 91, 53, 63, 59, 47, 29, 37, 11, 82, 40, 27, 84,
53, 19, 35, 72, 44, 19, 51, 67, 58, 76, 38, 16, 37, 74, 46, 50, 18, 59, 27, 92, 13, 45,
61, 86, 39, 78, 23, 12, 71, 62, 22, 41, 38, 27, 66, 51, 29, 63, 47, 39, 19, 22, 35, 39,
80, 37.

Required; represent the above data in a frequency distribution using;

 Exclusive method
 Inclusive method

Example 2:

The following is an extract from the monthly record of water consumption, in


thousand litres of Nimz Enterprises Ltd.

57, 59, 60, 61, 60, 63, 67, 70, 74, 80, 50, 59, 60, 63, 62, 61, 61, 65, 70, 76, 80, 89,
54, 59, 63, 60, 64, 65, 65, 86, 75, 80, 90, 92, 56, 60, 60, 64, 66, 66, 69, 68, 78, 84,
86, 96, 99, 60.

Represent the above data in the form of a frequency distribution.

Example 3:
Group the following data in a frequency polygon by taking a class interval 5 – 10

100, 150, 149, 105, 109, 101, 160, 200, 105, 205, 210, 111, 114, 131, 121, 123,
133, 129, 149, 148, 152, 157, 159, 167, 168, 161, 171, 173, 181, 189, 182, 191,

30 | P a g e
Intro. To statistics &for Economics -Emma Charles 0753-236-367/0787-080-333

192, 199, 197, 201, 209, 202, 207, 195, 185, 165, 155, 145, 125, 175, 115, 109,
105, 201, 176, 177, 168, 169, 159, 157, 100.

Stem and leaf plot of data

Example 1: Data: 23, 58, 62, 62, 63, 65, 67, 71, 71, 72, 80, 82, 82, 82.

Required: Display the above data in a stem and leaf plot

Example 2: the height of 20 students in cm is given below:

143, 160, 154, 159, 172, 165, 162, 171, 146, 165, 176, 145, 182, 175, 186, 160,
158, 167, 172.

Required:

a) Display the data in a stem & leaf plot


b) Find the range of height
c) What is the mode height?
d) What is the median height?

GRAPHIC REPRESENTATION OF DATA

A 'Graph' is a vivid or intense or bright form of presentation of data. It is a simplest


and commonest aid to the numerical reading which gives a picture of numbers in
such a way that the relations between the two series can be easily compared.

Types of graphs

Graphs are generally classified into two categories on the basis of the characteristics
of data - Graphs of Time series and Graphs of Frequency Distribution.
Graphs of Time Series (Historical)
Time series or historical series stand for the numerical record of the changes in a
variable during a given period of time. Time units are placed on the OX-axis and the
values of variable are measured on the OY-axis. There are three types of time series
graphs as under;
 One- Variable Graph.
 Two - Variable Graph.

31 | P a g e
Intro. To statistics &for Economics -Emma Charles 0753-236-367/0787-080-333

 Three - Variable Graph.


One -Variable Graph: In this graph, only one factor is shown on the Y-axis and the
time is measured on X- axis.

Example: the following is the production of rice (in '000' tonnes) in Mbale in
different years.

Year 1991-92 1992-93 1993-94 1994-95 1995-96 1996-97 1997-98


Rice 220 270 220 290 250 280 240
Production
(in ‘000’
tonnes)
Represent the above data in a one-variable time series graph

A GRAPH SHOWING RICE PRODUCTION IN MBALE DURING SEVEN


YEARS

Two- variable graph: in this graph, there are factors on the Y-axis and the time is
measured on X – axis.

Example: the following data relates to foreign trade of Uganda during the seven years.

Year 1991-92 1992-93 1993-94 1994-95 1995-96 1996-97 1997-98

32 | P a g e
Intro. To statistics &for Economics -Emma Charles 0753-236-367/0787-080-333

Exports 3300 4000 5700 6300 6700 6000 6500


(in shs)

Imports 2000 2500 2800 3000 3500 3800 4000


(in shs)
A GRAPH SHOWING IMPORTS & EXPORTS OF UGANDA (IN SHS)
DURING THE SEVEN YEARS

Three Variable Graph: In three variable graph, there are three factors on the Y-axis
and the time is measured on X-axis. Following is the data relating to Income and
Expenditure of the Sports Club villa for five years.

Year 1993 1994 1995 1996 1997


Income (in 150 180 160 190 170
‘000’ shs)
Expenditure 90 100 120 190 200
(in ‘000’ shs)
Profit/loss (in +60 +80 +40 0 -30
‘000’ shs)

33 | P a g e
Intro. To statistics &for Economics -Emma Charles 0753-236-367/0787-080-333

GRAPH SHOWING INCOME, EXPENDITURE & BALANCE OF SPORTS


CLUB VILLA DURING THE FIVE YEARS

GRAPHS OF FREQUENCY DISTRIBUTION

Frequency distribution, whether discrete or continuous, can be graphically


represented. The values of variable or mid-values of the class interval are measured
on X-axis and the frequencies on Y -axis.

Types of frequency graphs

There are four types of frequency graphs:

 Histogram
 Frequency Polygon
 Frequency Curve
 Ogive Curves

Histogram: The histogram is a device of graphic representation of a frequency


distribution. It is constructed by erecting a set of rectangles on each class-interval on
the horizontal respective class frequencies. Frequencies are shown on Y-axis.

The height of rectangle represents the frequency of the respective class interval. The
height should be measured on Y-axis from the point of the ‘upper-class limit’ of the

34 | P a g e
Intro. To statistics &for Economics -Emma Charles 0753-236-367/0787-080-333

concerned class interval. The area of all the rectangles joined together represents the
total frequencies.

Unequal class intervals

When the class intervals are not equal, the density of the frequency has to be
calculated. Frequency density refers to the concentration of frequency in a unit of
value of the total size of the class interval. In unequal class intervals, the frequencies
of each class are adjusted to the width of the interval with the lowest size as follows:

Adjusted Frequency of the Highest C.I


𝐹𝑟𝑒𝑞. 𝑜𝑓 𝑡ℎ𝑒 ℎ𝑖𝑔ℎ𝑒𝑠𝑡 𝐶. 𝐼 × 𝑊𝑖𝑑𝑡ℎ 𝑜𝑓 𝑡ℎ𝑒 𝑙𝑜𝑤𝑒𝑠𝑡 𝐶. 𝐼
=
𝑊𝑖𝑑𝑡ℎ 𝑜𝑓 𝑡ℎ𝑒 ℎ𝑖𝑔ℎ𝑒𝑠𝑡 𝐶. 𝐼

Examples

Construct a Histogram for the following frequency distribution

Variable 35-40 40-45 45-50 50-55 55-65


Frequency 12 30 22 30 28
Solution;

Adjusted Frequency of the Highest C.I


𝐹𝑟𝑒𝑞. 𝑜𝑓 𝑡ℎ𝑒 ℎ𝑖𝑔ℎ𝑒𝑠𝑡 𝐶. 𝐼 × 𝑊𝑖𝑑𝑡ℎ 𝑜𝑓 𝑡ℎ𝑒 𝑙𝑜𝑤𝑒𝑠𝑡 𝐶. 𝐼
=
𝑊𝑖𝑑𝑡ℎ 𝑜𝑓 𝑡ℎ𝑒 ℎ𝑖𝑔ℎ𝑒𝑠𝑡 𝐶. 𝐼

58 × 5
= = 14
10

A HISTOGRAM SHOWING THE FREQUENCY DISTRIBUTION

35 | P a g e
Intro. To statistics &for Economics -Emma Charles 0753-236-367/0787-080-333

Example 2

The following data relates to marks obtained by 2nd year BECON students in
statistics for economics exam.

12, 30, 29, 22, 35, 39, 40, 42, 40, 16, 30, 18, 12, 5, 15, 3, 4, 9, 45, 50, 0, 5, 6, 8, 15,
25, 22, 3, 4, 9, 12, 5, 9, 3, 4, 30, 32, 29, 1, 8

Required;

a) Prepare a frequency distribution of the above data taking a class interval of 10


(exclusive method)
b) Draw a histogram

Solution:

Class Interval (marks) Tally Bars Frequencies


0-10 //// //// //// // 17
10-20 //// // 7
20-30 //// 5
30-40 //// / 6
40-50 //// 4
50-60 / 1
Total 40

36 | P a g e
Intro. To statistics &for Economics -Emma Charles 0753-236-367/0787-080-333

A HISTOGRAM SHOWING MARKS OBTAINED BY BECON STUDENTS IN


A STATISTIC FOR ECONOMICS EXAM

Example

Represent the following data by means of a histogram

Weekly wages (in shs) 20-25 25-30 30-35 35-40 40-45 45-50
No. of workers 6 18 25 14 10 8

37 | P a g e
Intro. To statistics &for Economics -Emma Charles 0753-236-367/0787-080-333

Example

From the following information, determine the mode graphically and verify the
results.

Weekly wages (shs) 10-15 15-20 20-25 25-30 30-40 40-60 60-80
No. of workers 7 19 27 15 12 12 8
Solution:

Reduce the last three class intervals as follows;

Class interval Adjusted frequency


30-40 12 × 5⁄10 6 fo = 19
40-60 12 × 5⁄20 3 f1 = 27 (highest frequency)
60-80 8 × 5⁄20 2 f2 = 15

Mode, Z
𝑑1
𝑚𝑜𝑑𝑒, 𝑧 = 𝑙𝑚 + ×𝑐
𝑑1 + 𝑑2

38 | P a g e
Intro. To statistics &for Economics -Emma Charles 0753-236-367/0787-080-333

27 − 19
20 + × 5 = 22
(27 − 19) + (27 − 15)

Construct a Histogram from the following and then locate the mode

Marks 0-10 10-20 20-40 40-50 50-70


No. of students 10 30 80 64 56
Frequency polygon: It is a device of graphic representation of a frequency distribution.
It is a simple method of drawing the graph with the help of histogram.

Steps:

 Construct the histogram.


 Plot the mid-points at the top of each rectangle.
 Connect the mid-points of the top of all the rectangles by straight lines. This is
done under the assumption that the frequencies in each class interval are evenly
distributed.

The area of the frequency polygon is equal to the area of the histogram, as the area
left outside is geometrically equal to the area included in it.

Example. Monthly profits of 100 shops are distributed as follows:

Profits per shop (in 0-50 50-100 100-150 150-200 200-250 250-300

39 | P a g e
Intro. To statistics &for Economics -Emma Charles 0753-236-367/0787-080-333

‘000’ shs)
No. of shops 12 18 27 20 17 6
Required;

Draw a histogram and a frequency polygon for the above data.

Draw a histogram and show the frequency polygon for the following data:

Marks 40-50 50-60 60-70 70-80 80-100


No. of students 5 12 20 13 8
Adjusted Frequency of the Highest C.I
𝐹𝑟𝑒𝑞. 𝑜𝑓 𝑡ℎ𝑒 ℎ𝑖𝑔ℎ𝑒𝑠𝑡 𝐶. 𝐼 × 𝑊𝑖𝑑𝑡ℎ 𝑜𝑓 𝑡ℎ𝑒 𝑙𝑜𝑤𝑒𝑠𝑡 𝐶. 𝐼
=
𝑊𝑖𝑑𝑡ℎ 𝑜𝑓 𝑡ℎ𝑒 ℎ𝑖𝑔ℎ𝑒𝑠𝑡 𝐶. 𝐼

8 × 20
=4
20

GRAPH SHOWING HISTOGRAM & FREQUENCY POLYGON OF MARKS


SCORED BY THE STUDENTS

40 | P a g e
Intro. To statistics &for Economics -Emma Charles 0753-236-367/0787-080-333

Frequency curve: This can be drawn on the same lines as a frequency polygon but the
mid-points at the height of class interval rectangles will be joined smoothly and by
rounding off the top through the trend principle. It means that the difference
between frequency polygon and frequency curve is of joining the points.

(Refer to the above example)

Ogive curves: These curves refer to a continuous form of the cumulative frequency
curves-less than cumulative frequency curve and more than cumulative frequency
curve. This method of drawing the curves is the best among other types as it serves

41 | P a g e
Intro. To statistics &for Economics -Emma Charles 0753-236-367/0787-080-333

many purposes. Ogive curves are based on the cumulative frequencies. That is why
they are also called "Cumulative Frequency Curves".

Types of Ogives

Less than Ogive: It consists in plotting the 'less than' frequencies against the upper
limit of the class interval or boundaries. The points so obtained are joined by a
smoothed curve. It is an increasing curve sloping upward from left to right of the
graph and it is in the shape of an elongated-(S).

More than Ogive: It consists in plotting the 'more than' frequencies against the lower
limit of the class interval or boundaries. The points so obtained are joined by a
smoothed curve. It is a decreasing curve sloping downward from left to right of the
graph and it is in the shape of an elongated upside down - (s).

Example 1; draw an ogive and from it read the median and quartiles

Marks 0-20 20-30 30-40 40-50 50-60 60-70 70-80


No. of students 21 19 60 42 24 18 17
Solution

To find the partitioned values we have to convert the data into a less than frequency
distribution

Marks 20 30 40 50 60 70 80
No. of students 21 40 100 142 166 184 201

42 | P a g e
Intro. To statistics &for Economics -Emma Charles 0753-236-367/0787-080-333

Example 2

Draw two ogives from the following data and locate the median value.

Marks 20-40 40-60 60-80 80-100 100-120 120-140 140-160


No. of students 4 6 10 16 12 7 3
Solution

From the ogive curves we have to convert the data into two cumulative frequencies.

Marks (C.I) No. of Marks LT. Cf. Marks MT. Cf.


students
20-40 4 Less than 40 4 More than 20 58
40-60 6 Less than 60 10 More than 40 54
60-80 10 Less than 80 20 More than 60 48
80-100 16 Less than 100 36 More than 80 38
100-120 12 Less than 120 48 More than 100 22
120-140 7 Less than 140 55 More than 120 10
140-160 3 Less than 160 58 More than 140 3

43 | P a g e
Intro. To statistics &for Economics -Emma Charles 0753-236-367/0787-080-333

Example 3

Draw a ‘Less Than Ogive’ from the following and locate the median

Size 10-20 20-30 30-40 40-50 50-60


Freq. 20 60 100 150 75
Example 4

Draw two ogives from the following data and locate the median

Class 100-200 200-300 300-400 400-500 500-600 600-700


Freq. 20 40 60 80 100 120
Example 5

Draw a less than ogive and find the value of quartiles for the following;

Class 20 40 60 80 100 120 140


Freq. 3 8 20 38 52 58 60
Thus, graphic representation of data is a powerful and effective medium for
presenting statistical data. Under all circumstances, however, it is not complete
substitute for a tabular form of presentation. Even then, the graphs play an
important role by facilitating comparison of values, trends and relationships.

44 | P a g e
Intro. To statistics &for Economics -Emma Charles 0753-236-367/0787-080-333

MEASURES OF CENTRAL TENDENCY OR LOCATION OR STATISTICAL


AVERAGES (AVERAGES OF FIRST ORDER)

It is very difficult to understand mass of data. In order to get a concise and complete
picture of large data, it is essential to obtain a figure, which should represent the
whole data. By the help of such a figure, the data are compared and understood
easily. A figure which represents the whole data is known as an average or a measure
of central tendency. An average removes all the unnecessary details of the data and
gives a concise picture of the huge data under investigation.

Meaning of statistical averages


An 'average' is a figure which represents the large number of observations in a
concise or single numerical data. It is a typical size which describes the central
tendency. It is a representative value around which all the values of variable cluster
or concentrate.

"Measures of central Tendency" or "Measures of Central Location" or "Averages of First Order"


describe the concentration of large numbers adequately around the central
tendency. An average is a set of summary figures. It is a single value which represents
all the items. It is a single simple expression in which the net result of a complex
group or large numbers is concentrated.

Qualities of a good average

Following are the requisite properties of a good and ideal average:

 It should be easily understood.


 It should be simple in calculation.
 It should be based on all the observations.
 It should not be unduly affected by the extreme values,
 It should be rigidly defined.
 It should be capable of further algebraic treatment.
 It should have sampling stability. (Three or four sampling sets must give the same
or similar result

Limitations of Statistical Averages:

45 | P a g e
Intro. To statistics &for Economics -Emma Charles 0753-236-367/0787-080-333

Although an average is useful in studying the complex data and is very widely used
in almost all the spheres of human activity, it is not without limitations that restrict
its scope and applicability.

Following are the limitations of statistical averages:

 The extreme values, if any, will affect the averageable figure disproportionately.
 The composition of the data cannot be viewed with the help of the average.
 The average does not represent always the characteristics of individual items.
 The average gives only a representative figure of the mass but fails to depict the
entire picture of the data.

In spite of the limitations, the statistical averages still are useful measures which play
an important role in analyzing the mass data as there is no alternative left for
statistical analysis.

Types of measures of central tendency (statistical averages)

Broadly speaking, there are five types of statistical averages which are commonly
used in practice. They are;

Mathematical averages:

 Arithmetic Mean (simple & weighted arithmetic mean)


 Geometric Mean
 Harmonic Mean

Positional averages:

 Median
 Mode

ARITHMETIC MEAN/MEAN/SIMPLE AVERAGE;

This is the quotient of the sum of the values divided by their number. It’s obtained
by dividing the sum of the values of the items in a variable by their number. It is
expressed as follows:

46 | P a g e
Intro. To statistics &for Economics -Emma Charles 0753-236-367/0787-080-333

∑𝑥
Individual observations/series: 𝑚𝑒𝑎𝑛, 𝑋̅ =
𝑛

∑ 𝑓𝑥
Discrete/continuous series: 𝑚𝑒𝑎𝑛, 𝑋̅ = ∑𝑓

Where; x is values of items in individual series and mid-values in


continuous/discrete series

∑ is Greek letter Sigma which means sum of the values

N is No. of observations.

∑f is the sum of the frequencies

COMPUTATION OF THE MEAN

The mean can be calculated by the help of two methods, i.e. Direct method &
Short-cut method but much emphasis in this handout is placed on direct method.

Individual series (ungrouped data)

The marks of 15 students of a college in statistics are as follows:

49, 62, 11, 74, 55, 80, 56, 64, 34, 42, 56, 67, 10, 68, & 46

Determine the mean marks.

Solution:
∑𝑥
𝑚𝑒𝑎𝑛, 𝑋̅ =
𝑛
46 + 62 + 11 + 74 + 55 + 80 + 56 + 64 + 34 + 42 + 56 + 67 + 10 + 68 + 46
𝑋̅ =
15
781
= = 52.1 𝑚𝑎𝑟𝑘𝑠
15

Discrete series:

In these series, the value of each of the item is multiplied by the corresponding
frequency and the total of products is divided by the total number of frequencies.

47 | P a g e
Intro. To statistics &for Economics -Emma Charles 0753-236-367/0787-080-333

Worked Examples:

Calculate arithmetic mean from the following data:

Marks 15 25 35 45 55 65 75 85 95
No. of students 1 2 3 4 5 7 12 16 10
Solution:

Marks (x) No. of students (f) 𝒇𝒙

15 1 15
25 2 50
35 3 105
45 4 180
55 5 275
65 7 455
75 12 900
85 16 1,360
95 10 950
∑𝑓 = 60 ∑𝑓𝑥 = 4,290
∑ 𝑓𝑥
∴ 𝑋̅ =
∑𝑓

4,290
=
60

𝑋̅ = 72 𝑚𝑎𝑟𝑘𝑠.

Example II

In the city, 30 members were surveyed as to how many domestic appliances they
have purchased, the replies were as follows:

1, 2, 5, 1, 5, 2, 1, 4, 2, 3, 4, 2, 4, 3, 2, 6, 3, 2, 4, 3, 6, 2, 2, 3, 3, 7, 2, 3, 0, 2

Required: prepare Discrete series and find the mean.

Values (x) Tally Bars Frequencies (f) 𝒇𝒙

0 / 1 0

48 | P a g e
Intro. To statistics &for Economics -Emma Charles 0753-236-367/0787-080-333

1 /// 3 3
2 //// //// 10 20
3 //// // 7 21
4 //// 4 16
5 // 2 10
6 // 2 12
7 / 1 7
∑𝑓 = 30 ∑𝑓𝑥 = 89
∑ 𝑓𝑥
∴ 𝑋̅ =
∑𝑓

89
=
30
𝑋̅ = 2.9667 𝑚𝑒𝑚𝑏𝑒𝑟𝑠.

Continuous series:

The method of calculating the mean from continuous series is exactly the same as
that of discrete series with the exception that in continuous series, mid-points of
various class intervals are determined against the class intervals. These mid-point
values are then multiplied by the corresponding frequencies.

Example:

Determine the mean from the following data:

C.I 0-5 5-10 10-15 15-20 20-25 25-30 30-35 35-40


Freq. 17 30 35 49 45 33 26 15
Solution:

Class intervals Mid-points (x) 𝒇 𝒇𝒙

0-5 2.5 17 42.5


5-10 7.5 30 225
10-15 12.5 35 437.5
15-20 17.5 49 857.5
20-25 22.5 45 1,012.5
25-30 27.5 33 907.5
49 | P a g e
Intro. To statistics &for Economics -Emma Charles 0753-236-367/0787-080-333

30-35 32.5 26 845


35-40 37.5 15 562.5
∑𝑓 = 250 ∑𝑓𝑥 = 4,890
∑ 𝑓𝑥
∴ 𝑋̅ =
∑𝑓

4,890
=
250
𝑋̅ = 19.56 ≈ 20

GEOMETRIC MEAN

This is the ‘n’th root of the product of the ‘n’ quantities of a series. In the field of
business & management, various problems often arise relating to “Average
percentage” rate of change over a period of time. In such cases, geometric mean is
suitably applied. It involves obtaining log values of the items and the average of these
values determined. It is obtained as follows:

Individual series:
∑𝑙𝑜𝑔𝑥
𝐺. 𝑀 = 𝐴𝑛𝑡𝑖 − log
𝑛
Discrete/continuous series:
∑𝑓𝑙𝑜𝑔𝑥
𝐺. 𝑀 = 𝐴𝑛𝑡𝑖 − log
∑𝑓

Where: G.M is Geometric mean

𝑙𝑜𝑔𝑥 is log values of variables

n is No. of items under observations

f is frequencies

COMPUTATION OF GEOMETRIC MEAN

Individual series:

Compute geometric mean from the following data

50 | P a g e
Intro. To statistics &for Economics -Emma Charles 0753-236-367/0787-080-333

75, 125, 350, 790, 1250, 1740, 2000

Solution:

𝒙 𝒍𝒐𝒈𝒙
75 1.8751
125 2.0969
350 2.5441
790 2.8976
1250 3.0969
1740 3.2405
2000 3.3010
∑𝑙𝑜𝑔𝑥 = 19.0521
∑𝑙𝑜𝑔𝑥
𝐺. 𝑀 = 𝐴𝑛𝑡𝑖 − log
𝑛
19.0521
𝐺. 𝑀 = 𝐴𝑛𝑡𝑖 − log
7
𝐴𝑛𝑡𝑖 − log 𝑜𝑓 2.7217

∴ 𝑮. 𝒎. = 𝟓𝟐𝟕

Discrete Series:

Yield in maize mill (in tonnes) of 6 farms in Soroti District is given below:

Farm No. 85 95 105 120 125 145


Yield (tonnes) 5 8 25 14 20 9
Required:

Determine the Geometric mean.

Solution:

𝒇𝒂𝒓𝒎 𝑵𝒐. (𝒙) 𝒍𝒐𝒈𝒙 𝒇 𝒇𝒍𝒐𝒈𝒙


85 1.9294 5 9.647
95 1.9777 8 15.8216
105 2.0212 25 50.53

51 | P a g e
Intro. To statistics &for Economics -Emma Charles 0753-236-367/0787-080-333

120 2.0792 14 29.1088


125 2.0969 20 41.938
145 2.1614 9 19.4526
∑𝑓 =81 ∑𝑙𝑜𝑔𝑥 = 166.498

∑𝑓𝑙𝑜𝑔𝑥
𝐺. 𝑀 = 𝐴𝑛𝑡𝑖 − log
∑𝑓

166.498
= 𝐴𝑛𝑡𝑖 − log
81
= 𝐴𝑛𝑡𝑖 − log 𝑜𝑓 2.0555

𝐺. 𝑀 = 114 𝑇𝑜𝑛𝑛𝑒𝑠

Continuous series:

Calculate Geometric mean from the following data:

Class Interval 0-10 10-20 20-30 30-40 40-50


Frequency 5 12 17 25 6
Solution:

Class intervals Mid-points (x) 𝒍𝒐𝒈𝒙 𝒇


𝒇𝒍𝒐𝒈𝒙
0-10 5 0.6989 5 3.4945
10-20 15 1.1761 12 14.1132
20-30 25 1.3979 17 23.7643
30-40 35 1.5441 25 38.6025
40-50 45 1.6532 6 9.9192
∑𝑓 = 65 ∑𝑓𝒍𝒐𝒈𝒙 = 89.8937
∑𝑓𝑙𝑜𝑔𝑥
𝐺. 𝑀 = 𝐴𝑛𝑡𝑖 − log
∑𝑓

89.8937
= 𝐴𝑛𝑡𝑖 − log
65
= 𝐴𝑛𝑡𝑖 − log 𝑜𝑓 1.38298

𝐺. 𝑀 = 24.2

52 | P a g e
Intro. To statistics &for Economics -Emma Charles 0753-236-367/0787-080-333

HARMONIC Mean

This is the total number of items of a variable divided by the sum of the reciprocals
of the values of the variable. It’s a specialized average which solves problems
involving variables expressed in ‘time rates’ that vary according to time-km per hour,
hour per km, units per day, price per unit & so forth.

Note: harmonic mean, is suitable only when time factor is variable and the act being
performed remains constant. Therefore, harmonic mean provides the same results as
that of the weighted arithmetic mean when the act being performed remains the
same otherwise not. It is denoted as;

Individual series:
𝑛
𝐻. 𝑚 =
∑𝑅𝑥

Grouped Data (Discrete & continuous series)


∑𝑓
𝐻. 𝑚 =
∑𝑓𝑅𝑥

Where; n = No. of items


1
𝑅𝑥 = Reciprocal of x values given by
𝑥

𝑓= frequencies

Worked examples:

Compute harmonic mean from the given data:

40, 80, 90, 120, 150, 180, 200

Solution:

𝒙 𝟏
𝑹𝒙 =
𝒙
40 0.025
80 0.0125

53 | P a g e
Intro. To statistics &for Economics -Emma Charles 0753-236-367/0787-080-333

90 0.0111
120 0.0080
150 0.0070
180 0.0060
200 0.0050
∑𝑅𝑥 = 0.0746
𝑛
𝐻. 𝑚 =
∑𝑅𝑥

7
= = 94
0.0746

Discrete series:

Compute harmonic mean from the following:

𝒙 20 25 30 35 40 45 50
𝒇 3 8 12 15 10 7 5
Solution:

𝒙 𝟏 𝒇 𝟏
𝑹𝒙 = 𝒇
𝒙 𝒙
20 0.050 3 0.150
25 0.040 8 0.320
30 0.030 12 0.360
35 0.029 15 0.435
40 0.025 10 0.250
45 0.022 7 0.154
50 0.020 5 0.100
∑𝑓 = 60 1
∑𝒇 = 𝟏. 𝟕𝟔𝟗
𝑥
∑𝑓
𝐻. 𝑚 =
∑𝑓𝑅𝑥

60
𝐻. 𝑚 = = 34
1.769

Continuous series:

54 | P a g e
Intro. To statistics &for Economics -Emma Charles 0753-236-367/0787-080-333

Determine harmonic mean from the given data:

Size 0-4 4-8 8-12 12-16 16-20 20-24 24-28 28-32


frequency 6 15 22 28 45 25 13 6
Solution:

Size Mid-points 𝟏 𝒇 𝟏
𝑹𝒙 = 𝒇
𝒙 𝒙 𝒙
0-4 2 0.50 6 3.00
4-8 6 0.16 15 2.40
8-12 10 0.10 22 2.20
12-16 14 0.07 28 1.96
16-20 18 0.06 45 2.70
20-24 22 0.05 25 1.25
24-28 26 0.04 13 0.52
28-32 30 0.03 6 0.18
∑𝑓 = 160 1
∑𝒇 = 𝟏𝟒. 𝟐𝟏
𝑥
∑𝑓
𝐻. 𝑚 =
∑𝑓𝑅𝑥

160
= = 11.3
14.21

POSITIONAL AVERAGES

MEDIAN:

This is the value of that item in a series which divides the array into two equal parts,
one consisting of all the values less than it and the other consisting of all the values
more than it. This means when values of a variable are arranged in an array form in
ascending or descending order of magnitude, the value of the middle item of the
array is the median. It is expressed as follows:

Individual series:
𝑛+1
𝑀𝑒 = 𝑆𝑖𝑧𝑒 𝑜𝑓 ( ) 𝑡ℎ 𝑖𝑡𝑒𝑚
2

55 | P a g e
Intro. To statistics &for Economics -Emma Charles 0753-236-367/0787-080-333

Note: when computing the median under individual series, first arrange the data in
ascending order.

Computation of the median when data is an odd number

Example:

In a company, there are 7 workers whose ages are 20, 25, 29, 19, 35, 40, 55 years.
Find the median age.

Solution:

Arrange data in ascending order as follows:

S/No. Workers’ Ages


1 20
2 25
3 29
4 19
5 35
6 40
7 55
𝑛+1
𝑀𝑒 = 𝑆𝑖𝑧𝑒 𝑜𝑓 ( ) 𝑡ℎ 𝑖𝑡𝑒𝑚
2
7+1
= 𝑆𝑖𝑧𝑒 𝑜𝑓 ( ) 𝑡ℎ 𝑖𝑡𝑒𝑚
2
8
= 𝑆𝑖𝑧𝑒 𝑜𝑓 ( ) 𝑡ℎ 𝑖𝑡𝑒𝑚 = 4𝑡ℎ 𝑖𝑡𝑒𝑚
2

∴ 𝑚𝑒𝑑𝑖𝑎𝑛 = 19 𝑦𝑒𝑎𝑟𝑠

Computation of the median when data is an even number

When data is an even number, the following formula is used;


1 𝑛 𝑛+2
𝑀𝑒 = 𝑆𝑖𝑧𝑒 𝑜𝑓 ( 𝑡ℎ 𝑖𝑡𝑒𝑚 + 𝑡ℎ 𝑖𝑡𝑒𝑚)
2 2 2

Example:

56 | P a g e
Intro. To statistics &for Economics -Emma Charles 0753-236-367/0787-080-333

Compute the median value from the following data:

51, 65, 40, 44, 46, 55, 48, 62

Solution:

S/No. Values
1 51
2 65
3 40
4 44
5 46
6 55
7 48
8 62
1 𝑛 𝑛+2
𝑀𝑒 = (𝑆𝑖𝑧𝑒 𝑜𝑓 𝑡ℎ 𝑖𝑡𝑒𝑚 + 𝑡ℎ 𝑖𝑡𝑒𝑚)
2 2 2
1 8 8+2
𝑀𝑒 = (𝑆𝑖𝑧𝑒 𝑜𝑓 𝑡ℎ 𝑖𝑡𝑒𝑚 + 𝑡ℎ 𝑖𝑡𝑒𝑚)
2 2 2
1
𝑀𝑒 = (𝑆𝑖𝑧𝑒 𝑜𝑓 4𝑡ℎ 𝑖𝑡𝑒𝑚 + 5 𝑡ℎ 𝑖𝑡𝑒𝑚)
2
1
𝑀𝑒 = (44 + 46)
2
90
𝑀𝑒 = = 45
2

Discrete series:

In discrete series, the items are first arranged in ascending or descending order of
magnitude and their corresponding frequencies written against them. The
frequencies are cumulated and then the value of the middle item located with
reference to the cumulative frequency which covers the size first. The formula
∑𝑓+1
however, becomes; 𝑆𝑖𝑧𝑒 𝑜𝑓 ( 2
) 𝑡ℎ 𝑖𝑡𝑒𝑚

Example:

Determine the median from the data given below:


57 | P a g e
Intro. To statistics &for Economics -Emma Charles 0753-236-367/0787-080-333

x 31 32 33 34 35 36 37 38 39 40
Freq. 5 18 21 26 29 22 15 12 10 4
Solution:

x f Cumulative freq.
31 5 5
32 18 23
33 21 44
34 26 70
35 29 99
36 22 121
37 15 136
38 12 148
39 10 158
40 4 162
∑𝑓 = 𝟏𝟔𝟐
∑𝑓 + 1
𝑀𝑒 = 𝑆𝑖𝑧𝑒 𝑜𝑓 ( ) 𝑡ℎ 𝑖𝑡𝑒𝑚
2
162 + 1
= 𝑆𝑖𝑧𝑒 𝑜𝑓 ( ) 𝑡ℎ 𝑖𝑡𝑒𝑚
2
163
=( ) 𝑡ℎ 𝑖𝑡𝑒𝑚 = 81.5𝑡ℎ 𝑖𝑡𝑒𝑚
2

Thus, 81.5th item lies in 99th cumulative frequency and so, the median = 35

Continuous series:

Under this series, since class intervals are already in the form of an array, size of the
𝑛
median class is determined using; (2 ) 𝑡ℎ 𝑖𝑡𝑒𝑚 with reference to the cumulative
frequency which covers the size first. After locating the median class, the median
value is then determined using interpolation formula below:
𝑛
(2 ) 𝑡ℎ 𝑖𝑡𝑒𝑚 − 𝐶𝑓𝑏
𝑀𝑒 = 𝐿𝑚 + ( )×𝐶
𝑓𝑚

58 | P a g e
Intro. To statistics &for Economics -Emma Charles 0753-236-367/0787-080-333

Where: 𝐿𝑚 = 𝑙𝑜𝑤𝑒𝑟 𝑙𝑖𝑚𝑖𝑡 𝑜𝑓 𝑡ℎ𝑒 𝑚𝑒𝑑𝑖𝑎𝑛 𝑐𝑙𝑎𝑠𝑠

𝐶𝑓𝑏 = 𝑐𝑢𝑚𝑢𝑙𝑎𝑡𝑖𝑣𝑒 𝑓𝑟𝑒𝑞. 𝑏𝑒𝑓𝑜𝑟𝑒 𝑡ℎ𝑒 𝑚𝑒𝑑𝑖𝑎𝑛 𝑐𝑙𝑎𝑠𝑠

𝑓𝑚 = 𝑓𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑦 𝑜𝑓 𝑡ℎ𝑒 𝑚𝑒𝑑𝑖𝑎𝑛 𝑐𝑙𝑎𝑠𝑠

𝐶 = 𝑚𝑎𝑔𝑛𝑖𝑡𝑢𝑑𝑒 𝑜𝑓 𝑡ℎ𝑒 𝑚𝑒𝑑𝑖𝑎𝑛 𝑐𝑙𝑎𝑠𝑠 (𝑢𝑝𝑝𝑒𝑟 𝑙𝑖𝑚𝑖𝑡 − 𝑙𝑜𝑤𝑒𝑟 𝑙𝑖𝑚𝑖𝑡)

Note: While calculating the median, always make sure that the data under
observation is exclusive data. If it’s inclusive data type, convert it into exclusive first
to obtain class boundaries since class limits are obtained from class boundaries.

Worked examples:

The data below shows marks obtained by students in a statistic exam.

Marks 30-35 35-40 40-45 45-50 50-55 55-60


Students 5 8 10 6 3 2
Required: find the median mark

Solution:

Marks (C.I) Students (f) C.f


30-35 5 5
35-40 8 13
40-45 10 23
45-50 6 29
50-55 3 32
55-60 2 35
∑𝑓 = 35
𝑛
𝑀𝑒. = 𝑠𝑖𝑧𝑒 𝑜𝑓 𝑡ℎ 𝑖𝑡𝑒𝑚
2
35
= 𝑡ℎ 𝑖𝑡𝑒𝑚 = 17.5𝑡ℎ 𝑖𝑡𝑒𝑚 = 23𝑟𝑑 𝐶. 𝑓
2
𝑛
(2 ) 𝑡ℎ 𝑖𝑡𝑒𝑚 − 𝐶𝑓𝑏
∴ 𝑀𝑒 = 𝐿𝑚 + ( )×𝐶
𝑓𝑚

17.5 − 13
𝑀𝑒 = 40 + ( )×5
10

59 | P a g e
Intro. To statistics &for Economics -Emma Charles 0753-236-367/0787-080-333

14.5
= 40 + ( )×5
10
14.5 × 5
= 40 + ( )
10

= 40 + 7.25 = 47.3 𝑚𝑎𝑟𝑘𝑠

Example II

Find the median value of the following data:

C.I 4-7 8-11 12-15 16-19 20-23 24-27


Freq. 12 23 40 65 17 3
Solution:

The data is inclusive, so there is need to convert into exclusive class interval.

Class Intervals C.B (f) C.f


4-7 3.5-7.5 12 12
8-11 7.5-11.5 23 35
12-15 11.5-15.5 40 75
16-19 15.5-19.5 65 140
20-23 19.5-23.5 17 157
24-27 23.5-27.5 3 160
∑𝑓 = 160
𝑛
𝑀𝑒. = 𝑠𝑖𝑧𝑒 𝑜𝑓 𝑡ℎ 𝑖𝑡𝑒𝑚
2
160
= 𝑡ℎ 𝑖𝑡𝑒𝑚 = 80𝑡ℎ 𝑖𝑡𝑒𝑚 = 23𝑟𝑑 𝐶. 𝑓
2
𝑛
(2 ) 𝑡ℎ 𝑖𝑡𝑒𝑚 − 𝐶𝑓𝑏
∴ 𝑀𝑒 = 𝐿𝑚 + ( )×𝐶
𝑓𝑚

80 − 75
𝑀𝑒 = 15.5 + ( )×4
65
5
= 15.5 + ( ) × 4
65

60 | P a g e
Intro. To statistics &for Economics -Emma Charles 0753-236-367/0787-080-333

5×4
= 15.5 + ( )
65

= 15.5 + 0.31 = 15.81

OTHER PARTITIONED VALUES

QUARTILES

This divide the entire distribution into four equal parts i.e. 1st Quartile, 2nd Quartile
& 3rd Quartile.

 1st Quartile (Lower quartile): this is a value which there are one-fourth of the
items and above which there are three-fourth of the items.
 2nd Quartile (Median): this divides the distribution into two halves
 3rd Quartile (upper quartile): is a value below which there are three-fourth of the
items and above which there are one-fourth of the items.

Determination of quartiles

Measure SERIES
Individual &discrete Continuous
𝑸𝟏 𝑛+1 𝑛
𝑡ℎ 𝑖𝑡𝑒𝑚 𝑡ℎ 𝑖𝑡𝑒𝑚
4 4

2(𝑛 + 1) 𝑛+1 2𝑛 𝑛
𝑸𝟐 𝑡ℎ 𝑜𝑟 𝑡ℎ 𝑖𝑡𝑒𝑚 𝑡ℎ 𝑜𝑟 𝑡ℎ 𝑖𝑡𝑒𝑚
4 2 4 2

3(𝑛 + 1) 3𝑛
𝑸𝟑 𝑡ℎ 𝑖𝑡𝑒𝑚 𝑡ℎ 𝑖𝑡𝑒𝑚
4 4

Note: always arrange data either in ascending or descending order while calculating
quartiles under individual series.

Worked examples: (individual series)

Example:

The data below shows marks scored by BECON students in an exam.

23, 48, 34, 68, 15, 36, 24, 54, 65, 75, 92, 10, 70, 61, 20, 47, 83, 19, 77
61 | P a g e
Intro. To statistics &for Economics -Emma Charles 0753-236-367/0787-080-333

Find: First Quartile, Median & Third Quartile

Solution:

S/No. Marks Calculations


1 10 𝑛+1
𝑄1 = 𝑡ℎ 𝑖𝑡𝑒𝑚
4
2 15
3 19 19 + 1
𝑡ℎ 𝑖𝑡𝑒𝑚 = 5𝑡ℎ 𝑖𝑡𝑒𝑚
4 20 4
5𝑡ℎ 𝑖𝑡𝑒𝑚 = 23; 𝑄1 = 23
5 23 𝑄1
6 24 𝑄2 =
𝑛+1
𝑡ℎ 𝑖𝑡𝑒𝑚
2
7 34 19 + 1
= 𝑡ℎ 𝑖𝑡𝑒𝑚 = 10𝑡ℎ 𝑖𝑡𝑒𝑚
8 36 2
47 10𝑡ℎ 𝑖𝑡𝑒𝑚 = 48; 𝑄2 = 48
9
10 48 𝑄2 3(𝑛 + 1)
𝑄3 = 𝑡ℎ 𝑖𝑡𝑒𝑚
11 54 4
3(19 + 1)
12 61
4
13 65 15𝑡ℎ 𝑖𝑡𝑒𝑚 = 70; ∴ 𝑄3 = 70
14 68
15 70 𝑄3
16 75
17 77
18 83
19 92
Computing Quartiles from Discrete series:

Example:

Locate the median and quartiles from the following data.

Shoe sizes 4.0 4.5 5.0 5.5 6.0 6.5 7.0 7.5 8.0
Frequencies 20 36 44 50 80 30 30 16 14
Solution:

𝑺𝒉𝒐𝒆 𝑺𝒊𝒛𝒆 (𝒙) 𝒇 𝑪. 𝒇 𝑪𝒂𝒍𝒄𝒖𝒍𝒂𝒕𝒊𝒐𝒏𝒔

62 | P a g e
Intro. To statistics &for Economics -Emma Charles 0753-236-367/0787-080-333

4.0 20 20 𝑛+1
𝑄1 = 𝑡ℎ 𝑖𝑡𝑒𝑚
4
4.5 36 56
5.0 𝑸𝟏 44 100 320 + 1
= 𝑡ℎ 𝑖𝑡𝑒𝑚 = 80.25𝑡ℎ 𝑖𝑡𝑒𝑚
5.5 50 150 4
80.25𝑡ℎ 𝑖𝑡𝑒𝑚 𝑙𝑖𝑒𝑠 𝑖𝑛 100 𝑐. 𝑓; 𝑄1 = 5.0
6.0 𝑸𝟐 80 230
6.5 𝑸𝟑 30 260 𝑄2 =
𝑛+1
𝑡ℎ 𝑖𝑡𝑒𝑚
2
7.0 30 290 320 + 1
= 𝑡ℎ 𝑖𝑡𝑒𝑚 = 160.50𝑡ℎ 𝑖𝑡𝑒𝑚
7.5 16 306 2
14 320 160.50𝑡ℎ 𝑖𝑡𝑒𝑚 𝑙𝑖𝑒𝑠 𝑖𝑛 230 𝑐. 𝑓; 𝑄2 = 6.0
8.0
3(𝑛 + 1)
𝑄3 = 𝑡ℎ 𝑖𝑡𝑒𝑚
4
3(320 + 1)
4
240.75𝑡ℎ 𝑖𝑡𝑒𝑚 𝑙𝑖𝑒𝑠 𝑖𝑛 260 𝑐. 𝑓; ∴ 𝑄3 = 6.5
Computation of Quartiles from Continuous series:

Example:

Compute quartiles from the following data:

C.I 0-10 10-20 20-30 30-40 40-50 50-60 60-70 80-90


Freq. 5 8 7 12 28 20 10 10
Solution:

𝑪𝒍𝒂𝒔𝒔 𝒊𝒏𝒕𝒆𝒓𝒗𝒂𝒍𝒔 𝒇 𝑪. 𝒇
0-10 5 5
10-20 8 13
20-30 7 20
30-40 12 32
40-50 28 60
50-60 20 80
60-70 10 90
70-80 10 100
∑𝑓
= 100
𝑛
𝑄1 = 𝑡ℎ 𝑖𝑡𝑒𝑚
4
63 | P a g e
Intro. To statistics &for Economics -Emma Charles 0753-236-367/0787-080-333

100
= 𝑡ℎ 𝑖𝑡𝑒𝑚 = 25𝑡ℎ 𝑖𝑡𝑒𝑚 = 32 𝑐. 𝑓
4
𝑛
( )𝑡ℎ 𝑖𝑡𝑒𝑚−𝐶𝑓𝑏
4
∴ 𝑄1 = 𝐿𝑚 + ( 𝑓𝑚
)×𝐶

25 − 20
= 30 + ( ) × 10
12
5 × 10
= 30 + ( ) = 34.2
12
𝑛 100
𝑄2 = 𝑠𝑖𝑧𝑒 𝑜𝑓 𝑡ℎ 𝑖𝑡𝑒𝑚 = = 50𝑡ℎ 𝑖𝑡𝑒𝑚 = 60 𝑐. 𝑓
2 2
𝑛
(2 ) 𝑡ℎ 𝑖𝑡𝑒𝑚 − 𝐶𝑓𝑏
= 𝐿𝑚 + ( )×𝐶
𝑓𝑚

50 − 32
∴ 𝑄2 = 40 + ( ) × 10
28

18 × 10
= 40 + ( )
28

= 40 + 6.43 = 46.43
3𝑛 3 × 100 300
𝑄3 = 𝑡ℎ 𝑖𝑡𝑒𝑚 = = = 75𝑡ℎ 𝑖𝑡𝑒𝑚 = 80 𝑐. 𝑓
4 4 4
3𝑛
( 4 ) 𝑡ℎ 𝑖𝑡𝑒𝑚 − 𝐶𝑓𝑏
= 𝐿𝑚 + ( )×𝐶
𝑓𝑚

75 − 60
= 50 + ( ) × 10
20
15 × 10
= 50 + ( )
20

= 50 + 7.5 = 57.5

THE MODE

This is the value which occurs with the maximum frequency. It is the most common
value which receives the highest frequency. Thus, it corresponds to the value of the
variable which occurs most frequently. E.g. manufacturers produce in greater
64 | P a g e
Intro. To statistics &for Economics -Emma Charles 0753-236-367/0787-080-333

quantity those items which are always purchased frequently by customers; this shows
the application of a mode in business. A mode however, is denoted by a symbol ‘Z’.

Determination of the Mode

Individual series:

Determine the mode from the following variables.

54, 66, 42, 64, 44, 86, 104, 94, 100, 80, 72, 64, 64, 44, 64, 72, 54, 48, 52, 50. The
mode is 64 as it occurs most frequently.

Continuous series:

Under this, the modal class is located with the help of highest frequency. After
locating the modal class, the following interpolation formula is used to determine
the modal value.

𝑑1
𝑍 = 𝑙𝑚 + ( )×𝑐
𝑑1 + 𝑑2

Where: 𝑑1 = 𝑑𝑖𝑓𝑓𝑒𝑟𝑒𝑛𝑐𝑒 𝑏𝑡𝑛. 𝑚𝑜𝑑𝑎𝑙 𝑓𝑟𝑒𝑞. & 𝑓𝑟𝑒𝑞. 𝑝𝑟𝑒𝑐𝑒𝑑𝑖𝑛𝑔 𝑖𝑡 (𝑏𝑒𝑓𝑜𝑟𝑒)

𝑑2 = 𝑑𝑖𝑓𝑓𝑒𝑟𝑒𝑛𝑐𝑒 𝑏𝑡𝑛. 𝑚𝑜𝑑𝑎𝑙 𝑓𝑟𝑒𝑞. & 𝑓𝑟𝑒𝑞. 𝑠𝑢𝑐𝑐𝑒𝑒𝑑𝑖𝑛𝑔 𝑖𝑡 (𝑎𝑓𝑡𝑒𝑟)

𝑙𝑚 = 𝑙𝑜𝑤𝑒𝑟 𝑐𝑙𝑎𝑠𝑠 𝑙𝑖𝑚𝑖𝑡 𝑜𝑓 𝑡ℎ𝑒 𝑚𝑜𝑑𝑎𝑙 𝑔𝑟𝑜𝑢𝑝.

𝑐 = 𝑚𝑎𝑔𝑛𝑖𝑡𝑢𝑑𝑒 𝑜𝑓 𝑡ℎ𝑒 𝑚𝑜𝑑𝑎𝑙 𝑐𝑙𝑎𝑠𝑠 𝑖𝑛𝑡𝑒𝑟𝑣𝑎𝑙.

Example I:

Calculate the Mode from the following data:

Value 1 2 3 4 5 6 7
Freq. 7 11 31 17 16 5 2
Solution:

Z = 3 since the highest frequency is 31

Example II

The following data shows marks of all students of 50 schools in a city.

65 | P a g e
Intro. To statistics &for Economics -Emma Charles 0753-236-367/0787-080-333

Marks 35&above 30-35 25-30 20-25 15-20 <15


No. of Schools. 7 10 15 9 5 4
Avge. No. of students 200 250 300 200 150 100
Required: compute the mode marks & mean marks

Solution:

Marks (C.I) No. of Schs. Avge. No. No. of C.I x


of students Stud. (f)
< than 15 4 100 400 10-15 12.5
15-20 5 150 750 15-20 17.5
20-25 9 200 1,800 20-25 22.5
25-30 15 300 4,500 25-30 27.5
30-35 10 350 2,500 30-35 32.5
35 & above 7 200 1,400 35-40 37.5
∑𝑓 = 11,350
𝑑1
𝑍 = 𝑙𝑚 + ( )×𝑐
𝑑1 + 𝑑2
(4500 − 1800)
𝑍 = 25 + ( ) × (30 − 25)
(4500 − 1800) + (4500 − 2500)

2,700
= 25 + ( )×5
2,700 + 2,000
2,700 × 5
25 + ( ) = 25 + 2.9 = 27.9𝑚𝑎𝑟𝑘𝑠
2,700 + 2,000

Exercises:

1. From the following data, estimate the mode and quartiles (1,2 & 3).

C.I 1-5 6-10 11-15 16-20 21-25 26-30


Freq. 5 15 35 30 5 18
2. Draw a histogram from the following data and locate the mode:

C.I 0-10 10-20 20-30 30-40 40-50 50-60


Freq. 14 23 35 20 8 5

66 | P a g e
Intro. To statistics &for Economics -Emma Charles 0753-236-367/0787-080-333

MEASURES OF DISPERSION (VARIATION)

Dispersion, is a degree to which numerical data tend to spread about an average


value (Spiegel). It may also be defined as the extent of scatteredness of items around
a measure of central tendency.

Types of measures of dispersion:

1. Positional (based on limits): Range & Quartile Deviation


2. Arithmetic (based on difference): Mean Deviation & standard deviation
3. Graphic (based on graph): Lorenz curve

RANGE

This is the difference between the smallest and highest value of the series. It gives an
extremely simple indicator of the variability of a set of observations. Note that,
where data are grouped into a frequency distribution, the range is equal to the
difference between the upper boundary of the highest class and lower boundary of
the lowest class. Thus, range is given by;
𝑅𝑎𝑛𝑔𝑒 = 𝑙𝑎𝑟𝑔𝑒𝑠𝑡 𝑣𝑎𝑙𝑢𝑒 − 𝑠𝑚𝑎𝑙𝑙𝑒𝑠𝑡 𝑣𝑎𝑙𝑢𝑒

𝑙𝑎𝑟𝑔𝑒𝑠𝑡 𝑣𝑎𝑙𝑢𝑒 − 𝑠𝑚𝑎𝑙𝑙𝑒𝑠𝑡 𝑣𝑎𝑙𝑢𝑒


𝑐𝑜𝑒𝑓𝑓𝑖𝑐𝑖𝑒𝑛𝑡 𝑜𝑓 𝑅𝑎𝑛𝑔𝑒 =
𝑙𝑎𝑟𝑔𝑒𝑠𝑡 𝑣𝑎𝑙𝑢𝑒 + 𝑠𝑚𝑎𝑙𝑙𝑒𝑠𝑡 𝑣𝑎𝑙𝑢𝑒

Therefore, a distribution with the smallest range is less dispersed and vice-versa.

Examples:

Individual series:

Series Values of variables


A 10, 11, 12, 13, 14
B 40, 41, 42, 43, 44
C 100, 101, 102, 103, 104
Required: compute the range and coefficient of range of the series, and state which
one is more dispersed and more uniform.

67 | P a g e
Intro. To statistics &for Economics -Emma Charles 0753-236-367/0787-080-333

Solution:

A B C

Range: =𝐿−𝑆 =𝐿−𝑆 =𝐿−𝑆

= 14 − 10 = 44 − 40 = 104 − 100

=4 =4 =4

𝐿−𝑆 𝐿−𝑆 𝐿−𝑆


Coefficient of Rage =
𝐿+𝑆
=
𝐿+𝑆
=
𝐿+𝑆

14−0 44−40 104−100


= 14+10 = 44+40 = 104+100

= 0.2 = 0.04 = 0.02

Thus, Series C, is less dispersed & more uniform while; Series (A), s less uniform &
more dispersed.

Discrete series:

Example: from the following distribution, find the range and coefficient of range

X 6 12 18 24 30 36 42
Freq. 20 130 80 60 210 1500 600
Solution:
𝑅𝑎𝑛𝑔𝑒 = 𝐿 − 𝑆 = 42 − 6 = 36

𝐿−𝑆 42 − 6 36
𝐶𝑜𝑅 = == == = 0.8
𝐿+𝑆 42 + 6 48

Continuous series:

Example;

C.I 120-130 130-140 140-150 150-160 160-170


Freq. 2 9 16 12 5
Solution:
𝑅𝑎𝑛𝑔𝑒 = 𝐿 − 𝑆 = 170 − 120 = 50

68 | P a g e
Intro. To statistics &for Economics -Emma Charles 0753-236-367/0787-080-333

𝐿−𝑆 170 − 120 50


𝐶𝑜𝑅 = == == = 0.2
𝐿+𝑆 170 + 120 290

QUARTILE DEVIATION (Semi-Interquartile range)

This measures the difference between the values of quartile one and quartile three.
Q.D is a measure of partition rather than a measure of dispersion. Thus, the smaller
the value of Q.D., the minimum is the dispersion of middle half of the distribution
around the median & vice versa. It is symbolically denoted as:
𝑄3 − 𝑄1
𝑄. 𝐷 =
2
𝑄3 − 𝑄1
𝑐𝑜𝑒𝑓𝑓𝑖𝑐𝑖𝑒𝑛𝑡 𝑜𝑓 𝑄. 𝐷 =
𝑄3 + 𝑄1

Individual series:

From the following marks of 12 students, compute the quartile deviation and its
coefficient.

20, 28, 40, 12, 30, 15, 50

Solution:

S/No. 1 2 3 4 5 6 7
Marks 12 15 20 28 30 40 50
𝑛+1
𝑄1 = 𝑡ℎ 𝑖𝑡𝑒𝑚
4
7+1
𝑄1 = 𝑡ℎ 𝑖𝑡𝑒𝑚 = 2𝑛𝑑 𝑖𝑡𝑒𝑚 = 15
4
3(𝑛 + 1)
𝑄3 = 𝑡ℎ 𝑖𝑡𝑒𝑚
4
3(7 + 1)
𝑄3 = 𝑡ℎ 𝑖𝑡𝑒𝑚 = 6𝑡ℎ 𝑖𝑡𝑒𝑚 = 40
4
𝑄3 − 𝑄1 40 − 15
∴ 𝑄. 𝐷 = = = 12.5
2 2
𝑄3 − 𝑄1 40 − 15 25
𝑐𝑜𝑒𝑓𝑓𝑖𝑐𝑖𝑒𝑛𝑡 𝑜𝑓 𝑄. 𝐷 = = = = 0.45
𝑄3 + 𝑄1 40 + 15 55

69 | P a g e
Intro. To statistics &for Economics -Emma Charles 0753-236-367/0787-080-333

Discrete series:

Compute quartile deviation and its coefficient from the following data:

x 58 59 60 61 62 63 64 65 66
Freq. 15 20 32 35 33 22 20 10 8
Solution:

x Freq. C.f
58 15 15
59 20 35
60 32 67
61 35 102
62 33 135
63 22 157
64 20 177
65 10 187
66 8 195
∑𝑓 = 195
𝑛+1
𝑄1 = 𝑡ℎ 𝑖𝑡𝑒𝑚
4
195 + 1
𝑄1 = 𝑡ℎ 𝑖𝑡𝑒𝑚 = 49𝑡ℎ 𝑖𝑡𝑒𝑚 = 67 𝑐. 𝑓 = 60
4
3(𝑛 + 1)
𝑄3 = 𝑡ℎ 𝑖𝑡𝑒𝑚
4
3(195 + 1)
𝑄3 = 𝑡ℎ 𝑖𝑡𝑒𝑚 = 147𝑡ℎ 𝑖𝑡𝑒𝑚 = 157 𝑐. 𝑓 = 63
4
𝑄3 − 𝑄1 63 − 60
∴ 𝑄. 𝐷 = = = 1.5
2 2
𝑄3 − 𝑄1 63 − 60 3
𝑐𝑜𝑒𝑓𝑓𝑖𝑐𝑖𝑒𝑛𝑡 𝑜𝑓 𝑄. 𝐷 = = = = 0.02
𝑄3 + 𝑄1 63 + 60 123

Continuous series:

From the following data, compute quartile deviation and its coefficient

70 | P a g e
Intro. To statistics &for Economics -Emma Charles 0753-236-367/0787-080-333

Wages 4-8 8-12 12-16 16-20 20-24 24-28 28-32 32-36 36-40
Workers 6 10 18 30 15 12 10 6 2
Solution:

Wages (shs) No. of workers (f) C.f


4-8 6 6
8-12 10 16
12-16 18 34
16-20 30 64
20-24 15 79
24-28 12 91
28-32 10 101
32-36 6 107
36-40 2 109
∑𝑓 = 109
𝑛 109
𝑄1 = 𝑡ℎ 𝑖𝑡𝑒𝑚 = = 27.3𝑟𝑑 𝑖𝑡𝑒𝑚 = 34 𝑐. 𝑓 = 𝑐𝑙𝑎𝑠𝑠 (12 − 16)
4 4
𝑛
− 𝐶𝑓𝑏
𝑄1 = 𝐿𝑚 + (4 )×𝑐
𝑓𝑚

27.25 − 16
= 12 + ×4
18
45
= 12 + = 12 + 2.5 = 15
18
3𝑛 3(109)
𝑄3 = 𝑡ℎ 𝑖𝑡𝑒𝑚 = = 81.8𝑡ℎ 𝑖𝑡𝑒𝑚 = 91 𝑐. 𝑓 = 𝑐𝑙𝑎𝑠𝑠 (24 − 28)
4 4
3𝑛
− 𝐶𝑓𝑏
𝑄1 = 𝐿𝑚 + ( 4 )×𝑐
𝑓𝑚

81.8 − 79
= 12 + ×4
12
11.2
= 24 + = 24 + 0.93 = 25
12
𝑄3 − 𝑄1 25 − 15
∴ 𝑄. 𝐷 = = =5
2 2
71 | P a g e
Intro. To statistics &for Economics -Emma Charles 0753-236-367/0787-080-333

𝑄3 − 𝑄1 25 − 15 10
𝑐𝑜𝑒𝑓𝑓𝑖𝑐𝑖𝑒𝑛𝑡 𝑜𝑓 𝑄. 𝐷 = = = = 0.25
𝑄3 + 𝑄1 25 + 15 40

Exercise:

Compute quartile deviation and its coefficient from the following:

Size 5-7 8-10 11-13 14-16 17-19


Freq. 14 24 38 20 4

MEAN DEVIATION (𝜹)/Average deviation/First moment of dispersion

Mean deviation is the average difference among the items in a series from the mean
itself or median or mode of that series. It is concerned with the extent to which the
values are dispersed about the mean or the median or the mode. It is found by the
averaging all the deviations from the central tendency. These deviations are taken
into computations with regard to negative sign (i.e., all the deviations assumed as
positive).

Theoretically, the deviations of items are taken preferably from median instead than
from the mean or the mode. Median is supposed to be the suitable central tendency
for calculating deviations because the sum of the deviations from the median is less
than the sum of the deviations from the mean. It is not a common practice to
calculate the deviation from the mode as its value is sometimes not clearly defined.

In aggregating the deviations, the algebraic negative signs are not taken into account.
It means all the deviations are treated as positive ignoring the negative signs. It’s
then denoted by the following formulae:

Individual series:
̅)
∑(𝒙 − 𝒙
𝜹=
𝒏

Discrete & continuous series:


̅)
∑𝒇(𝒙 − 𝒙
𝜹=
∑𝒇

72 | P a g e
Intro. To statistics &for Economics -Emma Charles 0753-236-367/0787-080-333

𝜹 𝜹 𝜹
𝒄𝒐𝒆𝒇𝒇𝒊𝒄𝒊𝒆𝒏𝒕 𝒐𝒇 𝜹 = 𝒐𝒓 𝒐𝒓
𝑴𝒆 ̅
𝒙 𝒁

Individual series:

Example:

Given below are marks obtained by 9 students:

45, 32, 37, 46, 39, 36, 41, 48 & 36

Required: find the mean deviation and coefficient of mean deviation from the mean
& median.

Solution:

S/No. 𝑥 |𝑥 − 𝑥̅ | |𝒙 − 𝑴𝒆|

1 32 8 7
2 36 4 3
3 36 4 3
4 37 3 2
5 39 1 0
6 41 1 2
7 45 5 6
8 46 6 7
9 48 8 9
∑𝒏 = 𝟗 ̅) = 𝟒𝟎
∑(𝒙 − 𝒙 ∑(𝒙 − 𝑴𝒆) = 𝟑𝟗

∑𝒙 𝟑𝟔𝟎
From the mean ̅=
𝒙 = = 𝟒𝟎
𝒏 𝟗

̅) 𝟒𝟎
∑(𝒙 − 𝒙
∴𝜹= = = 𝟒. 𝟒
𝒏 𝟗
𝒏+𝟏 𝟗+𝟏
From the median 𝑴𝒆 = ( 𝟐
) 𝒕𝒉 𝒊𝒕𝒆𝒎 = 𝟐
= 𝟓𝒕𝒉 𝒊𝒕𝒆𝒎 = 𝟑𝟗

∑(𝒙 − 𝑴𝒆) 𝟑𝟗
∴𝜹= = = 𝟒. 𝟑
𝒏 𝟗
𝜹 𝟒. 𝟑 𝜹 𝟒. 𝟒
𝑻𝒉𝒖𝒔, 𝒄𝒐𝒆𝒇𝒇𝒊𝒄𝒊𝒆𝒏𝒕 𝒐𝒇 𝜹 = = = 𝟎. 𝟏𝟏 𝒐𝒓 = = 𝟎. 𝟏𝟏
𝑴𝒆 𝟑𝟗 ̅ 𝟒𝟎
𝒙
73 | P a g e
Intro. To statistics &for Economics -Emma Charles 0753-236-367/0787-080-333

Discrete series:

Example:

Determine mean deviation from the median and mean from the following data:

Wages 20 18 16 14 12 10 6 4
Freq. 2 4 9 18 27 25 14 1
Solution:

Wages 𝒇 𝒄. 𝒇 𝑓𝑥 |𝑥 − 𝑥̅ | |𝒙 − 𝑴𝒆| 𝑓|𝑥 − 𝑥̅ | 𝑓|𝒙 − 𝑴𝒆|

𝑥
20 2 2 40 8 8 16 16
18 4 6 72 6 6 24 24
16 9 15 144 4 4 36 36
14 18 33 252 2 2 36 36
12 27 60 324 0 0 0 0
10 25 85 250 2 2 50 50
6 14 99 84 6 6 84 84
4 1 100 4 8 8 8 8
∑𝒇 = 𝟏𝟎𝟎 ∑𝒇𝒙 = 𝟏, 𝟏𝟕𝟎 ̅) = 𝟐𝟓𝟎
∑𝒇(𝒙 − 𝒙 ∑𝒇(𝒙 − 𝑴𝒆) = 𝟐𝟓𝟎

∑𝒇𝒙 𝟏𝟏𝟕𝟎
From the mean ̅
𝒙= = = 𝟏𝟐
∑𝒇 𝟏𝟎𝟎

̅) 𝟐𝟓𝟎
∑𝒇(𝒙 − 𝒙
∴𝜹= = = 𝟐. 𝟓
∑𝒇 𝟏𝟎𝟎

From the median 𝑴𝒆 = (


𝒏+𝟏
𝟐
) 𝒕𝒉 𝒊𝒕𝒆𝒎 =
𝟏𝟎𝟎+𝟏
𝟐
= 𝟓𝟎. 𝟓𝒕𝒉 𝒊𝒕𝒆𝒎 = 𝟔𝟎𝒄. 𝒇 = 𝟏𝟐

∑𝒇(𝒙 − 𝑴𝒆) 𝟐𝟓𝟎


∴𝜹= = = 𝟐. 𝟓
∑𝒇 𝟏𝟎𝟎

𝜹 𝟐. 𝟓 𝜹 𝟐. 𝟓
𝑻𝒉𝒖𝒔, 𝒄𝒐𝒆𝒇𝒇𝒊𝒄𝒊𝒆𝒏𝒕 𝒐𝒇 𝜹 = = = 𝟎. 𝟐𝟏 𝒐𝒓 = = 𝟎. 𝟐𝟏
𝑴𝒆 𝟏𝟐 ̅ 𝟏𝟐
𝒙

Continuous Series:

Example

74 | P a g e
Intro. To statistics &for Economics -Emma Charles 0753-236-367/0787-080-333

From the following data, calculate mean deviation from the median and the mean

Class 1-5 6-10 11-15 16-20 21-25 26-30 31-35 36-40 41-45
Freq. 7 10 16 32 24 18 10 5 1
Solution:

Wages 𝑪𝑩 𝒙 𝒇 𝒄. 𝒇 𝒇𝑥 |𝑥 − 𝑥̅ | |𝒙 − 𝑴𝒆| 𝑓|𝑥 − 𝑥̅ | 𝑓|𝒙 − 𝑴𝒆|


𝑥
1-5 0.5-5.5 3 7 7 21 17.4 15 121.8 105
6-10 5.5-10.5 8 10 17 80 12.4 10 124 100
11-15 10.5-15.5 13 16 33 208 7.4 5 118.4 80
16-20 15.5-20.5 18 32 65 576 2.4 0 76.8 0
21-25 30.5-25.5 23 24 89 552 2.6 5 62.4 120
26-30 25.5-30.5 28 18 107 504 7.6 10 136.8 180
31-35 30.5-35.5 33 10 117 330 12.6 15 126 150
36-40 35.5-40.5 38 5 122 190 17.6 20 88 100
41-45 40.5-45.5 43 1 123 43 22.6 25 22.6 25

∑𝒇 ∑𝒇𝒙 ̅)
∑𝒇(𝒙 − 𝒙 ∑𝒇(𝒙 − 𝑴𝒆)
= 𝟏𝟐𝟑 = 𝟐, 𝟓𝟎𝟒 = 𝟖𝟕𝟔. 𝟖 = 𝟖𝟔𝟎
∑𝒇𝒙 𝟐𝟓𝟎𝟒
From the mean ̅=
𝒙 = = 𝟐𝟎. 𝟒
∑𝒇 𝟏𝟐𝟑

̅) 𝟖𝟕𝟔. 𝟖
∑𝒇(𝒙 − 𝒙
∴𝜹= = = 𝟕. 𝟏𝟑
∑𝒇 𝟏𝟐𝟑

From the median 𝒏


𝑴𝒆 = ( ) 𝒕𝒉 𝒊𝒕𝒆𝒎 =
𝟐
𝟏𝟐𝟑
𝟐
= 𝟔𝟏. 𝟓𝒕𝒉 𝒊𝒕𝒆𝒎 = 𝟔𝟓𝒄. 𝒇 = 𝟏𝟖

∑𝒇(𝒙 − 𝑴𝒆) 𝟖𝟔𝟎


∴𝜹= = = 𝟔. 𝟗𝟗
∑𝒇 𝟏𝟐𝟑

𝜹 𝟔. 𝟗𝟗 𝜹 𝟕. 𝟏𝟑
𝑻𝒉𝒖𝒔, 𝒄𝒐𝒆𝒇𝒇𝒊𝒄𝒊𝒆𝒏𝒕 𝒐𝒇 𝜹 = = = 𝟎. 𝟒 𝒐𝒓 = = 𝟎. 𝟑𝟓
𝑴𝒆 𝟏𝟖 ̅ 𝟐𝟎. 𝟒
𝒙

STANDARD DEVIATION

'Standard deviation' is the root of the sum of the squares of the deviations divided by
their number. It is also called "Mean Error Deviation". 'Mean Square Error Deviation' or
"Root Mean Square Deviation". It is a second moment of a dispersion. Since the sum of
the squares of the deviations from the mean is minimum, the deviations are taken only
from mean (but not from median or mode).

75 | P a g e
Intro. To statistics &for Economics -Emma Charles 0753-236-367/0787-080-333

Standard Deviation is the root-mean-square average of all the deviations from the
mean. It was proposed by Prof. Karl Pearson in 1893, and it is denoted by 𝜎 (sigma)

Note: If S.D number is;

 Zero, it means numbers are close to each other


 Large, it means numbers are well dispersed
 Small, it means high degree of uniformity & homogeneity of the series.

Therefore, the following formulae are used in determination of S.D.

1. Individual series:

∑(𝑥 − 𝑥̅ )2
𝜎=√ … 𝐴𝑏𝑠𝑜𝑙𝑢𝑡𝑒 𝑚𝑒𝑎𝑠𝑢𝑟𝑒
𝑛

2. Discrete & Continuous series:

∑𝑓(𝑥 − 𝑥̅ )2
𝜎=√ … 𝐴𝑏𝑠𝑜𝑙𝑢𝑡𝑒 𝑚𝑒𝑎𝑠𝑢𝑟𝑒
∑𝑓

Coefficient of Variation (C.V):


𝜎
𝜎= × 100 … 𝑅𝑒𝑙𝑎𝑡𝑖𝑣𝑒 𝑀𝑒𝑎𝑠𝑢𝑟𝑒
𝑥
Worked Examples:

Individual series:

Ten students of B.A. class have obtained the following marks in Economics major
out of 100. Calculate standard deviation of marks obtained.

7, 8, 10, 13, 14, 19, 20, 25, 26, 28

Solution:

𝑥 (𝑥 − 𝑥̅ ) (𝑥 − 𝑥̅ )2
7 -10 100

76 | P a g e
Intro. To statistics &for Economics -Emma Charles 0753-236-367/0787-080-333

8 -9 81
10 -7 49
13 -4 16
14 -3 9
19 -2 4
20 3 9
25 8 64
26 9 81
28 11 121
∑𝑥 =170 ∑(𝑥 − 𝑥̅ )2 = 534
∑𝑥 170
𝑥̅ = = = 17 𝑚𝑎𝑟𝑘𝑠
𝑛 10

∑(𝑥 − 𝑥̅ )2 534
∴𝜎=√ = = √53.4 = 7.32
𝑛 10

𝜎 7.32
𝐶𝑉 = × 100 = = 0.43 × 100 = 43%
𝑥̅ 17

The following are marks obtained by two students ‘Miracle’ & ‘Fatumah’ in 10 tests
of 100 marks each.

Tests 1 2 3 4 5 6 7 8 9 10
Miracle 44 80 76 48 52 72 72 51 60 54
Fatumah 48 75 54 60 63 69 72 51 57 66
Required:

Find who is better in studies and if consistency is the criterion for awarding a prize,
who should get the prize?

Solution:

Miracle Fatumah
𝑥 (𝑥 − 𝑥̅ ) (𝑥 − 𝑥̅ )2 𝑥 (𝑥 − 𝑥̅ ) (𝑥 − 𝑥̅ )2
44 -16-9 285.61 48 -13.5 182.25
80 -12.9 166.41 75 -10.5 110.25
76 -9.9 98.01 54 -7.5 56.25

77 | P a g e
Intro. To statistics &for Economics -Emma Charles 0753-236-367/0787-080-333

48 -8.9 79.21 60 -4.5 20.25


52 -6.9 47.61 63 -1.5 2.25
72 -0.9 0.81 69 1.5 2.25
72 11.1 123.21 72 4.5 20.25
51 11.1 123.21 51 7.5 56.25
60 15.1 228.01 57 10.5 110.25
54 19.1 364.81 66 13.5 182.25
∑𝑥 =609 ∑(𝑥 − 𝑥̅ )2 ∑𝑥 =615 ∑(𝑥 − 𝑥̅ )2
= 1516.90 = 742.50
Miracle:
∑𝑥 609
𝑥̅ = = = 60.9 𝑚𝑎𝑟𝑘𝑠
𝑛 10

∑(𝑥 − 𝑥̅ )2 1516.90
∴𝜎=√ = = √151.69 = 12.32
𝑛 10

𝜎 12.32
𝐶𝑉 = × 100 = = 0.202 × 100 = 20.2%
𝑥̅ 60.9

Fatumah:
∑𝑥 615
𝑥̅ = = = 61.5 𝑚𝑎𝑟𝑘𝑠
𝑛 10

∑(𝑥 − 𝑥̅ )2 742.50
∴𝜎=√ = = √74.25 = 8.62
𝑛 10

𝜎 8.62
𝐶𝑉 = × 100 = = 0.140 × 100 = 14%
𝑥̅ 61.5

Therefore, Fatumah is better in studies and she should get the prize since her
average is high and the variance is less.

Exercise:

1. Prices of a particular commodity in five years in two cities are given below:

City A 20 22 19 23 16
City B 10 20 18 12 15

78 | P a g e
Intro. To statistics &for Economics -Emma Charles 0753-236-367/0787-080-333

Required; determine which city had more stable prices.

2. The following are the runs scored by two batsmen Miklos & Ojara

Miklos 70 90 80 50 40
Ojara 70 90 60 50 30
Required: find who is a better run getter & more consistent player.

Discrete series:

The following table gives the age distribution of boys & girls in a University.

Age (years) 20 21 22 23 24
Boys 12 15 15 5 3
Girls 13 10 12 2 1
Required: determine which of the two groups is more variable in age.

Solution:
BOYS GIRLS
𝟐 𝟐
𝑨𝒈𝒆 𝒙 𝑓 𝑓𝑥 𝑑 𝒅 𝒇𝒅 𝑨𝒈𝒆 𝒙 𝑓 𝑓𝑥 𝑑 𝒅𝟐 𝒇𝒅𝟐
20 12 240 -1.44 2.0736 24.8832 20 13 260 -1.2 1.44 18.72
21 15 315 -0.44 0.1936 2.904 21 10 210 -0.2 0.04 0.4
22 15 330 0.56 0.3136 4.704 22 12 264 0.8 0.64 9.6
23 5 115 1.56 2.4336 12.168 23 2 46 1.8 3.24 6.48
24 3 72 2.56 6.5536 19.6608 24 1 24 2.8 7.84 7.84
∑𝑓 ∑𝑓𝑥 ∑𝒇𝒅𝟐 ∑𝑓 ∑𝑓𝑥 ∑𝒇𝒅𝟐
= 50 = 1072 = 𝟔𝟒. 𝟑𝟐 = 38 = 804 = 𝟒𝟑. 𝟎𝟒
BOYS:
∑𝑓𝑥 1072
𝑥̅ = = = 21.44
∑𝑓 50

∑𝑓(𝑥 − 𝑥̅ )2 64.32
∴𝜎=√ = = √1.2864 = 1.1342
∑𝑓 50

𝜎 1.13
𝐶𝑉 = × 100 = = 0.053 × 100 = 5.3%
𝑥̅ 21.44

GIRLS:
∑𝑓𝑥 804
𝑥̅ = = = 21.2
∑𝑓 38

79 | P a g e
Intro. To statistics &for Economics -Emma Charles 0753-236-367/0787-080-333

∑𝑓(𝑥 − 𝑥̅ )2 43.04
∴𝜎=√ = = √1.1326 = 1.0642
∑𝑓 38

𝜎 1.06
𝐶𝑉 = × 100 = = 0.050 × 100 = 5%
𝑥̅ 21.2

Thus, the age of Boys is more variable since it has more variation

Exercise:

Compute standard deviation from the following data:

X 48 49 50 51 52 53 54 55 56 57
Freq. 3 7 11 14 18 17 13 8 5 4

Continuous series:

A factory produces two types of tyres. In an experiment in the working life of these
tyres the following results are obtained.

Length of life (000) hrs 15-17 17-19 19-21 21-23 23-25


Type A 50 110 260 100 80
Type B 40 300 120 80 60
Required: determine which type of tyre is more stable.

Solution:
C.I 𝒙 TYPE A TYPE B
(hrs) 𝑓 𝑓𝑥 𝑑 𝟐
𝒅 𝟐
𝒇𝒅 𝑓 𝑓𝑥 𝑑 𝒅𝟐 𝒇𝒅𝟐
15-17 16 50 800 -4.2 17.64 882 40 640 -3.4 11.56 462.4
17-19 18 100 1800 -2.2 4.84 484 300 5400 -1.4 1.96 588
19-21 20 260 5200 -0.2 0.04 10.4 120 2400 0.6 0.36 43.2
21-23 22 100 2200 1.8 3.24 324 80 1760 2.6 6.76 540.8
23-25 24 80 1920 3.8 14.44 1155.2 60 1440 4.6 21.16 1269.6
∑𝑓 ∑𝑓𝑥 ∑𝒇𝒅𝟐 ∑𝑓 ∑𝑓𝑥 ∑𝒇𝒅𝟐
= 590 = 11920 = 𝟐𝟖𝟓𝟓. 𝟔 = 600 = 11640 = 𝟐𝟗𝟎𝟒
TYPE A:
∑𝑓𝑥 11920
𝑥̅ = = = 20.2
∑𝑓 590

80 | P a g e
Intro. To statistics &for Economics -Emma Charles 0753-236-367/0787-080-333

∑𝑓(𝑥 − 𝑥̅ )2 2855.6
∴𝜎=√ = = √4.84 = 2.2
∑𝑓 590

𝜎 2.2
𝐶𝑉 = × 100 = = 0.109 × 100 = 10.9%
𝑥̅ 20.2

TYPE B:
∑𝑓𝑥 11640
𝑥̅ = = = 19.4
∑𝑓 600

∑𝑓(𝑥 − 𝑥̅ )2 2904
∴𝜎=√ = = √4.84 = 2.2
∑𝑓 600

𝜎 2.2
𝐶𝑉 = × 100 = = 0.113 × 100 = 11.3%
𝑥̅ 19.4

Thus, type A tyres are more stable as their variation is less than that of Type B.

Characteristics of Standard Deviation

Standard Deviation and Coefficient of Variation possess all those properties, which
a good measure of dispersion should possess. The process of squaring the deviations
eliminates negative signs, and thus makes the mathematical manipulation of figures
easy.

Merits of Standard Deviation:


Following are the merits of standard deviations. -

 It is based on all the observations given.


 It can be smoothly handled algebraically.
 It is a well-defined and definite measure of dispersion
 It is of great importance, when we are making comparison between variability of
two series.

Demerits of Standards Deviations:


 In spite of its merits the standard deviation suffers from the following
limitations:

81 | P a g e
Intro. To statistics &for Economics -Emma Charles 0753-236-367/0787-080-333

 It is difficult to calculate and understand.


 It gives more weight to extreme values as the deviations are squared.
 It is not useful in economic studies.

Exercise:

1. An agent obtained samples of bulbs from the two companies. He had them
tested for durability and got the following results:

Durability in ‘00’ hrs Company A Company B


17-19 100 30
19-21 160 420
21-23 260 120
23-25 80 30
Required: which company bulbs are more uniform?

2. Find which of the two classes is more consistent in scoring marks from the
following table:

Marks 20-30 30-40 40-50 50-60 60-70


Class A 7 10 20 18 7
Class B 5 9 21 15 6

3. Two brands of tyres are tested for their life and the following results were
obtained:

Life (months) 20-25 25-30 30-35 35-40 40-45


Tyres X 1 22 64 10 3
Tyres Y 3 21 74 1 1
If inconsistency is the criterion, which brand of tyres would you prefer?

INDEX NUMBERS

82 | P a g e

You might also like