0% found this document useful (0 votes)
4 views

Statistics Module 1 (Intro,Statistical Data, Primary and Secondary Data and Sources of Data)[1]

Uploaded by

praveenam137
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

Statistics Module 1 (Intro,Statistical Data, Primary and Secondary Data and Sources of Data)[1]

Uploaded by

praveenam137
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 24

Contents

Module - 1 Introduction to Statistics 1 - 162


Introduction
Statistical Data

Primary and Secondary Data


Sources of Data
Classification of Data
Frequency Distribution
Diagrammatic Representation
Graphic Representation of Data
Graphs
Advantages of Diagrams and Graphs
Limitations of Diagrams and Graphs
Tabulation
Types of Tables
Construction of One Way and Two Way Tables
Measures of Central Terndency
Mean
Median

Mode
Measures of Dispersion
Range
Mean Deviation
Standard Deviation
Coefficient of Variation
Skewness
Kurtosis
Review Questions
Practical Problems with Answer
Statistics for Management

1 Introduction
TOPIC
Origin and Growth of Statistics
The origin of statistics is originated by the word which is said to have been derived
either from the Latin Word 'Status' or the Italian word 'statista' or the German word
'Statistik' which means political state. Statistics was used as a by-product of
administrative activity. Govt. maintains records of various types of numerical data on
population, births, deaths, literates, illiterates, employment, unemployment, income,
taxes, imports, exports etc. Statistics was used as a technique to collect periodica) dato
to ascertain the manpower and material strength for military and fiscal purposes.
The theoretical development of statistics has its origin in the mid-seventeenth
century when many gamblers and mathematicians of France, Germany and England
are credited for its development. Pascal and P. Fermat, the two great Prench
mathematicians made innovative efforts to solve the famous 'Problem of point' wbich
was posed by the famous French gambler Chevalier De Mere. Their contribution
became the foundation stone of the Science of Probability. James Bernoulli (1654-1705)
developed the 'Normal Curve'. The use of 'Statistics' was popularized by Sir John
Sinclair in his work Statistical Account of Scotland (1791-1799). Modern Theory of
Statistics wa gradually developed during the 18th, 19th and 20th centuries
mathematicians. Laplace (1749-1827) gave the principles of Least squares' and
established the 'Normal Law of Errors'. The famous statisticians Sir Francis Galton
(1822.1911), Karl Pearson (1857-1936) and W.S. Gosset contributed to the study of
Regression Analysis, Correlation Analysis as well as Chi-square test of Goodness of Fit,
and t-test respectively. R.A. Fisher, who is called "Father of Statistics", has developed
statistics for use in genetics, biometry, agriculture, psychology and education. He also
contributed to the Estimation Theory, Sampling Distribution, Analysis of Variance
(ANOVA) and design of experiments. Thus Prof. Ronald A. Fisher is the real exponent
in the development of the "Theory of Statistics".
Meaning of Statistics
Statistih is used as general name for a large group of mathematical tools, not
aiming at absolutely accurate results but approximate results based on the theory o!
probability, used to collect, analyse and interpret numerical facts for solving specittc
problens. Facts that one dealt with in statistics must be capable of numerical
xpression. Otherwise they do not come within the preview of statistics. Statistics is also
Introduetion to Statistios

wnvorned with a NhOup ol umercal data an lor esanyplo, population of a counry, nnlon
nrive of the finishel gooda produwed by a concorn ete.
Stotistics can be used either as plural or singular:
When it is used as plural sense,
It is a systematic collection and presentation of nuerical wts and figures 'Thexe
figures may be with regard to production of food grains, er capitn income
artieular state at diferent times, population ote, and thoso aro gonorally publixlhedin ina
trade journals, news papers ete.
When statistics is used as singular sense,
In a singular sense, statistics implies statistics methods, Thus, it is a body or
technique of methods relating to the collection, classifieation, prexontation, analysis and
interpretation of inlormation. In a sense, statistics can be detined a8:
"Statistics is the science of estimates and probabilitio" - Boddington
"Statistics may rightly be called the seience oe averagos" - Bowley
"The science which deals with the collection, analysis and interpretation of
numerical data". - Corxton & Cowden
All definitions clearly point out the four aspects of statistics (i) collection of data, (ii)
analysis of data. (i) presentation of data and (iv) interpretation of data.
Definitions (based on Singular Noun)
Prof. A.L. Bowley has given as series of definitions, At one place Bowley says
"Statistics may be called the science of counting". "This view is not perfect and
correct. Statistics is not concerned with counting only. It deals more with estimates,
At another place. he says that "Statistics many rightly be called the science of
averages". But calling statistics as a scienwe of counting or averages, confines the
scope of statisties. Bowley himself realized this drawback and stated that statistics
cannot be confined to any one sense.
Webster defined Statistics as "The classified facts respecting the condition of the
people in a state especilly those facts which can be stated in numbers or in tables
of numbers or in any tabular or classified arrangement". This definition has
limited the scope of statistics. It relates statisties only to those facts which are
concerned with the condition of the people in a state. This definition is not
exhaustive because, it does not take into aceount all aspeets of human activity.
Seligman defines statistics as the science which deals with the methods of
collecting, classifying, presenting, comparing and interpreting numerical data
collected to throw some light on any sphere of enquiry.
Definitions (based on Plural Noun/Numerial Data)
According to Horace Seerist "By statistics we mean aggregate of facts affected to
a marked extent by multiplicity of cause, numerically expressed, enumerated or
estimated according to reasonable standard of accuracy, collected in a systematic
manner for a predetermined purpose and placed in relation to each other".
Statistics for Management

Characteristics of Statistics in the Plural Sense or Numerical Data


dala run as follous:
The basic feature of statistics as a quantitative or numerical
1.
but it refers to.
Aggregate of Facts: Statistics does not refer to a single figureExample: A singlo
series of figures i.e., the totality of facts are called statistics.
weight of a groun of
weigh of 50 kg is not statistics but a series relating to the
Ppersons is called statistics.
factor onlw
Affected by Multiplicity of Causes: Statistics are not affected by one affected hy
are
rather they are affected by a large number of factors. E.g. prices
conditions of demand, supply, money supply, imports, exports and various other
factors. bad ete
3. Numerically expressed: i.e. qualitative expressions like young, old, good,
must be attached. For e.g. the
are not statistics. To all statistics a numerical value
Furthermore, such
statements like "There are 916 females per 1,000 males.
convenient form
numerical expressions are precise, meaningful and
communication.
Enumerated or Estimated according to Reasonable Standards of Accuracy:
In case the numerical statement are precise and accurate, then these can be
enumerated. But in case the number of observatións is very large, in that case the
figures are estimated. It is obvious that the estimated figures cannot be absolutely
accurate and precise. The accuracy, of course, depends on the purpose for which
statistics are collected. There cannot be uniform standard of accuracy for all types
of enquiries. E.g., enumeration refers to exact count as there are ten students of
statistics; it is 100% accurate statement. On the other hand, estimation refers to
round off about figure; we say that two lakhs people participated in the Rally.
There can be a few hundreds more or less. Thus statistical results are true only on
average.
5. Collected in a Systematic Manner: For accuracy or reliability of data, the
figures should be collected in a systematic manner. If the figures are collected in a
haphazard manner, the reliability of such data will decrease.
6. Collected for a Pre-determined Purpose: The purpose of collecting data must be
decided well in advance. Besides, the objective should be concrete and specific. For
example, if we want to collect data on prices, then we must be clear whether we
have to collect whole-sale or retail prices. If we want data on retail prices, then we
have to see the number of goods required to serve the objective.
7 Placed in Relation to each other: The collection of data is generally done with
the motive to compare. The figures collected should be homogeneous for
Comparison and not heterogeneous. In case of heterogeneity, the figures cannot be
placed in relation to each other.
Introduction to Statistics 5

Objectives of Statistics
The obties of statistics are as follows:
) To present facts in numerical form.
(R To smpl1fy. classify and analyse numerical data so that their significance may
clearly understood. be
G To bring out the broad characteristics of a group in which the
exhibit variation in attributes. individual numbers
(v) To throw light on general economic and social conditions as guide to
to decide administrative policy. administrators
(v) To indicate business trends and tendencies so that it may be possible to plan ahead
without any difficulties.
Uses of Statistics
1. Uses of statistics in economic planning: The main aim of economic planning is
to develop the various sectors of the economy like agriculture, industry. transport,
irigation etc rapidly and systematically. Planning is not possible without statistics.
The planning commission requires information relating to the demand for and
supply of various commodities like food. cloth, sugar, iron and steel in order to
prepare a plan. It must also estimate the demand for various products and services
after a certain period, say 5years. It must possess accurate information relating to
the availability of natural, capital and human resources in the counting. It must
prepare a detailed plan for the development of each sector of the economy. These
detailed plans cannot be prepared without adequate statistical data.
2 Uses of statistics to a business: Statistics are of great use to traders and
manufacturers. They help them make maximum profits. A producer estimates
the demand for his product or products. On the basis of such an estimate, he
decides about the quality and quantity of goods to be produced. Similarly, a trader
brings goods on the basis of his estimates of the likely demand for his goods. If the
estimates are correct, the businessman will make profit as estimated. If he over
estimates this demand, he may find it difficult to sell his goods at profit. On the
other hand, he may incur loss. If he under estimates the demand he looses on
opportunity of making some more profit.
Statistics are very useful to banks, insurance companies, railway companies and
other public utility concerns. Statistics helps in efficient organisation and
supervision of business concerns. That is why big business concerns maintain
transition statistical departments.
3. Uses of statistics in the field of research: Statistics is indispensable in research
work. Most of the advancement in knowledge has taken place because of
experiments conducted with the help of statistical methods. For example,
experiments about crop yields and different types of fertilizers and different types
of soils or the growth of animals under different diets and environment are
frequently designed and analyzed with the help of statistical methods. Statistical
methods also affect research in medicine and public health. In fact, there is hardly
Statistics for Management
vomplete without statistical data a3
anY Feeeanh wotk today that one can find
atatintiat iethd

Functions of Statistics
performed by statistics are us under:
Ihe aos upplications or functions
Simplifieation of Complex FactwClassification of parts Data: The process f
which helps in the
oplitng up huge collection f numerical data into certainfeatures of the data, Thie
matters of comparison and interpretation of the varioUs
is done by various improved techniques of statistics. hy
Comparison: After simplifying the data, it can be correlated or compared
certan mathematical quantities like averages, ratios, Co-efficient etc.in to0 ascertain
the future
the changes which have taken place and the effect of such changes
Relationship between Pacts: Statistical methods are used to investigate the
between
cause and effect relationship between two or more facts. The relationship
demand and supply, money-supply and price level can be best understood with the
help of statistical methods.
Measurement of E(fecta/lt guides formulation of economic policies:
Statistical methods act as a guide to measure the effect of a policy. For example.
the effect of a change in bank rate or a change in incomes tax etc. can best he
judged by the statistical methods.
Porecating: Statistical nethods are of great use to predict the future course of
acton of the phenomenon.
Scope of Statistics
Te importance of statistica makes it clear that the science of statistics eovers all
quantitative analysis eoncerned with any department of enquiry. Its scope, therefore, is
utretched over all those branches of human knowledge in which a grasp of the
gnificances of large numbers are looked for. Its methods provide an important
manser of measuring numerical changes in complex groups and judging collective
phenomenon. It's seope is thus wide, the limiting factor being its applicability to studies
of quantitative anpects alone.
"Seenees without statistics bear no fruits, statistics without sciences have no roots."
There n hardly any field of human knowledge where statistical methods are not
ppl able. Thus, the significances of statistics has increased from the "'science of kings"
u the "sCience of universe."
(a) Statinticn and the state: The state collects statistics on several problems. These
statiatics help in framing Auitable policies. AII ministers and department of
government whether they be finance, transport, defence, railways, food, commerce
ete. depend heavily onfactua) data for their efficient
functioning.
(b) Statinties and Busines4: With the growing size and ever increasing competition.
the problems of the busines8 enterprises are becoming complex and they
are usin
more md more statisties in decision making, A businessman who has to deal in an
atmosphere of ucertainty can no longer adopt the method of trial and error
lntroduction to Statistics 7

Laing deeisions If he is to be suecessful in his decision making, he must be able


to
deal systematially with the uncertainty itself by eareful evaluation and
applhcation of statistical nethods, concernng the business activities.
) Stotistics and BeonomieN: Eeonomies is concerned with the production and
dtstribution of wealth as well as with the complex institutions set-up connected
with the onsumpion, sVing and investment of incone. Statistical data and
statistical methods are of immense help in the proper understanding of the
Onome problems and in the fornmulation of economie policie.
(d) Statistics and physical sciences: Physical sciences seem to be making
ireasing use of statistics, especially in astronomy, chemistry, biology.
engineering. meteorology. geology and certain branches of physics,.
(e) Statistics and natural sciences: Statistical technique have proved to be
extremely useful in the study of all natural sciences like astronomy; biology.
medicine. meteorology, zoology. botany etc.
() Statistics and researeh: Statistics is indispensable in research work. Most of the
advancement in knowledge has taken place because of experiments conducted
with the help of statistical methods.
(g) Statistics and other uses: Statistics is useful to bankers, brokers, insurance
companies, social workers, labour unions, trade associations, chambers of
commerce and to the politicians.
Scope of Statistics in Economics and Business
In the field of Economics, Statistics entered rather late. The relationship among
supply of. demand for and price of commodities was established with the help of
statistical analysis by the end of the 17th century. The classical economists laid more
stress upon the deceptive method of reasoning and economic laws were reasoned out in
the abstract. The economic laws were verified from observations and proved by
inductive method only during the first decade of the 20th century. The quantitative
analysis was introduced along with the qualitative analysis in the methods of study of
economics. Thus. the inductive method was introduced in addition to the classical
deductive method in the science of economics and various problems of economics were
solved by the end of the 20 century. Statistics of production help in adjusting the
supply to demand. statistics of consumption help us to find out the way in which people
of different strata of society spend their income. Statistics are very useful in knowing
the standard of living and taxable capacity of people. Statistical methods help not only
in formulating appropriate economic policies but also evaluating their effect.
Econometrics which comprises application of statistical methods to theoretical economic
models is widely used in empirical economic research.
In the field of business, decision-making process is the most important function of
management. Modern business is complex and vast, involving a number of specialised
activities, government interference and cut-throat competition. A modern manager
cannot solve the complex business problems without the assistance of statistical
methods. The entire business planning is depending upon forecasting the future trends
which is performed only by the science of statistics. Statistics is helpful preparing
Statisties for Management
VarNOUs budgets, onduct1ng the market survey, tixing the wage structure based unes
oet of hvng undices and ontrolling the business activities. Statisties and statistica
methads have provided the businessman with one of his most valuable tools
decisan- making. The use of statistics in business can be extended to production, sales
purchase., finance. personnel, accounting. marketing and produet research and qualit:
ontrol

Limitations of Statistics
The lim1tations of statistics are as follows:
1 Statistis does not study qualitative phenomenon. For e.g. honesty, intelligenee
ete
Statistics does not study individual data, statistics deals with aggregate of facts
3 Stat1stcs laws are not exact laws.
4 Statistics does not reveal the entire information.
5 Statistics is hable to be misused.
Statistical conclusions are valid only on average base.
Statistics may lead to Wrong or misleading conclusions if figures are quoted
without context.
The greatest limitation is that statistical data can be used properly only by persons
having thorough knowledge of the methods of statistics and proper training.
Ifpossible.
statistical data are not uniform and homogenous, study of the problem is not
Homogeneity of data is essential for their proper study.
10. Statistical methods are not the only method for studying a problem. There are
other methods also. A problem can be studied in various ways.
Distrusts of Statistics
There exists today a great prejudice or personal feeling against statistics. Some
people have simply blind faith in it and are ready to swear by it. Others have no faith
at all and they think adversely of statistics. The science of statistics is often commented
by some as.
"An once of truth will produce tons of statistics."
Statistics can prove anything."
"Figures do not lie."
What statistics reveal is ordinary but what they hide is important."
History asserts without evidence, while statistics asserts contrary to the evidence.
Following are the two interesting statements made by the great men: "There are
three kinds of les-lies, damned lies and statistics."- Mark Twain.
"Statistics can be compared to miniskirts because they are short enough to make a
thing interesting and long enough to cover the subject-matter."
All the above state ments bring discredit to the science of statistics. Statistics by
nature is innocent and eas1ly believed. It may be incomplete, inaccurate and
man1pulated deliberately by prejudiced persons. The diieulty lies not with the scence
of etatIst iCs but with the users who misuse. It is merely a tool in the hands of
Introduetion to Statistics 9

statistician just as operating tools in the handu of a surgeon. The derogatory statements
about the science of statistics are due ignorance, resulting from unhappy experience
betng misled by figures in the past. 0ne may knowingly mishandle the statistical tools
to serve his own purpose, but statistics is not to prove or disprove anything. It is only a
tol lable to be misused. One should not blame the tools; the operator of the tools
should be blamed. With the study of statistics AS a science with the recognition of its
imitations and with improvements in its techniques, the cause for its distrust is
gradually warning.
Important Terminologies in Statistics
Data
It is the collection of observation expressed in numerical figures. This collection
may be done in 2 ways-(i) complete enumeration (i) Sample survey
Raw Data
Raw data are collected directly related to their object of statistical units. When
people are the subject of an investigation, may choosethe form of a survey, an
observation or an experiment.
Raw data is a set of information that was delivered from a certain data entity to
the data provider and hasn't been processed yet by machine nor human. This
information is gathered out of online sources to deliver deep insight into users' online
behavior.

Primary Data
Primary data is data that has not been previously published, i.e., the data is
derived from a new or original research study and collected at the source, e.g., in
marketing, it is information that is obtained directly from first-hand sources by means
of surveys, observation or experimentation. It also called first hand information.
Secondary Data
The secondary data are the second hand information which is already been
collected and analysed by someone else for some purpose. The secondary data are not
pure in character, for example, Economics survey of India is secondary data because
these are collected by more than one organization like Bureau of statistics, Board of
Revenue, the Banks etc.

Population
In statistics the word population denotes the totality of the set of objects under
review. For Example
(i) All workers in a plant.
(i) All items produced by a machine on a particular day.
(iii) All shares traded in a particular stock exchange in the course of a given month etc.
10
Statistics for Managemen
Census
A censuIs is a study that obtains data
from each and every unit
called census or complete enumeration. In most studies, a censusof isa population
because of the cost and/or time required. not
Survey
practical.
Survey is defined as the act of examining a process or
questioning a
sample of individuals to obtain data about a service, product, or process. Data selected
opicollneictoins,on
surveys collect information from a targeted group of people about
behavior, or knowledge. Common types of example surveys are written their
face-to-face or telephone interviews, focus groups, and electronic surveys.
Sample Survey
questionnaires,
Asample survey is a survey which is carried out using a sampling method i .
which a portion only, and not the whole population is surveyed.
Sampling
Under this technique some representative units or informants are selected ron
the universe. These selected units are called samples. Based upon the data collected
from these samples conclusions are drawn upon the whole universe
Parameter
A parameter is a useful component of statistical analysis. It refers to the
characteristics that are used to define a given population. It is used to describe a
specific characteristic of the entire population
The most commonly used parameters are the measures of central tendency. These
measures include mean, median, and mode, and they are used to describe how data
behaves in a distribution.
Statistical Units
Statistical unit is the basis of collection of statistics in a statistical inquiry. These
are the units in terms of which data are collected. For Example
(i) For production of sugar 'tonnes' is used as statistical unit.
(ü) For weight of persons 'kilograms' (kg) is the statistical unit etc.
Variable
A variable is an attribute that describes a person, place, thing, or idea. The value
of the variable can "vary" from one entity to another.
The term variable is derived from the word 'vary' which means to difer or change.
Hence, variable means the characteristic which varies or differs or changes from person
tw person, time to time, place to place etc. Or
A variable refers to quantity or attribute whose value varies from one
investigation to another.
Introduction to Statistics 11

For example:
1 "Price" is a variable as prices of different commodities are different.
"Age" is a variable as the age of different students varies.
3 Some more examples are Height, Weight, Wnges, Expenditure, Imports,
Production, etc.

Atribute
Non- Measurable characteristics like quality aspects for example, religion, occupation etc
are known as attributes.

Frequency
A series of data that is formed along with the frequencies of their occurrences is
called a frequency series. Afrequency series is again, of three types viz.
Seriation
Seriation is the process of finding an arrangement of all objects in a set, in a
linear order, given a loss function. The main goal is exploratory, to reveal structural
information.
Example: 1. Individual series 2. Discrete series, and 3. Continuous series.

Individual Series
An individual series is one in which each value of the variable occurs for only
once. In other words, the frequency of occurrence of all the values in such a series is
only one. As such, essentially such series are displayed without the frequency column.
The following are the examples of individual series.
Examplel Example2 Example 3
Marks No.1 Students Mark Roll No. Marks
50 1 90 1 80
60 80 2 90
70 70 3 70
80 60 50
90 1 50 5 60
An individual series may be arranged either in ascending, or in descending, or in
any other orders as it would suit the desired analysis. In the examples 1 above, the
series has been arranged in ascending order, while in the example 3above, the series
has been arranged in order of the roll numbers of the students.
Discrete Series
A discrete series is one in which the different values of a variable are shown in a
discontinuous manner along with their respective frequencies and at least one of the
values has a frequency of more than one. Such a series can also be arranged either in
ascending, or in descending order.
12 Statistics for Management
The following are the examples of discrete series.
Example
In asending order Example
In descending order
Weekly wages Rs. No. of Workers Marks
15
No. of students
70 90
L5
84 24 80
17
105 50 70 22
140 12 60 13
175 4 50
Continuous Series
A continuous series is one which the
a continuous manner along with their different values of the variables are stated :.
respective frequencies. Such seriess can
arranyed either in ascending, or in descending
either in the form exclusive, or in the from of order. Further, such series can be stota
respective class frequencies. Furthermnore, suchinclusive class intervals along with thei
non-cumulative, or in cumulative from (Less series can also be presented either in
than, or more than) along with their
respective frequencies.
Example - 1
Marks (x) Numnber of students ()
0-20 10
20-40 13
40-60 26
60-80 20
80-100 6
Example - 2
Write the following frequency distribution in exclusive form.
X 0-9.99 10-19.99 20-29.99 30-39.99 40-49.99 50-59.99
f 2 5 10 12 6 1
Characteristics
Qualities possessed by an individual person, objects or items of
example, Height of a person and so on. population. For
Introduetion to Statistics 13

Statistical Data

Introduction
Data in statistical data analysis consists of variable(s). Sometimes the data is
univariate or multivariate. Depending upon the number of variables, the researcher
performs different statistical techniques. Statistical data analysis is a procedure of
performing various statistical operations. It is a kind of quantitative research, which
seeks to quantify the data, and typically, applies some form of statistical analysis.
Quantitative data basically involves descriptive data, such as survey data and
observational data. If the data in statistical data analysis is multiple in numbers, then
several multivariates can be performed. These are factor statistical data analysis,
discriminant statistical data analysis, etc. Similarly, if the data is singular in number,
then the univariate statistical data analysis is performed. This includes t test for
significance, z test, f test, ANOVA one way, etc.
Meaning of Statistical Data
Statistical data analysis generally involves some form of statistical tools, which a
layman cannot perform without having any statistical knowledge. There are various
software packages to perform statistical data analysis. This software includes Statistical
Analysis System (SAS), Statistical Package for the Social Sciences (SPSS), Stat soft.
etc.

Types of Data
Types of Data

1. Structured data 2. Unstructured data

Time Series Data

Cross-Section Data

Longitudinal
1.Structured Data
As the word "structured" suggests, this is data which is highly organized and
neatly formatted. Structured data is organized in tabular format (ie. rows and columns)
and there is a relationship between different rows and columns. As such, its highly
organized and formatted and easy to store, process, and access. There are three types of
data: time series, cross-section and Longitudinal.
Statistics for Managemen
Time Series Data
grouD of
These Are dta from a Dit (or unita) observed in
Thme series dnta of a
vnriable have a net of several
dailys,uCcesweeklIVye
perods
usunlly observations
collecteod fXed intervals, such a8on values at

di f erenM
nts of ime Thev arr at
monthl. Annually, quarterly, etc. T'ime serien economet.rics
but mamly fnanecial
economics has applications i
machonomics
dervatives, crrencies, ete
where it s URed tor
price analysis
staks
Cross-Section Data
(hoss setion data are collected at the same point of time for several
These are data from units observed at the same time or in the same
data may be single observations from a sample survey or from all units time
income
individuals,
in a period. The
Examples are opinion
European countries, etc.
polls, distribution, data on GNP per populination,all
capita
Longitudinal
Panel, longitudinal or micropanel data is a type that is pooled data of
difference is that we measure over the same cross-sectional unit for nature "r.
individual
bouseholds, firms, etc. This branch of econometrics is called micro econometrics,
2. Unstructured Data
Unstructured data is data which is not organized in any predefined manner. It
can be textual, numbers, dates, Irregularities and
data make it difficult to handle and understand. disorganization within unstructured

Classification Data
The data can be classified into tuwo categories:

A. Primary Data
Primary data is data that has not been previously published, i.e., the data is
derived from a new or original research study and collected at the source, e.g., in
marketing, is information that is obtained directly from first-hand sources by means
of surveys, observation or experimentation. It also called first hand information.
Importance of Primary Data
Importance of Primary data cannot be neglected. A research can be conducted
without secondary data but a research based on only secondary data is least reliable
and rnay have biases because secondary data has already been manipulated by human
beings. In statistical surveys it is necessary to get information from primary sources
and work on primary data: for example, the statistical records of female population in a
country cannot be based on newspaper, magazine and other printed sources. One such
SOurce is old and secondly they contain limited information as well as they can be
misleading and biased.
(i) Validity: Validity is one of the major concerns in a research. Validity is the quality
of a research that makes it trustworthy and scientific. Validity is the use of
seientife methods in research to make it logical and acceptable. Using primary
data in researeh can improves the validity of research. First hand information
lntroduetion to Statisties

obtained from a sample that is representative of the target population will yeld
data that will be valid for the ontire target population.
Sourees of
Data

A. Primary Data Collection B. Secondary Data Collection

1. Internal 2. External
L. Personal Investigation

|) Accounting Records DDatabase


2. Indirect Oral Interviews
ii) Associations
3. Mailed Questionnaire ) Sales force Reports
Method

4. Interview Schedule iii) Miscellaneous Records iüi) Government Agencies


Method

iv) Internal Experts iv) Syndicate Services


5. From Local Agents

Through Telephone ) Directories

7. Google Form vi) Other Published Source.

(ii) Authenticity: Authenticity is the genuineness of the research. Authenticity can


misleading information
be at stake if the researcher invests personal biases or uses
become more authentic if
into the research. Primary research tools and data can
and reasonably
the methods chosen to analyze and interpret data are valid because the facts
suitable for the data type. Primary sources are more authentic
authentie if the source hides
have not been overdone. Primary source can be less methods that
information or alters facts due to some personal reasons. There are
source.
can be employed to ensure factual yielding of data from the
(üi) Reliability: Reliability is the certainty that the research is enough true to be
food consumption
trusted on. For example, if a research study concludes that junk
diseases. T'his conclusion should
does not increase the risk of cancer and heart technique and variability is
have to be drawn from a sample whose size, sampling
Statistics for Manageme
16
not questionable. Reinbility improves with using primary data. In the simila
seareh mentioned above if the researcher uses experimental method anA
questionnaires the results will be highly reliable. On the other hand, if he relies
collect
available in books and on internet he will information that doe,
the data
not represent the real facts.

Merits of Primary Data


is quite high.
i) Degree of accuracy caution.
It does not require extra
great detail.
ü) It depicts the data in
frequently includes definitions of variouss
iv) Primary source of data collection
and units used.
data are not available.
v) For some investigations, secondary
Demerits of Primary Data
Collection of data requires a lot of time.
ii) lt requires lot of finance.
iü) In some enquiries it is not possible to collect primary data.
iv) It requires a lot of labor.
v) It requires a lot of skil.
Methods/Collection of Primary Data
Primary data are collected by the following sources:
1. Personal Investigation/enquiry or Direct Personal Observation
The information is collected by direct personal interviews. The data collected in this
way is usually accurate and reliable. It is necessary that investigator has to keep
observing while collecting data. The information thus obtained is called first-hand or
original in charter. So this method is useful when the scope of enquiry is small.
Merits
i) Response is more encouraging and more people are willing to supply information
when approached personally.
(ii) Accurate and additional information may be collected.
Demerits
(i) It is expensive and time-consuming.
(ü) There are chance of personal prejudice and bias.
2. Indirect Oral Interviews
This is an indirect method of collecting primary data. Here information are not
collected directly from the source but by interviewing persons closely related with the
oblem. This method is applied to upprehend culprits in case of theft, murder etc. The
Drmation relating to one's personal life or which the informant hesitates to reveal are
Introduetionto Statistics 17

better collected by thin method. Ilero the investigutor prepares a small list of
relating to the enquiry. "The aswers (information) nre collected by interviewing questions
ersone Well connected with the incident. "T'he investigator should cross-examine the
informants to get orrect information.
Merits
() This method is time saving and involves relatively less cost.
() Accurate and ndditional information may be collected.
(ii) Vast area can be covered.
Demerits
) The third party may not be willing to co-operate with the investigation.
() The aceuracy of the information largely depends upon the integrity of the
investigator.
3. Mailed Questionnaire Method
Aquestionnaire is a document prepared by the investigator containing a set of
questions. These questions relate to the problem of enquiry directly or indirectly. The
questionnaires are mailed to the informants with a formal request to answer the
question and send them back within a specified time. For better response the
investigator should bear the postal charges. The questionnaire should carry a note
explaining the aims and objective of the enquiry, definition of various terms and
concepts used there.
Drafting of questionnaire: Success of this method greatly depends upon the
way in which the questionnaire is drafted. So the investigator must be very careful
while framing the questions. The questions should be
(i) Short and clear.
(i) Few in number.
(ii) Simple and intelligible.
(v) Impersonal, non-aggressive type.
(vi) Simple alternative, multiple-choice or open-end type.
(a) In the simple alternative question type, the respondent has to choose between
alternatives such as "Yes or No', 'right or wrong'etc.
4. Interview Schedule Method
In case the informants are largely uneducated and non-responsive data cannot be
collected by the mailed questionnaire method. In such cases, schedule method is used to
collect data. Here the interview schedules are sent through the enumerators to collect
information. Enumerators are persons appointed by the investigator for the purpose.
They directly meet the informants with the interview schedules. They explain the scope
and objective of the enquiry to the informants and solicit their co-operation. The
enumerators ask the questions to the informants and record their answers in the
interview schedule and compile them. The success of this method depends on the
sincerity and efficiency of the enumerators.
18 Statistics for Managemen
5. From Local Agents
Primary data are collected trom local gents or correspondents. These
authorities. They are well conversant with agents
appointed by the
cOnditions like sponsoring
language, comnuication, food habits, traditions ete. Being on thethe local
capable of
and well acquainted with the nature of the enquiry they aremethod
reliable information. The accuracy of the data collected by this depends
honesty and sincerity of the agents. The method is generally used by governe
R furnishing
agencies. newspapers. periodicals ete.
6. Through Telephone
The researchers get information through telephone this method is quick and i,
accurate information.

7. Google Form
It is a online form which are used recently for primary data collection, Googl
s
Forms is a survey administration app that is included in the Google Drive offce
and Google Classroom along with Google Docs, Google Sheets, inand Google Slides
Docs, Sheets. nnd
Forms features all of the collaboration and sharing features found
Slides.

B. Secondary Data
The secondary data are the second hand information which is already been
collected and analysed by someone else for some purpose. The secondary data are not
pure in character, for example, Economics survey of India is secondary data because
these are collected by more than one organization like Bureau of statistics, Board of
Revenue, the Banks etc.
Secondary data analysis saves time that would otherwise be spent collecting data
and, particularly in the case of quantitative data, provides larger and higher-quality
databases that would be unfeasible for any individual researcher to collect on their
own. In addition, analysts of social and economic change consider secondary data
essential, since it is impossible to conduct a new survey that can adequately capture
past change and/or developments.
Advantages of Secondary Sources of Data
1. Ease of Access: There are many advantages to using secondary data. Ths
includes the relative ease of access to many sources of secondary data. In the past
secondary data accumulation required marketers to visit libraries or wait for
reports to be shipped by mail. Now with the availability of online access, secondary
data is more openly accessed. This offers convenience and generally standardized
usage methods for all sources of secondary data.
2. Low Cost to Acquire: The use of secondary data has allowed researchers access
to valuable information for little or no cost to acquire. Therefore, this information
is much less expensive than if the researchers had to carry out the data
themselves.
Introduction to Statistics 19

3. Clarification of Data Question: The une of weondary data may help the
researcher to clarify the data question Beondary data is tften ued prior
primary data to help clarify the data forus.
May Answer Researcher Quention: The ue of secondary data ollectin is ften
used to help align the fcus of large eale primary data. When freusiny on
secondary data, the researcher may realize that the ezact information they were
look1ng to uncover is already available through secondary urces. This wuld
effectively eliminate the need and expense to carry out their wn primary data.
May Show Difficulties in Conducting Primary research: In many cases, the
originators of secondary data include details of how the information was colleted.
This may include information detailing the procedures used in data collection and
difficulties encountered in conducting the primary data. Therefore, the detailed
d1fficulties may persuade the researcher to decide that the potential informatisn
obtained is not worth the potential difficulties in conducting the data.
Uses of Secondary Data
1) The secondary data may be used in three ways by a researcher.
ii) Some specific information from secondary sources may he used for reference
purposes.
iü) Secondary data may be used as benchrmark against the findings of a research may
be tested.
iv) Secondary data may be used as the sole source of information for a research
project. Such studies as Securities Market Behavior, Financial Analysis of
Companies, and Trends in credit allocation in Commercial Banks, Sociological
Studies on crimes, historical studies, and the like depend primarily on secondary
data.
v) Year books, Statistical reports of government departments, reports of public
organisations like Bureau of Public Enterprises, Census Reports ete. serve as major
data sources for such research studies.

Sources of Secondary Data Collection


There are mainly two sources of secondary data as follows:
1. Internal Sources of Secondary Data
2. External Sources of Secondary Data
1. Internal Sources of Secondary Data
Internal s0urces can be classified into four broad eategories:
i) Accounting Records
i) Sales force Reports
i) Miscellaneous Records
iv) Internal Experts
20 Statistics for Managemehe

i)Accounting Records
The bass for a(ountng records concerned with sales is the sales invoice,
generally inel.
Sual sales imwce has a szable amount of information on it, whieh ordered.
of customer, items ordered, quantities
hame of ustomer, lcation
shyped. Rupees extenstons, back orders, discounts allowed
and date. quantitiey
territory
In add1ton, the invoice often contains intormation on sales supplemented ts
epesentative, and warehouse of shipment. This information, when
call
data on osts and industry and product classification, as well as from sales
provides the bas1s for a comprehensive analysis of sales by product, customer, indust
gegraphic area. sales territory and sales representative, as well as the profitabilit
each sales category. Unfortunately, most firms' accounting systems are designed
primarilv for tax reasons rather than for decision support.
ii) Sales Force Reports
Sales force reports represent a rich and largely untapped potential sourcethatof
marketing information. The word potential is used because evidence indicates
sales personnel do generally not report valuable marketing information.
Sales personnel often lack the motivation andor the means to communicate key
information to marketing managers. To obtain the valuable data available from most
sales forces, several elements are necessary: (a) a clear, concise statement, repeated
frequently, of the types of information desired; (b) a systematic, simple process for
reporting the information; (c) financial and other rewards for reporting information:
and (d) concrete examples of the actual use of the data.

ii) Miscellaneous Reports


Miscellaneous reports represent the third internal data source. Previous
marketing research studies, special audits and reports purchased from outside for prior
problems may have relevance for current problems. As a firm becomes more diversifed.
the more ikely it is to conduct studies that may have relevance to problems in other
areas of the firm.
For example, P&G sells a variety of distinct products to identical or similar target
markets. An analysis of the media habits conducted for one product could be very
useful for a different product that appeals to the same target market. Again, this
requires an efficient marketing information system to ensure that those who need them
can find the relevant reports.

iv) Internal Experts


One of the most overlooked sources of internal secondary data is internal experts.
An internal expert is anyone employed by the firm who has special knowledge. The
following statement by a senior research manager at a major consumer goods irm
deeribes why his organization developed a research reports library and how they
ensure its use. On the average, each brand is assigned a new brand manager every
two year8.
Introduetion to Statisties 21

2 External Sources of Secondary Data


Numerous sources external to the firm may have data relevant to the
nquirements Seven general categories of external secondary informationfirm's are
desribed in the sections that follow: i) computerized databases, ii)
associations, iii)
government agencies, iv) syndicated services, v) directories, vi) other published sources
and vi) external experts.
i) Database
Acomputerized database is a collection of numeric data
and/or
made computer-readable form for electronic distribution. Thereinformation that is
are than 3.500
databases available from over 550 on-line service enterprises. Those that are available
that are useful bibliographic search, site location, media planning,
market
forecasting and for many other purposes of interest to marketing researchers. planning,
ii) Associations
Associations frequently publish or maintain detailed information on industry
sales, operating characteristics, growth patterns, and the like. Furthermore, they may
conduct special studies of factors relevant to their industry. These materials may be
published in the form of annual reports, as part of a regular trade journal, or as special
reports. In some cases, they are available only on request from the association. Most
libraries maintain reference works, such as the Encyclopedia of Associations that list
the various associations and provide a statement of the scope of their activities.

ii) Government Agencies


Federal, state, and local government agencies produce a massive amount of data
that are of relevance to marketers. In this section, the nature of the data produced by
the federal government is briefly described. However, the researcher should not
overlook state and local government data.
There are also a number of specialized analytic and research agencies, numerous
administrative and regulatory agencies and special committees and reports of the
judicial and legislative branches of the government. These sources produce five broad
types of data of interest to marketers. There are data on (1) population, housing and
income; (2) agricultural, industrial and commercial product sales of manufacturers,
wholesalers, retailers and service organizations; (3) financial and other characteristics
of firms; (4) employment and (5) miscellaneous reports.
iv) Syndicate Services
Awide array of data on both consumer and industrial markets is collected and sold
by commercial organizations.
v) Directories
Any sound marketing strategy requires an understanding of existing and
potential competitors and customers. Suppose you were asked to prepare a report on
the forest products industry, to aid your organization in developing a sales and
22 Statistics for Managemehs
directess
markeing appoach to umber maufacturerN. A umber of Aervices and
such ns Thomas Register of
would prove ueeful Agrheral ndustry d1rectory
Manufacturers is A RONNd starting place. This sixteen volume, set lists manufactur American
prducts and servces by product category. It provides the company name,
extaddres,
size. It also Contains an
telephone number and an estimate of its asset
irademark hsting and samples of company catalogs. ensie
vi) Other Published Sources
There is a virtually endless array of periodicals, books, dissertations, heet
eports, newspapers and the like that contain information relevant to marko:
dec1ssons.
vii) External Experts
provid.
External experts are individuals outside your organization whose job
official,
them with expertise on your industry or activity. State and government
Associated with the industry, trade association officials, editors and writers for trad
and publications, financial analysts focusing on the industry, government and
university researchers and distributors often have expert knowledge relevant t
marketing problems.
Techniques of Data Collection
Data collection techniques is a methodical process of gathering and analysing the
information to correct solution to relevant questions and evaluate the results. The maior
techniques of data collection i.e., census, sampling, interview and focus groups etc.
1. Census Method
A census is a study that obtains data from each and every unit of a population is
called census or complete enumeration. In most studies, a census is not practical.
because of the cost and/or time required.
A census is the procedure of systematically acquiring and recording information
about the members of a given population. It is a regularly occurring and official count
national
of aparticular population. The term is used mostly in connection with
populaton and housing censuses; other common censuses include agriculture, business
and traffic censuses. In the latter cases the elements of the 'population' are farms,
businesses and so forth, rather than people. When the whole area or population of
persons 0s Contacted the method is known as census method.
Merits of Census Method
1. Data is obtained from each and every unit of population.
2 The results ohtained are likely to be more representative, accurate and reiable.
3 Itcan be used in various aspect of survey e.g. Indian population censes surve.
4 Itcan be exploited as a basis for various surveys.
Demerits of Census Method
The effort, money and time-extremely large.
2 Population is infinite.
Introduction toStatisties 23

Exhaustive and intensive study npossible


I s epensive And ime consuming
2.Sampling Method
Under this technique some representative units or informanta are selected from
the universe These selected units are cnlled samples. Based upon the data eollected
fon these sAmplos conclusions are drawn upon the whole universe
Principles of Sampling
The theory of sampling is based on two important principles:
(i) Principal of Stastical Regularity: Sample is taken at random from a population
it s lhkely to possess almost the same characteristics as that of the population.
(ii) Principles of Inertia of Large Numbers: other things being equal, large the
size of the sample, more accurate the results are likely to be, i.e. Large number are
more stable as compared to small ones.

Methods of Sampling
The various types of sampling can broadly divided into two:

Methods of Sampling

(A) Probability Sampling or (B) Non-Probability Sampling


Random Sampling
) Simple Random sampling ) Convenience sampling
(ü) Stratified random sampling (ii) Judgment or Purposive sampling
(ü) Cluster (area) sampling (ii) Quota sampling
(iv) Asystematic random sampling (iv) Snowball sampling
(v) Multi-stage sampling
(A) Probability Sampling or Random Sampling
(i) Simple Random sampling
Every member of the population has a known and equal chance of being selected.
For example: (i) lottery method and (ii) the use of table of Random Number.
(i) Stratified random sampling
All people in sampling frame are divided into "strata" (groups or categories).
Within each stratum, a simple random sample or systematic sample is selected.
Example if we want to ensure that a sample of 5 students from a group of 50 contains
both male and female students in same proportions as in the full population (i.e. the
group of 50), we first divide that population into male and female. In this case, there
are 22 male students and 28 females. To work out the number of males and females in
the sample No. of males in sample = (5/50) x 22 = 2.2,No. of females in sample = (5/
24 Statistics for Managemenm
females in the HAmple, Th
2N Theefoe we choose 2 males and 3
lted using sinple random or systematic snmple methods, hese Woula
(iii) Cluster (area) sampling
The population is divided into mutually exclusive groups such
blocka, and th
investigator draws a sample of the group to interview. For example, a cluster may he
schools in KarnatoL
nllage or a school, a state. So vou decide all the elementary
are clusters You wannt 20 schools selected. You can use simple or systematir Sral
sampling
randon
to select the schools and then every school selected becomes a cluster. If
interest is to interview teachers on the opinion of some new program which has
intnduced. then all the tenchers in a cluster must be interviewed.
(iv) ASample
systematic random sampling
is obtained by selecting one unit on a random basis and choosing
additional elementary units at evenly spaced intervals i.e., every nth item until the
ae
desired number of units is obtained. For example, there are 100 students in your
You want a sample of 20 from these 100, divide 100 by 20, you will get 5. Randor
select any number between 1 and five. Suppose the number you have picked is nat4
wall be your starting number from there you will select every 5th name ie..4th
9th name (4 + 5) 14th name (9 + 5) and so on until you reach the last one hund
you will end up with 20 selected students.
(v) Multi-stage sampling
As the name implies, this involves drawing several difterent samples. It does so:
such a way that cost of final interviewing is minimised. or example, first draw saml
lar
of areas. Initially large areas selected then progressively smaller areas within
area are sampled. Eventually end with sample of households and use methodof
selecting individuals from these selected households.
(B) Non-Probability Sampling
(i)Convenience sampling
The investigator selects the easiest population members from which to obtain
information.
(ii) Judgment Sample or Purposive Sampling
A purposive sample is one which is selected by the investigator subjectively. The
investigator uses his/her judgement to select population members who are good
prospects for accurate information.
(ii)Quota sampling
The investigator finds and interviews a prescribed number of people in each of
several categories.
(iv) Snowball /Reference Sampling
With thi approach, you initially contact a few potential respondents and then ass
them whether they know of anybody with the same characteristics that you are lookng
for in your purpo8e.
For example, if you wanted to interview a sample of vegetarians Ieyclists / pe
with a particular disability ete., you initial contacts may well have knowledge of otnet

You might also like