Business Statistics
Business Statistics
Chapter - 1
Business Statistics - What and Why?
The word Statistics seems to have been derived from the Latin word ‘Status’ or the Italian word
‘Statista’ or the German word ‘Statistick’. All these words means a Political State.
It was necessary for the Govt. of a state to Collect the Data about:
Definition of Business Statistics: The focal point of Business Statistics is managerial decision
making. So, Statistical techniques by which quantitative data relating to business are collected,
organized, analyzed and interpreted for managerial decision making is called Business Statistics.
Such as price quotation and index numbers, shares prices of a stoke exchange etc.
Functions of Statistics: The proper function of Statistics is to enlarge our knowledge of complex
phenomenon and to lend precision to our ideas that would other wise vague and indeterminate. Our
knowledge is such things as:
i. National income
ii. Population
iii. Natural Resources etc. would not have been so definite and precise if there were no
reliable statistics pertaining to each of these objects
Statistics is able to widen are knowledge because of the following service that are
accomplished for the forecasting business activities:
Important and Uses: Statistics is very useful in the economics, Finance, Management, Marketing,
Accounting, Banking, Sociology, Social Work, Political Science, Biology, Psychology, Medical Science,
and Engineering and in the other fields of world affairs.
Scope of Statistics:
⧫ Statistics and the State: Since ancient times the ruling ring and chiefs have relied heavily on
statistics in framing suitable military and fiscal policies. Most of the statistics such as that of
crimes, military strength, population, taxes etc.
⧫ Statistics in Business and Management: The use of statistical methods in the solution of
business problems dates almost exclusively to the 20th century. Applications of statistics pervade
virtually every area of activity in business and industry such as production, financial analysis,
distribution analysis, market research.
⧫ Statistics and Economics: Statistical methods help not only in formulating appropriate
economic policies but also evaluating their effect.
Characteristics of Statistics:
→ In statistics all information must be expressed in quantitative term (i.e: Statistics must be
numerically expressed).
→ Statistics deals with collection of facts not an individual happening.
→ Statistical data should be collected in a systematic manner with a definite object in mind (for
predetermined purpose).
→ Statistics can be affected by a multiplicity of causes.
→ Statistics is not an exact science. Generally conclusions are taken from samples and exactness
can’t be guaranteed.
→ Statistics must be enumerated or estimated according to reasonable standard of accuracy.
→ Statistics should be related that cause effect relationship can be established.
→ A statistical enquiry should passes through four stages:
i. Calculation of Data.
ii. Classification and Tabulation of Data.
iii. Analysis of the Data.
iv. Interpretation of Data.
Limitation of Statistics:
→ Statistics deals with only those objects of enquiry which are capable of quantitative measurement
(i.e.: Numerical data).
→ Statistics is not suited to the study of quantitative phenomenon, which can be expressed
numerically. Example: Beauty, Honesty, Nationality, Culture, Patriotism, on the other hand,
intelligence can be studied by the test score.
→ Statistics does not study individuals.
→ Statistical decision are true only on an average and also the average is to be taken form a large
number of observation.
→ Statistical decisions are to be taken or made carefully by the experts. The use of Statistical tools
by untrained person may lead to false conclusions. Statistics are like clay by which one can make
a ‘God’ or the ‘Devil’ as one places. Misuse of Statistics has in fact created some destruction on
the subject. This is why we often hear comments like ‘Statistics can proof anything in this
universe or there are three types of lie’:
i. Lie,
ii. Dam lies, and
iii. Statistics.
Chapter - 2
Depending on the source, We can have either Secondary Data, Internal data or Primary data. In this
Chapter, We shall study the whole procedure of collection of statistical data.
i. Secondary Source.
ii. Internal Sources.
iii. Primary Sources.
i. Library Method
ii. Experimental Method
iii. Observation Method
iv. Questionnaire Method
➔ Statement of the purpose of the data Collection: There are two things which must be
carefully consider before starting the work of data collection:
i. Statement of Purpose
ii. Object
Objective of the Statistical Enquiry: By Statistical Enquiry we mean a search for knowing some
statistical investigation wherein relevant information is collected, analyzed, interpreted by the
application of statistical techniques.
i. Planning.
ii. Questionnaire.
iii. Collection of Data.
iv. Editing or Organization of Data.
v. Analysis and Interpretation of the Data.
vi. Reports.
• The supplement, disprove or simple to test some theory and hypothesis which is current,
or
• To discover a new theory or hypothesis, or,
• To know the existing state affairs, or
• To solve a problem involving the interrelation of several groups of facts.
Primary Data and Secondary Data:
➔ Primary Data (Row Data): Primary data are collected for a specific purpose directly from the
field of enquiry (by researcher/enumerator) to that these data are original in nature. The
Primary data may be published by the authorities who themselves are responsible for this
collector. Usually trade associations collect data from their members concern. Govt.
organizations collect data from it’s subordinate offices. They are considered as primary data.
Also individual or any organization can collect primary data from the field of inquiry himself or
by appointing enumerates. The primary data on which no statistical technique is applied is called
row data.
➔ Secondary Data: Secondary data are numerical information, which have been primarily
collected as primary data by some agency for specific purpose which now compiled from that
source for use in a different connection. In fact one purpose when used by another purpose may
be termed as Secondary Data. This data is primary for it’s collecting authority but secondary to
them who will use it. Care should be given before using secondary data.
Questionnaire:
➔ In Statistical inquiry, the necessary information are generally collected in a printed sheet in the
form of a questionnaire, such a sheet contains a set of questions which the investigators are
supposed to ask the informants and note down their answer against each question. Also educated
informants may answer the question in written.
Characteristics of Questionnaire:
➔ Clarify: The individual question should be as simple as possible so that the informants
feel better.
✓ To Understand, and
✓ To Answer
✓ Name
✓ Father’s Name
✓ Mother’s Name
✓ Married/Unmarried etc.
What is Average? What are the desirable properties of an average or Characteristics of a good
average?
Ans: An average is a single value representing a group of values. It desirable that such a value
statistics the following properties:
It should be easy to understand otherwise its easy to bound to be very limited.
It should be simple to compute.
It should be based on all the observations.
It should be rigidly defined so that it has one and only one interpretation.
It should be capable of further algebraic treatment.
What is Central Tendency? Define Mean, Median and Mode or Define measures of Central
Tendency?
Ans: Central tendency indicates that the typical values of the valuable lie near the central part of the
distribution and other values cluster around these central values. This behavior of the data about the
concentration of the values in central part of the distribution is called the central tendency of data.
i. Mean
ii. Median, &
iii. Mode
i. Mean: The most common and useful measure of central tendency is Mean. There are three
kinds of mean:
a. Arithmetic Mean: Arithmetic mean is the sum of all values of a variable divided by
the total number of values is called arithmetic mean.
b. Geometric Mean: Geometric mean is defined as the Nth root of the product of N
observation of a given data. If there are two observations, we take the square root
and if there are three observations, we take the cube root and so on.
c. Harmonic Mean: Harmonic mean is based on the reciprocal of the numbers
averaged. It is defined as the reciprocal of the arithmetic mean of the reciprocal of
the individual observation.
ii. Median: Median is an important measure of central tendency which is the middle most or
the central value of an ordered series. Median divides the series into two equal parts in
such a manner that the number of its below. It is the positional average and it is not
affected by the presence of an extremely small and large value.
iii. Mode: Mode is a measure of central tendency of a variable. Mode of a series of values of a
variable is that value which occurs with the maximum frequency. i.e. it is the most
common value (which occurs the maximum number of items or which have maximum
frequency). Mode is more suitable than mean and median as a measure of central
tendency.
Formulas of Central Tendency:
Arithmetic Mean
Ungrouped Data: Grouped Data:
∑𝑿 ∑ 𝒇𝒙
A.M (𝑋̅)= A.M (𝑋̅)=
𝑵 𝑵
Geometric Mean
Ungrouped Data: Grouped Data:
∑ 𝑙𝑜𝑔𝑋 ∑ 𝑓𝑙𝑜𝑔𝑋
G.M= A.L. ( ) G.M= A.L. ( 𝑁
)
𝑁
Harmonic Mean
Ungrouped Data: Grouped Data:
N N
H.M= 1 H.M= f
∑( ) ∑( )
x x
Median
Ungrouped Data: Grouped Data:
𝑁+1 𝑁
Median= − 𝑝.𝑐.𝑓
2
2 Median= L+ 𝑓
×𝑖
Mode
Ungrouped Data: Grouped Data:
∆1
Highest Number of Frequency M0= L+ × 𝑖
∆1+∆2
The variation or scattering or deviation of the different values of a variable from their average is
known as dispersion.
There are seven measures of dispersion of which four are absolute and three are relative measures of
dispersion.
A. Absolute Measures:
i. Range
ii. Quartile Deviation
iii. Mean Deviation
iv. Standard Deviation
B. Relative Measures:
i. Co-efficient of Quartile Deviation
ii. Co-efficient of mean Deviation
iii. Co-efficient of Variation
i. Range: Range is the simplest absolute measure of dispersion. It is the difference between the
highest and the lowest values of a variable. [ Range= Highest Values – Lowest Value]
ii. Quartile Deviation: Inter quartile range is express as the difference between the first and the
𝑸𝟑 – 𝑸𝟏
third quartiles. It can be written as: Quartile Deviation (Q.D)= 𝟐
iii. Mean Deviation: The arithmetic mean of absolute deviation of a series from it’s average is called
Mean Deviation.
̅
∑ |𝒙−𝒙|
Simple Series, M.D=
𝒏
̅
∑ 𝒇|𝒙−𝒙|
For Frequency Distribution, M.D=
𝑵
iv. Standard Deviation: Standard deviation of a set of values of a variable is the positive square
root of the arithmetic mean of the squares of all the deviations of the values from their
̅̅̅𝟐
∑(𝒙−𝒙)
arithmetic mean. [ 𝝈= ]
𝒏
vi. Co-efficient of mean deviation: It is a relative measures of dispersion and is defined by the
𝑴𝒆𝒂𝒏 𝒅𝒆𝒗𝒊𝒂𝒕𝒊𝒐𝒏 𝒇𝒓𝒐𝒎 𝒎𝒆𝒂𝒏
formula: Co-efficient of mean deviation= × 100
𝒎𝒆𝒂𝒏
vii. Co-efficient of Variation: Co-efficient of variation is the most important relative measures of
dispersion. The ration of standard deviation and mean expressed in percentage is called co-
efficient of variation.
𝑺𝒕𝒂𝒏𝒅𝒂𝒓𝒅 𝑫𝒆𝒗𝒊𝒂𝒕𝒊𝒐𝒏
Co-efficient of Variation (C.V)= × 𝟏𝟎𝟎]
𝑴𝒆𝒂𝒏
There are seven measures of Dispersion. One of them is Standard Deviation. Standard Deviation is
the best measures of dispersion, because;
It is based on all values of the variable.
It is suitable for any algebraic treatment.
It is less affected by sampling fluctuation.
It is used in comparing two or more series.
It is a value which is always positive.
From above discussion, we can say that Standard Deviation is the best measure of Dispersion.
The following general guidelines are given which help in interpreting the value of r:
→ When r= 1, it means there is perfect positive correlation between the variables.
→ When r= 0, it means there is no correlation between the variables.
→ Where r= -1, it means there is perfect negative correlation between the variables.
Chapter - 8
Regression Analysis
Objectives of Regression:
i. The primary objective of regression analysis is to estimate the value of a random variable
(the dependent variable) given that the value of an associated variable (the independent
variable) is known.
ii. The second goal of regression analysis is to obtain a measure of the error involved in using
the regression line as a basis for estimator. For this purpose, standard error of estimate is
calculated. If the line fits the data closely that is, if there is relatively little scatter of the items
around the regression line, a good estimate can be made of Y (dependent) variable. But if
there is a great deal of scatter of the items around the fitted regression line, then the line will
not produce accurate estimate of the dependent variable.
iii. The cause and effect relationship is indicate through regression analysis but we can’t say that
one variable is the cause and the other the effect. Regression shows only the functional
relationship between two variables.
Assumptions Underlying the Regression Analysis Model (Simple Linear Regression Model):
The general assumptions underlying the regression analysis model are:
Index Numbers
Definition: An index number is a number which is used as a device for comparison between the
price, quantity or value of a variable (or a group of related variables) in different situations i.e. at a
certain place or a period of time and that at another place or period of time. The representative
figure of the combined change is also called Index Number.
Generally, index numbers are stated in the form of percentage.
Importance of Index Number: Index numbers are used to feel the pulse of the economy and they
have come to be used as indicators of inflationary of deflationary tendencies. In fact, they are
described as barometers of economic activity. i.e., if one wants to get an idea as to what is happening
to an economy he should look to important indies like the index number of industrial production,
agricultural production, business activity, etc.
Index numbers are playing an increasingly significant role in business planning and in the
formulation of executive decision. Many businesses are often reluctant to give out information
concerning sales, profits, and the like. It is possible to present index numbers indicating whether a
firm’s profits or sales have increased or decreased over a period of years without revealing the total
amount of profits or sales.
Price Index Number: When the comparison is in respect of prices, then it is called an index number
of prices or price index number.
Quantity Index Number: When the comparison is in respect of physical quantities, then it is called
an index number of quantity or quantity index number. Similarly, other index numbers can be
defined.
Base Year and Current Year: The period if time with whose values, the values of other periods are
compared, is called the base year/base period and the period of time whose values are compared to
the base year value is called the current year/ current period.
Example: The statement that the index number of wholesale prices in Dhaka for the year 2007 was
150% compared to 2000, meant that there was a net increase in the prices of wholesale commodities
in Dhaka market to the extant of 50% and that of 90% in 2006 compared to 2000 meant that there
was a net decrease in wholesale prices of commodities in the Dhaka market to the extant of 10%.
Note: In comparing the prices in a given year with the price of base year, the price of base year is
taken as 100, and the price of the given year for which the index year is required is expressed as a
percentage of a base year price. Thus if, for any item, P1 , be the current year price and P0 , be the base
year price, the corresponding index number:
Price Relative: A price relative is the ratio of the prices of a certain commodity at the current year to
the price at the base price (generally expressed as a percentage).
i. Definition of the purpose for which the index number is being constructed.
ii. Selection of base year.
iii. Selection of commodity for inclusion in the group.
iv. Selection of sources of data.
v. Method of combining the data.
vi. System of weighting (selection of an appropriate weight).
vii. Choice of an average.
viii. Selection of an appropriate formula.
Classifications of Index Number: Broadly index numbers are divided under two heads:
A. Un-weighted indices:
B. Weighted indices:
Theory of Probability
Probability: In general meaning, the word ‘Probability’ is likelihood. If happening of an event is
certain, the probability is said to be unity (1) and where there is absolute impossibility of an event,
the probability is said to be zero (0). But in real life such cases are rare and the probability generally
lies between 0 and 1.
Thus, probabilities are always greater than or equal to zero (i.e. probabilities are never negative) and
are equal to or less than one. This being so, we can say that the weight scale of probability runs from
zero to one and in symbolic form, it can be stated as follows:
𝑃 ≤ 1 but ≥ 0 𝑖. 𝑒. 0 ≤ 𝑃 ≤ 1
Concept of Probability: Term probability is difficult to define although every one has got some idea
of the meaning of probability. The dictionary meaning of probability is likely/chance/likelihood and
this meaning serves the purpose so far the general conversational language is concerned. But such
vague meaning is insignificant for the purpose of scientific methodology and we should be able to
ascribe a precious definition to the term for our purpose.
In everyday life, we often come across comments like, “He is probably wrong”, “The chances of
raining today is high”, “There is little likelihood of rain tomorrow”, “His chances of success is very
little”, “The chances of winning the game are fifty-fifty” etc. In all these cases the commentators has
good idea of probability in his mind. So the idea of probability has got it’s place in common sense and
finds use in common language. But these are not mathematical precise statements in the sense that
we can not form any definite idea about occurrence or non-occurrence of the events. The statistician
applies this common sense and ideas more carefully and makes statements numerically instead of
vague term.
In mathematical term probability is a measure of the expectations that an event will or will not
occur. Probability of happening an event depends upon the total number of alternatives cases and the
number of cases favorable to that event. Thus mathematically, probability is defined as the ratio of
the number of cases favorable to the event to the number of all mutually exclusive equally likely
cases.