0% found this document useful (0 votes)
366 views93 pages


Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1/ 93


Sinimole K.R.

“Statistics are the numbers of the


The numbers or symbols that formally

summarize our observations of the
Business statistics
Business Statistics is a science assisting
you to make business decisions under
uncertainties based on some numerical and
measurable scales.
Decision making processes must be based on
data, neither on personal opinion nor on
Statistics in business

The need for statistical information in smooth

functioning of an undertaking increases along
with its size. No business large or small, public
or private, can flourish in these days of large-
scale production and cut –throat competition
without the help of statistics.
Today's good decisions are driven by data. In
all aspects of our lives, and importantly in the
business context, an amazing diversity of data
is available for inspection and analytical
Business managers and professionals are
increasingly required to justify decisions on
the basis of data. They need statistical model-
based decision support systems. Statistical
skills enable them to intelligently collect,
analyze and interpret data relevant to their
Statistical concepts and statistical thinking
enable them to:
Solve problems in a diversity of contexts.
Add substance to decisions.
Reduce guesswork.
In short
In your future life and career, you will need to
be able to make good use of such information
to make sound decisions.
Art and Science of Collecting and Understanding DATA:
 DATA = Recorded Information
 e.g., Sales, Productivity, Quality, Costs, Return, …
Why? Because you want:
 Best use of imperfect information:
 e.g., 50,000 customers, 1,600 workers, 386,000 transactions,…
 Good decisions in uncertain conditions:
 e.g., new product launch: Fail? OK? Make you rich?

 Competitive Edge
 e.g., for you in the job market!
Activities of Statistics
1. Designing the study:
First step
Plan for data-gathering

2. Exploring the data:

First step (once you have data)
Look at, describe, summarize the data
Are you on the right track?
Activities of Statistics
3. Modeling the data
A framework of assumptions and equations
Parameters represent important aspects of the
Helps with estimation and hypothesis testing

4. Estimating an unknown:
Best “guess” based on data
Wrong - by how much?
Confidence interval - “we’re 95% sure that
the unknown is between …”
Activities of Statistics
5. Hypothesis testing:
Data decide between two possibilities
Does “it” really work? [or is “it” just randomly
Is financial statement correct? [or is error
Data Mining
Search for patterns in large data sets
Businesses data: marketing, finance, production ..
 Collected for some purpose, often useful for others
 From government or private companies

Makes use of
 Statistics – all the basic activities, and
 Prediction, classification, clustering

 Computer science – efficient algorithms (instructions) for

 Collecting, maintaining, organizing, analyzing data

 Optimization – calculations to achieve a goal

 Maximize or minimize (e.g. sales or costs)
Descriptive Statistics


Inferential Statistics
Descriptive helps us to Inferential help us to

Observe Formulate &

more test hypotheses
s cr i be
t o de ures
used sic fea
ba ta
the f the da y Descriptive Statistics
a stud

Single Numerical

of a batch

•It references to a given batch of data rather than to a more general

population or class. Such as an average.
•Leads to formulate laws or test hypotheses
Descriptive Statistics

Characterize your data

in simple but informative ways
Consider the following statements
1. A sample of ten students selected (one by
one ) from a class of hundred students
includes boys and girls in the order:
2. Six girls and four boys were included in a
sample of ten students from a class of
hundred students.
The first of the above statements provides
information (data) regarding the sex of ten
students chosen in a sample in its raw form.

 After making some computations, this

information is then put in a comprehensive
manner in the second statement.
The data in its original form as in (1) are
called statistics.
 The data in its secondary form as in (2) are
also called statistics.
 We can thus say that we compute statistics
(data as in(2)) from statistics (data in original
form as in (1)) with the help of statistics.
Or we can say that
The science of statistics deals with the collection
of data called statistics, their classification,
presentation of data with the help of tables,
graphs, diagrams, the analysis involving
computations of data from the classified data and
the interpretation of the computed data called
statistics which leads one to the conclusions and
recommendations that can be implemented in
Importance of relevant information.
It is very important for a statistician
to ensure that any inference that he
draws or any recommendations that
he makes should be practical.

Suppose for instance you are furnished

with the data about the minimum
temperature recorded in Mumbai on
1st , 2nd , 3rd and 4th day of January
last year as 200c , 190c, 180c, 170c

You are asked to forecast the minimum

temperature that may be recorded on 1st of
May of the same year.
If you carefully study the data, you find that
there is a decreasing trend, the decrease
being 10c per day.
Hence one may expect that after 120 days from
1st January, i.e. on first May, the temperature
would be 200c – 120* 10c= -1000c.

But it is practically impossible. Your estimate

should be at least practically feasible.
 For obtaining such an estimate, one requires
the relevant information.
For instance, the above case the information
supplied to you is irrelevant. What you need is
the information about the temperature
recorded on 1st May of last few years.
Uses of statistics
Statistical methods can be fruitfully applied to any problem of decision making where
numerical data are available or can be made available.
Therefore, in business, industry and economics; the statistical
techniques are applicable to problems like

 maintenance of trends of population,

 production of agricultural and non-agricultural industries,
 prices,
 internal and external trades,
 gross national product,
 taxation laws and rates;
 preparation of budgets,
 computation of consumer price indices from time to time
to revise the wage structures,
 preparation of price policies of new products,
 scheduling of the projects,
 quality control etc.
Functions of statistics
The science of statistics is used to perform
the following functions:
Statistics help in developing sound methods
of collecting data so that the data collected
can be used to draw the valid inference
regarding the desired objectives.
It presents the information in numerical form.
It helps in simplifying the complex data by
way of classification / tabulation/ graphical
It can be used to study the relationship
between two or more factors.
The data regarding a characteristic for a
series of past periods can be used to forecast
its value for a future period.
Limitations of statistics.
It is very powerful science to study
quantitative data. Qualitative data cannot
be studied with the help of statistics.
It is used to draw conclusions regarding a
group of units rather than single unit.
The sources of data can be classified into
two broad categories viz. primary sources and
secondary sources.
A source of data is primary for one who
creates such a source by collecting the data
through direct investigation and the same
source of data becomes secondary to anyone
else who makes use of it.

The primary data are to be collected

through personal investigation by asking
questions where as the secondary data are
the published reports in journals, news
papers, periodicals etc.
Primary Data

you collect the data yourself using methods

such as interviews and questionnaires.
The key point here is that the data you collect is
unique to you and your research and, until you
publish, no one else has access to it.
There are many methods of collecting primary data
and the main methods include:

focus group interviews
critical incidents
What is 'interpreting
Data means information. So interpreting data just
working out what information is telling you.
Information is sometimes shown in tables, charts
graphs to make the information easier to read.
It is important to read all the different parts of the
chart or graph.
Classification of Data

area-wise, e.g. cities, districts, etc.
on the basis of time.
according to some attributes.
in terms of Magnitudes
Classification is the process of arranging
things in groups or classes according to their
resemblance and affinities.
It serves the following purposes
1. eliminating unnecessary details
2. brings out clearly the point of similarity
3. allows the comparisons & drawing of
The characteristic of a population may be
broadly divided into two categories variables
(quantitative) & attributes (qualitative)
Attributes – not capable of being described
numerically eg. Sex, nationality, color of eye,
When classified on the basis of attribute,
groups are differentiated either by the
absence or presence of the attribute or by its
differing qualiites.
Variables- numerically described
Eg. Height, weight, turnover, age, export etc..
Quantitative data
 The data which assumes numerical values are called

quantitative data.
Qualitative data
The data representing class to which a unit
belongs with respect to a qualitative character
(reli 뉧
For example,
Consider the following for a sample of ten
students chosen from a class of hundred students.
 1.Their ages in completed years are
 2. the sample included boys and girls in the following
order: B G B B B G B G B.
 The age values in (1) are quantitative in nature.
Whereas sex codes B,G in case (2) represent qualitative
May be
Continuous or
Discrete and Continuous

Discrete data
If the data values are integer in nature, such data
values are called discrete.
 Examples
 1,2,3,4,5,6.
 Nominal &Ordinal
Continuous Data.
If the data values are real in nature (any
numerical with in a certain range), such data
values are called continuous.
 Examples
 12.5, 13.6, 14.7 ,16.8, 17.1.
 Variables like age, height, weight, rainfall records
always lead to continuous data
 Interval or ratio
Data types

Qualitative / Quantitative/
categorical Numerical

discrete Discrete/ continuous

Nominal Ordinal Interval Ratio

Nominal &Ordinal
Can be classified into categories or groups and labels to describe
No order implied in the groups
Eg. Yes or No, Pass or Fail, metropolitan, urban, semi-urban etc….
Ordinal is essentially same as nominal except there is now an
order within the groups into which data is classified.
Eg. Ratings of hotels, restaurants, and movies etc..
We can say 5 star is better than 4 star but not by howmuch
Interval &Ratio
Can be measured on a numerical scale. However
zero pt does not mean absence of something.
Eg. Temperature, time, longitude, latitude etc…
Ratio-can be measured on numerical scale, but
here zero pt does mean the absence of what is
being measured.
Height , weight, life, price, sales, revenue etc….
Organizing the quantitative
First step
Arranging the collected data to prepare an
Array – arranging the values by ascending or
descending order
Arranging Data
Second step
Organize data into similar categories or
classes and then count the number of
observations that falls into each category .
This method produces a frequency
The frequency
A frequency distribution is a table that
organizes data into classes, that is, into
groups of values describing one characteristic
of the data.
A frequency distribution shows the number of
observations from the data set that fall into
each of the classes.
Interpreting data

Tally marks and frequency tables

Tallying is a way of counting that uses groups of five, like this:

As you can see, using groups of five makes it easier to see the total!
•Frequency tables show the totals of the tally marks.
Grouping data
When there are a large number of possible
outcomes, we will usually need to group the
data. To do this, we first need to decide on our
Decide on the type and number of classes for
dividing the data.
 by quantitative or qualitative measure
The range must be divided by equal classes
The width of the interval from the beginning of
one class to the beginning of the next class
must be the same for every class.
Width of a class interval
width of the class intervals =
next unit value after largest value in data - smallest value in data
total number of class intervals
Step 2 &3
Sort the data points into classes and count
the number of points in each class
Illustrate the data in a chart.
Class intervals
Class limits
 The lowest an the highest value that can be included. 20-
Class intervals
 The difference between upper and lower limit.

Class frequency
 The number of observations corresponding to the
particular class
Class mid point
 Value lying the half-way between lower & upper limit.
Exclusive & Inclusive
Upper limit of one class is the lower limit of
the next class
 20-30 30-40 etc.
Upper limit of one class is included in that
class itself
20-29 30-39 etc.
General rules
The number of classes should preferably be
between 5 & 15
K= 1 + 3.322 log N
Where k= the approximate number of classes
N= total number of observation
Whenever possible all classes should be of the
same size.
If not possible form open end classes
Imagine that you are carrying out a survey to
determine the number of CDs bought by your
classmates in the past year. The possible
answers are likely to range from 0 to 50
(about one a week!), so you might draw a tally
chart similar to the one below:
Number of CDs Frequency

0- 4 8

5-9 5

10 - 14 7

15 - 19 4

20 - 24 1

25 - 29

30 - 34 1

35 - 39

40 - 44

45 - 49 2

more than 49
A table is a systematic arrangement of
statistical data in columns and rows.
Parts of a table
Table number Body of the table
Title of the table Head note
Caption Foot note
Caption ….
Column headings. It explains what the column
Stub ….
Row headings
Head notes ….
Brief explanatory statement
Placed below the title enclosed in brackets.
Bar charts
Bar charts are one way of showing the
information from a frequency table
Types of bar Diagrams
Simple bar diagrams
Subdivided bar diagrams
Multiple bar diagrams
Percentage bar diagrams
Deviation bars
Broken bars
Simple bar diagrams
Only one variable.

Funds flow in Rs. (crores) of ABC ltd.






0 50 100 150 200 250

Funds Flow
Sub divided bar diagrams
distribution of Assistance sanctioned. Series3
Rs. (crores)



1999-00 2000-01 2001-02 2002-03 2003-04
Series3 283.4 473.8 402.8 536.6 686.8
Series2 365.3 484.7 668.6 992.8 1067.4
Series1 233.8 301.8 303.2 365.3 416.4
Multiple bar diagrams
Corporate secotr profits gross profits
profit beofre tax
profit after tax
1800 retained profits
1400 1219
1000 846
2002-03 2003-04
Percentage bar diagrams

Sales & Net Profits


14% 11% 17%

1 2 3


sales netprofits
Pie charts
Pie charts are circles divided into segments,
where each segment represents a fraction of
the total amount.
Pie charts
Pie charts are circles divided into segments,
where each segment represents a fraction of
the total amount.
Pie charts…
This pie chart shows the 20 bikes sold at the
bike shop.
The segment for Mountain bikes is one half of
the chart. This is
because 10 Mountain bikes were sold, which
is exactly half the
number of bikes sold in total (20 bikes).
A histogram is a series of rectangles, each
proportional in width to the range of values
within a class and proportional in height to the
number of items falling in the class.
If the classes are of equal width, then the
vertical bars in the histogram are also of equal
are similar to bar charts apart from the
consideration of areas. In a bar chart, all of
the bars are the same width and the only
thing that matters is the height of the bar. In a
histogram, the area is the important thing.
Example of a Histogram
Class Freq.
Count 15 but < 25 3
5 25 but < 35 5
35 but < 45 2
Frequency 4
2 Bars
0 15 25 35 45 55
Lower Boundary

The next slide contains a real world example of

a histogram. It plots the relative frequency of
the heights of some students based on the
data below. Frequency:
Height (feet): (Number of pupils) :
0-2 0
2-4 1
4-5 4
5-6 8
6-8 2
Real World Example

 This Histogram is courtesy of http://www.gcsemaths.fsnet.co.uk/page5.html

Frequency polygons
To construct a frequency polygon
Mark the frequencies on the vertical axis and
the values of the variable on the horizontal axis.
Plot each frequency by drawing a dot above its
midpoint and
Connect the successive dots with straight lines
to form a polygon.
Add two new classes containing zero
Allow the polygon to reach the horizontal axis
at both end of the distribution.
Class Freq.
Count 15 but < 25 3
5 25 but < 35 5
35 but < 45 2
Frequency 4
Frequency 2 Fictitious
1 Class
0 10 20 30 40 50 60
Relative frequency
A frequency polygon that uses the relative
frequency of data points in each of the classes
is called a relative frequency polygon.
A graph of a cumulative frequency
distribution is called an ogive
A cumulative less-than ogive &
A cumulative greater-than ogive.
Cumulative % Polygon
Cumulative % Fictitious
100% Class

Class Cum. %
50% 15 but < 25 0%
25 but < 35 30%
35 but < 45 80%
25% 45 but < 55 100%

0 15 25 35 45 55
Lower Boundary
Errors in Presenting
1. Using ‘Chart Junk’
2. No Relative Basis
in Comparing Data
3. Compressing the
Vertical Axis
4. No Zero Point on
the Vertical Axis
‘Chart Junk’
Bad Presentation Good Presentation
Minimum Wage Minimum Wage
1960: $1.00 $
1970: $1.60
1980: $3.10
1990: $3.80 1960 1970 1980 1990
No Relative Basis
Bad Presentation Good Presentation
A’s by Class A’s by Class
Freq. %
300 30%
200 20%
100 10%
0 0%
Compressing Vertical
Bad Presentation Good Presentation
Quarterly Sales
$ $ Quarterly Sales
200 50

100 25

0 0
Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4
No Zero Point on
Vertical Axis
Bad Presentation Good Presentation
Monthly Sales Monthly Sales
$ $
45 60
42 40
39 20
36 0

You might also like