0% found this document useful (0 votes)
84 views

Basics of Stats

This document provides an overview of business statistics concepts. It introduces statistics and its applications in marketing and managerial decision making such as demand analysis, customer satisfaction analysis, and trend forecasting. It defines key statistical terminology like descriptive statistics, inferential statistics, discrete and continuous variables. It also explains measures of central tendency including the mean, median and mode. Finally, it covers measures of variation or dispersion like the range, variance and standard deviation.

Uploaded by

DISHA SARAF
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
84 views

Basics of Stats

This document provides an overview of business statistics concepts. It introduces statistics and its applications in marketing and managerial decision making such as demand analysis, customer satisfaction analysis, and trend forecasting. It defines key statistical terminology like descriptive statistics, inferential statistics, discrete and continuous variables. It also explains measures of central tendency including the mean, median and mode. Finally, it covers measures of variation or dispersion like the range, variance and standard deviation.

Uploaded by

DISHA SARAF
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 49

Business Statistics

Session 1
 Introduction to statistics – What & Why
 Scope of statistics and its applications in
marketing & managerial decision making
 Demand analysis
 Customer satisfaction/preference analysis
 Sales and advertising analysis
 Relationship between variables
 Trend forecasting
Books….
 Statistics for Management by Richard I
Levin & David S Rubin,Pearson
publication 7th Edition
 Business statistics by S P Gupta
Definition
 Statistics is a standard method for
collecting, organizing, summarizing,
presenting, and analyzing data for drawing
conclusions and making decisions based
upon the analyses of these data.
 Statistics are used extensively by
engineers, managers, govt, businessmen,
etc throughout the world.
Functions of Statistics
 Presents facts in a definite form
 Simplifies mass of figures
 Facilitates comparison
 Helps in formulating and testing
hypothesis
 Helps in prediction
 Helps in the formulation of suitable
policies.
Is single statistical figure
conclusive ?
Descriptive Statistics
 Descriptive Statistics include the techniques
that are used to summarize and describe
numerical data. These methods can either
be graphical or involve computational
analysis.
Inferential Statistics
 Inferential statistics include those techniques by
which decisions about a statistical population
are made based on a sample having been
observed, or possibly, by the use of managerial
judgments. Because such statements are made
Under conditions of
uncertainty, the use
of probability concepts
is required.
 Discrete Variable
 Discrete variable can have only observed values at
isolated points along a scale of values. In business
statistics, such data typically occur through the
process of counting; hence the values generally
are expressed as integers (whole numbers only).
 Continuous Variable
 A continuous variable can assume a value at any
fractional point along a specified interval of
values. Continuous data are generated by the
process of measuring.
Working on data….
 Data Collection
 Primary
 Secondary
 Data classification
 Frequency distribution
 Presentation of data
Populations and Samples
 A population is a complete set of all of
the possible instances of a particular
object
 for example, students in this College.
 A sample is a subset of the population
 for example, any one of the classes.
 We use samples to draw conclusions
about the parent population.
Measures of Central Tendency
 If you have to declare a single value to
represent a population or a sample, what do
you use?
 The most common value is the mean, also
called the average or the expected value.
 Another common value is the mode or the
most likely (most common) value.
 Another value is the median or the “middle”
of the data set.
Measures of Central Tendency

 Mean
 This is the mathematical average of a set of numbers
 Median
 This is the middle value of a set of data that has been arranged
from lowest to highest
 Mode
 The value that occurs the most in a set of data
 We can use expenditure as a good way of discussing these
three measures. If we wanted to know the average
expenditure of NIFT students. Imagine that we took a
random sample of monthly expenditure of NIFT students.
What is the Mean?
 The mean is the sum of all of the
values in the data set divided by the
number of values.
 The equation for calculating the mean is
the same for both samples and
populations.

x 
x   
 n 
Mean
Sample Mean
n
1
x 
n
x
i 1
i

Where:
 X-bar is the mean
 xi are the data points
 n is the sample size
Population Mean
N
  1
N x i 1
i

Where:
 μ is the population mean
 xi are the data points
 N is the total number of observations in the
population
Measures of Central Tendency
(ungrouped)
 The sample gives  The Mean
these values:  This is the
 5000, 6000, 30000, average….
110000, 15000,  Sum of values =
6000, 17000, 13000, 271500
12000, 11000,  Total N = 15
8000, 6000, 15000,  Mean = 18100
6000, 11500
Advantages /Disadvantages of the
Arithmetic Mean
 Advantages:
 1) Familiar and intuitively clear to most people
 2) Every data set has one and only one mean
 3) Useful for performing statistical procedures
 Disadvantages:
 1) May be affected by extreme values
 2) Tedious to compute
 3) Difficult to compute for data set with open- ended classes
Solve it….
 The average weekly for a group of 25
persons working in a factory was
calculated to be Rs. 378.40 it was later
discovered that one figure was misread
as 160 instead of the correct value Rs.
200. Calculate correct average wage.
 Mean of two or more groups
What is the Median?
 If the data has been sorted (ascending or
descending), the median is the middle value (for
an odd number of points) or the average of the
two middle values (for an even number of points).
 median is used to characterize data sets with a
few extreme values that distort the relevance of
the mean, such as house values or family
incomes.

Median = ( n+1
2 )th item in the data array
Measures of Central Tendency
(ungrouped)
 The Median
 The sample gives  This is the middle
these values: values:
 5000, 6000, 30000,
 5000, 6000, 6000,
6000, 6000, 8000,
110000, 15000, 11000, 11500, 12000,
6000, 17000, 13000, 13000, 15000, 15000,
17000, 30000, 110000
12000, 11000,  The median here is
8000, 6000, 15000, 11500
6000, 11500  In cases where there
are two middle values,
we average the two.
What is the Mode?
 If the data is discrete, or has been grouped
into discrete intervals, the mode is that value
that occurs the most often.
 In other words it is the value most likely to

occur.
Measures of Central Tendency
(ungrouped)
 The Mode
 The sample gives  This is the most
these values: numerous value:
5000, 6000, 30000,  5000, 6000, 6000,
110000, 15000, 6000, 6000, 8000,
6000, 17000, 13000, 11000, 11500, 12000,
12000, 11000, 13000, 15000, 15000,
17000, 30000, 110000
8000, 6000, 15000,  The Mode here is 6000.
6000, 11500  Sometimes there is no
mode…or even two
modes!
Measures of Central Tendency
(ungrouped)
 So given these  …what is the best
values… measure of central
tendency for this
5000, 6000, 6000, random sample of
6000, 6000, 8000, NIFT students?
11000, 11500,  Mean?...18100
12000, 13000,  Median?...11500
15000, 15000,  Mode?...6000
17000, 30000,
110000
Measure of central tendency
of grouped data
 Grouped data,
 Class limits
 Class mid point
 Class interval
 Inclusive
 Exclusive
Mean of grouped data

x  1   f  x
n

x is mid point of the group.


f is frequency
n is number of samples
Computation of Median for grouped Data

( N / 2  C .F .)
Median = L  X h where L is lower limit of Median Class; N is total Frequency,
F
C.F. id cumulative frequency of class preceding median class, F is frequency of median class
and h is class width.
Computation of Mode for grouped Data
Summary of
Central Tendency Measures

Measure Equation Description


Mean x / n Balance Point
Median (n+1) th item in Middle value in
2 array ordered array
Mode none Most frequent
Measure of variations
What Is the Range?
 range: the distance between the
lowest and the highest values in the
set.
 For example, the time to drive to
Churchgate is 2-hours plus or minus 15
minutes. Or, 105 to 135 minutes. Thus
the range is 30 minutes.
Measures of Dispersion or Spread
(ungrouped)

 Range
 The highest value minus the lowest value….
 From our last example, the range would be:
110000 – 5000 = 105000
What is the Variance?
 The Variance of a population is the sum of
the squares of the differences between the
mean and the individual data points divided
by the number of data points.
 The Variance of a sample is the sum of the
squared differences divided by the number of
data points less one.
What is the Standard
Deviation?
 Standard Deviation
 This is the average distance your
values have from the mean
score.
 The Standard Deviation is the square
root of the variance
Computing Standard Deviation
 Population σ

 Sample "s"

The expression under


N
1 the square root sign is
 
N
 i
( x
i 1
 ) 2

the variance

It is important that you


n recognize the difference
1
s 
(n  1) i 1
( xi x ) 2
between these two
equations!
Measures of Dispersion or Spread
(ungrouped)
Standard Deviation 1. Calculate the mean…
 Let’s return to our NIFT which is 18100
random sample…
2. Find the distance that
each value has from the
5000, 6000, 6000, 6000, 6000,
8000, 11000, 11500, 12000,
mean
13000, 15000, 15000, 17000, 3. Square the distance
30000, 110000
4. Add up these distances
and divide by the
 Follow the steps on the right
sample size – 1
while we calculate the standard
deviation as a class on the 5. Then we get the square
board root of this number
Standard Deviation
X Mean (x-bar) X – x-bar (X – x-bar)2
5000 18100 -13100 17161 + E4
6000 18100 -12100 14641 + E4
6000 18100 -12100 14641 + E4
6000 18100 -12100 14641 + E4
6000 18100 -12100 14641 + E4
8000 18100 -10100 10201 + E4
11000 18100 -7100 5041 + E4
11500 18100 -6600 4356 + E4
12000 18100 -6100 3721 + E4
13000 18100 -5100 2601 + E4
15000 18100 -3100 961 + E4
15000 18100 -3100 961 + E4
17000 18100 -1100 121 + E4
30000 18100 11900 14161 + E4
110000 18100 91900 844561 + E4
Standard Deviation

 We sum (x – x bar)2, and get the square root


of this sum. This is the standard deviation.
What is the square root of the sum?
 Appx. 26,219
Standard Deviation (Grouped data)

 f d   f d 
2
2

S .D    h
f  f 
 
Where f is frequency; d is deviation computed
as

xi  A
di= h
. A is assumed mean
x is mid value of class
SD for Grouped Data
The following data traveling No. of
provides the traveling expenses (Rs) Students
expenses (Rs) Of 50 61 – 70 2
NIFT students. 71 – 80 10
 Find Mean and SD
81 – 90 20

91 – 100 17

101 - 110 1
S.D for Grouped Data
CI Mid Freq. d= xa fxd f X d2
Values(x) (f) h

61 – 70 65.5 2 -2 -4 8
71 -80 75.5 10 -1 -10 10
81 – 90 85.5 20 0 0 0
91 – 100 95.5 17 1 17 17
101 – 110 105.5 1 2 2 4
Total 5 39
S.D for Grouped Data
n

 f i xd i
x  A i 1
n
xh  86.5
 i 1
fi

2
 n
2   n

  f i xd i    f i xd i 
   i 1   i 1  xh
 n   n 
  fi  

 fi 

 i 1  i 1

 8.86
Coefficient of Variation

 1. Measure of relative dispersion


 2. Always a %
 3. Shows variation relative to mean
 4. Used to compare 2 or more groups

Sample s x 100 Population 


CV  _ CV   x 100
x
Coefficient of Variation
Example
Which technician shows more variability?

a40 b160
a5 b15
Solution


CV   (100)

Technician A Technician B
= 5 (100)
40  =
15 (100)
160
= 12.5%
= 9.4%
A Valuable Tool
 The standard deviation is a rather
recent invention and was originally
devised by Gauss to explain the error
observed in measured star positions.
 Today it is used in everything from
Quality Control to Measuring Risk in
financial investments.
Measures of Central Tendency and Dispersion
(Grouped Data)
 Remember that grouped data is a collection
of data that has been placed into categories…

 Thus we need to calculate the mean and


standard deviation differently, but the idea is
the same.
Comparing the Mean, Median,
and Mode

Mode Mean
Mean Mode
Median
Median

You might also like