0% found this document useful (0 votes)
271 views102 pages

Statistics For Management

This document introduces statistics and its applications. It discusses that statistics involves collecting, organizing, analyzing, and interpreting numerical data [1]. Statistics is useful for business decisions in areas like market research, production planning, and quality control [2]. The document outlines limitations of statistics and differentiates between descriptive and inferential statistics [3]. Descriptive statistics summarize data while inferential statistics allow generalizing from samples to populations.

Uploaded by

Leela Maheswar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
271 views102 pages

Statistics For Management

This document introduces statistics and its applications. It discusses that statistics involves collecting, organizing, analyzing, and interpreting numerical data [1]. Statistics is useful for business decisions in areas like market research, production planning, and quality control [2]. The document outlines limitations of statistics and differentiates between descriptive and inferential statistics [3]. Descriptive statistics summarize data while inferential statistics allow generalizing from samples to populations.

Uploaded by

Leela Maheswar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 102

Module-1: Statistics

Notes

e
Learning Objective:

in
●● To get introduced with types of statistics and its application in different phases
●● To develop an understanding of data representation techniques

nl
●● To understand the MS Excel applications of numerical measures of central
tendency and dispersion

O
Learning Outcome:
At the end of the course, the learners will be able to –

●● To be able to arrange and describe statistical information using numerical and

ity
graphical procedures
●● To be able to use the tool MS Excel for answering business problems based on
numerical measures

s
“ It is the science of collection, presentation, analysis, and interpretation of
numerical data from logical analysis”
Croxton and Cowden define-
er
1.1.1 Statistical Thinking and Analysis
v
Data is a collection of any number of related observations. We can collect the
ni

number of telephones installed in a given day by several workers or the numbers of


telephones installed per day over a period of several days by one worker and call the
results our data. A collection of data is called a data set and a single observation is
called as a data point.
U

Statistics is not restricted to only information about the State, but it also extends
to almost every realm of the business. Statistics is about scientific methods to gather,
organize, summarize and analyze data. More important still is to draw valid conclusions
ity

and make effective decisions based on such analysis. To a large degree, company
performance depends on the preciseness and accuracy of the forecast. Statistics
is an indispensable instrument for manufacturing control and market research.
Statistical tools are extensively used in business for time and motion study, consumer
m

behaviour study, investment decisions, credit ratings, performance measurements and


compensations, inventory management, accounting, quality control, distribution channel
design, etc.
)A

For managers, therefore, understanding statistical concepts and knowledge


about using statistical tools is essential. With an increase in a company’s size and
market uncertainty due to reduced competition, the need for statistical knowledge and
statistical analysis of various business circumstances has greatly increased. Prior to
this, when the size of business used to be small without much complexities, a single
(c

person, usually owner or manager of the firm, used to take all decisions regarding
the business. Example: A manager used to decide, from where the necessary raw
materials and other factors of production were to be acquired, how much of output will
Amity Directorate of Distance & Online Education
2 Statistics Management

be produced, where it will be sold, etc. This type of decision making was usually based
Notes

e
on experience and expectations of this single individual and as such had no scientific
basis.

in
1.1.2 Limitations and Applications of Statistics
Statistical techniques, because of their flexibility have become popular and

nl
are used in numerous fields. But statistics is not a cure-all technique and has few
limitations. It cannot be applied to all kinds of situations and cannot be made to answer
all queries. The major limitations are:

O
1. Statistics deals with only those problems, which can be expressed in quantitative
terms and amenable to mathematical and numerical analysis. These are not suitable
for qualitative data such as customer loyalty, employee integrity, emotional bonding,
motivation etc.

ity
2. Statistics deals only with the collection of data and no importance is attached to an
individual item.
3. Statistical results are only an approximation and not mathematically correct. There is

s
always a possibility of random error.
4. Statistics, if used wrongly, can lead to misleading conclusions, and therefore, should
er
be used only after complete understanding of the process and the conceptual base.
5. Statistics laws are not exact laws and are liable to be misused.
6. The greatest limitation is that the statistical data can be used properly only by a
v
professional. A person having thorough knowledge of the methods of statistics and
proper training can only come to conclusions.
ni

7. If statistical data are not uniform and homogenous, then the study of the problem is
not possible. Homogeneity of data is essential for a proper study.
U

8. Statistical methods are not the only method for studying a problem. There are other
methods as well, and a problem can be studied in various ways.

1.1.3 Types of Statistical Methods: Descriptive & Inferential


ity

The study of statistics can be categorized into two main branches. These branches
are descriptive statistics and inferential statistics.

Descriptive statistics is used to sum up and graph the data for a category picked.
m

This method helps to understand a particular collection of observations. A sample is


defined on descriptive statistics. There is no confusion in concise numbers, since you
just identify the individuals or things which are calculated.
)A

Descriptive statistics give information that describes the data in some manner. For
example, suppose a pet shop sells cats, dogs, birds and fish. If 100 pets are sold, and
35 out of the 100 were dogs, then one description of the data on the pets sold would be
that 35% were dogs.

Inferential statistics are techniques that allow us to use certain samples to generalize
(c

the populations from which the samples were taken. Hence, it is crucial that the sample
represents the population accurately. The method to get this done is called sampling. Since

Amity Directorate of Distance & Online Education


Statistics Management 3

the inferential statistics aim at drawing conclusions from a sample and generalizing them to
Notes

e
a population, we need to be sure that our sample accurately represents the population. This
requirement affects our process. At a broad level, we must do the following:

in
●● Define the population we are studying.
●● Draw a representative sample from that population.
●● Use analyses that incorporate the sampling error.

nl
1.1.4 Importance and Scope of Statistics

O
●● Condensation: Statistics compresses a mass of figures to small meaningful
information, for example, average sales, BSE index, the growth rate etc. It is
impossible to get a precise idea about the profitability of a business from a mere
record of income and expenditure transactions. The information of Return On

ity
Investment (ROI), Earnings Per Share (EPS), profit margins, etc., however, can be
easily remembered, understood and thus used in decision-making.

●● Forecast: Statistics helps in forecasting by analyzing trends, which are essential


for planning and decision-making. Predictions based on the gut feeling or hunch

s
can be harmful for the business. For example, to decide the refining capacity for a
petrochemical plant, it is required to predict the demand of petrochemical product
er
mix, supply of crude oil, the cost of crude, substitution products, etc., for next 10 to
20 years, before committing an investment.

●● Testing of hypotheses: Hypotheses are the statements about population


v
parameters based on past knowledge or information. It must be checked for its
validity in the light of current information. Inductive inference about the population
ni

based on the sample estimates involves an element of risk. However, sampling


keeps the decision-making costs low. Statistics provides quantitative base for
testing our beliefs about the population.
U

●● Relationship between Facts: Statistical methods are used to investigate the


cause and effect relationship between two or more facts. The relationship between
demand and supply, money-supply and price level can be best understood with
ity

the help of statistical methods.

●● Expectation: Statistics provides the basic building block for framing suitable
policies. For example how much raw material should be imported, how much
capacity should be installed, or manpower recruited, etc., depends upon the
m

expected value of outcome of our present decisions

1.1.5 Population and Sample


)A

Sample
A sample consists of one or more observations drawn from the population. Sample
is the group of people who actually took part in your research. They are the people
(c

that are questioned (for example, in a qualitative study) or who actually complete the
survey (for example, in a quantitative study). Participants who may have been research
participants but didn’t personally participate are not considered part of the survey.

Amity Directorate of Distance & Online Education


4 Statistics Management

A sample data set contains a part, or a subset, of a population. The size of a


Notes

e
sample is always less than the size of the population from which it is taken. [Utilizes the
count n - 1 in formulas.]

in
Population
A population includes all of the elements from a set of data. Population is the

nl
broader group of people that you expect to generalize your study results to. Your
sample is just going to be a part of the population. The size of your sample will depend
on your exact population.

O
A population data set contains all members of a specified group (the entire list of
possible data values). [Utilizes the count n in formulas.]

Example: The population may be “ALL people living in India.

ity
For example – Mr. Tom wants to do a statistical analysis on students’ final
examination scores in her math class for the past year. Should he consider her data to
be a population data set or a sample data set?

Mrs. Tom is only working with the scores from his class. There is no reason for him

s
to generalize her results to all management students in the school. He has all of the
data that pertaining to his investigation = population.
er
1.2.1 Importance of Graphical Representation of Data
Data needs to be process and analyze the data obtained from the field. The
v
processing consists mainly of recording, labeling, classifying and tabulating the
collected data so that it is consistent with the report. The data may be viewed either in
ni

tabulation form or via charts. Effective use of the data collected primarily depends on
how it is organized, presented, and summarized.

●● One of the most convincing and appealing ways in which statistical results
U

may be represented is through graphs and diagrams..


●● Diagrams and graphs are extremely used because of the following reasons:
●● Diagrams and Graphs attract to the eye.
ity

●● They have more memorizing effect.


●● It facilitates for easy comparison of data from one period to another.
●● Diagram and graphs give bird’s eye view of entire data; therefore, it conveys
meaning very quickly
m

1.2.2 Bar Chart


)A

In a bar diagram, only the length of the bar is taken into account but not the width.
In other words bar is a thick line whose width is merely shown, but length of the bar is
taken into account and is called one-dimensional diagram.

Simple Bar Diagram


(c

It represents only one variable. Since these are of the same width and vary only in
lengths (heights), it becomes very easy for a comparative study. Simple bar diagrams

Amity Directorate of Distance & Online Education


Statistics Management 5

are very popular in practice. A bar chart can be either vertical or horizontal; for example
Notes

e
sales, production, population figures etc. for various years may be shown by simple bar
charts

in
Illustration - 1

The following table gives the birth rate per thousand of different countries over a
certain period of time.

nl
Country India Germany U. K. New Zealand Sweden China
Birth rate 33 16 20 30 15 40

O
s ity
v er
ni

Comparing the size of bars, China’s birth rate is highest, next is India whereas
Germany and Sweden equal in the lowest positions.
U

Sub-divided Bar Diagram


In a subdivided bar diagram, each bar representing the magnitude of given value is
ity

further subdivided into various components. Each component occupies a part of the bar
proportional to its share in total.

Illustration - 1

Present the following data in a sub-divided bar diagram.


m

Year/Faculty Science Humanities Commerce

2014-2015 240 560 220


)A

2015-2016 280 610 280


(c

Amity Directorate of Distance & Online Education


6 Statistics Management

Notes

e
in
nl
O
ity
Multiple Bar Diagram
In a multiple bar diagram, two or more sets of related data are represented and the
components are shown as separate adjoining bars. The height of each bar represents

s
the actual value of the component. The components are shown by different shades or
colours. er
Illustration 1 - Construct a suitable bar diagram for the following data of number of
students in two different colleges in different faculties.
v
College Arts Science Commerce Total
ni

A 1200 800 600 2600

B 700 500 600 1800


U
ity
m
)A
(c

Amity Directorate of Distance & Online Education


Statistics Management 7

Percentage bar Diagram


Notes

e
In percentage bar diagram the length of the entire bar kept equal to 100 (Hundred).
Various segment of each bar may change and represent percentage on an aggregate.

in
Illustation 1

Year Men Women Children

nl
1995 45% 35% 20%

1996 44% 34% 22%

O
1997 48% 36% 16%

s ity
v er
1.2.3 Pie Chart
ni

A pie chart or a circle chart is a circular statistical graphic, that is divided into
slices to illustrate a numerical proportion. In a pie chart, the arc length of each slice is
proportional to the quantity it represents. While it is named for its resemblance to a pie
which has been sliced, there are variations on the way it can be presented.. In a pie
U

chart, categories of data are represented by wedges in the circle and are proportional in
size to the percent of individuals in each category.

Pie charts are very widely used in the business world and the mass media. Pie
ity

charts are generally used to show percentage or proportional data and usually the
percentage represented by each category is provided next to the corresponding slice of
pie. Pie charts are good for displaying data for around six categories or fewer.
m

1.2.4 Histogram
Histogram is a graphical data display using bars of different heights. This is similar
to a bar map, but there are ranges of histogram categories. The height of each bar
)A

shows how many fall within each set.

A histogram can be used when:

●● The data is numerical


●● The shape of the data’s distribution is to be viewed, especially when
(c

determining whether the output of a process is distributed approximately


normally

Amity Directorate of Distance & Online Education


8 Statistics Management

●● Analyzing whether a process can meet the customer’s requirements


Notes

e
●● Analyzing what the output from a supplier’s process looks like seeing whether
a process change has occurred from one time period to another

in
●● Determining whether the outputs of two or more processes are different
●● You wish to communicate the distribution of data quickly and easily to others

nl
1.2.5 Frequency Polygon
These are the frequencies plotted against the mid-points of the class-intervals and
the points thus obtained are joined by line segments. On comparing the Histogram and

O
a frequency polygon, in frequency polygons the points replace the bars (rectangles).
Also, when several distributions are to be compared on the same graph paper,
frequency polygons are better than Histograms.

ity
Illustration 1

Draw a histogram and frequency polygon from the following data

Age in Years Number of Persons

s
10-20 3

20-30
er 16

30-40 22

40-50 35
v
50-60 24
ni

60-70 15

70-80 2
U
ity
m
)A
(c

Amity Directorate of Distance & Online Education


Statistics Management 9

1.2.6 Ogives
Notes

e
When frequencies are added, they are called the cumulative frequencies. The
curve obtained by plotting cumulating frequencies is called a cumulative frequency

in
curve or an ogive (pronounced as ojive).

To construct an Ogive: (i) Add up the progressive totals of frequencies, class by


class, to get the cumulative frequencies. (ii) Plot classes on the horizontal (x-axis) and

nl
cumulative frequencies on the vertical (y-axis).

Less than Ogive: To plot a less than ogive, data is arranged in ascending order

O
of magnitude and frequencies are cumulated from the top i.e. adding. Cumulative
frequencies are plotted against the upper class limits. Ogives under this method, gives
a positive curve

Greater than Ogive: To plot a greater than ogive, the data is arranged in the

ity
ascending order of magnitude and frequencies are cumulated from the bottom or
subtracted from the total from the top. Cumulative frequencies are plotted against the
lower class limits. Ogives under this method, gives negative curve

Uses: Certain values like median, quartiles, quartile deviation, co-efficient of

s
skewness etc. can be located using ogives. Ogives are helpful in the comparison of the
two distributions.
er
Illustration 1 –

Draw less than and more than ogive curves for the following frequency distribution
v
and obtain median graphically. Verify the result.

CI 0-20 20-40 40-60 60-80 80-100 100-120 120-140 140-160


ni

f 5 12 18 25 15 12 8 5
U

Size Icf mcf size

20 5 100 0

40 17 95 20
ity

60 35 83 40

80 60 65 60

100 75 40 80
m

120 87 25 100

140 95 13 120
)A

160 100 5 140


(c

Amity Directorate of Distance & Online Education


10 Statistics Management

Notes

e
in
nl
O
ity
1.2.7 Pareto Chart

s
A Pareto Chart is a graph showing the frequency of the defects and their
er
cumulative effect. Pareto charts are helpful in identifying the defects that should be
prioritized to achieve the greatest overall change.

The Pareto principle (also known as the 80/20 rule, the law of the vital few, or the
v
principle of factor sparsity) states that, for many events, roughly 80% of the effects
come from 20% of the causes.
ni

An example of pareto chart –


U
ity
m
)A

When to use a pareto chart


●● A pareto chart must be used when analyzing data about the frequency of problems
or causes in a process
●● A pareto chart must be used when there are many problems or causes and you
want to focus on the most significant
(c

Amity Directorate of Distance & Online Education


Statistics Management 11

●● It must be used when analyzing broad causes by looking at their specific


Notes

e
components
●● It must be used while communicating with others about the data

in
1.2.8 Stem-and-leaf display
A stem-and-leaf display or stem-and-leaf plot is a device for presenting quantitative

nl
data in a graphical format, similar to a histogram, to assist in visualizing the shape of a
distribution. They are are useful tools in exploratory data analysis.

A Stem and Leaf Plot is a special table where each data value is split into a “stem”

O
(the first digit or digits) and a “leaf” which is usually the last digit. The “stem” values are
listed down, and the “leaf” values go right (or left) from the stem values. The “stem” is
used to group the scores and each “leaf” shows the individual scores within each group.

ity
For example –

Tom got his friends to do a long jump and got these results:

2.3, 2.5, 2.5, 2.7, 2.8 3.2, 3.6, 3.6, 4.5, 5.0

s
The stem-and-leaf plot for the same will be – er
Stem Leaf

2 35578

3 266
v
4 5
ni

5 0

Stem “2” Leaf “3” means 2.3


U

●● What the stem and leaf here mean (Stem “2” Leaf “3” means 2.3)
●● In this case each leaf is a decimal
●● It is OK to repeat a leaf value
ity

●● 5.0 has a leaf of “0”

1.2.9 Cross tabulations


Cross tabulation is a method by which the relationship between multiple variables
m

is quantitatively analysed. Cross tabulation party variables also known as contingency


tables or cross tabs to explain the connection between the various variables. It also
shows how the correlations vary from one group to another variable.
)A

A cross-tabulation (or crosstab) is, for reference, a two- (or more) dimensional table
which records the number (frequency) of respondents having the specific characteristics
described in the table cells. Cross-tabulation tables offer a wealth of information on the
variables’ relationship. Cross-tabulation analysis goes by several names in the research
(c

world including crosstab, contingency table, chi-square and data tabulation.

Amity Directorate of Distance & Online Education


12 Statistics Management

Importance of Cross Tabulation


Notes

e
●● Clean and Useable Data:
Cross tabulation makes it simple to interpret data! The clarity offered by cross

in
tabulation helps deliver clean data that be used to improve decisions throughout an
organization.

●● Easy to Understand:

nl
No advanced statistical degree is needed to interpret cross tabulation. The results
are easy to read and explain. This is makes it useful in any type of presentation.

O
1.2.10 Scatter plot and Trend line
Scatter diagram is the most fundamental graph plotted to show relationship
between two variables. It is a simple way to represent bivariate distribution. Bivariate

ity
distribution is the distribution of two random variables. Two variables are plotted one
against each of the X and Y axis. Thus, every data pair of (xi , yi ) is represented by
a point on the graph, x being abscissa and y being the ordinate of the point. From a
scatter diagram we can find if there is any relationship between the x and y, and if yes,

s
what type of relationship. Scatter diagram thus, indicates nature and strength of the
correlation. er
The pattern of points obtained by plotting the observed points are knows as scatter
diagram.

It gives us two types of information.


v
●● Whether the variables are related or not.
ni

●● If so, what kind of relationship or estimating equation that describes the


relationship.
If the dots cluster around a line, the correlation is called linear correlation. If
U

the dots cluster around a curve, the correlation is called a non-linear or curve linear
correlation.

Scatter diagram is drawn to visualize the relationship between two variables. The
ity

values of more important variable are plotted on the X-axis while the values of the
variable are plotted on the Y-axis. On the graph, dots are plotted to represent different
pairs of data. When dots are plotted to represent all the pairs, we get a scatter diagram.
The way the dots scatter gives an indication of the kind of relationship which exists
between the two variables. While drawing scatter diagram, it is not necessary to take
m

at the point of sign the zero values of X and Y variables, but the minimum values of the
variables considered may be taken.

●● When there is a positive correlation between the variables, the dots on the
)A

scatter diagram run from left hand bottom to the right hand upper corner. In
case of perfect positive correlation all the dots will lie on a straight line.
●● When a negative correlation exists between the variables, dots on the scatter
diagram run from the upper left hand corner to the bottom right hand corner. In
(c

case of perfect negative correlation, all the dots lie on a straight line.

Amity Directorate of Distance & Online Education


Statistics Management 13

Example: Figures on advertisement expenditure (X) and Sales (Y) of a firm for the
Notes

e
last ten years are given below. Draw a scatter diagram.

Advertisement cost in ‘000’ 40 65 60 90 85 75 35 90 34 76

in
Sales in Lakh ` 45 56 58 82 65 70 64 85 50 85

Solution:

nl
O
s ity
er
A scatter diagram gives two very useful types of information. First, we can observe
patterns between variables that indicate whether the variables are related. Secondly,
if the variables are related we can get idea of what kind of relationship (linear or non-
v
linear) would describe the relationship. Correlation examines the first question of
determining whether an association exists between the two variables, and if it does, to
ni

what extent. Regression examines the second question of establishing an appropriate


relation between the variables.
U

1.3.1 Arithmetic mean - intro and application


The mean is the average of the numbers. It is easy to calculate: add up all the
numbers, then divide by how many numbers there are. In other words it is the sum
ity

divided by the count.

Arithmetic mean is defined as the value obtained by dividing the total values
of all items in the series by their number. In other word is defined as the sum of the
given observations divided by the number of observations, i.e., add values of all items
m

together and divide this sum by the number of observations.

Symbolically – x = x1 + x2 + x3 + xn/n
)A

Properties of Arithmetic Mean


1. The sum of the deviations, of all the values of x, from their arithmetic mean, is
zero.
2. The product of the arithmetic mean and the number of items gives the total of
(c

all items.
3. Finding the combined arithmetic mean when different groups are given.

Amity Directorate of Distance & Online Education


14 Statistics Management

Demerits of Arithmetic Mean


Notes

e
1. Arithmetic mean is affected by the extreme values.
2. Arithmetic mean cannot be determined by inspection and cannot be located

in
graphically.
3. Arithmetic mean cannot be obtained if a single observation is lost or missing.

nl
4. Arithmetic mean cannot be calculated when open-end class intervals are
present in the data.

O
Arithmetic Mean for Ungrouped Data

Individual Series

ity
1. Direct Method
The following steps are involved in calculating arithmetic mean under an individual
series using direct method:

- Add up all the values of all the items in the series.

s
- Divide the sum of the values by the number of items. The result is the arithmetic
er
mean.
The following formula is used: X = Ʃ x/N

Where, X = Arithmetic mean Ʃx = Sum of the values N = Number of items.


v
Illustration 1 – Value(x) – 125 128 132 135 140 148 155 157 159 191
ni

Calculate the arithmetic mean

Solution –
U

Total number of terms = N = 10

Mean = Ʃ x = 125 128 132 135 140 148 155 157 159 191 = 1440

X = Ʃ x/n = 1440/10
ity

= 144

2. Short-cut Method or Indirect method


The following steps are involved in calculating arithmetic mean under individual
m

series using short-cut or indirect method:

1. Assume one of the values in the series as an average. It is called as working


)A

mean or assumed average.


2. Find out the deviation of each value from the assumed average.
3. Add up the deviations
4. Apply the following formula. X = A d N + Ʃ
(c

where, X = Arithmetic mean A = Assumed average Ʃd = Sum of the deviations N =


Number of items

Amity Directorate of Distance & Online Education


Statistics Management 15

Illustration - 1 Calculate the arithmetic average of the data given below using
Notes

e
short–cut method

Roll No 1 2 3 4 5 6 7 8 9 10

in
Marks 43 48 65 57 31 60 37 48 78 59

nl
Solution –

Roll No Marks Obtained D = 60

1 43 -17

O
2 48 -12

3 65 5

ity
4 57 -3

5 31 -29

6 60 0

s
7 37 -23

8 48 -12
er
9 78 18

10 59 -1
v
Ʃd = – 74
ni

X = a +Ʃd/N

Ʃ 60 + (- 74/10) = 52.6 marks


U

Combined Arithmetic Mean


Arithmetic mean and number of items of two or more related groups are known
ity

as combined mean of the entire group. The combined average of two series can be
calculated by the given formula –

n1x1 + n2x2/ n1 + n2

Where, n1 = No. of items of the first group, n2 = No. of items of the second group
m

x1 = A.M of the first group, x2 = A.M of the second group,

Example - From the following data ascertain the combined mean of a factory
)A

consisting of 2 branches namely branch A and Branch B. In branch A the number of


workers is 500, and their average salary of 300. In branch B the number of workers is
1,000 and their average salary is 250

Solution:
(c

Let the no. of workers in branch A be n1 = 500

Let the no. of worker in branch B be n2 = 1000

Amity Directorate of Distance & Online Education


16 Statistics Management

Average salary x1 = 300


Notes

e
Average salary x2 = 250

in
n1x1 + n2x2/ n1 + n2

= 500(300) + 1000(250)/ 500 + 1000

= 1, 50,000 + 2, 50,000/1500

nl
= 266.66

O
Weighted Arithmetic Mean
Sometimes, some observations get relatively more importance than other
observations. The weight for such observation must be given on the basis of their
relative importance. In weighted arithmetic mean, for finding an average the value of

ity
each item is multiplied by its weight and then the product are divided by the number of
weights.

Symbolically = Ʃwx / Ʃw

s
Example – Calculate simple and weighted average from the following data –

Month Jan Feb March April May June


er
Price 42.5 51.25 50 52 44.25 54

No. of tonnes 25 30 40 50 10 45
v
Solution:
ni

Month Price Per Tonn No. of tonnes ( in 000)(x) WX purchased (w)

Jan 42.5 25 1062.5


U

Feb 51.25 30 1537.5

March 50 40 2000
ity

April 52 50 2600

May 44.25 10 442.5

June 54 45 2430
m

N=6 X = 294 Ʃw = 200 Ʃwx = 10027.5

Simple AM
)A

X = Ʃx/n = 294/6 = 49

Weighted AM

Xw = Ʃwx/Ʃw = 10027.5/200 = 50.137


(c

The correct average price paid is ` 50.30 and not ` 49 i.e., weight arithmetic mean
is correct than simple arithmetic mean.

Amity Directorate of Distance & Online Education


Statistics Management 17

1.3.2 Median - Intro and Application


Notes

e
Median is defined as the value of the item dividing the series into two equal
halves, where one half contains all values less than (or equal to) it and the other half

in
contains all values greater than (or equal to) it. It is also defined as the “central value
of the variable. In median, the value of items must be arranged in order of their size or
magnitude to find out the median.

nl
Median is a positional average. The term position refers to the place of a value in
the series, where the place of median is such that it is equal to the number of items
lying on the either side; therefore it is also called as locative average.

O
Merits of Median
Following are the advantages of median:

ity
●● It is rigidly defined.
●● It is easy to calculate and understand.
●● It can be located graphically.
●● It is not affected by extreme values like the arithmetic mean.

s
●● It can be found by mere inspection. er
●● It can be used for qualitative studies.
●● Even if the extreme values are unknown, median can be calculated if one
knows the number of items.
v
Demerits of Median
ni

Following are the disadvantages of median:

●● In the case of individual observations, the values are to be arranged in order


of their size to locate median. Such an arrangement of data is tedious task if
U

the number of items is large.


●● If the median is multiplied by the number of items, the total value of all the
items cannot be obtained as in the case of the arithmetic average.
ity

●● It is not suitable for complex algebraic or mathematical treatment.


●● It is more affected by sampling fluctuations.

Application of Median
Example – Determine the median from the following –
m

25, 15, 23, 40, 27 25 23 25 20

Solution - Arranging the figures in ascending order –


)A

S.no Value or Size


1 15
2 20
(c

3 23
4 23

Amity Directorate of Distance & Online Education


18 Statistics Management

5 25
Notes

e
6 25
7 25

in
8 27
9 40

nl
Median = 10/2 = 5th term

= 25

O
1.3.3 Mode - Intro and Application
The word “mode” is derived from the French word “1a mode” which means fashion.

ity
So it can be regarded as the most fashionable item in the series or the group.

Croxtan and Cowden regard mode as “the most typical of a series of values”. As a
result it can sum up the characteristics of a group more satisfactorily than the arithmetic
mean or median.

s
Mode is defined as the value of the variable occurring most frequently in a
er
distribution. In other words it is the most frequent size of item in a series.

Merits of Mode
v
The following are the merits of mode:

●● The most important advantage of mode is that it is usually on an actual value.


ni

●● In the case of discrete series, mode can be easily located by inspection.


●● Mode is not affected by extreme values.
U

●● Mode can be determined even if extreme values are not given.


●● It is easy to understand and this average is used by people in their every day
speech.
ity

Demerits of Mode
The following are the demerits of mode:

●● It is not based on all the observation of the data


●● In a number of cases there will be more than one mode in the series.
m

●● If mode is multiplied by the number of items, the product will not be equal to
the total value of the items.
)A

●● It will not truly represent the group if there are a small number of items of the
same size in a large group of items of different sizes
●● It is not suitable for further mathematical treatment

Applications of Mode
(c

Mode in Ungrouped Data

Amity Directorate of Distance & Online Education


Statistics Management 19

Individual Series
Notes

e
The mode of this series can be obtained by mere inspection. The number which
occurs most often is the mode.

in
Illustration - 1

Locate mode in the data 7, 12, 8, 5, 9, 6, 10, 9, 4, 9, 9

nl
Solution:
On inspection, it is observed that the number 9 has maximum frequency i.e.,

O
repeated maximum of 4 times than any other number. Therefore mode (Z) = 9

Discrete Series
The mode is calculated by applying grouping and analysis table.

ity
●● Grouping Table: Consisting of six columns including frequency column,
1st column is the frequency 2nd and 3rd column is the grouping two
way frequencies and 4th, 5th and 6th column is the grouping three way
frequencies.

s
●● Analysis table: consisting of 2 columns namely tally bar and frequency
er
Steps in Calculating Mode in Discrete Series
The following steps are involved in calculating mode in discrete series:
v
●● Group the frequencies by two’s.
●● Leave the frequency and group the other frequencies in two’s.
ni

●● Group the frequencies in threes.


●● Leave the frequency of the first size and add the frequencies of other sizes in
three’s.
U

●● Leave the frequencies of the first two sizes and add the frequencies of the
other sizes in threes.
●● Prepare an analysis table to know the size occurring the maximum number
ity

of times. Find out the size, which occurs the largest number of times. That
particular size is the mode.

Continuous Series
m

The following steps are involved in calculating mode in continuous series.

Find out the modal class. Modal class can be easily found out by inspection. The
group containing maximum frequency is the modal group. Where two or more classes
)A

appear to be a modal class group, it can be decided by grouping process and preparing
an analyzed table as was discussed in question number 2.102.

The actual value of mode is calculated by applying the following formula.

Mo = l + fm – f1 / 2fm – f1 – f2 . i
(c

Amity Directorate of Distance & Online Education


20 Statistics Management

1.3.4 Partition values - Quartiles and Percentiles


Notes

e
A percentile is the value below which a percentage of data falls.

in
Example: You are the fourth tallest person in a group of 20

80% of people are shorter than you:

That means you are at the 80th percentile.

nl
If your height is 1.65m then “1.65m” is the 80th percentile height in that group.

Quartiles are the values that split data into quarters. Quartiles are values that divide

O
a (part of a) data table into four groups containing an approximately equal number of
observations. The total of 100% is split into four equal parts: 25%, 50%, 75% and 100%.

The Quartiles also divide the data into divisions of 25%, so:

ity
●● Quartile 1 (Q1) can be called the 25th percentile
●● Quartile 2 (Q2) can be called the 50th percentile
●● Quartile 3 (Q3) can be called the 75th percentile

s
Example:

For 1, 3, 3, 4, 5, 6, 6, 7, 8, 8:
er
●● The 25th percentile = 3
●● The 50th percentile = 5.5
v
●● The 75th percentile = 7
The percentiles and quartiles are computed as follows:
ni

1. The f-value of each value in the data table is computed:


f = i–1
n–2
U

where i is the index of the value, and n the number of values.


2. The first quartile is computed by interpolating between the f-values immediately
below and above 0.25, to arrive at the value corresponding to the f-value 0.25.
ity

3. The third quartile is computed by interpolating between the f-values immediately


below and above 0.75, to arrive at the value corresponding to the f-value 0.75.
4. Any other percentile is similarly calculated by interpolating between the
m

appropriate values.

1.3.5 Measures of Dispersion - Range - intro and Application


)A

A measure of dispersion or variation in any data shows the extent to which the
numerical values tend to spread about an average. If the difference between items is
small, the average represents and describes the data adequately. For large differences
it is proper to supplement information by calculating a measure of dispersion in addition
to an average. It is useful to determine data for the knowledge it may serve:
(c

●● To compare the current results with the past results.

Amity Directorate of Distance & Online Education


Statistics Management 21

●● To compare two are more sets of observations.


Notes

e
●● To suggest methods to control variation in the data.
A study of variations helps us in knowing the extent of uniformity or consistency in

in
any data. Uniformity in production is an essential requirement in industry. Quality control
methods are based on the laws of dispersion.

nl
Absolute and Relative Measures of Dispersion
The measures of dispersion can be either ‘absolute’ or “relative”. Absolute
measures of dispersion are expressed in the same units in which the original data

O
are expressed. For example, if the series is expressed as Marks of the students in a
particular subject; the absolute dispersion will provide the value in Marks. The only
difficulty is that if two or more series are expressed in different units, the series cannot
be compared on the basis of dispersion.

ity
‘Relative’ or ‘Coefficient’ of dispersion is the ratio or the percentage of a measure
of absolute dispersion to an appropriate average. The basic advantage of this measure
is that two or more series can be compared with each other despite the fact they are
expressed in different units.

s
A precise measure of dispersion is one that gives the magnitude of the variation
er
in a series, i.e. it measures in numerical terms, the extent of the scatter of the values
around the average.

When dispersion is measured in terms of the original units of a series, it is absolute


v
dispersion or variability. It is difficult to compare absolute values of dispersion in
different series, especially when the series in different units or have different sets of
ni

values. A good measure of dispersion should have properties similar to those described
for a good measure of central tendency.

Measures of Dispersion Relative Variability


U

The range Relative range Relative range

The Quartile Deviation Deviation Relative Quartile Deviation


ity

The Mean Deviation Deviation Relative Mean deviation

The Median Deviation Deviation Coefficient of Variation

The Standard Deviation


m

Graphical Method

Range
)A

Definition: The ‘Range’ of the data is the difference between the largest value of
data and smallest value of data.

This is an absolute measure of variability. However, if we have to compare two sets


of data, ‘Range’ may not give a true picture. In such case, relative measure of range,
(c

called coefficient of range is used. This is given by,

Amity Directorate of Distance & Online Education


22 Statistics Management

Formulae: Range = L-S


Notes

e
Where L – Largest value and S- Smallest Value

in
In individual observations and discrete series, L and S are easily identified. In
continuous series, the following two methods are used as follows:

Method 1: L - Upper boundary of the highest class.

nl
S - Lower boundary of the lowest class.

Method 2: L - Mid value of the highest class.

O
S - Mid Value of the lowest class.

Example: Find the set of observations 10 5 8 11 12 9

Solution: L = 12 S = 5

ity
Range = L – S
= 12 – 5
=7

s
Coefficient of range = L – S / L + S
er
= 12 – 5/ 12 + 5
= 7/17
= 0.4118
v
Interquartile Range and Deviations
ni

Inter-quartile range and deviations are described in the following sub sections.

Inter-quartile Range
U

Inter-quartile range is a difference between upper quartile (third quartile) and lower
quartile (first quartile). Thus, Inter Quartile Range = (Q3 - Q1)
ity

Quartile deviation
Quartile Deviation is the average of the difference between upper quartile and
lower quartile.

Thus, Quartile Deviation = QD = (Q3 - Q1)/2


m

Quartile Deviation (QD) also gives the average deviation of upper and lower
quartiles from Median.
)A

QD = (Q3 - Q1)/2 = Q3 - Q1 / Q3 + Q1

Example: Weekly wages of a labourers is given below. Calculate Q.D. and


coefficient of Q.D.

Weekly wages 100 200 400 500 600 Total


(c

No. of Weeks: 5 8 21 12 6 52

Amity Directorate of Distance & Online Education


Statistics Management 23

Solution:
Notes

e
Weekly wages No. of Weeks: Cumulative Frequency

in
100 5 5

200 8 13

400 21 34

nl
500 12 46

600 6 52

O
N = 52

Q1 = N+1 /4

ity
= 52+1/4
13.25

Q1 = 13th value + 0.25 (14th value – 13th value)

s
= 200 + 0.25 (400-200)
= 200 + 0.25 × 200
er
= 200 + 50
= 250
v
Q3 = 3(N+1 /4)
ni

= 3 x 13.25
= 39.75
U

Q3 = 39th value + 0.75 (40th value – 39th value)


= 500 + 0.75 (500-500)
= 500 + 0.75 X 0
ity

= 500.

Q.D. = Q3 - Q1 / 2
= 500 – 250/2
m

= 250/2
= 125
)A

Coefficient of Q.D. = Q3 - Q1/ Q3 + Q1


. .= 500 -250/ 500 + 250
= 250/750
= 0.333
(c

Amity Directorate of Distance & Online Education


24 Statistics Management

1.3.7 Standard Deviation and Variance


Notes

e
Variance is defined as the average of squared deviation of data pointsfrom their mean.

in
When the data constitute a sample, the variance is denoted byσ2x and averaging
is done by dividing the sum of the squared deviation from the mean by ‘n – 1’. When
observations constitute the population, the variance is denoted by σ2 and we divide by
N for the average

nl
Different formulas for calculating variance:
n

O
(xi–x)2
i=1
Sample Variance Var (x) = σx2 =
n–1

ity
(xi–µ)2
Population Variance Var (x) = σ2 =
N
Where,

xi for i = 1, 2, ..., n are observation values

s
x = Sample mean
n = Sample size
er
µ = Population mean
N = Population size
v
Population Variance is,
ni

∑ ( xi − µ ) 2
Var (x) = σ =
2

N
n n n n
∑ ( x 2i − 2 µ xi + µ 2 ) ∑ ( x 2i ) − 2 µ ∑ xi + µ 2 ∑ (1)
U

= =
=i 1 =i 1 =i 1 =i 1

N N
n
∑ x 2i
ity

= i =1
− µ2
N
Var (x) = E(X2)–[E(X)]2
m

Standard deviation
Definition: Standard Deviation is the root mean square deviation of the values
from their arithmetic mean. S.D. is denoted by symbol σ (read sigma). The Standard
)A

Deviation (SD) of a set of data is the positive square root of the variance of the set.
This is also referred as Root Mean Square (RMS.) value of the deviations of the data
points. SD of sample is the square root of the sample variance i.e. equal to σx and the
Standard Deviation of a population is the square root of the variance of the population
and denoted by σ.
(c

Amity Directorate of Distance & Online Education


Statistics Management 25

The properties of standard deviation are:


Notes

e
●● It is the most important and widely used measure of variability.
●● It is based on all the observations.

in
●● Further mathematical treatment is possible.
●● It is affected least by any sampling fluctuations.

nl
●● It is affected by the extreme values and it gives more importance to the values
that are away from the mean.
●● The main limitation is; we cannot compare the variability of different data sets

O
given in different units

Formula for Calculating S.D.


For the set of values x1, x2 ........Xn

ity
2
Ex 2  ∑ x 
=σ −
n  n 

s
If an assumed value A is taken for mean and d = X-A, then
er
2
Ed 2  ∑ d 
=σ −
n  n 
v
For a frequency distribution
ni

2
Efd 2  ∑ fd 
σ= − ×C
N  N 
U

Where d = X–A and C is the true class interval


ity

N = Total frequency

Application of Standard Deviation


Example : Find the standard deviation for the following data:
m

Class Interval 0-10 10-20 20-30 30-40 40-50 50-60 60-70

Frequency 6 14 10 8 1 3 8
)A

Solution: Direct Method


(c

Amity Directorate of Distance & Online Education


26 Statistics Management

Class Class Frequency Fi x mi di = (mi-A) di2 fi x di2


Notes

e
Interval Mark mi

Jan 5 6 30 -25 625 3750

in
Feb 15 14 210 -15 225 3150

March 25 10 250 -5 25 250

nl
April 35 8 280 5 25 200

May 45 1 45 15 225 225

O
June 55 3 165 25 625 1875

N=6 65 8 520 35 1225 9800

Σfi = 50 1500 19250

ity
Mean = 1500/50 = 30

SD = /19250/50 = 19.62

s
Combined Standard Deviation
er
Standard Deviation of Combined Means

The mean and S.D. of two groups are given in the following table
v
Group Mean S.D. Size
I X1 σ1 n1
ni

II X2 σ2 n2

Let X and σ be the mean and S.D. of teh combined group of (n1 + n2) items. Then
U

X and σ are determined by the formulae.

n1x+n 2 x 2
X=
n1 +n 2
ity

n1σ 12 +n 2σ 2 2 + n1d12 +n 2 d 2 2 n1σ 12 +n 2σ 2 2 + n1d12 +n 2 d 2 2


=σ2 = (or) σ
n1 +n 2 n1 +n 2
m

where d1 = x1 -x;d 2 =x 2 − x (or)d1 = x1 − x ;d 2 =x2 − x


These results can be extended to 3 samples as follows:
)A

n1x1 +n 2 x 2 + n 3 x 3
X=
n1 +n 2 + n 3
n1σ 12 +n 2σ 2 2 +n 3σ 32 + n1d12 +n 2 d 2 2 +n 3d 32
(c

σ2 =
n1 +n 2 + n 3

Amity Directorate of Distance & Online Education


Statistics Management 27

1.3.8 Relative measure of dispersion - Coefficient of variation


Notes

e
It is defined as the ratio of SD and mean, multiplied by 100.

in
CV =σ/ μ×100

This is also called as variability. Smaller value of CV indicates greater stability and
lesser variability.

nl
Example: Two batsmen A and B made the following scores in the preliminary round
of World Cup Series of cricket matches.

O
A 14, 13, 26, 53, 17, 29, 79, 36, 84 and 49

B 37, 22, 56, 52, 28, 30, 37, 48, 20 and 40

Who will you select for the final? Justify your answer?

ity
Solution: We will first calculate mean, standard deviation and Karl Pearson’s
coefficient of variation. We will select the player based on the average score as well
as consistency. We not only want the player who has been scoring at high average but
also doing it consistently. Thus, the probability of his playing good inning in final is high.

s
For Player ‘A’ (Using Direct Method) er
Score xi Deviation (xi - µ) (xi - µ)2 Σ xi2

14 -26 676 196

13 -27 729 169


v
26 -14 196 676
ni

53 13 169 2809

17 -23 529 289


U

29 -11 121 841

79 39 1521 6241

36 -4 16 1296
ity

84 44 1936 7056

49 9 81 2401

Σ xi = 400 Σ (xi - µ) = 0 Σ (xi-µ)2 = 5974 Σ xi2 = 21974


m

Now,
)A

Mean =

10
∑ (xi-µ ) 2 5974
Variance = Var(x) = i −1 = =597.4
(c

N 10

Amity Directorate of Distance & Online Education


28 Statistics Management

Standard Deviation = σ = Var(x) = 597.4=24.44


Notes

e
Example: Two batsmen A and B made the following scores in the preliminary
round of World Cup Series of cricket matches.

in
A 14, 13, 26, 53, 17, 29, 79, 36, 84 and 49

B 37, 22, 56, 52, 28, 30, 37, 48, 20 and 40

nl
Who will you select for the final? Justify your answer?

Solution: We will first calculate mean, standard deviation and Karl Pearson’s

O
coefficientof variation. We will select the player based on the average score as well as
consistency. We not only want the player who has been scoring at high average but
also doing it consistently. Thus, the probability of his playing good inning in final is high.

ity
For Player ‘A’ (Using Direct Method)

Score xi Deviation (xi - µ) (xi - µ)2 Σ xi2

14 -26 676 196

s
13 -27 729 169

26 -14 196 676


er
53 13 169 2809

17 -23 529 289


v
29 -11 121 841

79 39 1521 6241
ni

36 -4 16 1296

84 44 1936 7056
U

49 9 81 2401

Σ xi = 400 Σ (xi - µ) = 0 Σ (xi-µ)2 = 5974 Σ xi2 = 21974


ity

Now,

Mean =
m

10
∑ (xi-µ ) 2 5974
Variance = Var(x) = i −1 = =597.4
N 10
)A

Standard Deviation = σ = Var(x) = 597.4=24.44

Key Terms
(c

●● Sample: A sample consists one or more observations drawn from the population.
Sample is the group of people who actually took part in your research.

Amity Directorate of Distance & Online Education


Statistics Management 29

●● Population: A population includes all of the elements from a set of data.


Notes

e
Population is the broader group of people that you expect to generalize your study
results to.

in
●● Frequency Polygon: These are the frequencies plotted against the mid-points of
the class-intervals and the points thus obtained are joined by line segments
●● Bar Diagram: Only length of the bar is taken into account but not the width. In

nl
other wards bar is a thick line whose width is shown merely, but length of the bar is
taken into account is called one-dimensional diagram.
●● Simple Bar Diagram: It represents only one variable. Since these are of the same

O
width and vary only in lengths (heights), it becomes very easy for comparative
study. Simple bar diagrams are very popular in practice.
●● Percentage bar diagram: the length of the entire bar kept equal to 100

ity
(Hundred). Various segment of each bar may change and represent percentage
on an aggregate.
●● Range: The ‘Range’ of the data is the difference between the largest value of data
and smallest value of data.

s
Check your progress er
1. A frequency polygon is constructed by plotting frequency of the class interval and the
a) Lower limit of the class
b) Upper limit of the class
v
c) Any value of the class
ni

d) Middle limit of the class


2. Numerical methods and graphical methods are specialized procedures used in
a) Social Statistics
U

b) Descriptive Statistics
c) Education Statistics
ity

d) Business Statistics
3. A histogram consists of a set of
a) Adjacent triangles
b) Adjacent rectangles
m

c) Non adjacent rectangles


d) Adjacent squares
)A

4. Component bar charts are used when data is divided into


a) Circles
b) Squares
(c

c) Parts
d) Groups

Amity Directorate of Distance & Online Education


30 Statistics Management

5. A circle in which sector represents various quantities is called


Notes

e
a) Histogram
b) Pie chart

in
c) Frequency Polygon
d) O give

nl
Questions and Exercises
1. What do you mean by statistics ?

O
2. What are the various type of bar diagrams ?
3. What are the merits of mean median and mode
4. What do you understand by Standard deviation and combined standard deviation

ity
5. Find the standard deviation for the following data:

Class Interval 0-10 10-20 20-30 30-40 40-50 50-60 60-70

Frequency 8 14 10 6 4 3 8

s
Check your progress
1. d)
er
Middle limit of the class
2. b) Descriptive Statistics
3. b) Adjacent rectangles
v
4. c) Parts
ni

5. b) Pie chart

Further Readings
U

1. Richard I. Levin, David S. Rubin, Sanjay Rastogi Masood Husain Siddiqui,


Statistics for Management, Pearson Education, 7th Edition, 2016.
2. Prem.S.Mann, Introductory Statistics, 7th Edition, Wiley India, 2016.
ity

3. Gareth James, Daniela Witten, Trevor Hastie, Robert Tibshirani, An


Introduction to Statistical Learning with Applications in R, Springer, 2016.

Bibliography
m

1. Srivastava V. K. etal – Quantitative Techniques for Managerial Decision Making,


Wiley Eastern Ltd
2. Richard, I.Levin and Charles A.Kirkpatrick – Quantitative Approaches to Management,
)A

McGraw Hill, Kogakusha Ltd.


3. Prem.S.Mann, Introductory Statistics, 7th Edition, Wiley India, 2016.
4. Budnik, Frank S Dennis Mcleaavey, Richard Mojena – Principles of Operation
Research - AIT BS New Delhi.
(c

5. Sharma J K – Operation Research- theory and applications-Mc Millan,New Delhi

Amity Directorate of Distance & Online Education


Statistics Management 31

6. Kalavathy S. – Operation Research – Vikas Pub Co


Notes

e
7. Gould F J – Introduction to Management Science – Englewood Cliffs N J Prentice Hall.
8. Naray J K, Operation Research, theory and applications – Mc Millan, New Dehi.

in
9. Taha Hamdy, Operations Research, Prentice Hall of India
10. Tulasian: Quantitative Techniques: Pearson Ed.

nl
11. Vohr.N.D. Quantitative Techniques in Management, TMH.
11. Stevenson W.D, Introduction to Management Science, TMH.

O
s ity
v er
ni
U
ity
m
)A
(c

Amity Directorate of Distance & Online Education


32 Statistics Management

Module-2: Probability Theory


Notes

e
Learning Objective:

in
●● To get familiarize with business problems associated with the concept of
probability and probability distributions

nl
●● To understand the MS Excel applications of Binomial, Poisson and Normal
probabilities

O
Learning Outcome:
At the end of the course, the learners will be able to –

●● Compute Binomial, Poisson and Normal probabilities through MS Excel

ity
●● Understand various theorems and principles of probability

“defined statistics as the science of estimates and probabilities”


Prof. Boddington-

s
2.1.1 Probability – Introduction
er
A probability is the quantitative measure of risk. Statistician I.J. Good suggests, “The
theory of probability is much older than the human species, since the assessment of
uncertainty incorporates the idea of learning from experience, which most creatures do.”
v
Probability and sampling are inseparable parts of statistics. Before we discuss
probability and sampling distributions, we must be familiar with some common terms
ni

used in theory of probability. Although these terms are commonly used in business, they
have precise technical meaning.

Random Experiment: In theory of probability, a process or activity that results in


U

outcomes under study is called experiment, for example, sampling from a production lot.
Random experiment is an experiment whose outcome is not predictable in advance. There
is a chance or risk (sometimes also called as uncertainty) associated with each outcome.
ity

Sample Space: It is a set of all possible outcomes of an experiment. It is usually


represented as S.

Example: If the random experiment is rolling of a die, the sample space is a set, S
= {1, 2, 3, 4, 5, 6}.
m

Similarly, if the random experiment is tossing of three coins, the sample space is, S
= {HHH, HHT, HTH, THH, HTT, THT, TTH, TTT} with total of 8 possible outcomes. (H is
heads, and T is Tails showing up.)
)A

If we select a random sample of 2 items from a production lot and check them for
defect, the sample space will be S = {DD, DS, DR, RS, RR, SS} where D stands for
defective, S stands for serviceable and R stands for re-workable.
(c

●● Event: One or more possible outcomes that belong to certain category of our
interest are called as event. A sub set E of the sample space S is an event. In
other words, an event is a favorable outcome.

Amity Directorate of Distance & Online Education


Statistics Management 33

●● Event space: It is a set of all possible events. It is usually represented as E. Note


Notes

e
that usually in probability and statistics; we are interested in number of elements in
sample space and number of elements in event space.

in
●● Union of events: If E and F are two events, then another event defined to include
all outcomes that are either in E or in F or in both is called as a union of events E
and F. It is denoted as E U F.

nl
●● Intersection of events: If E and F are two events, then another event defined to
include all outcomes that are in both E and F is called as an intersection of events
E and F. It is denoted as E∩ F.

O
●● Mutually exclusive events: The events E and F are said to be mutually exclusive
events if they have no outcome of the experiment common to them. In other
words, events E and F are said to be mutually exclusive events if E∩ F = φ, where
φ is a null or empty set.

ity
●● Collectively exhaustive events: The events are collectively exhaustive if their
union is the sample space.
●● Complement of event: Complement of an event E is an event which consists of

s
all outcomes that are not in the E. It is denoted as EC. Thus, E ∩ EC = φ and E U
EC = S er
2.1.2 Types of Events
A probability event can be defined as a set of outcomes of an experiment. In other
v
words, an event in probability is the subset of the respective sample space. A random
experiment ‘s entire potential set of outcomes is the sample space or the individual
ni

space of that encounter. The probability of an occurrence happening is called chance.


The likelihood of any event happening lies between 0 and 1.

For example –
U

The sample space for the tossing of three coins simultaneously is given by:

S = {(T, T, T), (T, T, H), (T, H, T), (T, H, H), (H, T, T), (H, T, H), (H, H, T), (H, H, H)}
ity

Suppose, if we want to find only the outcomes which have at least two heads; then
the set of all such possibilities can be given as:

E = { (H , T , H) , (H , H ,T) , (H , H ,H) , (T , H , H)}

Thus, an event is a subset of the sample space, i.e., E is a subset of S.


m

There could be a lot of events associated with a given sample space. For any
event to occur, the outcome of the experiment must be an element of the set of event E.
)A

By event it is meant one or more than one outcomes.

Example Events:

●● Getting a Tail when tossing a coin is an event


●● Rolling a “5” is an event.
(c

An event can include several outcomes:


●● Choosing a “King” from a deck of cards (any of the 4 Kings) is also an event

Amity Directorate of Distance & Online Education


34 Statistics Management

●● Rolling an “even number” (2, 4 or 6) is an event


Notes

e
Events can be:

●● Independent (each event is not affected by other events),

in
●● Dependent (also called “Conditional”, where an event is affected by other events)
●● Mutually Exclusive (events can’t happen at the same time)

nl
2.1.3 Algebra of Events
Events are the outcome of an experiment. The likelihood of an event occurring is

O
the ratio of number of favourable events to total number of occurrences. Often they
will happen together with two things occurring or it can happen that just one of them is
going to happen. Event algebra can offer an event that performs certain operations over
two given events. The operations are union, intersection, complement and difference of

ity
two events. As events are the subset of sample space, these operations are performed
as set operations.

Complementary Events

s
For an event AA, there is a complimentary event BB such that BB represent the
set of events which are not in the set AA. For example, if two coins are tossed together
er
then the sample space will be {HT,TH,HH,TT}{HT,TH,HH,TT}. Let AA be the event of
getting one head, then the set AA = {HT,TH}{HT,TH}. The complementary events of A,
BA, B = {HH,TT}{HH,TT}.
v
Events with AND
ni

AND stands for the intersection of two sets. An event is the intersection of two
events if it has got the members present in both the event. For example, if a pair of dice
is rolled then the sample space will have 3636 members. Suppose AA is the event of
getting both dice having same members and BB is the event having the sum as 66.
U

AA = {(1,1),(2,2),(3,3),(4,4),(5,5),(6,6)}{(1,1),(2,2),(3,3),(4,4),(5,5),(6,6)}
BB = {(3,3),(1,5),(5,1),(2,4),(4,2)}{(3,3),(1,5),(5,1),(2,4),(4,2)}
ity

AA AND BB = {(3,3)}{(3,3)}

Events with OR
OR stands for union of two sets. An event is called union of two events if it has got
m

members present in either of the sets. For example, if two coins are tossed together the
sample space, SS = {HT,TH,TT,HH}{HT,TH,TT,HH}. Let event AA be the event having
only one head and event BB be the event having two heads.
)A

AA = {HT}{HT}
BB = {HH}{HH}
Union of AA and BB, AA OR BB = {HT,HH}{HT,HH}

Events with BUT NOT


(c

For two events AA and BB, AA but not BB is the event having all the elements of AA
but excluding the elements of BB. This can also be represented as AA - BB. Suppose,

Amity Directorate of Distance & Online Education


Statistics Management 35

there is an experiment of choosing 44 cards from a deck of 5252 cards. The event AA is
Notes

e
having all cards as red cards and event BB is having all cards as king. Then the event
AA but not BB will have all red cards excluding the two kings.

in
2.1.4 Addition Rule of Probability
If one task can be done in n1 ways and other task can be done in n2 ways and

nl
if these tasks cannot be done at the same time, then there are (n1+n2) ways of doing
one of these tasks (either one task or the other). When logical OR is used in deciding
outcomes of the experiment and events are mutually exclusive then the ‘Sum Rule’ is

O
applicable.

The Addition rule of probability states that:

1. If ‘A’ and ‘B’ are any two events then the probability of the occurrence of either

ity
‘A’ or ‘B’ is given by:
P (A U B) = P (A) +P (B) – P (A∩B)
2. If ‘A’ and ‘B’ are two mutually exclusive events then the probability of occurrence
of either A or B is given by

s
P (A U B) = P (A) + P (B) er
Example: An urn contains 10 balls of which 5 are white, 3 black and 2 red. If we
select one ball randomly, how many ways are there that the ball is either white or red?

Solution:
v
Answer is 5 + 2 = 7.
ni

Example: In a triangular series the probability of Indian team winning match with
Zimbawe is 0.7 and that with Australia is 0.4. If the probability of India winning both
matches is 0.3, what is the probability that India will win at least one match so that it can
enter the final?
U

Solution:

Now, given that probability of the Indian team winning the match with
ity

Zimbawe P (A) = 0.7,

Australia P (A) = 0.4 and with

both P (A ∩B) = 0.3


m

Therefore, probability that India will win at least one match is,

P (A U B) = P (A) + P (B) - P (A∩ B)


)A

= 0.7 + 0.4 - 0.3

= 0.8

2.1.5 Multiplication Rule of Probability


(c

Suppose that a procedure can be broken down into a sequence of two tasks. If
there are n1 ways to do first task and n2 ways to do second task after the first task

Amity Directorate of Distance & Online Education


36 Statistics Management

has been done. Then there are (n1 × n2) ways to do the procedure. In general, if r
Notes

e
experiments are to be performed are such that the first outcome can be in n1 ways,
having completed the first experiment the second experiment outcome can be in n2,

in
then similarly outcome of the third experiment can be in n3 ways, and so on. Then there
is a total of n1 × n2 × n3 ×…× nr possible outcomes of the r experiments.

Multiplicative rule is stated as:

nl
If ‘A’ and ‘B’ are two independent events then the probability of occurrence of ‘A’
and ‘B’ is given by:

O
P (A∩B) = P (A) P (B)

It must be remembered that when the logical AND is used to indicate successive
experiments then, the ‘Product Rule’ is applicable.

ity
Example: How many outcomes are there if we toss a coin and then throw a dice?

Answer is 2 × 6 = 12.

Example: It has been found that 80% of all tourists who visit India visit Delhi, 70%

s
of them visit Mumbai and 60% of them visit both.

1. What is the probability that a tourist will visit at least one city?
er
2. Also, find the probability that he will visit neither city.

Solution:
v
Let D indicate visit to Delhi and M denote visit to Mumbai.
Given, P (D) = 0.8, P (M) = 0.7 and P (D ∩M) = 0.6
ni

Probability that a tourist will visit at least one city is,


P (D UM) = P (D) + P (M) - P (D ∩M) = 0.8 + 0.7 - 0.6 = 0.9
U

2. P (D¢ ∩M¢) =1 - P (D UM) =1- 0.9 = 0.1

2.1.6 Conditional, Joint and Marginal Probability


ity

As a measure of uncertainty, probability depends on the information available. If


we know occurrence of say event F, probability of event E happening may be different
as compared to original probability of E when we had no knowledge of the event
F happening. Probability that E occurs given that F has occurred is the conditional
m

probability and denoted by P(E F) . If event F occurs, then our sample space is reduced
to the event space of F. Also now for event E to occur, we must have both events E and
F occur simultaneously. Hence probability that event E occurs, given that event F has
occurred, is equal to the probability of EF (that is E ∩ F) relative to the probability of F.
)A

Thus,

P( EF )
P( E | F ) =
P( F )
Another variation of conditional probability rule is
(c

P( EF ) P( E | F ) × P( F )
=

Amity Directorate of Distance & Online Education


Statistics Management 37

Conditional probability satisfies all the properties and axioms of probabilities. Now
Notes

e
onwards, we would write (E ∩ F) as EF, which is a common convention.

Conditional probability is the probability that an event will occur given that another

in
event has already occurred. If A and B are two events, then the conditional probability of
A given B is written as P (A/B) and read as “the probability of A given that B has already
occurred.”

nl
Example: The probability that a new product will be successful if a competitor
does not launch a similar product is 0.67. The probability that a new product will be
successful in the presence of a competitor’s new product is 0.42. The probability that

O
the competitor will launch a new product is 0.35. What is the probability that the product
will be success?

Solution: Let S denote that the product is successful, L denote competitor will

ity
launch a product and LC denotes competitor will not launch the product. Now, from
given data,

P(S LC ) = 0.67 , P(S|L) = 0.42 , P(L) = 0.35

Hence, P(LC ) = 1− P(L) = 1− 0.35 = 0.65

s
Now, using conditional probability formula, probability that the product will be
er
success P(S) is,

P(S) = P(S L)P(L) + P(S LC )P(LC )

= 0.42 × 0.35 + 0.67 × 0.65 = 0.5825


v
2.1.7 Baye’s Theorem
ni

Consider two events, E and F. whatsoever be the events, we can always say
that the probability of E is equal to the probability of intersection of E and F, plus, the
U

probability of the intersection of E and complement of F. That is,

P (E) = P (E ∩ F) + P (E ∩ F ∩ C)

Baye’s Formula
ity

Let, E and F are events.


E = (E ∩ F) U (E ∩ F ∩ C)

For any element in E, must be either in both E and F or be in E but not in F. (E F)


m

and (E FC) are mutually exclusive, since former must be in F and latter must not in F,
we have by Axiom 3,

P (E) = (E F) + (E FC) = P(E F)×P(F) +P(E Fc )×P(Fc )


)A

= P(E|F)×P(F) +P(E|Fc )×[1− P(F)

Suppose now that E has occurred and we are interested in determining the
probability of Fi has occurred, then using above equations, we have following
proposition.
(c

P( EFi ) P( E | Fi ) × P( Fi )
P(=
Fi | E ) = n for all i = 1,2...n
P( E ) ∑ P( E | Fi ) × P( Fi )
i =1
Amity Directorate of Distance & Online Education
38 Statistics Management

This equation is known as Baye’s’ formula. If we think of the events Fi as being


Notes

e
possible ‘hypothesis’ about proportionality of some subject matter, say market shares of
a competitors, then Baye’s’ formula gives us how these should be modified by the new

in
evidence of the experiment, says a market survey.

Example: A bin contains 3 different types of lamps. The probability that a type 1
lamp will give over 100 hours of use is 0.7, with the corresponding probabilities for type

nl
2 and 3 lamps being 0.4 and 0.3 respectively. Suppose that 20 per cent of the lamps in
the bin are of type 1, 30 per cent are of type 2 and 50 per cent are of type 3. What is the
probability that a randomly selected lamp will last more than 100 hours? Given that a
selected lamp lasted more than 100 hours, what are the conditional probabilities that it

O
is of type 1, type 2 and type 3?

Solution: Let type 1, type 2 and type 3 lamps be denoted by T1, T2 and T3
respectively. Also, we denote S if a lamp lasts more than 100 hours and SC if it does

ity
not. Now, as per given data,

P(S|T1) = 0.7 , = P(S|T2 ) 0.4 , = P(S|T3 ) 0.3

= P(T1 ) 0.2 , = P(T2 ) 0.3 , = P(T3 ) 0.5

s
Now, using conditional probability formula,
er
= P(S1) = P(S|T1 )P(T1 ) P(S|T2 )P(T2 ) P(S|T3 )P(T3)

= 0.7 × 0.2 + 0.4 × 0.3 +0.3 × 0.5

= 0.41
v
(b) Now, using Bayes’ formula
ni

P( S | T1 ) P(T1 ) 0.7 × 0.2


P(T1 | S )
= = = 0.341
P( S ) 0.41
U

P( S | T2 ) P(T2 ) 0.4 × 0.3


P=
(T2 | S ) = = 0.293
P( S ) 0.41

P ( S | T3 ) P (T3 ) 0.3 × 0.5


ity

P=
(T3 | S ) = = 0.366
P( S ) 0.41

2.2.1 Random Variables - Introduction


In many practical situations, the random variable of interest follows a specific pattern.
m

Random variables are often classified according to the probability mass function in
case of discrete, and probability density function in case of continuous random variable.
When the distributions are entirely known, all the statistical calculations are possible. In
)A

practice, however, the distributions may not be known fully. But it can be approximated
that the random variable to one of the known types of standard random variables by
examining the processes that make it random. These standard distributions are also
called ‘probability models’ or sample distributions. Various characteristics of distribution
like mean, variance, moments, etc. can be calculated using known closed formulae. We
(c

will study some of the common types of probability distributions. The normal distribution is
the backbone of statistical inference and hence we will study it in more detail.

Amity Directorate of Distance & Online Education


Statistics Management 39

There are broadly four theoretical distributions which are generally applied in
Notes

e
practice. They are:

1. Bernoulli distribution

in
2. Binomial distribution
3. Poisson distribution

nl
4. Normal distribution

2.2.2 Mean/ Expected Value of Random Variable

O
In probability theory, the expected value of a random variable is a generalization
of the weighted average and intuitively is the arithmetic mean of a large number of
independent realizations of that variable. The expected value is also known as the
expectation, mathematical expectation, mean, average, or first moment.

ity
A Random Variable is a set of possible values from a random experiment. The
mean of a discrete random variable X is a weighted average of the possible values
that the random variable can take. Unlike the sample mean of a group of observations,
which gives each observation equal weight, the mean of a random variable weights

s
each outcome xi according to its probability, pi. The common symbol for the mean (also
known as the expected value of X) is u.
er
It is defined as –

ux = x1p1 + x1p1 + … + kxpk


v
= ∑xipi
ni

The formula changes slightly according to what kinds of events are happening. For
most simple events, either the Expected Value formula of a Binomial Random Variable
or the Expected Value formula for Multiple Events is used.
U

2.2.3 Variance and Standard Deviation of Random Variable


The variance is a numerical description of the spread, or the dispersion, of the
random variable. That is, the variance of a random variable X is a measure of how
ity

spread out the values of X are, given how likely each value is to be observed.

Variance: Var(X)

The Variance is:


m

Var(X) = Σx2p − μ2

To calculate the Variance:


)A

●● square each value and multiply by its probability


●● sum them up and we get Σx2p
●● then subtract the square of the Expected Value μ2
Standard Deviation: σ
(c

The Standard Deviation is the square root of the Variance:

Amity Directorate of Distance & Online Education


40 Statistics Management

σ = √Var(X)
Notes

e
2.2.4 Binomial Distribution - Introduction

in
Usually we often conduct many trials, which are independent and identical.
Suppose we perform n independent Bernoulli trials (each with two possible outcomes
and probability of success p) each of which results in a success with probability p and

nl
probability of failure (1 – p). If random variable X represents the number of successes
that occur in n trials (order of successes not important), then X is said to be a Binomial
random variable with parameters (n, p).

O
Note that Bernoulli random variable is a Binomial random variable with parameter
(1, p) i.e. n = 1. The probability mass function of a binomial random variable with
parameters (n, p) is given by,

ity
P(X = i) = (1 – p)n – 1 for i = 0, 1, 2, ....., n

Expected value and variance for Binomial random variable are,

μ = E[X] = np

s
Var = [X] = np(1 – p)
er
2.2.5 Binomial Distribution - Application
When to use binomial distribution is an important decision. Binomial distribution
can be used when following conditions are satisfied:
v
●● Trials are finite (and not very large), performed repeatedly for ‘n’ times.
ni

●● Each trial (random experiment) should be a Bernoulli trial, the one that results
in either success or failure.
●● Probability of success in any trial is ‘p’ and is constant for each trial.
U

●● All the trials are independent.


These trials are usually the experiments of selection ‘with replacement’. In cases
where the number of the population is very large, drawing a small sample from it
does not change probability of success significantly. Hence, we could consider the
ity

distribution as Bernoulli distribution.

Following are some of the real life examples of applications of binomial distribution.

●● Number of defective bulbs in a lot of n items produced by a machine.


m

●● Number of female births out of n births in a hospital.


●● Number of correct answers in a multiple-choice test.
●● Number of seeds germinated in a row of n planted seeds.
)A

●● Number of recaptured fish n a sample of n fish.


●● Number of missiles hitting the targets out of n fired.
Example: Suppose that the probability that a light in a classroom will be burnt
out is 1/3. The classroom has in all five lights and it is unusable if the number of lights
(c

burning is less than two. What is the probability that the class room is unusable on a
random occasion?

Amity Directorate of Distance & Online Education


Statistics Management 41

1
Solution: This a case of binomial distribution with n = 5 and p = Notes

e
3
Class room is unusable if the number of burnouts is 4 or 5. That is i = 4 or 5. Noting

in
that,

 n
P( X =+ i )   ( P)i (1 − P) n −i
4) P( X ==
i 

nl
Thus, the probability that the class room is unusable on a random occasion is,

O
4 5 0
 5  1   2   5  1   2 
P( X =4) + P( X =
5) =
 4  3   3   5  3   3  =
+ 0.0412 + 0.00412 =
0.04532

Example: It is observed that 80% of T.V. vuewers watch Aap Ki Adalat programme.

ity
What is the probability that at least 80% of the viewers in a random sample of 5 watch
this programme?

Solution: This is the case of binomial distribution with n = 5 and p = 0.8. Also i = 4
or 5.

s
Probability of at least 80% of the viewers in a random sample of 5 watches this
programme.
 5 4 1
er 5 5 0
P ( X > 4) + P ( X =4) + P ( X =
5) =
 4 (0.8) (0.2) +  5 (0.8) (0.2) =0.4096 + 0.3277

= 0.7373
v
We must remember that a cumulative binomial probability refers to the probability
that the binomial random variable falls within a specified range (e.g., is greater than or
ni

equal to a stated lower limit and less than or equal to a stated upper limit).

2.2.6 Poisson Distribution-Introduction


U

A random variable X, taking one of the values 0, 1, 2, is said to be a Poisson


random variable with parameter l, if for some l > 0,
ity

P(X = i) = eλ/I For i = 0, 1, 2 …

P(X = i) is a probability mass function (p.m.f.) of the Poisson random variable. Its
expected value and variance are,

μ = E[X] = λ
m

Var(X) = λ

Poisson random variable has wide range of applications. It can also be used as
)A

an approximation for a binomial random variable with parameters if n is large and p is


small enough to make the product np of moderate size. In this case, we call np = l an
average rate. Some of the common examples where Poisson random variable can be
used to define the probability distribution are:

1. Number of accidents per day on expressway.


(c

2. Number of earthquakes occurring over fixed time span.

Amity Directorate of Distance & Online Education


42 Statistics Management

3. Number of misprints on a page.


Notes

e
4. Number of arrivals of calls on telephone exchange per minute.
5. Number of interrupts per second on a server.

in
2.2.7 Poisson Distribution-Application

nl
Procedure for Using Cumulative Poisson Probabilities Table
Poisson p.m.f. for given l and i can be easily calculated using scientific calculators.
But while calculating cumulative probabilities i.e., ‘c.d.f.’, manual calculations become

O
too tedious. In such cases, we can use the Cumulative Poisson Probabilities.
Cumulative

Poisson Probabilities is referred as follows:

ity
●● To find cumulative binomial probability for given n, i and p
●● Looking at the given value of l i.e., average rate in the first column of the
table.
●● In first row look for the value of i, the number of successes.

s
●● Locate the cell in the column of i value and row of l value. The contained in
er
this cell is the value of cumulative Poisson probability.
Example: Average number of accidents on express way is five per week. Find the
probability of exactly two accidents that would take place in a given week. Also find the
probability of at the most two accidents that will take place in next week.
v
Solution:
ni

Method I

Using binomial distribution with parameters (n=10, p=0.1) we get,


U

P{X<1} = p(0) + p(1) = 10C0(0.1)0(0.1)10 + 10C1 (0.1)1(0.1)9 = 0.7361

Or, Using Cumulative Binomial Probabilities Table

We can read for n=10, p=0.1 and i=1, the cumulative probability as 0.7361.
ity

Method II

Using Poisson distribution (as approximation ot Binomial distribution) with


parameter /=10x0.1=1 we get,
m

P{X<1}=p(0) + p(1) = [e-1 (/) 0] / 0! + [e-1 (/) 1]/1! = e-1 + e-1 = 0.7358
Or, Using Cumulative Poisson Probabilities Table
)A

We can read for /=1, and i=1 the cumulative probability as 0.7358.

Note: That Poisson distribution gives reasonable good approximation.

Example: Average time for updating a passbook by a bank clerk is 15 seconds.

Someone arrives just ahead of you. Find the probability that you will have to wait
(c

for your turn,

1. More than 1 minute.


Amity Directorate of Distance & Online Education
Statistics Management 43

2. Less than ½ minutes.


Notes

e
Solution:

Now, λ = 60/15 = 4 passbooks per minute

in
P {X > 1} = 1 – F (1) = e-4 = 0.0183
P {X < 0.5} = F (0.5) = 1 - e-2

nl
= 1 - 0.1353
= 0.8647

O
2.2.8 Normal Distribution- Introduction including empirical rule
Normal random variable and its distribution is commonly used in many business
and engineering problems. Many other distributions like binomial, Poisson, beta, chi-

ity
square, students, exponential, etc., could also be approximated to normal distribution
under specific conditions. (Usually when sample size is large.)

If random variable is affected by many independent causes, and the effect of each
cause is not significantly large as compared to other effects, then the random variable

s
will closely follow the normal distribution, e.g., weights of coffee filled in packs, lengths
of nails manufactured on a machine, hardness of ball bearing surface, diameters of
er
shafts produced on lathe, effectiveness of training programme on the employees’
productivity, etc., are examples of normally distributed random variables.

Further, many sampling statistics, e.g., sample means X bar, are normally
v
distributed.
ni

Empirical Rule
The empirical rule, also referred to as the three-sigma rule is a statistical rule
which states that for a normal distribution, almost all observed data will fall within three
U

standard deviations which is denoted by σ of the mean or average which is denoted


by µ.

The empirical rule states that for a normal distribution, nearly all of the data will fall
ity

within three standard deviations of the mean. The empirical rule can be broken down
into three parts:

●● 68% of data falls within the first standard deviation from the mean.
●● 95% fall within two standard deviations.
m

●● 99.7% fall within three standard deviations.

The Empirical Rule is often used in statistics for forecasting, especially when
)A

obtaining the right data is difficult or impossible to get. The rule can give you a rough
estimate of what your data collection might look like if you were able to survey the entire
population.

A random variable X is a normal random variable with parameters μ and σ if the


probability density function (p.d.f.) of X is given by
(c

( x − µ )2
1
=f ( x) e 2σ 2
Where, - ∞ < X < ∞
σ 2π
Amity Directorate of Distance & Online Education
44 Statistics Management

This distribution is bell-shaped curve that is symmetric about μ. It gives a


Notes

e
theoretical base to the observation that, in practice, many random phenomena obey
approximately, a normal probability distribution. Mean of normal random variable is E(X)

in
= μ and variance of normal random variable is Var(X) σ2. If X is normally distributed
with parameters μ and σ, then another random variable is also normally distributed with
parameters (aμ + b) and (a σ).

nl
Properties of Normal Distribution
1. It is perfectly symmetric about the mean μ.

O
2. For a normal distribution mean = median = mode.
3. It is uni-modal (one mode), with skewness = 0 and kurtosis = 0.
4. Normal distribution is a limiting form of binomial distribution when number trials n is

ity
large, and neither the probability p nor (1-p) is very small.
5. Normal distribution is a limiting case of Poisson distribution when mean μ = λ is very
large.
6. While working on probability of normal distribution we usually use normal distribution

s
(more often standard normal distribution) tables.
While reading these tables, properties are:
er
(a) The probability that a normally distributed random variable with mean μ and
variance σ² lies between two specified values a and b is P (a < X < b) = area
under the curve P(x) between the specified values X = a and X = b.
v
(b) Total area under the curve P (x) is equal to 1 in which 0.5 lies on either side of
the mean.
ni

2.2.9 Standard Normal Distribution


U

Calculating cumulative density of normal distribution involves integration. Further,


tabulation also has a problem that we must have tables for every possible value of μ
and σ² (which is not feasible). Hence, we transform Normal Random Variable to another
random variable known as Standard Normal Random Variable. For this, we use a
ity

transformation,
m

z is a normally distributed random variable with parameters, μ = 0 and σ = 1. Any


normal random variable can be transformed to standard normal random variable z. We
can get cumulative distribution function as,
2
)A

z
a
1 a
F (a) =∫ f ( x)dx =
∫ e 2 dz
−∞

This has been calculated for various values of ‘a’ and tabulated. Also, we know that,

F (–a) = 1 – F(a) and also F(a < Z < b) = F(b) – F(a)


(c

Example: Tea is filled in the packs of 200 gm by a machine with variability of 0.25

Amity Directorate of Distance & Online Education


Statistics Management 45

gms. Packs weighing less than 200 gm would be rejected by customers and not legally
Notes

e
acceptable. Therefore, marketing and legal department requests production manager to
set the machine to fill slightly more quantity in each pack. However, finance department

in
objects to this since it would lead to financial loss due to overfilling the packs. The
general manager wants to know the 99% confidence interval, when the machine is set
at 200gms, so that he can take a decision. Find confidence interval. What is your advice
to the production manger?

nl
Solution:

Let weight of the tea in a pack is a random variable X.

O
We know that the mean μ = 200 gm and variance σ² = 0.25 gms i.e. σ = 0.5 gm.

First, we find the value of z for 99% confidence. Standard Normal Distribution curve
is symmetric about mean.

ity
Hence, corresponding to 99% confidence, half area under the curve

= 0.99/2

= 0.495.

s
Value z corresponding to probability 0.495 is 2.575. Thus, the 99% confidence
er
interval in terms of variable z is ± 2.575 which in terms of variable x is, 200 ±1.2875 or
(198.71 to 201.29).

Note: that x = σ z + μ = 0.5 × (±2.575) + 200 = 200 ± 1.2875


v
Hence, we can advise the production manager to set his machine to fill tea with
mean weight as 201.2875 or say 201.29. In that case we have 99% confidence of
ni

meeting legal requirement and at the same time to keep the cost of excess filling of the
coffee to minimum.
U

Key Terms
●● Probability: Probability of a given event is an expression of likelihood or chance
of occurrence of an event. A probability is a number which rages from zero to one.
ity

●● Continuous Probability Distributions: Continuous random variables are those


that take on any value including fractions and decimals. Continuous random
variables give rise to continuous probability distributions. Continuous is the
opposite of discrete.
m

●● Random Experiment: In theory of probability, a process or activity that results


in outcomes under study is called experiment, for example, sampling from a
production lot.
)A

●● Sample: A sample is that part of the universe which the select for the purpose
of investigation. A sample exhibits the characteristics of the universe. The word
sample literally means small universe.
●● Sampling: Sampling is defined as the selection of some part of an aggregate
or totality on the basis of which a judgment or inference about the aggregate or
(c

totality is made. Sampling is the process of learning about the population on the
basis of a sample drawn from it.

Amity Directorate of Distance & Online Education


46 Statistics Management

●● Stratified random sampling: Stratified random sampling requires the separation


Notes

e
of defined target population into different groups called strata and the selection of
sample from each stratum.

in
●● Cluster sampling: Cluster sampling is a probability sampling method in which
the sampling units are divided into mutually exclusive and collectively exhaustive
subpopulation called clusters.

nl
●● Hypothesis testing: Hypothesis testing refers to the formal procedures used by
statisticians to accept or reject statistical hypotheses. It is an assumption about a
population parameter. This assumption may or may not be true.

O
Check your progress
1. In probability theories, events which can never occur together are classified as

ity
a. Collectively exclusive events
b. Mutually exhaustive events
c. Mutually exclusive events
d. Collectively exhaustive events

s
2. Value which is used to measure distance between mean and random variable x in
terms of standard deviation is called
er
a. Z-value
b. Variance
v
c. Probability of x
d. Density function of x
ni

3. test is applied when samples are less than 30.


a. T
U

b. Z
c. Rank
d. None of these
ity

4. Under non-random sampling method, samples are selected on the basis of


a. Stages
b. Strategy
m

c. Originality
d. Convenience
)A

5. Probability of second event in situation if first event has been occurred is classified
as
a. Series probability
b. Conditional probability
(c

c. Joint probability
d. Dependent probability

Amity Directorate of Distance & Online Education


Statistics Management 47

Questions and Exercises


Notes

e
1. What is probability? What do you mean by probability distributions?
2. What is normal distribution ? What are the merits of normal distribution

in
3. What is Hypothesis Testing?
4. What do you mean by t-test and z test ?

nl
5. Explain Poisson Distribution and its Application

Check your progress

O
1. c) Mutually exclusive events
2. a) Z value
3. a) T test

ity
4. d) Convenience
5. b) Conditional probability

Further Readings

s
1. Richard I. Levin, David S. Rubin, Sanjay Rastogi Masood Husain Siddiqui,
er
Statistics for Management, Pearson Education, 7th Edition, 2016.
2. Prem.S.Mann, Introductory Statistics, 7th Edition, Wiley India, 2016.
3. Gareth James, Daniela Witten, Trevor Hastie, Robert Tibshirani, An
v
Introduction to Statistical Learning with Applications in R, Springer, 2016.
ni

Bibliography
1. Srivastava V. K. etal – Quantitative Techniques for Managerial Decision Making,
Wiley Eastern Ltd
U

2. Richard, I.Levin and Charles A.Kirkpatrick – Quantitative Approaches to Management,


McGraw Hill, Kogakusha Ltd.
3. Prem.S.Mann, Introductory Statistics, 7th Edition, Wiley India, 2016.
ity

4. Budnik, Frank S Dennis Mcleaavey, Richard Mojena – Principles of Operation


Research - AIT BS New Delhi.
5. Sharma J K – Operation Research- theory and applications-Mc Millan,New Delhi
6. Kalavathy S. – Operation Research – Vikas Pub Co
m

7. Gould F J – Introduction to Management Science – Englewood Cliffs N J Prentice


Hall.
8. Naray J K, Operation Research, theory and applications – Mc Millan, New Dehi.
)A

9. Taha Hamdy, Operations Research, Prentice Hall of India


10. Tulasian: Quantitative Techniques: Pearson Ed.
11. Vohr.N.D. Quantitative Techniques in Management, TMH.
12. Stevenson W.D, Introduction to Management Science, TMH.
(c

Amity Directorate of Distance & Online Education


48 Statistics Management

Module-3: Sampling, Sampling Distribution


Notes

e
and Estimation

in
Learning Objective:
●● To understand the basic concepts of sampling distribution and estimation

nl
techniques
●● To get familiarize with MS Excel for confidence interval construction

O
Learning Outcome:
At the end of the course, the learners will be able to –

●● Use sampling methods and estimations techniques in order to answer business

ity
queries
●● Understand the purpose and need of sampling.

3.1.1 Sampling - Introduction

s
Sampling is an important concept which is practiced in every activity. Sampling
involves selecting a relatively small number of elements from a large defined group
er
of elements and expecting that the information gathered from the small group will
allow judgments to be made about the large group. The basic idea of sampling is that
by selecting some of the elements in a population, the conclusion about the entire
v
population is drawn. Sampling is used when conducting census is impossible or
unreasonable.
ni

Meaning of Sampling
Sampling is defined as the selection of some part of an aggregate or totality on
U

the basis of which a judgment or inference about the aggregate or totality is made.
Sampling is the process of learning about the population on the basis of a sample
drawn from it.
ity

Purpose of Sampling
There are several reasons for sampling. They are explained below:

1. Lower cost: The cost of conducting a study based on a sample is much lesser
than the cost of conducting the census study.
m

2. Greater accuracy of results: It is generally argued that the quality of a study


is often better with sampling data than with a census. Research findings also
)A

substantiate this opinion.


3. Greater speed of data collection: Speed of execution of data collection is
higher with the sample. It also reduces the time between the recognition of a
need for information and the availability of that information.
4. Availability of population element: Some situations require sampling. When
(c

the breaking strength of materials is to be tested, it has to be destroyed. A

Amity Directorate of Distance & Online Education


Statistics Management 49

census method cannot be resorted as it would mean complete destruction of all


Notes

e
materials. Sampling is the only process possible if the population is infinite.

Features of Sampling Method

in
The sampling technique has the following good features of value and significance:

1. Economy: Sampling technique brings about cost control of a research project as it

nl
requires much less physical resources as well as time than the census technique.
2. Reliability: In sampling technique, if due diligence is exercised in the choice of
sample unit and if the research topic is homogenous then the sample survey

O
can have almost the same reliability as that of census survey.
3. Detailed Study: An intensive and detailed study of sample units can be done
since their number is fairly small. Also multiple approaches can be applied to a

ity
sample for an intensive analysis.
4. Scientific Base: As mentioned earlier this technique is of scientific nature as
the underlined theory is based on principle of statistics.
5. Greater Suitability in most Situations: It has a wide applicability in most

s
situations as the examination of few sample units normally suffices.
6. Accuracy: The accuracy is determined by the extent to which bias is eliminated
er
from the sampling. When the sample elements are drawn properly some
sample elements underestimates the population values being studied and
others overestimate them.
v
Essentials of Sampling
In order to reach a clear conclusion, the sampling should possess the following
ni

essentials:

1. It must be representative: The sample selected should possess the similar


U

characteristics of the original universe from which it has been drawn.


2. Homogeneity: Selected samples from the universe should have similar nature
and should not have any difference when compared with the universe.
ity

3. Adequate Samples: In order to have a more reliable and representative result,


a good number of items are to be included in the sample.
4. Optimization: All efforts should be made to get maximum results both in terms
of cost as well as efficiency. If the size of the sample is larger, there is better
m

efficiency and at the same time the cost is more. A proper size of sample is
maintained in order to have optimized results in terms of cost and efficiency.

3.1.2 Types of Sampling


)A

The sampling design can be broadly grouped on two basis viz., representation
and element selection. Representation refers to the selection of members on a
probability or by other means. Element selection refers to the manner in which the
elements are selected individually and directly from the population. If each element is
(c

drawn individually from the population at large, it is an unrestricted sample. Restricted


sampling is where additional controls are imposed, in other words it covers all other
forms of sampling.
Amity Directorate of Distance & Online Education
50 Statistics Management

The classification of sampling design on the basis of representation and element


Notes

e
selection is -

in
Probability Sampling
Probability sampling is where each sampling unit in the defined target population
has a known non-zero probability of being selected in the sample. The actual probability

nl
of selection for each sampling unit may or may not be equal depending on the type
of probability sampling design used. Specific rules for selecting members from the
operational population are made to ensure unbiased selection of the sampling units and
proper sample representation of the defined target population. The results obtained by

O
using probability sampling designs can be generalized to the target population within a
specified margin of error.

Probability samples are characterised by the fact that, the sampling units are

ity
selected by chance. In such a case, each member of the population has a known,
non- zero probability of being selected. However, it may not be true that all samples
would have the same probability of selection, but it is possible to say the probability
of selecting any particular sample of a given size. It is possible that one can calculate

s
the probability that any given population element would be included in the sample. This
requires a precise definition of the target population as well as the sampling frame.
er
Probability sampling techniques differ in terms of sampling efficiency which is a
concept that refers to trade off between sampling cost and precision. Precision refers to
the level of uncertainty about the characteristics being measured. Precision is inversely
related to sampling errors but directly related to cost. The greater the precision, the
v
greater the cost and there should be a trade-off between sampling cost and precision.
The researcher is required to design the most efficient sampling design in order to
ni

increase the efficiency of the sampling.

The different types of probability sampling designs are discussed below:


U

Simple Random Sampling


The following are the implications of random sampling:
ity

●● It provides each element in the population an equal probability chance of


being chosen in the sample, with all choices being independent of one
another and
●● It offers each possible sample combination an equal probability opportunity of
m

being selected.
In the unrestricted probability sampling design every element in the population
has a known, equal non-zero chance of being selected as a subject. For example, if
)A

10 employees (n = 10) are to be selected from 30 employees (N = 30), the researcher


can write the name of each employee in a piece of paper and select them on a random
basis. Each employee will have an equal known probability of selection for a sample.
The same is expressed in terms of the following formula:

Probability of selection = Size of sample / Size of population


(c

Each employee would have a 10/30 or .333 chance of being randomly selected
in a drawn sample. When the defined target population consists of a larger number

Amity Directorate of Distance & Online Education


Statistics Management 51

of sampling units, a more sophisticated method can be used to randomly draw the
Notes

e
necessary sample. A table of random numbers can be used for this purpose. The table
of random numbers contains a list of randomly generated numbers. The numbers

in
can be randomly generated through the computer programs also. Using the random
numbers the sample can be selected.

Advantages and Disadvantages

nl
The simple random sampling technique can be easily understood and the survey
result can be generalized to the defined target population with a pre specified margin
of error. It also enables the researcher to gain unbiased estimates of the population’s

O
characteristics. The method guarantees that every sampling unit of the population
has a known and equal chance of being selected, irrespective of the actual size of the
sample resulting in a valid representation of the defined target population.

ity
The major drawback of the simple random sampling is the difficulty of obtaining
complete, current and accurate listing of the target population elements. Simple
random sampling process requires all sampling units to be identified which would be
cumbersome and expensive in case of a large population. Hence, this method is most
suitable for a small population.

Systematic Random Sampling

s
er
The systematic random sampling design is similar to simple random sampling but
requires that the defined target population should be selected in some way. It involves
drawing every nth element in the population starting with a randomly chosen element
v
between 1 and n. In other words individual sampling units are selected according their
position using a skip interval. The skip interval is determined by dividing the sample size
ni

into population size. For example, if the researcher wants a sample of 100 to be drawn
from a defined target population of 1000, the skip interval would be 10(1000/100). Once
the skip interval is calculated, the researcher would randomly select a starting point and
U

take every 10th until the entire target population is proceeded through. The steps to be
followed in a systematic sampling method are enumerated below:

●● Total number of elements in the population should be identified


ity

●● The sampling ratio is to be calculated ( n = total population size divided by


size of the desired sample)
●● A sample can be drawn by choosing every nth entry
Two important considerations in using the systematic random sampling are:
m

1. It is important that the natural order of the defined target population list be
unrelated to the characteristic being studied.
)A

2. Skip interval should not correspond to the systematic change in the target
population.

Advantages and Disadvantages


The major advantage is its simplicity and flexibility. In case of systematic sampling
(c

there is no need to number the entries in a large personnel file before drawing a

Amity Directorate of Distance & Online Education


52 Statistics Management

sample. The availability of lists and shorter time required to draw a sample compared
Notes

e
to random sampling makes systematic sampling an attractive, economical method for
researchers.

in
The greatest weakness of systematic random sampling is the potential for the
hidden patterns in the data that are not found by the researcher. This could result in
a sample not truly representative of the target population. Another difficulty is that the

nl
researcher must know exactly how many sampling units make up the defined target
population. In situations where the target population is extremely large or unknown,
identifying the true number of units is difficult and the estimates may not be accurate.

O
Stratified Random Sampling
Stratified random sampling requires the separation of defined target population into
different groups called strata and the selection of sample from each stratum. Stratified

ity
random sampling is very useful when the divisions of target population are skewed
or when extremes are present in the probability distribution of the target population
elements of interest. The goal in stratification is to minimize the variability within each
stratum and maximize the difference between strata. The ideal stratification would be

s
based on the primary variable under study. Researchers often have several important
variables about which they want to draw conclusions.
er
A reasonable approach is to identify some basis for stratification that correlates
well with other major variables. It might be a single variable like age, income etc. or
a compound variable like on the basis of income and gender. Stratification leads to
segmenting the population into smaller, more homogeneous sets of elements. In order
v
to ensure that the sample maintains the required precision in terms of representing
the total population, representative samples must be drawn from each of the smaller
ni

population groups.

There are three reasons as to why a researcher chooses a stratified random


U

sample:

●● To increase the sample’s statistical efficiency


●● To provide adequate data for analyzing various sub populations
ity

●● To enable different research methods and procedures to be used in different


strata.

Cluster Sampling
m

Cluster sampling is a probability sampling method in which the sampling units


are divided into mutually exclusive and collectively exhaustive subpopulation called
clusters. Each cluster is assumed to be the representative of the heterogeneity of
the target population. Groups of elements that would have heterogeneity among the
)A

members within each group are chosen for study in cluster sampling. Several groups
with intragroup heterogeneity and intergroup homogeneity are found. A random
sampling of the clusters or groups is done and information is gathered from each of
the members in the randomly chosen clusters. Cluster sampling offers more of
heterogeneity within groups and more homogeneity among the groups.
(c

Amity Directorate of Distance & Online Education


Statistics Management 53

Single Stage and Multistage Cluster Sampling


Notes

e
In single stage cluster sampling, the population is divided into convenient clusters
and required number of clusters are randomly chosen as sample subjects. Each

in
element in each of the randomly chosen cluster is investigated in the study. Cluster
sampling can also be done in several stages which is known as multistage cluster
sampling. For example: To study the banking behaviour of customers in a national

nl
survey, cluster sampling can be used to select the urban, semi-urban and rural
geographical locations of the study. At the next stage, particular areas in each of the
location would be chosen. At the third stage, the banks within each area would be
chosen.

O
Thus multi-stage sampling involves a probability sampling of the primary sampling
units; from each of the primary units, a probability sampling of the secondary sampling
units is drawn; a third level of probability sampling is done from each of these

ity
secondary units, and so on until the final stage of breakdown for the sample units are
arrived at, where every member of the unit will be a sample.

Advantages and Disadvantages of Cluster Sampling

s
The cluster sampling method is widely used due to its overall cost-effectiveness
and feasibility of implementation. In many situations the only reliable sampling unit
er
frame available to researchers and representative of the defined target population,
is one that describes and lists clusters. The list of geographical regions, telephone
exchanges, or blocks of residential dwelling can normally be easily compiled than
the list of all the individual sampling units making up the target population. Clustering
v
method is a cost efficient way of sampling and collecting raw data from a defined target
population.
ni

One major drawback of clustering method is the tendency of the cluster to be


homogeneous. The greater the homogeneity of the cluster, the less precise will be the
sample estimate in representing the target population parameters. The conditions of
U

intra- cluster heterogeneity and inter-cluster homogeneity are often not met. For these
reasons this method is not practiced often.
ity

Area Sampling
Area sampling is a form of cluster sampling in which the clusters are formed by
geographic designations. For example, state, district, city, town etc., Area sampling is
a form of cluster sampling in which any geographic unit with identifiable boundaries
m

can be used. Area sampling is less expensive than most other probability designs and
is not dependent on population frame. A city map showing blocks of the city would be
adequate information to allow a researcher to take a sample of the blocks and obtain
data from the residents therein.
)A

Sequential/Multiphase Sampling
This is also called Double Sampling. Double sampling is opted when further
information is needed from a subset of groups from which some information has already
(c

been collected for the same study. It is called as double sampling because initially a
sample is used in the study to collect some preliminary information of interest and later
a sub-sample of this primary sample is used to examine the matter in more detail The

Amity Directorate of Distance & Online Education


54 Statistics Management

process includes collecting data from a sample using a previously defined technique.
Notes

e
Based on this information, a sub sample is selected for further study. It is more
convenient and economical to collect some information by sampling and then use this

in
information as the basis for selecting a sub sample for further study.

Sampling with Probability Proportional to Size

nl
When the case of cluster sampling units does not have exactly or approximately
the same number of elements, it is better for the researcher to adopt a random
selection process, where the probability of inclusion of each cluster in the sample
tends to be proportional to the size of the cluster. For this, the number of elements

O
in each cluster has to be listed, irrespective of the method used for ordering it. Then
the researcher should systematically pick the required number of elements from the
cumulative totals. The actual numbers thus chosen would not however reflect the
individual elements, but would indicate as to which cluster and how many from them are

ity
to be chosen by using simple random sampling or systematic sampling. The outcome
of such sampling is equivalent to that of simple random sample. This method is also
less cumbersome and is also relatively less expensive.

s
Non-probability Sampling
In non probability sampling method, the elements in the population do not have any
er
probabilities attached to being chosen as sample subjects. This means that the findings
of the study cannot be generalized to the population. However, at times the researcher
may be less concerned about generalizability and the purpose may be just to obtain
v
some preliminary information in a quick and inexpensive way. Sometimes when the
population size is unknown, then non probability sampling would be the only way to
ni

obtain data. Some non-probability sampling techniques may be more dependable than
others and could often lead to important information with regard to the population.

Convenience Sampling
U

Non-probability samples that are unrestricted are called convenient sampling.


Convenience sampling refers to the collection of information from members of
population who are conveniently available to provide it. Researchers or field workers
ity

have the freedom to choose as samples whomever they find, thus it is named as
convenience. It is mostly used during the exploratory phase of a research project
and it is the best way of getting some basic information quickly and efficiently. The
assumption is that the target population is homogeneous and the individuals selected
as samples are similar to the overall defined target population with regard to the
m

characteristics being studied. However, in reality there is no way to accurately assess


the representativeness of the sample. Due to the self selection and voluntary nature of
participation in data collection process the researcher should give due consideration to
)A

the non-response error.

Advantages and Disadvantages


Convenient sampling allows a large number of respondents to be interviewed
(c

in a relatively short time. This is one of the main reasons for using convenient
sampling in the early stages of research. However the major drawback is that the

Amity Directorate of Distance & Online Education


Statistics Management 55

use of convenience samples in the development phases of constructs and scale


Notes

e
measurements can have a serious negative impact on the overall reliability and validity
of those measures and instruments used to collect raw data. Another major drawback is

in
that the raw data and results are not generalizable to the defined target population with
any measure of precision. It is not possible to measure the representativeness of the
sample, because sampling error estimates cannot be accurately determined.

nl
Judgment Sampling
Judgment sampling is a non-probability sampling method in which participants
are selected according to an experienced individual’s belief that they will meet the

O
requirements of the study. The researcher selects sample members who conform to
some criterion. It is appropriate in the early stages of an exploratory study and involves
the choice of subjects who are most advantageously placed or in the best position to
provide the information required. This is used when a limited number or category of

ity
people have the information that are being sought. The underlying assumption is that
the researcher’s belief that the opinions of a group of perceived experts on the topic of
interest are representative of the entire target population.

Advantages and Disadvantages

s
If the judgment of the researcher or expert is correct then the sample generated
from the judgment sampling will be much better than one generated by convenience
er
sampling. However, as in the case of all non-probability sampling methods, the
representativeness of the sample cannot be measured. The raw data and information
collected through judgment sampling provides only a preliminary insight
v
Quota Sampling
ni

The quota sampling method involves the selection of prospective participants


according to pre specified quotas regarding either the demographic characteristics
(gender, age, education, income, occupation etc.,) specific attitudes (satisfied, neutral,
U

dissatisfied) or specific behaviours (regular, occasional, rare user of product). The


purpose of quota sampling is to provide an assurance that pre specified subgroups
of the defined target population are represented on pertinent sampling factors that
are determined by the researcher. It ensures that certain groups are adequately
ity

represented in the study through the assignment of the quota.

Advantages and Disadvantages


The greatest advantage of quota sampling is that the sample generated contains
m

specific subgroups in the proportion desired by researchers. In those research projects


that require interviews the use of quotas ensures that the appropriate subgroups are
identified and included in the survey. The quota sampling method may eliminate or
)A

reduce selection bias.

An inherent limitation of quota sampling is that the success of the study will be
dependent on subjective decisions made by the researchers. As a non-probability
method, it is incapable of measuring true representativeness of the sample or accuracy
of the estimate obtained. Therefore, attempts to generalize the data results beyond
(c

those respondents who were sampled and interviewed become very questionable and
may misrepresent the given target population.

Amity Directorate of Distance & Online Education


56 Statistics Management

Snowball Sampling
Notes

e
Snowball sampling is a non-probability sampling method in which a set of
respondents are chosen who help the researcher to identify additional respondents to

in
be included in the study. This method of sampling is also called as referral sampling
because one respondent refers other potential respondents. This method involves
probability and non-probability methods. The initial respondents are chosen by a

nl
random method and the subsequent respondents are chosen by non-probability
methods. Snowball sampling is typically used in research situations where the defined
target population is very small and unique and compiling a complete list of sampling
units is a nearly impossible task. This technique is widely used in academic research.

O
While the traditional probability and other non-probability sampling methods would
normally require an extreme search effort to qualify a sufficient number of prospective
respondents, the snowball method would yield better result at a much lower cost. The

ity
researcher has to identify and interview one qualified respondent and then solicit his
help to identify other respondents with similar characteristics.

Advantages and Disadvantages


Snowball sampling enables to identify and select prospective respondents who

s
are small in number, hard to reach and uniquely defined target population. It is most
useful in qualitative research practices. Reduced sample size and costs are the primary
er
advantage of this sampling method. The major drawback is that the chance of bias is
higher. If there is a significant difference between people who are identified through
snowball sampling and others who are not then, it may give rise to problems. The
v
results cannot be generalized to members of larger defined target population.
ni

3.1.3 Types of Sampling & Non Sampling Errors and Precautions


A sampling error represents a statistical error occuring when an analyst does not
select a sample that represents the entire population of data and the results found
U

in the sample do not represent the results that would be obtained from the entire
population.

●● Regardless of the fact that the sample is not representative of the population or
ity

skewed in any way, a sampling error is a difference in sampled value versus true
population value.
●● Also randomized samples may have some sampling error, since it is just a
population estimate from which it is derived.
m

●● Sampling errors can be eliminated when the sample size is increased and also
by ensuring that the sample adequately represents the entire population. For
example, ABC Company provides a subscription-based service that allows
)A

consumers to pay a monthly fee to stream videos and other programming over the
web.
A non-sampling error is a statistical term referring to an error resulting from data
collection, which causes the data to differ from the true values. A non-sampling error is
different from that of a sampling error.
(c

●● A non-sampling error refers to either random or systematic errors, and these errors
can be challenging to spot in a survey, sample, or census.
Amity Directorate of Distance & Online Education
Statistics Management 57

●● Systematic non-sampling errors are worse than random non-sampling errors


Notes

e
because systematic errors may result in the study, survey or census having to be
scrapped.

in
●● The higher the number of errors, the less reliable the information.
●● When non-sampling errors occur, the rate of bias in a study or survey goes up.

nl
3.1.4 Central Limit Theorem
In the study of probability theory, the central limit theorem (CLT) states that the
distribution of sample approximates a normal distribution also known as a “bell curve.

O
As the sample size becomes larger, it assumed that all samples are identical in size,
and regardless of the shape of the population distribution.

It is a statistical theory stating that, given a sufficiently large sample size from a

ity
population with a finite degree of variance, the mean of all samples from the same
population would be approximately equal to the average. Furthermore, all the samples
will follow an approximate normal distribution pattern, with all variances being
approximately equal to the variance of the population, divided by each sample’s size

s
●● The central limit theorem (CLT) states that the distribution of sample means
approximates a normal distribution as the sample size gets larger.
●●
er
Sample sizes equal to or greater than 30 are considered sufficient for the theorem
to hold.
●● A key aspect of the theorem is that the average of the sample means and standard
v
deviations will equal the population mean and standard deviation.
ni

●● A sufficiently large sample size can always predict the characteristics of a


population accurately.

3.1.5 Sampling Distribution of the Mean


U

A sample is that part of the universe which the select for the purpose of
investigation. A sample exhibits the characteristics of the universe. The word sample
literally means small universe. For example, suppose the microchips produced in
ity

a factory are to be tested. The aggregate of all such items is universe, but it is not
possible to test every item. So in such a case, a part of the universe is taken and then
tested. Now this quantity extracted for testing is known as sample.

If we take certain number of samples and for each sample compute various
m

statistical measures such as mean, standard deviation etc. then we can find out that
each sample may give its own value for statistics under consideration. All such values
of a particular statics, say, mean together with their relative frequencies will constitute
)A

the sampling distribution of mean standard deviation.

3.1.6 Sampling Distribution of Proportion


Sampling distribution of sample proportion refers to the concept that If repeated
(c

random samples of a given size n are taken from a population of values for a
categorical variable, where the proportion in the category of interest is p, then the mean
of all sample proportions (p-hat) is the population proportion (p).

Amity Directorate of Distance & Online Education


58 Statistics Management

The theory dictates the behavior much more precisely than saying that there is
Notes

e
less spread for larger samples as regards the spread of all sample proportions. The
standard deviation of all sample proportions is generally directly related to the sample
size, n as shown below

in
p (1 − p )
The standard deviation of all sample proportion (p ) is exactly
n

nl
Given that the sample size n appears in the square root denominator, the standard
deviation decreases as the sample size increases. Eventually, the p-hat distribution
form should be reasonably normal as long as the sample size n is sufficiently high. The
convention specifies that np and n(1 – p) should be at least 10

O
p is normally distributed with a mean of μp = p

p (1 − p )
and a standard deviation σp =

ity
n
as long as np > 10 and n(1-p) > 10

3.1.7 Estimation – Introduction

s
Let x be a random variable with probability density function (or probability mass
function) er
f(X ; θ1 , θ2 , .... θk), where θ1 , θ2 , .... θk are the k parameters of the population.

Given a random sample x1 , x2 , ...... xn from this population, we may be interested


in estimating one or more of the k parameters θ1 , θ2 , ...... θk. In order to be specific,
v
let x be any normal variate so that its probability density function can be written as N(x :
μ, σ). We may be interested in estimating m or s or both on the basis of random sample
ni

obtained from this population.

It should be noted here that there can be several estimators of a parameter, e.g.,
we can have any of the sample mean, median, mode, geometric mean, harmonic
U

mean, etc., as an estimator of population mean μ. Similarly, S will be –

1 1
s= ∑ (x i -x) 2 or s =∑ (x i -x) 2
n n −1
ity

as an estimator of population standard deviation s. This method of estimation,


where a single statistic such as Mean, Median, Standard deviation, etc. is used as an
estimator of population parameter, is known as the Point Estimation.
m

3.1.8 Types of Estimation


Statisticians use sample statistics to estimate population parameters. For example,
sample means are used to estimate population means; sample proportions, to estimate
)A

population proportions.

An estimate of a population parameter may be expressed in two ways:

●● Point estimate. A point estimate of a population parameter is a single value


of a statistic. For example, the sample mean x is a point estimate of the
(c

population mean μ. Similarly, the sample proportion p is a point estimate of


the population proportion P.

Amity Directorate of Distance & Online Education


Statistics Management 59

A population parameter is denoted by θθ which is unknown constant. The


Notes

e
available information is in the form of a random sample x1,x2,...,xnx1,x2,...,xn of size
nn drawn from the population. We formulate a function of the sample observation

in
x1,x2,...,xnx1,x2,...,xn. The estimator of θθ is denoted by θ^θ^. The different random
sample provides different values of the statistics θ^θ^. Thus θ^θ^ is a random variable
with its own sampling probability distribution.

nl
●● Interval estimate. An interval estimate is defined by two numbers, between
which a population parameter is said to lie. For example, a < x < b is an
interval estimate of the population mean μ. It indicates that the population
mean is greater than a but less than b.

O
This range of values used to estimate a population parameter is known as
interval estimate or estimate by a confidence interval, and is defined by two numbers,
between which a population parameter is expected to lie. For example, a<x¯<ba<x¯<b

ity
is an interval estimate of the population mean μ, indicating that the population mean
is greater than aa but less than bb. The purpose of an interval estimate is to provide
information about how close the point estimate is to the true parameter.

s
3.1.9 Using z Statistic for Estimating Population Mean
The estimation of a population mean given a random sample is a very common
er
task. If the population standard deviation (σσ) is known, the construction of a
confidence interval for the population mean (μ) is based on the normally distributed
sampling distribution of the sample means
v
The 100(1−α)%100 the confidence interval for μ is given by

CI : x ± z *α /2 × σ x
ni

σ
Where σ x =
n
U

The value of z*α/2 corresponds to the critical value and is obtained from the
standard normal table or computed with the qnorm() function in R. The critical value is
a quantity that is related to the desired level of confidence. Typical values for z*α/2zα/2*
ity

are 1.64, 1.96, and 2.58, corresponding to a confidence level of 90%, 95% and 99%.
This critical value is multiplied with the standard error, given by σx¯σx¯, in order to
widen or narrowing the margin of error.

The standard error (σx¯) is given by the ratio of the standard deviation of the
m

population (σ) and the square root of the sample size nn. It describes the degree to
which the computed sample statistic may be expected to differ from one sample to
another. The product of the critical value and the standard error is called the margin of
)A

error. It is the quantity that is subtracted from and added to the value of x¯ to obtain the
confidence interval for μ.

3.1.10 Confidence Interval for Estimating Population Mean When


Population SD is Unknown
(c

A confidence interval gives an estimated range of values which is likely to include


an unknown population parameter, the estimated range being calculated from a given

Amity Directorate of Distance & Online Education


60 Statistics Management

set of sample data. The common notation for the parameter in question is θ. Often, this
Notes

e
parameter is the population mean μ, which is estimated through the sample mean X .

The level C of a confidence interval gives the probability that the interval produced

in
by the method employed includes the true value of the parameter θ.

In many situations, the value of σ is unknown, thus it is estimated with the sample
standard deviation, s; and/or the sample size is small (less than 30), and it is unsure

nl
as to where data came from a normal distribution. (In the latter case, the Central
Limit Theorem can’t be used.) In either situation, the z*-value can not be used from
the standard normal (Z-) distribution as a critical value anymore. It is essential to use a

O
larger critical value than that, because of not knowing the data quantity.

The formula for a confidence interval for one population mean in this case is
− s
X ± t *n −1 , where t*n-1

ity
n

is the critical t*-value from the t-distribution with n-1 degrees of freedom (where n is
the sample size).

s
Estimating population mean using t Statistic
er
A statistical examination of two population means. A two-sample t-test examines
whether two samples are different and is commonly used when the variances of two
normal distributions are unknown and when an experiment uses a small sample size.
v
x–m
Formula: t = s
ni

Where, is the sample mean, Ä is a specified value to be tested, s is the sample


standard deviation and n is the size of the sample. Look up the significance level of the
U

z-value in the standard normal table.

When the standard deviation of the sample is substituted for the standard deviation
of the population, the statistic does not have a normal distribution; it has what is called
ity

the t-distribution. Because there is a different t-distribution for each sample size, it is
not practical to list a separate area of the curve table for each one. Instead, critical
t-values for common alpha levels (0.10, 0.05, 0.01, and so forth) are usually given in
a single table for a range of sample sizes. For very large samples, the t-distribution
approximates the standard normal (z) distribution. In practice, it is best to use
m

t-distributions any time the population standard deviation is not known.

Values in the t-table are not actually listed by sample size but by degrees
)A

of freedom (df). The number of degrees of freedom for a problem involving the
t-distribution for sample size n is simply n – 1 for a one-sample mean problem.

Uses of T Test
(c

Among the most frequently used t-tests are:


●● A one-sample location test of whether the mean of a normally distributed
population has a value specified in a null hypothesis.
Amity Directorate of Distance & Online Education
Statistics Management 61

●● A two sample location test of the null hypothesis that the means of two
Notes

e
normally distributed populations are equal.
All such tests are usually called Student’s t-tests, though strictly speaking that

in
name should only be used if the variances of the two populations are also assumed
to be equal; the form of the test used when this assumption is dropped is sometimes
called Welch’s t-test. These tests are often referred to as “unpaired” or “independent

nl
samples” t-tests, as they are typically applied when the statistical units underlying the
two samples being compared are non-overlapping.

A test of the null hypothesis that the difference between two responses measured

O
on the same statistical unit has a mean value of zero. For example, suppose we
measure the size of a cancer patient’s tumor before and after a treatment. If the
treatment is effective, we expect the tumor size for many of the patients to be smaller
following the treatment. This is often referred to as the “paired” or “repeated measures”

ity
t-test: A test of whether the slope of a regression line differs significantly from 0.

3.1.12 Confidence Interval Estimation for Population Proportion


The confidence interval (CI) for a population proportion can be used to show the

s
statistical probability that a characteristic is likely to occur within the population.
er
For example, if we wish to estimate the proportion of people with diabetes in
a population, we consider a diagnosis of diabetes as a “success” (i.e., and individual
who has the outcome of interest), and we consider lack of diagnosis of diabetes as a
“failure.” In this example, X represents the number of people with a diagnosis of diabetes
v
in the sample. The sample proportion is p̂ (called “p-hat”), and it is computed by taking the
ratio of the number of successes in the sample to the sample size, that is =
ni

P = x/n

Where x is the number of successes in the sample and n is the size of the sample
U

The formula for the confidence interval for a population proportion follows the same
format as that for an estimate of a population mean. The sampling distribution for the
proportion from , the standard deviation was found to be:
ity

σp’=p(1−p)n

The confidence interval for a population proportion, therefore, becomes:

p=p′±[Z(a2)p′(1−p′)n]
m

Z(a2) is set according to our desired degree of confidence and p′(1−p′)n is the
standard deviation of the sampling distribution.

The sample proportions p′ and q′ are estimates of the unknown population


)A

proportions p and q. The estimated proportions p′ and q′ are used because p and q are
not known.

Key Terms
(c

●● Sample: A sample is that part of the universe which the select for the purpose
of investigation. A sample exhibits the characteristics of the universe. The word
sample literally means small universe.

Amity Directorate of Distance & Online Education


62 Statistics Management

●● Sampling: Sampling is defined as the selection of some part of an aggregate


Notes

e
or totality on the basis of which a judgment or inference about the aggregate or
totality is made. Sampling is the process of learning about the population on the

in
basis of a sample drawn from it.
●● Stratified random sampling: Stratified random sampling requires the separation
of defined target population into different groups called strata and the selection of

nl
sample from each stratum.
●● Cluster sampling: Cluster sampling is a probability sampling method in which
the sampling units are divided into mutually exclusive and collectively exhaustive

O
subpopulation called clusters.
●● Confidence interval: (CI) for a population proportion can be used to show the
statistical probability that a characteristic is likely to occur within the population.

ity
●● Point estimate. A point estimate of a population parameter is a single value of a
statistic
●● Interval estimate. An interval estimate is defined by two numbers, between which
a population parameter is said to lie.

s
Check your progress er
1. _____ states that the distribution of sample means approximates a normal
distribution as the sample size gets larger.
a) Probability
v
b) Central Limit Theorem
c) Z test
ni

d) Sampling Theorem
2. ____ error is a statistical term referring to an error resulting from data collection,
U

which causes the data to differ from the true values


a) Sampling
b) Non - sampling
ity

c) Probability
d) Central
3. Sampling method in which a set of respondents are chosen who help the
m

researcher to identify additional respondents to be included in the study is ?


a) Quota Sampling
b) Judgment Sampling
)A

c) Snowball Sampling
d) Convenience Sampling
4. Value used to measure distance between mean and random variable x in terms of
(c

standard deviation is -
a) Z-value

Amity Directorate of Distance & Online Education


Statistics Management 63

b) Variance
Notes

e
c) Probability of x
d) Density function of x

in
5. Test is applied when samples are less than 30.
a) T

nl
b) Z
c) Rank

O
d) None of these

Questions and Exercises


1. What is sampling? Explain the features of sampling

ity
2. Differentiate between sampling and non-sampling.
3. Explain any five types of sampling techniques
4. What do you mean by t-test and z test?

s
5. Explain Confidence interval estimation for population proportion
er
Check your progress:
1. b) Central Limit Theorem
2. b) Non - sampling
v
3. c) Snowball Sampling
ni

4. a) Z-value
5. a) T
U

Further Readings
4. Richard I. Levin, David S. Rubin, Sanjay Rastogi Masood Husain Siddiqui,
Statistics for Management, Pearson Education, 7th Edition, 2016.
ity

5. Prem.S.Mann, Introductory Statistics, 7th Edition, Wiley India, 2016.


6. Gareth James, Daniela Witten, Trevor Hastie, Robert Tibshirani, An
Introduction to Statistical Learning with Applications in R, Springer, 2016.
m

Bibliography
13. Srivastava V. K. etal – Quantitative Techniques for Managerial Decision Making,
Wiley Eastern Ltd
)A

14. Richard, I.Levin and Charles A.Kirkpatrick – Quantitative Approaches to Management,


McGraw Hill, Kogakusha Ltd.
15. Prem.S.Mann, Introductory Statistics, 7th Edition, Wiley India, 2016.
16. Budnik, Frank S Dennis Mcleaavey, Richard Mojena – Principles of Operation
(c

Research - AIT BS New Delhi.

Amity Directorate of Distance & Online Education


64 Statistics Management

17. Sharma J K – Operation Research- theory and applications-Mc Millan,New Delhi


Notes

e
18. Kalavathy S. – Operation Research – Vikas Pub Co
19. Gould F J – Introduction to Management Science – Englewood Cliffs N J Prentice

in
Hall.
20. Naray J K, Operation Research, theory and applications – Mc Millan, New Dehi.

nl
21. Taha Hamdy, Operations Research, Prentice Hall of India
22. Tulasian: Quantitative Techniques: Pearson Ed.
23. Vohr.N.D. Quantitative Techniques in Management, TMH

O
24. Stevenson W.D, Introduction to Management Science, TMH

s ity
v er
ni
U
ity
m
)A
(c

Amity Directorate of Distance & Online Education


Statistics Management 65

Module-4: Concepts of Hypothesis Testing


Notes

e
Learning Objective:

in
●● To get introduced with the concept of hypothesis testing and learn parametric and
non-parametric

nl
Learning Outcome:
At the end of the course, the learners will be able to –

O
●● Perform Test of Hypothesis as well as calculate confidence interval for a
population parameter for single sample and two sample cases.

4.1.1 Hypothesis Testing - Introduction

ity
Hypothesis test is a method of making decisions using data from a scientific study.
In statistics, a result is called statistically significant if it has been predicted as unlikely
to have occurred by chance alone, according to a pre-determined threshold probability,
the significance level. The phrase “test of significance” was coined by statistician

s
Ronald Fisher. These tests are used in determining what outcomes of a study would
lead to a rejection of the null hypothesis for a pre-specified level of significance;
er
this can help to decide whether results contain enough information to cast doubt on
conventional wisdom, given that conventional wisdom has been used to establish
the null hypothesis. The critical region of a hypothesis test is the set of all outcomes
v
which cause the null hypothesis to be rejected in favor of the alternative hypothesis.
Statistical hypothesis testing is sometimes called confirmatory data analysis, in contrast
ni

to exploratory data analysis, which may not have pre-specified hypotheses. Statistical
hypothesis testing is a key technique of frequents inference.

Characteristics of Hypothesis
U

The important characteristics of Hypothesis are as follows:

Hypothesis must be conceptually clear


ity

The concepts used in the hypothesis should be clearly defined, operationally


if possible. Such definitions should be commonly accepted and easily communicable
among the research scholars.
m

Hypothesis should have empirical referents


The variables contained in the hypothesis should be empirical realities. In case
these are not empirical realities then it will not be possible to make the observations.
)A

Being handicapped by the data collection, it may not be possible to test the hypothesis.
Watch for words like ought, should, bad.

Hypothesis must be specific


(c

The hypothesis should not only be specific to a place and situation but also
these should be narrowed down with respect to its operation. Let there be no global

Amity Directorate of Distance & Online Education


66 Statistics Management

use of concepts whereby the researcher is using such a broad concept which may
Notes

e
all inclusive and may not be able to tell anything. For example somebody may try to
propose the relationship between urbanization and family size. Yes urbanization

in
influences in declining the size of families. But urbanization is such comprehensive
variable which hide the operation of so many other factor which emerge as part of the
urbanization process. These factors could be the rise in education levels, women’s
levels of education, women empowerment, emergence of dual earner families, decline

nl
in patriarchy, accessibility to health services, role of mass media, and could be more.
Therefore the global use of the word `urbanization’ may not tell much. Hence it is
suggested to that the hypothesis should be specific.

O
Hypothesis should be related to available techniques of research
Hypothesis may have empirical reality; still we are looking for tools and techniques

ity
that could be used for the collection of data. If the techniques are not there then the
researcher is handicapped. Therefore, either the techniques are already available or the
researcher is in a position to develop suitable techniques for the study.

Hypothesis should be related to a body of theory

s
Hypothesis has to be supported by theoretical argumentation. For this purpose the
research may develop his/her theoretical framework which could help in the generation
er
of relevant hypothesis. For the development of a framework the researcher shall
depend on the existing body of knowledge. In such an effort a connection between
the study in hand and the existing body of knowledge can be established. That is how
v
the study could benefit from the existing knowledge and later on through testing the
hypothesis could contribute to the reservoir of knowledge.
ni

Hypothesis testing procedure


Hypothesis testing refers to the formal procedures used by statisticians to accept
U

or reject statistical hypotheses. It is an assumption about a population parameter. This


assumption may or may not be true.

The best way to determine whether a statistical hypothesis is true would be to


ity

examine the entire population. Since that is often impractical, researchers typically
examine a random sample from the population. If sample data are not consistent with
the statistical hypothesis, the hypothesis is rejected.

In doing so, one has to take the help of certain assumptions or hypothetical values
m

about the characteristics of the population if some such information is available. Such
hypothesis about the population is termed as statistical hypothesis and the hypothesis
is tested on the basis of sample values. The procedure enables one to decide on a
certain hypothesis and test its significance. “A claim or hypothesis about the population
)A

parameters is known as Null Hypothesis and is written as, H 0 .”

This hypothesis is then tested with available evidence and a decision is made
whether to accept this hypothesis or reject it. If this hypothesis is rejected, then we
accept the alternate hypothesis. This hypothesis is written as H1. For testing hypothesis
(c

or test of significance we use both parametric tests and nonparametric or distribution


free tests. Parametric tests assume within properties of the population, from which we
draw samples. Such assumptions may be about population parameters, sample size
Amity Directorate of Distance & Online Education
Statistics Management 67

etc. In case of non-parametric tests, we do not make such assumptions. Here we


Notes

e
assume only nominal or ordinal data.

in
4.1.2 Developing Null and Alternate Hypothesis

Null Hypothesis

nl
It is used for testing the hypothesis formulated by the researcher. Researchers
treat evidence that supports a hypothesis differently from the evidence that opposes
it. They give negative evidence more importance than to the positive one. It is because

O
the negative evidence tarnishes the hypothesis. It shows that the predictions made by
the hypothesis are wrong. The null hypothesis simply states that there is no relationship
between the variables or the relationship between the variables is “zero.” That is
how symbolically null hypothesis is denoted as “H0”. For example: H0 = There is no

ity
relationship between the level of job commitment and the level of efficiency.

Or H0 = The relationship between level of job commitment and the level of


efficiency is zero. Or The two variables are independent of each other. It does not take
into consideration the direction of association

s
(i.e. H0 is non directional), which may be a second step in testing the hypothesis.
First we look whether or not there is an association then we go for the direction of
er
association and the strength of association. Experts recommend that we test our
hypothesis indirectly by testing the null hypothesis. In case we have any credibility in
our hypothesis then the research data should reject the null hypothesis. Rejection of the
v
null hypothesis leads to the acceptance of the alternative hypothesis.
ni

Alternative Hypothesis
The alternative (to the null) hypothesis simply states that there is a relationship
between the variables under study. In our example it could be: there is a relationship
U

between the level of job commitment and the level of efficiency. Not only there is an
association between the two variables under study but also the relationship is perfect
which is indicated by the number “1”. Thereby the alternative hypothesis is symbolically
denoted as “ H1”. It can be written like this: H1: There is a relationship between the level
ity

of job commitment of the officers and their level of efficiency.

4.1.3 Type I Error and Type II Error


A statistically significant result cannot prove that a research hypothesis is correct
m

(as this implies 100% certainty). Because a p-value is based on probabilities, there is
always a chance of making an incorrect conclusion regarding accepting or rejecting the
null hypothesis (H0).
)A

Anytime we make a decision using statistics there are four possible outcomes, with
two representing correct decisions and two representing errors.

Type 1 error
(c

A type 1 error is also known as a false positive and occurs when a researcher
incorrectly rejects a true null hypothesis. This means that your report that your findings
are significant when in fact they have occurred by chance.
Amity Directorate of Distance & Online Education
68 Statistics Management

●● The probability of making a type I error is represented by your alpha level (α),
Notes

e
which is the p-value below which you reject the null hypothesis. A p-value of
0.05 indicates that user is willing to accept a 5% chance that you are wrong

in
when you reject the null hypothesis.
●● The risk of committing a type I error can be reduced by using a lower value
for p. For example, a p-value of 0.01 would mean there is a 1% chance of

nl
committing a Type I error.
●● However, using a lower value for alpha means that you will be less likely to
detect a true difference if one really exists (thus risking a type II error).

O
Type 2 error
A type II error is also known as a false negative and occurs when a researcher fails
to reject a null hypothesis which is really false. Here a researcher concludes there is not

ity
a significant effect, when actually there really is.

The probability of making a type II error is called Beta (β), and this is related to the
power of the statistical test (power = 1- β). The risk of committing a type II error can be
decreased by ensuring that the test has enough power.

s
4.1.4 Level of Significance and Critical Region
er
Level of Significance
●● The level of significance often referred to as alpha or α, is a measure of
v
the strength of the evidence to be present in your sample before the null
hypothesis is rejected and it is concluded that the effect is statistically
ni

significant. Before performing the experiment the researcher decides the


degree of significance.
●● The significance level is the probability of rejecting the null hypothesis when
U

it is true. For example, a significance level of 0.06 indicates a 6% risk of


concluding that a difference exists when there is no actual difference. Lower
significance levels indicate that stronger evidence is required before the null
hypothesis is rejected.
ity

●● The significance levels are used during hypothesis testing to help in the
determination of which hypothesis the data supports and are comparing the
p-value with significance level. If the p-value is less than the significance
level, then the null hypothesis can be rejected and concluded that the effect
m

is statistically significant. In other words, the evidence in the sample is strong


enough to be able to reject the null hypothesis at the population level.

Critical Region
)A

A critical region, also known as the Region of Rejection, is a set of test statistic
values for which the null hypothesis is rejected. That is to say, if the test statistics
observed are in the critical region then we reject the null hypothesis and accept the
alternative hypothesis. The critical region defines how far away our sample statistic
(c

must be from the null hypothesis value before we can say it is unusual enough to reject
the null hypothesis.

Amity Directorate of Distance & Online Education


Statistics Management 69

The “best” critical region is one where the likelihood of making a Type I or Type II
Notes

e
error is minimised. In other words, the uniformly most powerful rejection region is the
region where the smallest chance of making a Type I or II error is present. It is also the

in
region that provides the largest (or equally greatest) power function for a UMP test.

4.1.5 Standard Error

nl
A statistic’s standard error is the standard deviation from its sampling distribution,
or an estimate of that standard deviation. If the mean is the parameter or the statistic it
is called the mean standard error. It is defined as –

O
σ
SE=
n
Where,

ity
SE is Standard error of the sample
N is the number of samples and
σ Is the sample standard deviation.

Standard error increases when standard deviation, i.e. the variance of the

s
population, increases. Standard error decreases when sample size increases – as the
sample size gets closer to the true size of the population, the sample means cluster
more and more around the true population mean.
er
The standard error tells how accurate the mean is likely to be compared with the
true population of any given sample from that population. By increasing the standard
v
error, i.e. the means are more spread out; it becomes more likely that any given mean
is an inaccurate representation of the true mean population.
ni

4.1.6 Confidence Interval


A Confidence Interval is a range of values where the true value lies in. It is a type
U

of estimate computed from the statistics of the observed data. This proposes a range of
plausible values for an unknown parameter (for example, the mean). The interval has
an associated confidence level that the true parameter is in the proposed range.
ity

●● Given observations and a confidence level a valid confidence interval


has a probability of containing the true underlying parameter. The level of
confidence can be chosen by the investigator. In general terms, a confidence
interval for an unknown parameter is based on sampling the distribution of a
corresponding estimator. The confidence level here represents the frequency
m

(i.e. the proportion) of possible confidence intervals that contain the true value
of the unknown population parameter.
●● In other words, if confidence intervals are constructed using a given
)A

confidence level from an infinite number of independent sample statistics, the


proportion of those intervals that contain the true value of the parameter will
be equal to the confidence level.
For example, if the confidence level is 90% then in a hypothetical indefinite data
(c

collection, in 90% of the samples the interval estimate will contain the population
parameter. The confidence level is designated before examining the data. Most

Amity Directorate of Distance & Online Education


70 Statistics Management

commonly, a 95% confidence level is used. However, confidence levels of 90% and
Notes

e
99% are also often used in analysis.

Factors affecting the width of the confidence interval include the size of the sample,

in
the confidence level, and the variability in the sample. A larger sample will tend to
produce a better estimate of the population parameter, when all other factors are equal.
A higher confidence level will tend to produce a broader confidence interval.

nl
4.2.1 For Single Population Mean Using t-statistic
When s is not known, we use its estimate computed from the given sample. Here,

O
the nature of the sampling distribution of X would depend upon sample size n. There
are the following two possibilities:

If parent population is normal and n < 30 (popularly known as small sample case),

ity
use t - test. The
∑ ( xi − x ) 2
Unbiased estimate of s in this case is given by s=
n −1

If n ³ 30 (large sample case), use standard normal test. The unbiased estimate of

s
∑ ( xi − x ) 2
s in this case can be taken as s= since the difference between n and n - 1
er n
is negligible for large values of n. Note that the parent population may or may not be

normal in this case.


v
Application
ni

Statisticians use tα to represent the t statistic that has a cumulative probability of


(1 - α). For example, suppose we were interested in the t statistic having a cumulative
probability of 0.95. In this example, α would be equal to (1 - 0.95) or 0.05. We would
U

refer to the t statistic as t0.05

Of course, the value of t0.05 depends on the number of degrees of freedom. For
example, with 2 degrees of freedom, t0.05 is equal to 2.92; but with 20 degrees of
ity

freedom, t0.05 is equal to 1.725.

Example:

ABC Corporation manufactures light bulbs. The CEO claims that an average
Acme light bulb lasts 300 days. A researcher randomly selects 15 bulbs for testing. The
m

sampled bulbs last an average of 290 days, with a standard deviation of 50 days. If the
CEO’s claim were true, what is the probability that 15 randomly selected bulbs would
have an average life of no more than 290 days?
)A

Note: Solution is the traditional approach and requires the computation of the t
statistic, based on data presented in the problem description. Then, the T distribution
calculator is to be used to find the probability.

Solution:
(c

Computing the t statistic, based on the following equation:

Amity Directorate of Distance & Online Education


Statistics Management 71

t=[x-μ]/[s/√ (n)]
Notes

e
t = ( 290 - 300 ) / [ 50 / √ ( 15) ]
t = -10 / 12.909945 = - 0.7745966

in
where x is the sample mean, μ is the population mean, s is the standard deviation
of the sample, and n is the sample size.

●● The degrees of freedom are equal to 15 - 1 = 14.

nl
●● The t statistic is equal to - 0.7745966.
The calculator displays the cumulative probability: 0.226. Hence, if the true bulb

O
life were 300 days, there is a 22.6% chance that the average bulb life for 15 randomly
selected bulbs would be less than or equal to 290 days.

4.2.2 For Single Population Mean Using z-statistic

ity
A z-test is a statistical test that is used to determine if means of population differ
when the variances are known and the sample size is large. It is assumed that the
test statistics have a normal distribution, and nuisance parameters such as standard
deviation should be known in order to perform an accurate z-test.

s
It is useful to standardized the values of a normal distribution by converting them
into z-scores as -
er
(a) It allows the researchers to calculate the probability of a score occurring within a
standard normal distribution;
v
(b) It enables the comparison of two scores that are from different samples (which may
have different means and standard deviations).
ni

●● A z-test is a statistical test to determine whether two population means are


different when the variances are known and the sample size is large.
●● It can be used to test hypotheses in which the z-test follows a normal distribution.
U

●● A z-statistic, or z-score, is a number representing the result from the z-test.


●● Z-tests are closely related to t-tests, but t-tests are best performed when an
experiment has a small sample size.
ity

●● Also, t-tests assume the standard deviation is unknown, while z-tests assume
it is known.

Application
m

The conditions for a z test are:

●● The distribution of the population is Normal


●● The sample size is large n>30.
)A

If at least one of conditions are satisfied, then.

Z = x – µ / σ / √n
Where, x is the sample mean,
u is the population mean
(c

σ is the population standard deviation and


n is the sample size

Amity Directorate of Distance & Online Education


72 Statistics Management

Example:
Notes

e
The mean length of the lumber is supposed to be 8.5 feet. A builder wants to check
whether the shipment of lumber she receives has a mean length different from 8.5 feet.

in
If the builder observes that the sample mean of 61 pieces of lumber is 8.3 feet with a
sample standard deviation of 1.2 feet. What will she conclude? Is 8.3 very different from
8.5?

nl
Solution:

Whether the value is different or not depends on the standard deviation of x

O
Thus,
Z = x – µ / σ / √n
= 8.3 -8.5 / 1.2 √ 61

ity
= - 1.3

Thus, It is been asked if −1.3 is very far away from zero, since that corresponds to
the case when x¯ is equal to μ0. If it is far away so the null statement is unlikely to be
valid and one refuses it. Otherwise the null hypothesis can not be discarded.

s
4.2.3 Hypothesis Testing for Population Proportion.
er
Using independent samples means that there is no relationship between the
groups. The values in one sample have no association with the values in the other
sample. These populations are not related, and the samples are independent. We look
v
at the difference of the independent means.
ni

As with comparing two population proportions, when we compare two population


means from independent populations, the interest is in the difference of the two means.
In other words, if μ1 is the population mean from population 1 and μ2 is the population
mean from population 2, then the difference is μ1−μ2.
U

It is important to be able to distinguish between an independent sample and a


dependent sample.
ity

Independent sample
The samples from two populations are independent if the samples selected from
one of the populations have no relationship with the samples selected from the other
population.
m

Dependent sample
The samples are dependent if each measurement in one sample is matched or
)A

paired with a particular measurement in the other sample. Another way to consider this
is how many measurements are taken off of each subject. If only one measurement,
then independent; if two measurements, then paired. Exceptions are in familial
situations such as in a study of spouses or twins. In such cases, the data is almost
always treated as paired data.
(c

Amity Directorate of Distance & Online Education


Statistics Management 73

Example - Compare the time that males and females spend watching TV.
Notes

e
a. We randomly select 15 men and 15 women and compare the average time they
spend watching TV. Is this an independent sample or paired sample?

in
b. We randomly select 15 couples and compare the time the husbands and wives
spend watching TV. Is this an independent sample or paired sample?

nl
a. Independent Sample
b. Paired sample

O
Application
The null hypothesis to be tested is H0: π = π0 against Ha: π ≠ π0 for a two tailed test
and π > or < π0 for a one tailed test. The test statistic is

ity
p − π0 n
zcal
= = ( p − π0 )
π 0 (1 − π 0 ) π 0 (1 − π 0 )
n

s
Example :

A wholesaler in oranges claims that only 4% of the apples supplied by him are
er
defective. A random sample of 600 apples contained 36 defective apples. Test the claim
of the wholesaler.

Solution.
v
We have to test H0 : π £ 0.04 against Ha : π > 0.04.
ni

It is given that p = 36/ 600 = 0.06 and n = 600.

600
zcal =
(0.06−0.04) 2.5
=
U

0.04 x0.96

This value is highly significant in comparison to 1.645, therefore, H0 is rejected at


5% level of significance.
ity

Example:

470 tails were obtained in 1,000 throws of an unbiased coin. Can the difference
between the proportion of tails in sample and their proportion in population be regarded
m

as due to fluctuations of sampling?

Solution:

We have to test H0 : π = 0.5 against Ha : π ≠ 0.5.


)A

It is given that p = 470/1000 = 0.47 and n = 1000.

Since this value is less than 1.96, the coin can be regarded as fair and thus,
the difference between sample and population proportion of heads are only due to
fluctuations of sampling.
(c

Amity Directorate of Distance & Online Education


74 Statistics Management

4.3.1 Inference about the Difference Between two Population Means


Notes

e
a. When the population mean in being known

in
This test is applicable when the random sample X1 , X2 , ...... Xn is drawn from a
normal population.

We can write H0 : µ = µ0 (specified) against Ha : µ ≠ µ0 (two tailed test)

nl
The test statistic X − µ � N (0,1) . Let the value of this statistic calculated from
σ/ n
sample be denoted

O
X −µ
as zcal = . The decision rule would be:
σ/ n
Reject H0 at 5% (say) level of significance if zcal > 1.96. Otherwise, there is no

ity
evidence against H0 at 5% level of significance.

Example –

A company claims that the average mileage of bikes of his company is 40 km/l.

s
A random sample of 20 bikes of the company showed an average mileage of 42 km/l.
Test the claim of the manufacturer on the assumption that the mileage of scooter is
er
normally distributed with a standard deviation of 2 km/l.

Here, we have to test H0 : µ = 40 against Ha : π ≠ 40

X −µ 42 − 40
v
=zcal = = 4.47
σ/ n 2 / 20
ni

Since zcal > 1.96, is rejected at 5% level of significance.

b. When the population mean is being unknown


U

When s is not known, we use its estimate computed from the given sample. Here,
the nature of the sampling distribution of X would depend upon sample size n. There
are the following two possibilities:
ity

If parent population is normal and n < 30 (popularly known as small sample case),
use t – test. Also, like normal test, the hypothesis may be one or two tailed

If n ³ 30 (large sample case), use standard normal test. Since the difference
between n and n - 1 is negligible for large values of n. Note that the parent population
m

may or may not be normal in this case.

Example:
)A

Daily sales figures of 40 shopkeepers showed that their average sales and
standard deviation were Rs 528 and Rs 600 respectively. Is the assertion that daily
sales on the average is Rs 400, contradicted at 5% level of significance by the sample?

Solution:
(c

Since n > 30, standard normal test is applicable. It is given that n = 40, X = 528 and
S = 600.

Amity Directorate of Distance & Online Education


Statistics Management 75

We have to test H0 : µ = 400 against Ha : µ ≠ 400.


Notes

e
528 − 400
=zcal = 1.35
600 / 40

in
Since this value is less than 1.96, there is no evidence against H0 at 5% level of
significance. Hence, the given assertion is not contradicted by the sample.

nl
4.3.2 Inference about the Difference Between two Population
Proportions

O
A test of two population proportions is very similar to a test of two means, except
that the parameter of interest is now “p” instead of “µ”.

With a one-sample proportion test, p = x/n is used. as the point estimate of p.

ity
It is expect that p̂ would be close to p. With a test of two proportions, we will
have two p̂ ’s, and we expect that (p̂ 1 – p̂ 2) will be close to (p1 – p2). The test statistic
accounts for both samples.

●● With a one-sample proportion test, the test statistic is

s

p− p
z=
p (1 − p )
er
n
v
and it has an approximate standard normal distribution.

●● For a two-sample proportion test, we would expect the test statistic to be


ni

HOWEVER, the null hypothesis will be that p1 = p2. Because the H0 is assumed
to be true, the test assumes that p1 = p2. We can then assume that p1 = p2 equals p, a
common population proportion. We must compute a pooled estimate of p (its unknown)
U

using our sample data.

Application
ity

When we have a categorical variable of interest measured in two populations, it is


quite often that we are interested in comparing the proportions of a certain category for
the two populations.

Men and Women were asked about what they would do if they received a $100 bill
m

by mail, addressed to their neighbor, but wrongly delivered to them. Would they return
it to their neighbour? Of the 69 males sampled, 52 said “yes” and of the 131 females
sampled, 120 said “yes.”
)A

Does the data indicate that the proportions that said “yes” are different for male and
female?

If the proportion of males who said “yes, they would return it” is denoted as p1 and
the proportion of females who said “yes, they would return it” is denoted as p2, thus p1
(c

= p2

p1 – p2 = 0 or p1/p2 = 1

Amity Directorate of Distance & Online Education


76 Statistics Management

It is required to develop a confidence interval or perform a hypothesis test for one


Notes

e
of these expressions.

Thus,

in
Men: n1 = 69 p1 = 52/69

Women = n2 = 131 p2 = 120/131

nl
Using the formula –
   
  p1 (1 − p1 ) p2 (1 − p2 )
p1 − p2 ± zα /2 +
n1 n2

O
52  52  120  120 
1 −  1 − 
52 120 69  69  131  131
− ± 1.96 +
69 131 69 131

ity
−0.1624 ± 1.96(0.05725)
−0.1624 ± 0.1122or (0.2746 − 0.0502)

We are 95% confident that the difference of population proportions of men who
said “yes” and women who said “yes” is between -0.2746 and -0.0502.

s
Based on both ends of the interval being negative, it seems like the proportion of
females who would return it is higher than the proportion of males who would return it.
er
4.3.3 Independent Samples and Matched Samples
Matched samples also called as matched pairs, paired samples or dependent
v
samples are paired such that all characteristics except the one under review are shared
by the participants. A “participant” is a member of the sample, and can be a person,
ni

object or thing. Matched pairs are widely used to assign one person to a treatment
group and another to a control group. This method , called matching, is used in the
design of matched pairs. The “pairs” should not be different persons, at different times
U

they can be the same individuals.

●● The same study participants are measured before and after an intervention.
●● The same study participants are measured twice for two different
ity

interventions.
An independent sample is the opposite of a matched sample which deals with
unrelated classes.

Although matching pairs are intentionally selected, individual samples are typically
m

selected at random (through simple random sampling or a similar technique)

4.3.4 Inference about the Ratio of two Population Variances


)A

One of the essential steps of a test to compare two population variances is for
checking the equal variances assumption if you want to use the pooled variances. Many
people use this test as a guide to see if there are any clear violations, much like using
the rule of thumb.
(c

An F-test is used to test if the variances of two populations are equal. This test can
be a two-tailed test or a one-tailed test.

Amity Directorate of Distance & Online Education


Statistics Management 77

The two-tailed version tests that the variances are not equal against the alternative.
Notes

e
The one-tailed version tests only in one direction, that is, the variance from the
first population is either greater or less than (but not both) the second variance in

in
population. The problem determines the choice. If we are testing a new process , for
example, we might only be interested in knowing if the new process is less variable
than the old process.

nl
Application:
To compare the variances of two quantitative variables, the hypotheses of interest
are:

O
Null Alternatives
σ2 σ 12
H 0 : 12 = 1 Hα : ≠1
σ 22

ity
σ2
σ 12
Hα : >1
σ 22
σ 12
Hα : <1

s
σ 22
Example:
er
Suppose randomly 7 women are selected from a population of women, and 12
men from a population of men. The table below shows the standard deviation in each
sample and in each population. Compute the f statistic.
v
Population Population standard deviation Sample standard deviation
ni

Women 30 35

Men 50 45
U

Solution:

The f statistic can be computed from the population and sample standard
ity

deviations, using the following equation: f = [ s1 2/ σ1, 2 ] / [ s2 2/ σ2, 2 ]

where σ 1 is the standard deviation of population 1, s1 is the standard deviation of


the sample drawn from population 1, σ 2 is the standard deviation of population 2, and s
1 is the standard deviation of the sample drawn from population 2.
m

f = ( 35`2 / 30`2 ) / ( 45`2 / 50`2 )

= (1225 / 900) / (2025 / 2500)


)A

= 1.361 / 0.81

= 1.68

For this calculation, the numerator degrees of freedom v1 are 7 - 1 or 6; and the
(c

denominator degrees of freedom v2 are 12 - 1 or 11. On the other hand, if the men’s
data appears in the numerator, we can calculate an f statistic as follows:

Amity Directorate of Distance & Online Education


78 Statistics Management

f = ( 45`2 / 50`2 ) / ( 352 / 302 )


Notes

e
= (2025 / 2500) / (1225 / 900)

in
= 0.81 / 1.361

= 0.595

For this calculation, the numerator degrees of freedom v1 are 12 – 1 or 11; and

nl
the denominator degrees of freedom v2 are 7 – 1 or 6. When you are trying to find the
cumulative probability associated with an f statistic, you need to know v1 and v2.

O
Assumptions
Several assumptions are made for the test. Your population must be approximately
normally distributed (i.e. fit the shape of a bell curve) in order to use the test. Plus, the

ity
samples must be independent events. In addition, you’ll want to bear in mind a few
important points:

●● The larger variance should always go in the numerator (the top number) to
force the test into a right-tailed test. Right-tailed tests are easier to calculate.

s
●● For two-tailed tests, divide alpha by 2 before finding the right critical value.
●● If you are given standard deviations, they must be squared to get the
variances.
er
●● If your degrees of freedom aren’t listed in the F Table, use the larger critical
value. This helps to avoid the possibility of Type I errors.
v
4.4.1 Analysis of Variance
ni

Variance is defined as the average of squared deviation of data points from their
mean.

When the data constitute a sample, the variance is denoted byσ2x and averaging
U

is done by dividing the sum of the squared deviation from the mean by ‘n – 1’. When
observations constitute the population, the variance is denoted by σ2 and we divide by
N for the average
ity

Different formulas for calculating variance:


n
∑ ( xi − X ) 2
Sample Variance Var (X) = σ 2 = i =1
x
n −1
m

∑( − )
Population Variance Var (X) =

Where,
)A

Xi for i = 1, 2, ..., n are observations values.

X = Sample mean

n = Sample size.
(c

µ = Population mean

Amity Directorate of Distance & Online Education


Statistics Management 79

N = Population size
Notes

e
Population Variance is,
∑ ( xi − µ ) 2

in
Var (x) = σ 2 =
N
n n n n
∑ ( xi2 − 2µ xi + µ 2 ) ∑ ( xi2 ) − 2µ ∑ xi + µ 2 ∑ (1)
= =
=i 1 =i 1 =i 1 =i 1

nl
N N
n
∑ xi2
= i =1
− µ2

O
N
Var (x) = E(X 2 )-[E(X)]2

4.5.1 Chi Square Test

ity
It is the test that uses the chi-square statistic to test the fit between a theoretical
frequency distribution and a frequency distribution of observed data for which each
observation may fall into one of several classes.

s
Formula of Chi-square text:

(O – E)2
x2 = Σ
E
er
Table value of X2 for d.f and a

X2 cal < X2 table, accept H0


v
Conditions of Chi-square Test
ni

A chi-square test can be used when the data satisfies four conditions:

●● There must be two observed sets of data or one observed set of data and one
U

expected set of data (generally, there are n-rows and c-columns of data).
●● The two sets of data must be based on the same sample size.
●● Each cell in the data contains the observed or expected count of five or large?
ity

●● The different cells in a row of column must have categorical variables (male,
female less than 25 years of age, 25 year of age, older than 40 years of age
etc.

Application areas of Chi-square Test


m

●● The distribution typically looks like a normal distribution, which is skewed to


the right with a long tail to the right. It is a continuous distribution with only
positive values. It has following applications:
)A

●● To test whether the sample differences among various sample proportions are
significant or can they be attributed to chance
●● To test the independence of two variables in a contingency table.
●● To use it as a test of goodness of fit.
(c

Amity Directorate of Distance & Online Education


80 Statistics Management

Example 1:
Notes

e
The operations manager of a company that manufactures tires wants to determine
whether there are any differences in the quality of work among the three daily shifts.

in
She randomly selects 496 tires and carefully inspects them. Each tire is either classified
as perfect, satisfactory, or defective, and the shift that produced it is also recorded. The
two categorical variables of interest are shift and condition of the tire produced. The

nl
data can be summarized by the accompanying two-way table. Does the data provide
sufficient evidence at the 5% significance level to infer that there are differences in
quality among the three shifts?

O
Perfect Satisfactory Defective Total

Shift 1 106 124 1 231

Shift 2 67 85 1 153

ity
Shift 3 37 72 3 112

Total 210 281 5 496

s
Solution:

C1
er C2 C3 Total

1 106 124 1 231

97.80 130.87 2.33


v
2 67 85 1 153
ni

64.78 86.68 1.54

3 37 72 3 112
U

47.42 63.45 1.13

Total 210 281 5 496


ity

Chi-Sq = 8.647 DF = 4, P-Value = 0.071

There are 3 cells with expected counts less than 5.0.

In the above example, there are no significant results at a 5% significance level


m

since the p-value (0.071) is greater than 0.05. Even if we did have a significant result,
we still could not trust the result, because there are 3 (33.3% of) cells with expected
counts < 5.0
)A

Example 2

A food services manager for a baseball park wants to know if there is a relationship
between gender (male or female) and the preferred condiment on a hot dog. The
following table summarizes the results. Test the hypothesis with a significance level of
10%.
(c

Amity Directorate of Distance & Online Education


Statistics Management 81

Ketchup Mustard Relish Total


Notes

e
Male 15 23 10 48

in
Female 25 19 8 52

Total 40 42 18 100

nl
Solution:

The hypotheses are:

O
●● H0: Gender and condiments are independent
●● Ha: Gender and condiments are not independent

ity
Ketchup Mustard Relish Total

Male 15 ( 19.2) 23 ( 20.16) 10 ( 8.64) 48

Female 25 ( 20.8) 19 ( 21.84) 8 ( 9.36) 52

s
Total 40 42 18 100
er
None of the expected counts in the table are less than 5. Therefore, we can
proceed with the Chi-square test. The test statistic is
v
2* (15 − 19.2)2 (23 − 20.16)2 (10 − 8.64)2
x = + + +
ni

19.2 20.16 8.64

(25 − 20.8)2 (19 − 21.84)2 (8 − 9.36)2


U

+ + 2.95
=
20.8 21.84 9.36

The p-value is found by P(χ2>χ2*)=P(χ2>2.95) with (3-1)(2-1) =2 degrees of


ity

freedom. Using a table or software, we find the p-value to be 0.2288. With a p-value
greater than 10%, we can conclude that there is not enough evidence in the data to
suggest that gender and preferred condiment are related.

Assumptions of Chi-square Test


m

The chi-squared test, when used with the standard approximation that a chi-
squared distribution is applicable, has the following assumptions:
)A

●● Simple random sample: The sample data is a random sampling from a fixed
distribution or population where each member of the population has an equal
probability of selection. Variants of the test have been developed for complex
samples, such as where the data is weighted.
●● Sample size (whole table): A sample with a sufficiently large size is
(c

assumed. If a chi squared test is conducted on a sample with a smaller size,

Amity Directorate of Distance & Online Education


82 Statistics Management

then the chi squared test will yield an inaccurate inference. The researcher, by
Notes

e
using chi squared test on small samples, might end up committing a Type II
error.

in
●● Expected cell count: Adequate expected cell counts. Some require 5 or more,
and others require 10 or more. A common rule is 5 or more in all cells of a 2-by-
2 table, and 5 or more in 80% of cells in larger tables, but no cells with zero

nl
expected count. When this assumption is not met, Yates’s correction is applied.
●● Independence: The observations are always assumed to be independent of
each other. This means chi-squared cannot be used to test correlated data
(like matched pairs or panel data). In those cases you might want to turn to

O
McNamara’s test.

Degrees of Freedom (d.f)

ity
The degree of freedom, abbreviated as d.f, denotes the extent of independence
(freedom) enjoyed by a given set of observed frequencies. Degrees of freedom are
usually denoted by the letter ‘v’ of the Greek alphabet.

Suppose, if we are given a set of ‘n’ observed frequencies which are subjected to

s
‘k’ independent constraints (restrictions). Then

Degrees of Freedom = No. of frequencies – No. of independent constraints ( v =


n–k)
er
Key Terms
v
●● Hypothesis Test: Hypothesis test is a method of making decisions using data
from a scientific study
ni

●● Type I error: A type 1 error is also known as a false positive and occurs when a
researcher incorrectly rejects a true null hypothesis.
●● Type II error: A type II error is a false negative and occurs when a researcher fails
U

to reject a null hypothesis which is really false


●● Confidence Interval: A Confidence Interval is a range of values where the true
value lies in. It is a type of estimate computed from the statistics of the observed
ity

data.
●● Z- Test: A z-test is a statistical test to determine whether two population means
are different when the variances are known and the sample size is large.
●● p Value: The p-value is the probability of receiving outcomes as extreme as the
m

outcomes of a statistical hypothesis test, assuming the null hypothesis is correct.


●● Sample random sample: The sample data is a random sampling from a fixed
distribution or population where each member of the population has an equal
)A

probability of selection
●● Degrees of Freedom: The degree of freedom, abbreviated as d.f, denotes
the extent of independence or the freedom enjoyed by a given set of observed
frequencies
(c

Amity Directorate of Distance & Online Education


Statistics Management 83

Check your progress :


Notes

e
1. A ____ is a range of values where the true value lies in.
a) Confidence Interval

in
b) Quartile range
c) Sample

nl
d) Mean
2. A ____ a statistical test to determine whether two population means are different
when variances are known

O
a) T test
b) Quartile

ity
c) z test
d) Median
3. What denotes the extent of independence enjoyed by a given set of observed
frequencies

s
a) Standard deviation er
b) Median
c) Degree of freedom
d) Hypothesis
v
4. Which test is used as test of goodness of fit.
ni

a) Z test
b) T test
c) Chi square test
U

d) Fitness test
5. A _____ is also known as a false positive and occurs when researcher incorrectly
rejects a true null hypothesis.
ity

a) Type I error
b) Type II error
c) T test error
m

d) Probability error

Questions & Exercises


)A

1. What do you understand by hypothesis ? Explain its characterstics


2. Explain the type of hypothesis and how to develop them ?
3. What is the p value approach to hypothesis testing ?
(c

4. Explain the Chi square test and its assumptions


5. What do you Infer about the difference between two population means

Amity Directorate of Distance & Online Education


84 Statistics Management

Check your progress:


Notes

e
1. a) Confidence Interval
2. c) z test

in
3. c) Degree of freedom
4. c) Chi square test

nl
5. a) Type I error

Further Readings

O
1. Richard I. Levin, David S. Rubin, Sanjay Rastogi Masood Husain Siddiqui,
Statistics for Management, Pearson Education, 7th Edition, 2016.
2. Prem.S.Mann, Introductory Statistics, 7th Edition, Wiley India, 2016.

ity
3. Gareth James, Daniela Witten, Trevor Hastie, Robert Tibshirani, An
Introduction to Statistical Learning with Applications in R, Springer, 2016.

Bibliography

s
1. Srivastava V. K. etal – Quantitative Techniques for Managerial Decision Making,
Wiley Eastern Ltd
2.
er
Richard, I.Levin and Charles A.Kirkpatrick – Quantitative Approaches to Management,
McGraw Hill, Kogakusha Ltd.
3. Prem.S.Mann, Introductory Statistics, 7th Edition, Wiley India, 2016.
v
4. Budnik, Frank S Dennis Mcleaavey, Richard Mojena – Principles of Operation
Research - AIT BS New Delhi.
ni

5. Sharma J K – Operation Research- theory and applications-Mc Millan,New Delhi


6. Kalavathy S. – Operation Research – Vikas Pub Co
U

7. Gould F J – Introduction to Management Science – Englewood Cliffs N J Prentice


Hall.
8. Naray J K, Operation Research, theory and applications – Mc Millan, New Dehi.
ity

9. Taha Hamdy, Operations Research, Prentice Hall of India


10. Tulasian: Quantitative Techniques: Pearson Ed.
11. Vohr.N.D. Quantitative Techniques in Management, TMH
m

12. Stevenson W.D, Introduction to Management Science, TMH


)A
(c

Amity Directorate of Distance & Online Education


Statistics Management 85

Module-5: Forecasting Techniques


Notes

e
Learning Objective:

in
●● To understand the measures of linear relationship between variables
●● To get familiarize with Time Series Analysis

nl
Learning Outcome:
●● Understand and apply forecasting techniques for business decision making and to

O
uncover relationships between variables to produce forecasts of the future values
of strategic variables

ity
“If two or more quantities vary in sympathy so that the movement in one
tends to be accompanied by corresponding movements in others than they
are said are correlated.”
L.R. Conner says-

s
5.1.1 Measures of Linear Relationship: covariance & correlation – Intro
er
We often encounter the situations, where data appears as pairs of figures relating
to two variables, for example, price and demand of commodity, money supply and
inflation, industrial growth and GDP, advertising expenditure and market share, etc.
v
Examples of correlation problems are found in the study of the relationship
between IQ and aggregate percentage marks obtained in mathematics examination
ni

or blood pressure and metabolism. In these examples, both variables are observed
as they naturally occur, since neither variable can be fixed at predetermined levels.
Correlation and regression analysis show how to determine the nature and strength of
U

the relationship between the variables.

●● According to Croxton and Cowden “When the relationship is of a quantitative


nature, the appropriate statistical tool for discovering and measuring the
relationship and expressing it in a brief formula is known as correlation”
ity

●● A.M. Tuttle says, “Correlation is an analysis of the co variation between two or


more variables.”
Correlation is a degree of linear association between two random variables. In
these two variables, we do not differentiate them as dependent and independent
m

variables. It may be the case that one is the cause and other is an effect i.e.
independent and dependent variables respectively. On the other hand, both may be
dependent variables on a third variable. In some cases there may not be any cause
)A

effect relationship at all. Therefore, if we do not consider and study the underlying
economic or physical relationship, correlation may sometimes give absurd results.

5.1.2 Covariance and Correlation - Application in Real Life


(c

For example, The case of global average temperature and Indian population. Both
are increasing over past 50 years but obviously not related. Correlation is an analysis of
the degree to which two or more variables fluctuate with reference to each other.
Amity Directorate of Distance & Online Education
86 Statistics Management

Correlation is expressed by a coefficient ranging between –1 and +1. Positive (+ve)


Notes

e
sign indicates movement of the variables in the same direction. E.g. Variation of the
fertilizers used on a farm and yield, observes a positive relationship within technological

in
limits. Whereas negative (–ve) coefficient indicates movement of the variables in the
opposite directions, i.e. when one variable decreases, other increases. E.g. Variation of
price and demand of a commodity have inverse relationship. Absence of correlation is
indicated if the coefficient is close to zero. Value of the coefficient close to ±1denotes a

nl
very strong linear relationship.

●● The study of correlation helps managers in following ways:

O
●● To identify relationship of various factors and decision variables.
●● To estimate value of one variable for a given value of other if both are
correlated.

ity
●● To understand economic behaviour and market forces.
●● To reduce uncertainty in decision-making to a large extent.
In business, correlation analysis often helps manager to take decisions by
estimating the effects of changing the values of the decision variables like promotion,
advertising, price, production processes, on the objective parameters like costs, sales,

s
market share, consumer satisfaction, competitive price. The decision becomes more
objective by removing subjectivity to certain extent. However, it must be understood that
er
the correlation analysis only tells us about the two or more variables in a data fluctuate
together or not. It does not necessarily be due cause and effect relationship. To know
if the fluctuations in one of the variables indeed affects other or not, one has to be
v
established with logical understanding of the business environment.
ni

5.1.3 Types of Correlation


The correlation can be studied as positive and negative, simple and multiple, partial
and total, linear and non linear. Further the method to study the correlation is plotting
U

graphs on x - y axis or by algebraic calculation of coefficient of correlation. Graphs


are usually scatter diagrams or line diagrams. The correlation coefficients have been
defined in different ways, of these Karl Pearson’s correlation coefficient; Spearman’s
Rank correlation coefficient and coefficient of determination.
ity

1. Positive or negative correlation: In positive correlation, both factors increase


or decrease together. Positive or direct Correlation refers to the movement of variables
in the same direction.
m

The correlation is said to be positive when the increase (decrease) in the value of
one variable is accompanied by an increase (decrease) in the value of other variable also.

Negative or inverse correlation refers to the movement of the variables in opposite


)A

direction. Correlation is said to be negative, if an increase (decrease) in the value of


one variable is accompanied by a decrease (increase) in the value of other.

When we say a perfect correlation, the scatter diagram will show a linear (straight
line) plot with all points falling on straight line. If we take appropriate scale, the straight
(c

line inclination can be adjusted to 45°, although it is not necessary as long as inclination
is not 0° or 90° where there is no correlation at all because value of one variable
changes without any change in the value of other variable.
Amity Directorate of Distance & Online Education
Statistics Management 87

In case of negative correlation when one variable increases the other decrease
Notes

e
and visa versa. If the scatter diagram shows the points distributed closely around an
imaginary line, we say it is high degree of correlation. On the other hand, if we can
hardly see any unique imaginary line around which the observations are scattered, we

in
say correlation does not exist. Even in case of imaginary line being parallel to one of
the axes we say no correlation exists between the variables. If the imaginary line is a
straight line we say the correlation is linear.

nl
2. Simple or multiple correlations: In simple correlation the variation is between
only two variables under study and the variation is hardly influenced by any external
factor. In other words, if one of the variables remains same, there won’t be any change

O
in other variable. For example, variation in sales against price change in case of a
price sensitive product under stable market conditions shows a negative correlation. In
multiple correlations, more than two variables affect one another. In such a case, we
need to study correlation between all the pairs that are affecting each other and study

ity
extent to which they have the influence.

3. Partial or total correlation


In case of multiple correlation analysis there are two approaches to study the

s
correlation. In case of partial correlation, we study variation of two variables and
excluding the effects of other variables by keeping them under controlled condition. In
er
case of ‘total correlation’ study we allow all relevant variables to vary with respect to
each other and find the combined effect. With few variables, it is feasible to study ‘total
correlation’. As number of variables increase, it becomes impractical to study the ‘total
v
correlation’. For example, coefficient of correlation between yield of wheat and chemical
fertilizers excluding the effects of pesticides and manures is called partial correlation.
ni

Total correlation is based upon all the variables.

4. Linear and nonlinear correlation:


U

When the amount of change in one variable tends to keep a constant ratio to the
amount of change in the other variable, then the correlation is said to be linear.

The distinction between linear and non-linear is based upon the consistency of the
ratio of change between the variables. The manager must be careful in analyzing the
ity

correlation using coefficients because most of the coefficients are based on assumption
of linearity. Hence plotting a scatter diagram is good practice. In case of linear
correlation, the differential (derivative) of relationship is constant with the graph of the
data being a straight line.
m

In case on nonlinear correlation the rate of variation changes as values increase or


decrease. The nonlinear relationship could be approximated to a polynomial (parabolic,
cubic etc.), exponential sinusoidal, etc. In such cases using the correlation coefficients
)A

based on linear assumption will be misleading unless used over a very short data
range. Using computers, we could analyze a nonlinear correlation to a certain extent,
with some simplified assumption

5.1.4 Correlation of Grouped Data


(c

Many times the observations are grouped into a ‘two way’ frequency distribution
table. These are called bivariate frequency distribution. It is a matrix where rows are

Amity Directorate of Distance & Online Education


88 Statistics Management

grouped for X variable and columns are grouped for Y variable. Each cell say (i, j)
Notes

e
represents them frequency or count that falls in both groups of a particular range of
values of Xi and Yj. In this case correlation coefficient is given by

in
1
∑ f × mx × m y − ∑ ( f × mx ) ∑ ( f × m y )
r= n
( ∑ f × mx ) 2 (∑ f × my ) 2
∑ ( f × mx 2 ) − ∑( f × my 2 ) −
n n

nl
Where mX and mY are class marks of frequency distributions of X and Y variables,
fX and fY are marginal frequencies of X and Y and fXY are joint frequencies of X and Y

O
respectively.

Example: Calculate coefficient of correlation for the following data.

X/Y 0-500 500-1000 1000-1500 1500-2000 2000-2500 Total

ity
0-200 12 6 - - - 18
200-400 2 18 4 2 1 27
400-600 - 4 7 3 - 14

s
600-800 - 1 - 2 1 4
800-1000 - - 1 2 3 6
Total 14 29
er 12 9 5 69

Solution: Let the assumed mean for X be 1 = 1250 and the scaling factor g = 500.
v
Therefore, we can calculate f x dy and f x dx2 from the marginal distribution of X as,

mx − a
ni

X Class Mark mx dx = Frequency f f x dx f x dx2


g
0-500 250 -2 14 -28 56
U

500-1000 750 -1 29 -29 29


1000-1500 1250 0 12 0 0
1500-2000 1750 1 9 9 9
2000-2500 2250 2 5 10 20
ity

Total -38 114

Definition: The correlation coefficient measures the degree of association between


two variables X and Y.
m

The coefficient is given as –


Covx.Cov y
r=
ó xó y
)A

1
∑ ( X − X )(Y − Y )
r= n
ó xó y
Where r is the ‘Correlation Coefficient’ or ‘Product Moment Correlation Coefficient’
(c

between X and Y. σ X and σ Y are the standard deviations of X and Y respectively. ‘n’ is
the number of the pairs of variables X and Y in the given data.

Amity Directorate of Distance & Online Education


Statistics Management 89

The expression - 1/nΣ(X − X)(Y − Y)


Notes

e
is known as a covariance between the variables X and Y. It is denoted asCov(x,y)
. The Correlation Coefficient r is a dimensionless number whose value lies between

in
+1 and –1. Positive values of r indicate positive (or direct) correlation between the two
variables X and Y i.e. both X and Y increase or decrease together.

Negative values of r indicate negative (or inverse) correlation, thereby meaning that

nl
an increase in one variable X or Y results in a decrease in the value of the other variable.
A zero correlation means that there is no association between the two variables.

O
The formula can be modified as,
1 1
∑ ( X − X )(Y − Y ) ∑ ( XY − XY − XY + XY )
=r = n
n
σ xσ y σ xσ y

ity
∑ XY ∑ X ∑ Y (2)
− ×
= n n n
2 2
∑X2 ∑X ∑Y 2  ∑Y 
− −
n  n  n  n 

s
(3)
E[ XY ] − E[ X ]E[Y ]
=
E[ X 2 ] − ( E[ X ]) 2 E[Y 2 ] − ( E[Y ]) 2
er
Equations (2) and (3) are alternate forms of equation (1). These have advantage
that each value from the mean may not be subtracted.
v
Example: The data of advertisement expenditure (X) and sales (Y) of a company
for past 10 year period is given below. Determine the correlation coefficient between
ni

these variables and comment the correlation.

X 50 50 50 40 30 20 20 15 10 5
U

Y 700 650 600 500 450 400 300 250 210 200

Solution: We shall take U to be the deviation of X values from the assumed mean
of 30 divided by 5. Similarly, V represents the deviation of Y values from the assumed
ity

mean of 400 divided by 10.

Sl.No. X = xi Y = yi U = ui V = vi uivi u i2 vi2


1 50 700 4 30 120 16 900
2 50 650 4 25 100 16 625
m

3 50 600 4 20 80 16 400
4 40 500 2 10 20 4 100
5 30 450 0 5 0 0 25
)A

6 20 400 -2 0 0 4 0
7 20 300 -2 -10 20 4 100
8 15 250 -3 -15 45 9 225
9 10 210 -4 -19 76 16 361
(c

10 5 200 -5 -20 100 25 400


Total -2 26 561 110 3136

Amity Directorate of Distance & Online Education


90 Statistics Management

Short cut procedure for calculation of correlation coefficient


Notes

e
n 1 n n
∑ ui vi − ∑ ui ∑ vi
r= =i 1 n i 1 =i 1
=

in
2 2
n 1 n  n 1 n 
∑ ui 2 −  ∑ ui  ∑ vi 2 −  ∑ vi 
=i 1 = n  i 1= i1 = n i 1 

nl
( −2)(26)
561 −
10 561 − 5.2
= = = 0.976
4 676 109.6 3068.4
110 − 3136 −

O
10 10

Interpretation of r
The correlation coefficient, r ranges from −1 to 1. A value of 1 implies that a linear

ity
equation describes the relationship between X and Y perfectly, with all data points lying
on a line for which Y increases as X increases. A value of −1 implies that all data points
lie on a line for which Y decreases as X increases. A value of 0 implies that there is no
linear correlation between the variables.

s
More generally, note that (Xi − X) (Yi − Y) is positive if and only if Xi and Yi lie
on the same side of their respective means. Thus the correlation coefficient is positive
er
if Xi and Yi tend to be simultaneously greater than, or simultaneously less than, their
respective means.

●● The correlation coefficient is negative if Xi and Yi tend to lie on opposite sides


v
of their respective means.
●● The coefficient of correlation r lies between –1 and +1 inclusive of those
ni

values.
●● When r is positive, the variables x and y increases or decrease together.
●● r = +1 implies that there is a perfect positive correlation between variables x
U

and y.
●● When r is negative, the variables x and y move in the opposite direction.
●● When r = -1, there is a perfect negative correlation.
ity

●● When r = 0, the two variables are uncorrelated.

5.1.5 Spearman Rank Correlation Method - Intro & Application


Quite often the data is available in the form of some ranking for different
m

variables. Also there are occasions where it is difficult to measure the cause-effect
variables. For example, while selecting a candidate, there are number of factors on
which the experts base their assessment. It is not possible to measure many of these
)A

parameters in physical units e.g. sincerity, loyalty, integrity, tactfulness, initiative, etc.
Similar is the case during dance contests. However, in these cases the experts may
rank the candidates. It is then necessary to find out whether the two sets of ranks
are in agreement with each other. This is measured by Rank Correlation Coefficient.
The purpose of computing a correlation coefficient in such situations is to determine
(c

the extent to which the two sets of ranking are in agreement. The coefficient that is
determined from these ranks is known as Spearman’s rank coefficient, rS

Amity Directorate of Distance & Online Education


Statistics Management 91

This is defined by the following formula:


Notes

e
n
6 × ∑ di 2
rs = 1 − i =1

in
n(n 2 − 1)
Where, n = Number of observation pairs

D = Xi - Yi

nl
= Xi = Values of variable X and = Yi values of variable Y

O
Rank Correlation when Ranks are given
Example: Ranks obtained by a set of ten students in a mathematics test (variable
X) and a physics test (variable Y) are shown below:

ity
Rank for Variable X 1 2 3 4 5 6 7 8 9 10
Rank for Variable Y 3 1 4 2 6 9 8 10 5 7

To determine the coefficient of rank correlation, S r

s
Solution: Computations of Spearman’s Rank Correlation as shown below:

Individual Rank in Maths (X = xi) Rank in Physics (Y = yi) di = xi-yi d i2


er
1 1 3 +2 4
2 2 1 -1 1
3 3 4 +1 1
v
4 4 2 -2 1
ni

5 5 6 +1 1
6 6 9 +3 9
7 7 8 +1 1
U

8 8 10 +2 4
9 9 5 -4 16
10 10 7 -3 9
ity

Total 50

n
Now, n = 10, ∑ di 2 =
50
i =1
m

Using the formula


n
6 × ∑ di 2 6 × 50
rs =
1− i =1
1−
= 0.697
=
)A

2
n(n − 1) 10(100 − 1)

It can be said that there is a high degree of correlation between the performance in
mathematics and physics.

Rank Correlation when Ranks are not given


(c

Example: Find the rank correlation coefficient for the following data.

Amity Directorate of Distance & Online Education


92 Statistics Management

X 75 88 95 70 60 80 81 50
Notes

e
Y 120 134 115 110 140 142 100 150

in
Solution: Let R1 and R2 denotes the ranks in X and Y respectively.

X Y R1 R2 d=R1-R2 d2
75 120 5 5 0 0

nl
88 134 2 4 -2 4
95 150 1 1 0 0

O
70 115 6 6 0 0
60 110 7 7 0 0
80 140 4 3 1 1

ity
81 142 3 2 1 1
50 100 8 8 0 0
6

s
6∑d2 6×6
Coefficient of Correlation P =
1− 1−
= +.93
=
n(n 2 − 1) 8(64 − 1)
er
In this method the biggest item gets the first rank, the next biggest second rank and
so on.
v
5.1.6 Regression Model
ni

There is a need for a statistical model that will extract information from the given
data to establish the regression relationship between independent and dependent
relationship. The model should capture systematic behaviour of data. The non-
U

systematic behaviour cannot be captured and called as errors. The error is due to
random component that cannot be predicted as well as the component not adequately
considered in statistical model. Good statistical model captures the entire systematic
component leaving only random errors.
ity

In any model we attempt to capture everything which is systematic in data.


Random errors cannot be captured in any case. Assuming the random errors are
‘Normally distributed’ we can specify the confidence level and interval of random errors.
Thus, our estimates are more reliable.
m

If the variables in a bivariate distribution are correlated, the points in scatter


diagram approximately cluster around some curve. If the curve is straight line we call
)A

it as linear regression. Otherwise, it is curvilinear regression. The equation of the curve


which is closest to the observations is called the ‘best fit’.

The best fit is calculated as per Legender’s principle of least sum squares of
deviations of the observed data points from the corresponding values on the ‘best
fit’ curve. This is called as minimum squared error criteria. It may be noted that the
(c

deviation (error) can be measured in X direction or Y direction. Accordingly we will get


two ‘best fit’ curves. If we measure deviation in Y direction, i.e. for a given i x value of

Amity Directorate of Distance & Online Education


Statistics Management 93

data point ( x,y ) and then we measure corresponding y value on ‘best fit’ curve and
Notes

e
then take the value of deviation in y, we call it as regression of Y on X. In the other
case, if we measure deviations in X direction we call it as regression of X and Y.

in
Definition: According to Morris Myers Blair, regression is the measure of the average
relationship between two or more variables in terms of the original units of the data.

nl
Applicability of Regression Analysis
Regression analysis is one of the most popular and commonly used statistical tools
in business. With availability of computer packages, it has simplified the use. However,

O
one must be careful before using this tool as it gives only mathematical measure based
on available data. It does not check whether the cause effect relationship really exists
and if it exists which is dependent and which is dependent variable.

ity
Regression analysis is a branch of statistical theory which is widely used in
all the scientific disciplines. It is a basic technique for measuring or estimating the
relationship among economic variables that constitute the essence of economic theory
and economic life. The uses of regression analysis are not confined to economic and
business activities. Its applications are extended to almost all the natural, physical and

s
social sciences.

Regression analysis helps in the following ways -


er
●● It provides mathematical relationship between two or more variables. This
mathematical relationship can then be used for further analysis and treatment
of information using more complex techniques.
v
●● Since most of the business analysis and decisions are based on cause-
ni

effect relationships, regression analysis is highly valuable tool to provide


mathematical model for this relationship.
●● Most wide use of regression analysis is the analysis, estimation and forecast.
U

●● Regression analysis is also used in establishing the theories based on


relationships of various parameters.
●● Some of the common examples are demand and supply, money supply and
expenditure, inflation and interest rates, promotion expenditure and sales,
ity

productivity and profitability, health of workers and absenteeism, etc.

5.1.7 Estimating the Coefficient Using Least Square Method


Generally the method used to find the ‘best’ fit that a straight line of this kind can
m

give is the least-square method. To use it efficiently, we first determine

∑ xi 2 =
∑ xi 2 − nX 2
)A

∑ yi 2 =
∑ yi 2 − nY 2
∑ xi yi =
∑ xi yi − nX .Y
∑ xi yi
b= , a= Y − bX
(c

∑ xi 2

Amity Directorate of Distance & Online Education


94 Statistics Management

These measures define a and b which will give the best possible fit through the
Notes

e
original X and Y points and the value of r can then be worked out as under:

in
b ∑ xi 2
r=
∑ yi 2

nl
Thus, the regression analysis is a statistical method to deal with the formulation
of mathematical model depicting relationship amongst variables which can be used for
the purpose of prediction of the values of dependent variable, given the values of the

O
independent variable.

Alternatively, for fitting a regression equation of the type Y = a + bX to the given


values of X and Y variables, we can find the values of the two constants viz., a and b by

ity
using the following two normal equations:

∑ yi = na + b ∑ X i
∑ X iYi = a ∑ X i + b ∑ X i 2

s
Solving these equations for finding a and b values. Once these values are obtained
er
and have been put in the equation Y = a + bX, we say that we have fitted the regression
equation of Y on X to the given data. In a similar fashion, we can develop the regression
equation of X and Y viz., X = a + bX, presuming Y as an independent variable and X as
dependent variable.
v
5.1.8 Assessing the Model
ni

Method of Least Square parabolic trend


The mathematical form of a parabolic trend is given by Yt = a + bt + ct2 or Y =
U

a + bt + ct2 (dropping the subscript for convenience). Here a, b and c are constants
to be determined from the given data. Using the method of least squares, the normal
equations for the simultaneous solution of a, b, and c are:
ity

∑ Y = na + b ∑ t + c ∑ t 2
∑ tY = a ∑ t + b ∑ t 2 + c ∑ t 3
∑ t 2Y = a ∑ t 2 + b ∑ t 3 + c ∑ t 4
m

By selecting a suitable year of origin, i.e., define X = t - origin such that SX = 0, the
computation work can be considerably simplified. Also note that if SX = 0, then SX3 will
also be equal to zero. Thus, the above equations can be rewritten as:
)A

∑ Y = na + c ∑ X 2 ..(i)

b X 2 ..(ii)
∑ XY =∑

∑ X 2Y = a ∑ X 2 + c ∑ X 4 ..(iii)
∑ XY
(c

From equation (ii), we get b = ...(iv)


∑X2

Amity Directorate of Distance & Online Education


Statistics Management 95

∑Y − c ∑ X 2 Notes

e
Further, from equation (i), we get a = ...(v)
n

in
n ∑ X 2Y − (∑ X 2 )(∑ Y )
And from equation(iii), we get c = ...(vi)
n ∑ X 4 − (∑ X 2 ) 2

nl
Thus, equations (iv), (v) and (vi) can be used to determine the values of the
constants a, b and c.

O
5.1.9 Standard Error of Estimate
Standard Error of Estimate is the measure of variation around the computed
regression line.

ity
Standard error of estimate (SE) of Y measure the variability of the observed values
of Y around the regression line. Standard error of estimate gives a measure about the
line of regression. of the scatter of the observations about the line of regression.

Standard Error of Estimate of Y on X is: S.E. of Yon X (SExy) = √ Σ (Y–Ye)2/n-2

s
Y = Observed value of y er
Ye = Estimated values from the estimated equation that correspond to each y value

e = The error term (Y – Y e )

n = Number of observation in sample.


v
The convenient formula: (SExy) = √ Σ Y2_a Σ Y_b Σ YX /n – 2
ni

X = Value of independent variable. Y = Value of dependent variable. a = Y


intercept.
U

b = Slope of estimating equation. n = Number of data points.

Regression Coefficient of X on Y
The regression coefficient of X on Y is represented by the symbol bxy that
ity

measures the change in X for the unit change in Y. Symbolically, it can be represented
as: The bxy can be obtained by using the following formula when the deviations are
taken from the actual means of X and Y: When the deviations are obtained from the
assumed mean, the following formula is used:
m

Regression Coefficient of Y on X
The symbol byx is used that measures the change in Y corresponding to the unit
)A

change in X. Symbolically, it can be represented as:

●● In case, the deviations are taken from the actual means; the following formula
is used:
●● The byx can be calculated by using the following formula when the deviations
(c

are taken from the assumed means:

Amity Directorate of Distance & Online Education


96 Statistics Management

●● The Regression Coefficient is also called as a slope coefficient because it


Notes

e
determines the slope of the line i.e., the change in the independent variable
for the unit change in the independent variable.

in
5.1.10 Regression Coefficient
The coefficients of regression are YX b and XY b. They have following implications:

nl
●● Slopes of regression lines of Y on X and X on Y viz. YX b and XY b must have
same signs (because r² cannot be negative).
●● Correlation coefficient is geometric mean of YX b and XY b.

O
●● If both slopes YX b and XY b are positive correlation coefficient r is positive. If both
YX b and XY b are negative the correlation coefficient r is negative.
●● Both regression lines intersect at point (X,Y )

ity
As in case of calculation of correlation coefficient, we can directly write the formula
for the two regression coefficients for a bivariate frequency distribution as given below –

N ∑ ∑ fi j xi y j − (∑ fi xi )(∑ f j y j )

s
b=
N ∑ fi xi 2 − (∑ fi xi ) 2
er
Xi − A YJ − B
if we define ui
or,= = and vj
h k
k  N ∑ ∑ fi j ui v j − (∑ f i ui )(∑ f j xJ ) 
v
b=  
h N ∑ fi ui 2 − (∑ fi ui ) 2 
ni

N ∑ ∑ fi j xi y j − (∑ fi xi )(∑ f j yJ )
Similarly d=
N ∑ f j y j 2 − (∑ f j y j )2
U

h  N ∑ fi j ui v j − (∑ f i ui )(∑ f j vJ ) 
or d =  
k  N ∑ f j v j 2 − (∑ f j v j )2 
ity

5.2.1 Time Series


Time series analysis systematically identifies and isolates different kinds of time-
related patterns in the data. Four common relationship patterns are horizontal, trend,
seasonal and cyclic. The random component is superimposed on these patterns. There
m

is a procedure for decomposing the time series in these patterns. These are used for
forecasting. However, more accurate and statistically sound procedure is to identify
the patterns in time series using auto-correlations that was explained in previous
)A

subsection. It is correlation between the values of same variable at different time lag.

When the time series represents completely random data, the auto correlation
for various time lags is close to zero with values fluctuating both on positive and
negative side. If auto correlation slowly drops to zero, and more than two or three
(c

differ significantly from zero, it indicates presence of trend in the data. The trend can
be removed by taking difference between consecutive values and constructing a new
series. This is called numerical differentiation.
Amity Directorate of Distance & Online Education
Statistics Management 97

Definition
Notes

e
A time series is a collection of data obtained by observing a response variable
atperiodic points in time. If repeated observations on a variable produce a time series,

in
the variable is called a time series variable. We use Yi to denote the value of the
variable at time i.

nl
Objectives of Time Series
The analysis of time series implies its decomposition into various factors that affect
the value of its variable in a given period. It is a quantitative and objective evaluation of

O
the effects of various factors on the activity under consideration.

There are two main objectives of the analysis of any time series data:

1. To study the past behaviour of data.

ity
2. To make forecasts for future. The study of past behaviour is essential because
it provides us the knowledge of the effects of various forces. This can facilitate
the process of anticipation of future course of events and, thus, forecasting the
value of the variable as well as planning for future.

s
5.2.2 Variation in Time Series er
Time Series analysis – Secular Component
Secular trend or simply trend is the general tendency of the data to increase or
decrease or stagnate over a long period of time. Most of the business and economic
v
time series would reveal a tendency to increase or to decrease over a number of years.
For example, data regarding industrial production, agricultural production, population,
ni

bank deposits, deficit financing, etc., show that, in general, these magnitudes have
been rising over a fairly long period. As opposed to this, a time series may also reveal
a declining trend, e.g., in the case of substitution of one commodity by another, the
U

demand of the substituted commodity would reveal a declining trend such as the
demand for cotton clothes, demand for coarse grains like bajra, jowar, etc. With
the improved medical facilities, the death rate is likely to show a declining trend, etc.
The change in trend, in either case, is attributable to the fundamental forces such as
ity

changes in population, technology, composition of production, etc.

Time Series Analysis - Seasonal Component


Cycles that occur over short periods of time, normally < 1 year. e.g. monthly,
m

weekly, daily. A time series, where the time interval between successive observations
is less than or equal to one year, may have the effects of both the seasonal and cyclical
variations. However, the seasonal variations are absent if the time interval between
successive observations is greater than one year.
)A

Causes of Seasonal variations:


The main causes of seasonal variations are:
●● Climatic Conditions
(c

●● Customs and Traditions

Amity Directorate of Distance & Online Education


98 Statistics Management

Climatic Conditions: The changes in climatic conditions affect the value of


Notes

e
time series variable and the resulting changes are known as seasonal variations. For
example, the sale of woolen garments is generally at its peak in the month of November

in
and December because of the beginning of winter season. Similarly, timely rainfall may
increase agricultural output, prices of agricultural commodities are lowest during their
harvesting season, etc., reflect the effect of climatic conditions on the value of time
series variable.

nl
Customs and Traditions: The customs and traditions of the people also give rise
to the seasonal variations in time series. For example, the purchase of clothing and
ornaments may be highest during the marriage season, sale of sweets during Diwali,

O
etc., are variations which are the results of customs and traditions of the people.

Time Series Analysis - Cyclical Component

ity
●● Cyclical variations are revealed by most of the economic and business
time series and, therefore, are also termed as trade or the business cycles.
Any trade cycle has four phases which are respectively known as boom,
recession, depression and recovery.

s
●● Various phases repeat themselves regularly one after another in the given
sequence. The time interval between two identical phases is known as the
er
period of cyclical variations. The period is always greater than one year.
Normally, the period of cyclical variations lies between 3 to 10 years.

Objectives of Measuring Cyclical Variations


v
The main objectives of measuring cyclical variations are:
ni

●● To analyse the behaviour of cyclical variations in the past.


●● To predict the effect of cyclical variations so as to provide guidelines for future
business policies.
U

Time Series Analysis - Random Component


As the name suggests, these variations do not reveal any regular pattern of
the movements. These variations are caused by random factors such as strikes,
ity

fire, floods, war, famines, etc. Random variations is that component of a time series
that cannot be explained in terms of any of the components discussed so far. This
component is obtained as a residue after the elimination of trend, seasonal and cyclical
components and hence is often termed as residual component. Random variations
m

are usually short-term variations but sometimes their effect may be so intense that the
value of trend may get permanently affected.

Numerical Application
)A

Using the method of Free hand determine the trend of the following data:

Year 1998 1999 2000 2001 2002 2003 2004 2005

Production 42 44 48 42 46 50 48 52
(c

(in tonnes)

Amity Directorate of Distance & Online Education


Statistics Management 99

Solution:
Notes

e
in
nl
O
s ity
Example 2 - Find trend values from the following data using three yearly moving
averages and show the trend line on the graph.
er
Year Price (`) Year Price (`)
1994 52 2000 75
v
1995 65 2001 70
ni

1996 58 2002 64
1997 63 2003 78
1998 66 2004 80
U

1999 72 2005 73

Solution:
ity

Computation of trend values

Year Price (`) 3 yearly moving total 3 yearly moving average


1994 52 –
1995 65 175 58.33
m

1996 58 186 62.00


1997 63 187 62.33
1998 66 201 67.00
)A

1999 72 213 71.00


2000 75 217 72.33
2001 70 209 69.67
2002 64 212 70.67
2003 78 222 74.00
(c

2004 80 231 77.00


2005 73 –

Amity Directorate of Distance & Online Education


100 Statistics Management

Notes

e
in
nl
O
s ity
er
Key Terms
●● Correlation: Correlation is expressed by a coefficient ranging between –1 and +1.
v
Positive (+ve) sign indicates movement of the variables in the same direction.
●● Positive correlation: The correlation is said to be positive when the increase
ni

(decrease) in the value of one variable is accompanied by an increase (decrease)


in the value of other variable also.
U

●● Negative correlation: Negative or inverse correlation refers to the movement of


the variables in opposite direction
●● Linear correlation: When the amount of change in one variable tends to keep a
constant ratio to the amount of change in the other variable, then the correlation is
ity

said to be linear.
●● Regression: Regression is a basic technique for measuring or estimating the
relationship among economic variables that constitute the essence of economic
theory and economic life.
m

●● Time Series: A time series is a collection of data obtained by observing a


response variable at periodic points in time.
)A

●● Standard Error of Estimate: Standard Error of Estimate is the measure of


variation around the computed regression line.

Check your progress:


1. In ____ correlation, both factors increase or decrease together.
(c

a) Constant
b) Positive

Amity Directorate of Distance & Online Education


Statistics Management 101

c) Negative
Notes

e
d) Probability
2. The correlation that refers to the movement of the variables in opposite direction

in
a) Constant
b) Positive

nl
c) Negative
d) Probability

O
3. A ____ is a collection of data obtained by observing a response variable at
periodic points in time
a) Mean deviation

ity
b) Sample
c) Time Series
d) Hypothesis
4. Technique for estimating the relationship among economic variables that constitute

s
the essence of economic theory is ? er
a) Correlation
b) Time Series
c) Regression
v
d) Standard deviation
ni

5. In ____ the variation is between only two variables under study and the variation is
hardly influenced by any external factor.
a) Partial correlation
U

b) Total correlation
c) Standard correlation
d) Multiple correlation
ity

Questions and exercises


1. Explain the measures of linear relationship.
m

2. What is correlation? What are the various types of correlation?


3. Explain correlation in a grouped data
4. The data of advertisement expenditure (X) and sales (Y) of a company for past
)A

10 year period is given below. Determine the correlation coefficient between these
variables and comment the correlation.

X 50 50 50 40 30 20 20 15 10 5

Y 700 650 600 500 450 400 300 250 210 200
(c

5. What do you understand by time series analysis ? Explain its components.

Amity Directorate of Distance & Online Education


102 Statistics Management

Check your progress:


Notes

e
1. b) Positive
2. c) Negative

in
3. c) Time Series
4. c) Regression

nl
5. d) Multiple correlation

Further Readings

O
1. Richard I. Levin, David S. Rubin, Sanjay Rastogi Masood Husain Siddiqui,
Statistics for Management, Pearson Education, 7th Edition, 2016.
2. Prem.S.Mann, Introductory Statistics, 7th Edition, Wiley India, 2016.

ity
3. Gareth James, Daniela Witten, Trevor Hastie, Robert Tibshirani, An
Introduction to Statistical Learning with Applications in R, Springer, 2016.

Bibliography

s
1. Srivastava V. K. etal – Quantitative Techniques for Managerial Decision Making,
Wiley Eastern Ltd
2.
er
Richard, I.Levin and Charles A.Kirkpatrick – Quantitative Approaches to Management,
McGraw Hill, Kogakusha Ltd.
3. Prem.S.Mann, Introductory Statistics, 7th Edition, Wiley India, 2016.
v
4. Budnik, Frank S Dennis Mcleaavey, Richard Mojena – Principles of Operation
ni

Research - AIT BS New Delhi.


5. Sharma J K – Operation Research- theory and applications-Mc Millan,New Delhi
6. Kalavathy S. – Operation Research – Vikas Pub Co
U

7. Gould F J – Introduction to Management Science – Englewood Cliffs N J Prentice


Hall.
8. Naray J K, Operation Research, theory and applications – Mc Millan, New Dehi.
ity

9. Taha Hamdy, Operations Research, Prentice Hall of India


10. Tulasian: Quantitative Techniques: Pearson Ed.
11. Vohr.N.D. Quantitative Techniques in Management, TMH
m

12. Stevenson W.D, Introduction to Management Science, TMH


)A
(c

Amity Directorate of Distance & Online Education

You might also like