Unit Iii
Unit Iii
1.Efficiency measures
2.The CCR model: Definition of target objectives
3.Peer groups
4 Identification of good operating practices
5. cross efficiency analysis
6. virtual inputs and outputs
7.Other models
8 .Pattern matching – cluster analysis, outlier analysis
1.Efficiency measures
28
compared. If the units produce a single output using a single input only, the efficiency of
the j th decision-making unit DMUj , j ∈ N, is defined as yj θj = j ,
in which yj is the output value produced by DMUj and xj the input value used. If
the units produce multiple outputs using various input factors,
the efficiency of DMUj is defined as the ratio between a weighted sum of the
outputs and a weighted sum of the inputs. Denote by H = {1, 2,..., s} the set of pro-
duction factors and by K = {1, 2,... , m} the corresponding set of outputs. If xij , i ∈ H,
denotes the quantity of input i used by DMUj and yrj , r ∈ K, the quantity of output r
obtained.
for weights u1 , u2,..., um associated with the outputs and v1, v2,..., vs
assigned to the inputs. In this second case, the efficiency of DMUj depends strongly on
the system of weights introduced.
At different weights, the efficiency value may undergo relevant variations and it
becomes difficult to fix a single structure of weights that might be shared and accepted by
all the evaluated units. In order to avoid possible objections raised by the units to a preset
system of weights, which may privilege certain DMUs rather than others, data
envelopment analysis evaluates the efficiency of each unit through the weights system
that is best for the DMU itself – that is, the system that allows its efficiency value to be
maximized. Subsequently, by means of additional analyses, the purpose of data
envelopment analysis is to identify the units that are efficient in absolute terms and those
whose efficiency value depends largely on the system of weights adopted.
1.3. What is Efficient Fronier
The efficient frontier , also known as production function , expresses the rela
tionship between the inputs utilized and the outputs produced. It indicates the maximum
quantity of outputs that can be obtained from a given combination of inputs.
At the same time, it also expresses the minimum quantity of inputs that must be
used to achieve a given output level. Hence, the efficient frontier corresponds to
technically efficient operating methods. The efficient frontier may be empirically
obtained based on a set of observations that express the output level obtained by applying
a specific combination of input production factors.
In the context of data envelopment analysis, the observations correspond to the
units being evaluated. Most statistical methods of parametric nature, which are based for
instance on the calculation of a regression curve, formulate some prior hypotheses on the
shape of the production function. Data envelopment analysis, on the other hand, forgoes
any assumptions on the functional form of the efficient frontier, and is therefore
nonparametric in character. It only requires that the units being compared are not placed
above the production function, depending on their efficiency value. To further clarify the
notion of efficient frontier.
Evaluation of the efficiency of bank branches.
A bank wishes to compare the operational efficiency of its nine branches, in terms
of staff size and total value of savings in active accounts. It shows for each branch the
total value of accounts, expressed in hundreds of thousands of euros, and the number of
staff employed, with the cor- responding efficiency values calculated based on definition
.The graph shows for each branch the number of employ- ees on the horizontal axis and
29
the value of accounts on the vertical axis. The slope of the line connecting each point to
the origin represents the efficiency value associated with the corresponding branch. The
line with the maximum slope, represented by a solid line, is the effi- cient frontier for all
branches being analyzed. The branches that are on this line correspond to efficient units,
while the branches that are below the efficient frontier are inefficient units. The area
between the efficient frontier and the positive horizontal semi-axis is called the
production possibility set .
A possible alternative to the efficient frontier is the regression line that can be
obtained based on the available observations.
In this case, the units that fall above the regression line may be deemed excellent,
and the degree of excellence of each unit could be expressed by its distance from the line.
However, it is appropriate to underline the difference that exists between the prediction
line obtained using a regression model and the efficient frontier obtained using data
envelopment analysis.
The regression line reflects the average behavior of the units being compared,
while the efficient frontier identifies the best behavior, and measures the inefficiency of a
unit based on the distance from the frontier itself. Notice also that the efficient frontier
provides some indications for improving the performance of inefficient units. Indeed, it
identifies for each input level the output level that can be achieved in conditions of
efficiency. By the same token, it identifies for each output level the minimum level of
input that should be used in conditions of efficiency. In particular, for each DMUj , j ∈ N,
the input-oriented efficiency θ I can be defined as the ratio between the ideal input
quantity x∗ that should be used by the unit if it were efficient and the actually used
quantity xj :
The problem of making an inefficient unit efficient is then turned into one of
devising a way by which the inefficient unit can be brought close to the efficient frontier.
If the unit produces a single output only by using two inputs, the efficient frontier
assumes the shape. In this case, the inefficiency of a given unit is evaluated by the length
of the segment connecting the unit to the efficient frontier along the line passing through
the origin of the axes. For the example illustrated in Figure 15.2, the efficiency value of
30
DMUA is given by
31
The objective function involves the maximization of the efficiency measure for
DMUj . Constraints require that the efficiency values of all the units, cal- culated by
means of the weights system for the unit being examined, be lower than one. Finally,
conditions guarantee that the weights associated with the inputs and the outputs are non-
negative. In place of these conditions, some- times the constraints ur , vi ≥ δ, r ∈ K, i ∈ H
may be applied, where δ > 0, preventing the unit from assigning a null weight to an input
or output. Model can be linearized by requiring the weighted sum of the inputs to take a
constant value, for example 1. This condition leads to an alterna- tive optimization
problem, the input-oriented CCR model, where the objective function consists of the
maximization of the weighted sum of the outputs
32
• preferences expressed by the decision makers with respect to a decrease in
some inputs or an increase in specific outputs.
Data envelopment analysis identifies for each inefficient unit a set of excellent
units, called a peer group, which includes those units that are efficient if evalu- ated with
the optimal system of weights of an inefficient unit. The peer group, made up of DMUs
which are characterized by operating methods similar to the inefficient unit being
examined, is a realistic term of comparison which the unit should aim to imitate in order
to improve its performance. The units included in the peer group of a given unit DMUj
may be identified by the solution to model . Indeed, these correspond to the DMUs for
which the first and the second member of constraints are equal:
Notice that within a peer group a few excellent units more than others may
represent a reasonable term of comparison.
The relative importance of a unit belonging to a peer group depends on the value of
the corresponding variable λj in the optimal solution of the dual model.
The analysis of peer groups allows one to differentiate between really efficient
units and apparently efficient units for which the choice of an optimal system of weights
conceals some abnormal behavior. In order to draw this dis- tinction, it is necessary to
consider the efficient units and to evaluate how often each belongs to a peer group. One
may reasonably expect that an efficient unit often included in the peer groups uses for the
evaluation of its own efficiency a robust weights structure. Conversely, if an efficient unit
rarely represents a term of comparison, its own system of optimal weights may appear
distorted, in the sense that it may implicitly reflect the specialization of the unit along a
particular dimension of analysis.
By identifying and sharing good operating practices, one may hope to achieve an
improvement in the performance of all units being compared. The units that appear
efficient according to data envelopment analysis certainly represent terms of comparison
and examples to be imitated for the other units. However, among efficient units some
more than others may represent a target to be reached in improving the efficiency.
The need to identify the efficient units, for the purpose of defining the best
operating practices, stems from the principle itself on which data envelopment analysis is
grounded, since it allows each unit to evaluate its own degree of efficiency by choosing
the most advantageous structure of weights for inputs and outputs. In this way, a unit
might appear efficient by purposely attributing a non-negligible weight only to a limited
subset of inputs and outputs. Further- more, those inputs and outputs that receive greater
weights may be less critical than other factors more intimately connected to the primary
activity performed by the units being analyzed. In order to identify good operating
practices, it is therefore expedient to detect the units that are really efficient, that is, those
units whose efficiency score does not primarily depend on the system of weights selected.
To differentiate these units, we may resort to a combination of different methods:
33
1. cross-efficiency analysis ,
2. evaluation of virtual inputs and virtual outputs
3. weight restrictions .
4. 1.Cross-efficiency analysis
The analysis of cross-efficiency is based on the definition of the efficiency matrix ,
which provides information on the nature of the weights system adopted by the units for
their own efficiency evaluation.
The square efficiency matrix contains as many rows and columns as there are units
being compared.
The generic element θij of the matrix represents the efficiency of DMUj evaluated
through the optimal weights structure for DMUi , while the element θjj provides the
efficiency of DMUj calculated using its own optimal weights. If DMUj is efficient (i.e.
if θjj = 1), although it exhibits a behavior specialized along a given dimension with
respect to the other units, the efficiency values in the column corresponding to DMUj
will be less than 1.
Two quantities of interest can be derived from the efficiency matrix. The first
represents the average efficiency of a unit with respect to the optimal weights systems for
the different units, obtained as the average of the values in the j th column.
The second is the average efficiency of a unit measured applying its optimal system
of weights to the other units. The latter is obtained by averaging the values in the row
associated with the unit being examined.
The difference between the efficiency score θjj of DMUj and the efficiency
obtained as the average of the values in the j th column provides an indication of how
much the unit relies on a system of weights conforming with the one used by the other
units in the evaluation process. If the difference between the two terms is significant,
DMUj may have chosen a structure of weights that is not shared by the other DMUs in
order to privilege the dimensions of analysis on which it appears particularly efficient.
4.2. Virtual inputs and virtual outputs :
Virtual inputs and virtual outputs provide information on the relative importance
that each unit attributes to each individual input and output, for the purpose of
maximizing its own efficiency score. Thus, they allow the specific competencies of each
unit to be identified, highlighting at the same time its weaknesses. The virtual inputs of a
DMU are defined as the product of the inputs used by the unit and the corresponding
optimal weights. Similarly, virtual outputs are given by the product of the outputs of the
unit and the associated optimal weights. Inputs and outputs for which the unit shows high
virtual scores provide an indication of the activities in which the unit being analyzed
appears particularly efficient. Notice that model admits in general multiple optimal
solutions, corresponding to which it is possible to obtain different combinations of virtual
inputs and virtual outputs. Two efficient units may yield high virtual values
corresponding to different combinations of inputs and outputs, showing good operating
practices in different contexts.
In this case, it might be convenient for each unit to follow the principles and
operating methods shown by the other, aiming at improving its own efficiency on a
specific dimension.
4.3.Weight restrictions
34
To separate the units that are really efficient from those whose efficiency score
largely depends on the selected weights system, we may impose some restric- tions on the
value of the weights to be associated with inputs and outputs. In general, these restrictions
translate into the definition of maximum thresholds for the weight of specific outputs or
minimum thresholds for the weight of specific inputs.
Notice that, despite possible restrictions on the weights, the units still enjoy a
certain flexibility in the choice of multiplicative factors for inputs and outputs. For this
reason it may be useful to resort to the evalua- tion of virtual inputs and virtual outputs in
order to identify the units with the most efficient operating practices with respect to the
usage of a specific input resource or to the production of a given output.
5.Other Models
Model is based on the hypothesis that the units being compared operate with
constant returns to scale.
Recall that the returns to scale express the variation in the quantity of outputs in
terms of variations in the quantity of inputs used.
• When the returns to scale are constant, if the inputs increase in a given
• proportion then the outputs also increase in the same proportion.
• In particular, if X denotes the matrix of inputs used by the n units and Y denotes
the corresponding matrix of outputs, in the hypothesis of constant returns to scale
This means that if the point (x, y) belongs to P, then any other point of the form
(kx, ky), k > 0, will also belong to the production possibility set.
• If the hypothesis of constant returns to scale is not adequate, one may resort to
formulations other than model For example, the Banker–Charnes–Cooper model is based
on the hypothesis of variable returns to scale,
6.1Clustering Methods
• The aim of clustering models is to subdivide the records of a dataset into homogeneous
groups of observations, called clusters,
• observations belonging to one sgroup are imilar to one another and dissimilar from
observations included in other groups.
For example, grouping customers based on their purchase behaviors may reveal the
existence of a cluster corresponding to a market niche to which it might be appropriate to
address specific marketing actions for promotional purposes
• In a retention analysis, a preliminary subdivision into clusters may be followed by
35
the development of distinct classification models, with the aim of identifying with greater
accuracy the customers characterized by a high probability of churning
Finally, grouping into clusters may prove useful in the course of exploratory data analysis
to highlight outliers and to identify an observation that might represent on its own an
entire cluster, in order to reduce the size of the dataset.
Clustering methods must fulfill a few general requirements, as indicated below. –
Flexibility
– Robustness
– Efficiency
6.2.Outliers analysis
• An outlier, in mathematics, statistics and information technology, is a specific data
point that falls outside the range of probability for a data set.
• In other words, the outlier is distinct from other surrounding data points in a
particular way.
• Outlier analysis is extremely useful in various kinds of analytics and research,
some of it related to technologies and IT systems.
• The easiest way to detect outliers is to create a graph.
Plots such as Box plots, Scatterplots and Histograms can help to detect outliers.
Alternatively, we can use mean and standard deviation to list out the outliers. Interquartile
Range and Quartiles can also be used to detect outliers
36
• Outlier data points can represent either a) items that are so far outside the norm that
they need not be considered or b) the illustration of a very unique and singular
category or variable that is worth exploring either to capitalize on a niche or find an
area where an organization can offer a unique focus.
• When considering the use of Outlier analysis, a business should first think about
why they want to find the outliers and what they will do with that data. That focus
will help the business to select the right method of analysis, graphing or plotting to
reveal the results they need to see and understand.
• When considering the use of Outlier analysis, it is important to recognize that,
when the Outlier analysis is applied to certain datasets, the results will indicate that
outliers should be discounted, while in other cases, the outlier results will indicate
that the organization should focus solely on those outliers.
• For example, if an outlier indicates a risk or a mistake, that outlier should be
identified and the risk or mistake should be addressed. If an outlier indicates an
exceptional result, such as a person that recovered from a particular disease in spite
of the fact that most other patients did not survive, the organization will want to
perform further analysis on the outlier result to identify the unique aspects that may
be responsible for the patient’s recovery.
• When a business uses Outlier analysis, it is important to test the results and analyze
the overall dataset and environment to be sure that the presence of outliers does not
indicate that the dataset may be more complex than anticipated and may require a
different form of analysis.
II.ILLUSTRATION OF CONCEPTS:
The company considered in this section operates in retail consumer electronics, and
37
wishes to segment its customer base in order to optimize marketing actions aimed at
promoting a specific product or group of products. The goal is therefore to develop a
predictive model able to assign to each customer a score that indicates her propensity to
respond positively to a cross-selling offer. Besides prediction purposes, the model should
be used also to interpret explanatory factors that have a greater effect on the purchase of
the product promoted. Finally, the model is to be used for assessing the existence of
causal and temporal correlations between the purchase of the product promoted and the
purchase of other items.
PART-A:
1. What is the purpose of DEA?
2. Define inputs and outputs in DEA?
3. Define Decision Making Units?
4. How to measure the efficiency of Decision making Units?
5. Define Efficient Frontier?
6. Define efficient and in-efficient units?
7. Define Production Possibility Set?
8. Define input oriented and output oriented efficiency?
38
9. Define CCR model?
10.Define Peer group?
PART-B:
1. Explain about Date envelopment analysis?
2. What are the Efficiency measures for Data Envelopment analysis?
3. Explain in detail about CCR model?
4. What are the Identification of good operating practices?
5. Explain about Peer groups in Data Envelopment Analysis?
IV. ASSIGNMENT
V.REFERNCES
1.https://study.com/academy/lesson/how-mathematical-models-are-used-in-
business.html
39