Data Envelopment Analysis - Theory
DEA is a method of ranking various Decision Making Units (DMUs) on the basis of some overall measure of productivity or efciency It can accommodate multiple inputs and multiple outputs It does not require specication or knowledge of a priori weights or prices for inputs or outputs It does not require knowledge or specication of an implicit production function
1 / 18
What is a production function? Suppose X1 , X2 , , Xm are m inputs and we have a single output Y , then the following is a production function y = f (x1 , , xm ), where y is the maximum output possible when x1 units of input 1, x2 units of input 2, , and xm units of input m are used. For example, suppose 1 stands for labor, 2 for capital, and Y for the output in number of units produced, then the Cobb-Douglas production function (assuming X1 and X2 are the only two relevant inputs), has the following form
b 1 y = ax1 x2 b
where a, b 0 are constants and b < 1.
2 / 18
Some drawbacks of using the production-function approach are as follows. 1. It requires that a functional form must be rst specied (hypothesized), which is often very difcult to do. 2. In order to estimate the parameters that constitute this functional relationship (e.g. parameters a and b in Cobb-Douglas production function), we need to use statistical methods like multiple linear regression. 3. After estimating a production function for each output, there still remains the question of nding an overall mix of inputs that maximize the DMUs welfare over all outputs.
3 / 18
In contrast DEA
Identies efciency of individual DMUs rather than population averages in regression Focuses on revealed best practice frontiers, rather than on central-tendency-related properties of frontiers Can identify desired changes in inputs and/or outputs needed to project a DMU, which is below the efcient frontier, onto the efcient frontier Can accommodate variable returns to scale (will be explained later)
4 / 18
Notation
n DMUs, indexed j = 1, , n m inputs, indexed i = 1, , m s outputs, indexed r = 1, , s For the jth DMU, the (column) vector Xj = {xij } denotes inputs the (column) vector Yj = {yrj } denotes outputs X is an m n matrix of inputs Y is a s n matrix of outputs Xj (Yj ) is the jth column of X (resp. Y )
5 / 18
The Fundamental Problem
Consider a particular DMU that we will denote by index o. Suppose there exist multipliers ur and vi that allow us to convert inputs and outputs of DMU o into a single virtual" input and a single virtual" output. Then, its efciency will be ho (u , v ) =
s=1 ur yro r . m 1 vi xio i=
The fundamental problem is that no two DMUs will agree on the same set of multipliers to use. Why?
6 / 18
DEA allows each DMU to choose his/her own weights, so long as when the same weights are applied to other DMUs, their efciency does not exceed 1. This amounts to comparing each DMU against an appropriate peer from the remaining DMUs. Then, DMU os problem can be written as follows:
7 / 18
max
u ,v
s=1 ur yro r m 1 vi xio i=
Subject to:
s=1 ur yrj r 1, m 1 vi xij i=
ur m i =1 vi xio and vi m i =1 vi xio
for all j = 1, , n, for all r = 1, , s , for all i = 1, , m.
, ,
8 / 18
is the innitesimal constant and the last two constraints bound the multiplier values in order to prevent division by zero. The above formulation is due to Charnes, Cooper and Rhodes and it is called the CCR-IR model where I indicates its input orientation and R denotes that it is a ratio form (there are several other forms of DEA). We will look at one more. This formulation has an innite number of optimal solutions. If (u , v ) are a pair of optimal multiplier vectors, then ( u , v ) is also optimal for > 0. This problem can be overcome by choosing a representative form. For example, by choosing vi s such that i vi xio = 1. Now, we get
9 / 18
max o =
,v
r =1
r yro
Subject to:
i =1 s m
vi xio = 1,
for all j = 1, , n, for all r = 1, , s , for all i = 1, , m.
r =1
r yrj vi xij 0,
i =1
r ,
and vi ,
We can also use the dual of the above linear program.
10 / 18
The Additive Model
For each DMU o, we associate a peer, denoted as (Xj , Yj ), whose inputs and outputs are a convex combination of inputs and outputs of all n DMUs. In particular, yrj = n =1 p yrp and xij = n =1 p xip . p p Clearly, 0 p 1 is the weight assigned to the pth DMU, and p p = 1. We are interested in nding a set of weights p s for which the sum of rectilinear distances from (Yo , Xo ) is maximized. If upon doing so, we nd the sum of distances to be zero, we know that the DMU o is efcient (lies on the envelopment surface). Advantages: The formulation always has a feasible solution and it accommodates variable returns to scale.
11 / 18
Noticing that Xj and Yj above are obtained as the jth element of the column vector obtained by multiplying matrix Y and X with column vector (i.e. weights p ), the vector of linear distance on the output side is s + = Y Yo ,
and on the input side is s = Xo X . In order to identify inefciencies, we wish to maximize the sum of linear distances in s + and s , or equivalently
12 / 18
Additive Model Formulation
,s + ,s
min 1s + 1s Y s + = Yo ,
Subject to:
X s = Xo ,
1 = 1,
, s + , s 0.
13 / 18
When s + and s are zero for a particular DMU, it lies on the envelopment surface (is efcient). Otherwise, the magnitudes of s + and s indicate the rectilinear distances from the most efcient peer and provide guidance about what changes in inputs and outputs can move the chosen DMU towards the envelopment surface.
14 / 18
The dual representation of the additive model above is
,,uo
max T Yo T Xo + uo
Subject to:
T Y T X + u o 1 0,
T 1, T 1.
15 / 18
Constant and Variable Returns to Scale
Returns to scale refers to the level of output achieved when all inputs are increased (varied) simultaneously, i.e., the production process is expanded exactly to scale. In a constant returns to scale scenario, when all input factors are increased proportionally, the output increases in the same proportion. Alternatively, it could increase/decrease more than proportionally (when scale economies/diseconomies are present) leading to an increasing/decreasing returns to scale scenario. Constant returns to scale production functions are called homogeneous production functions.
16 / 18
Cobb-Douglas production function is homogeneous as seen in the following arguments. Suppose the inputs are increased from levels x1 and x2 to tx1 and tx2 , then the output is given by
b 1 b 1 y = at b x1 t 1b x2 b = ax1 x2 b t .
Thus, the output also increased by a factor of t.
17 / 18
Practical Considerations
LP solvers are widely available and getting cheaper and faster Many applications of DEA have been documented There are several software products available, some are even available as free downloads
18 / 18