Dea
Dea
ERWIN KALVELAGEN
(2)
P
uk yk,j0
maximize 0 = Pk
u,v
i vi xi,j0
P
u
y
k k,j
subject to Pk
1 j
v
i
i xi,j
uk , vi 0
ERWIN KALVELAGEN
weights vi (often weights are normalized to add up to one; this can be considered
as a slightly more complex normalization). This results in:
maximize
u,v
subject to
(3)
uk yk,j0
vi xi,j0 = 1
uk yk,j
vi xi,j j
uk , vi 0
It is noted that x and y are no decision variables but rather data. The decision
variables are the weights u and v.
In some places [7] the dual has been mentioned as being preferable from a computational point of view (typical primal models have many more rows than columns).
The dual DEA model can be stated as:
minimize z0 = j0
X
j yk,j yk,j0
(4)
j0 xi,j0
j xi,j
j 0
Other forms for the DEA model have been proposed. The model we discussed
above is called the CCR model after the authors of [3]. Some variants set a lower
bound on uk and vi to prevent zero weights: uk , vi . Another basic model
is the BCC model [1]. This model is based on the dual, and adds a restriction on
the s:
minimize z0 = j0
X
j yk,j yk,j0
j
(5)
j0 xi,j0
j xi,j
j = 1
j 0
This transforms the model from being constant returns-to-scale to variable
returns-to-scale. The scores from this model are sometimes called pure technical
efficiency scores as they eliminate scale-efficiency from the analysis [2, 17].
2. GAMS implementation
We have to repeat the solution of the DEA LP model for every DMU. In GAMS
this is coded quite easily using a loop:
Model dea.gms.
$ontext
Data Envelopment Analysis (DEA) example
Erwin Kalvelagen, may 2002
Data from:
Emrouznejad, A (1995-2001),
" Ali Emrouznejads DEA HomePage",
Warwick Business School, Coventry CV4 7AL, UK
$offtext
sets i
j
inp(j)
outp(j)
;
"DMUs" /Depot1*Depot20/
inputs and outputs /stock, wages, issues, receipts, reqs/
inputs /stock, wages/
outputs /issues, receipts, reqs/
Table data(i,j)
stock wages
Depot1
3
5
Depot2
2.5
4.5
Depot3
4
6
Depot4
6
7
Depot5
2.3
3.5
Depot6
4
6.5
Depot7
7
10
Depot8
4.4
6.4
Depot9
3
5
Depot10
5
7
Depot11
5
7
Depot12
2
4
Depot13
5
7
Depot14
4
4
Depot15
2
3
Depot16
3
6
Depot17
7
11
Depot18
4
6
Depot19
3
4
Depot20
3
6
;
parameter
x0(inp)
y0(outp)
x(inp,i)
y(outp,i)
;
issues
40
45
55
48
28
48
80
25
45
70
45
45
65
38
20
38
68
25
45
57
receipts
55
50
45
20
50
20
65
48
64
65
65
40
25
18
50
20
64
38
67
60
reqs
30
40
30
60
25
65
57
30
42
48
40
44
35
64
15
60
54
20
32
40
inputs of DMU j0
outputs of DMU j0
inputs of DMU i
outputs of DMU i
positive variables
v(inp) input weights
u(outp) output weights
;
variable
eff efficiency
;
equations
objective
normalize
1http://amsterdamoptimization.com/models/dea/dea.gms
ERWIN KALVELAGEN
limit(i)
objective..
normalize..
limit(i)..
alias (i,iter);
x(inp,i) = data(i,inp);
y(outp,i) = data(i,outp);
parameter efficiency(i) efficiency of each DMU;
loop(iter,
x0(inp) = x(inp, iter);
y0(outp) = y(outp, iter);
solve dea using lp maximizing eff;
abort$(dea.modelstat<>1) "LP was not optimal";
efficiency(iter) = eff.l;
);
display efficiency;
*
* create sorted output
*
set r /rnk1*rnk1000/;
parameter rank(i);
alias (i,ii);
rank(i) = sum(ii$(efficiency(ii)>=efficiency(i)), 1);
parameter efficiency2(r,i);
efficiency2(r,i)=efficiency(i)$(rank(i)=ord(r));
option efficiency2:4:0:1;
display efficiency2;
rnk4 .Depot12
rnk4 .Depot19
rnk7 .Depot5
rnk10.Depot10
rnk13.Depot1
rnk16.Depot4
rnk19.Depot8
1.0000
1.0000
0.9466
0.8889
0.8204
0.6528
0.5169
rnk4 .Depot14
rnk5 .Depot9
rnk8 .Depot2
rnk11.Depot13
rnk14.Depot3
rnk17.Depot11
rnk20.Depot18
1.0000
0.9634
0.9417
0.8254
0.8148
0.6313
0.4201
rnk4 .Depot15
rnk6 .Depot20
rnk9 .Depot16
rnk12.Depot6
rnk15.Depot7
rnk18.Depot17
1.0000
0.9517
0.9091
0.8228
0.7111
0.5495
$ontext
Data Envelopment Analysis (DEA) example
Indexed equations formulation.
2http://amsterdamoptimization.com/models/dea/dea2.gms
$offtext
sets i
j
inp(j)
outp(j)
;
"DMUs" /Depot1*Depot20/
inputs and outputs /stock, wages, issues, receipts, reqs/
inputs /stock, wages/
outputs /issues, receipts, reqs/
issues
40
45
55
48
28
48
80
25
45
70
45
45
65
38
20
38
68
25
45
57
receipts
55
50
45
20
50
20
65
48
64
65
65
40
25
18
50
20
64
38
67
60
reqs
30
40
30
60
25
65
57
30
42
48
40
44
35
64
15
60
54
20
32
40
parameter
x(inp,i) inputs of DMU i
y(outp,i) outputs of DMU i
;
positive variables
v(inp) input weights
u(outp) output weights
;
variable
eff efficiency
;
equations
objective(i)
normalize(i)
limit(i)
objective(j0)..
normalize(j0)..
limit(i)..
ERWIN KALVELAGEN
alias(i,iter);
x(inp,i) = data(i,inp);
y(outp,i) = data(i,outp);
parameter efficiency(i) efficiency of each DMU;
loop(iter,
*
* set j0 is the current DMU
*
j0(i) = no;
j0(iter) = yes;
solve dea using lp maximizing eff;
abort$(dea.modelstat<>1) "LP was not optimal";
efficiency(iter) = eff.l;
);
display efficiency;
*
* create sorted output
*
set r /rnk1*rnk1000/;
parameter rank(i);
alias (i,ii);
rank(i) = sum(ii$(efficiency(ii)>=efficiency(i)), 1);
parameter efficiency2(r,i);
efficiency2(r,i)=efficiency(i)$(rank(i)=ord(r));
option efficiency2:4:0:1;
display efficiency2;
Note that the set j0 is a dynamic set. The equations are therefore declared over
the set i, which is a static set. We then define the equations over the set j0 which
will be calculated inside the loop.
GAMS protects the modeler by forbidding the loop set to be used in equations.
However that is exactly what we need here. To work around this, we use a different
loop set iter and calculate the set j0 inside the loop.
3. Performance issues
The LPs in the model are all very small: 22 equations and 6 variables. Nevertheless GAMS will get slow if the number of DMUs gets large. Part of it we can
easily fix: the large amount of data written to the listing file. This can be reduced
to a minimum by the following statements:
This will speed up GAMS but as the loop unfolds, GAMS may still become
unbearably slow. Basically, GAMS has too much overhead in solving very small
models in a loop. We can alleviate this by folding several small LPs into one. For
the model above, we can solve the whole thing in one swoop. Say a single model
for DMU i has the standard LP format:
maximize cTi xi
xi
(6)
Ai x i = bi
`i xi ui
(7)
cTi xi
Ax = b
`xu
xT1 xT2
where x =
. . . xTn
and
A1
A=
(8)
A2
..
.
An
$ontext
Data Envelopment Analysis (DEA) example
One LP formulation.
Erwin Kalvelagen, may 2002
Data from:
Emrouznejad, A (1995-2001),
" Ali Emrouznejads DEA HomePage",
Warwick Business School, Coventry CV4 7AL, UK
$offtext
sets i
j
inp(j)
outp(j)
;
"DMUs" /Depot1*Depot20/
inputs and outputs /stock, wages, issues, receipts, reqs/
inputs /stock, wages/
outputs /issues, receipts, reqs/
alias (i,j0);
Table data(i,j)
stock wages
Depot1
3
5
Depot2
2.5
4.5
Depot3
4
6
Depot4
6
7
Depot5
2.3
3.5
issues
40
45
55
48
28
receipts
55
50
45
20
50
reqs
30
40
30
60
25
3http://amsterdamoptimization.com/models/dea/dea3.gms
Depot6
Depot7
Depot8
Depot9
Depot10
Depot11
Depot12
Depot13
Depot14
Depot15
Depot16
Depot17
Depot18
Depot19
Depot20
;
ERWIN KALVELAGEN
4
7
4.4
3
5
5
2
5
4
2
3
7
4
3
3
6.5
10
6.4
5
7
7
4
7
4
3
6
11
6
4
6
48
80
25
45
70
45
45
65
38
20
38
68
25
45
57
20
65
48
64
65
65
40
25
18
50
20
64
38
67
60
65
57
30
42
48
40
44
35
64
15
60
54
20
32
40
parameter
x(inp,i) inputs of DMU i
y(outp,i) outputs of DMU i
;
positive variables
v(inp,j0)
input weights
u(outp,j0) output weights
;
variable
eff(j0)
totaleff
;
equations
objective(j0)
normalize(j0)
limit(i,j0)
totalobj
;
totalobj..
objective(j0)..
eff(j0) =e=
normalize(j0)..
limit(i,j0)..
sum(outp, u(outp,j0)*y(outp,j0));
*
* create sorted output
*
set r /rnk1*rnk1000/;
parameter rank(i);
alias (i,ii);
rank(i) = sum(ii$(eff.l(ii)>=eff.l(i)), 1);
parameter efficiency2(r,i);
efficiency2(r,i)=eff.l(i)$(rank(i)=ord(r));
option efficiency2:4:0:1;
display efficiency2;
This model has just 441 equations and 121 variables, so it is still very small for
current standards. In this case, a one LP formulation solves much quicker than
formulating twenty little models, one for each DMU. (We note that the actual
matrix being generated is not block-diagonal, but rather permuted block-diagonal:
after some simple row and column swaps the matrix can be made block-diagonal).
If you have many DMUs it is possible to find a balance between looping and
solving big LPs. E.g. suppose one has 100 DMUs, then it may make sense to solve
5 batches of 20 combined problems.
In the example below we set up a set dist which determines the distribution of
DMUs over runs. In this case we have two runs. This first takes care of DMUs 1
through 10, while the second run does DMUs 11 through 20.
Model dea4.gms.
$ontext
Data Envelopment Analysis (DEA) example
Flexible batch formulation
Erwin Kalvelagen, may 2002
Data from:
Emrouznejad, A (1995-2001),
" Ali Emrouznejads DEA HomePage",
Warwick Business School, Coventry CV4 7AL, UK
$offtext
sets i
j
inp(j)
outp(j)
;
"DMUs" /Depot1*Depot20/
inputs and outputs /stock, wages, issues, receipts, reqs/
inputs /stock, wages/
outputs /issues, receipts, reqs/
issues
40
45
55
48
28
48
80
25
45
70
45
45
65
38
20
38
68
25
45
57
receipts
55
50
45
20
50
20
65
48
64
65
65
40
25
18
50
20
64
38
67
60
reqs
30
40
30
60
25
65
57
30
42
48
40
44
35
64
15
60
54
20
32
40
http://amsterdamoptimization.com/models/dea/dea4.gms
10
ERWIN KALVELAGEN
run1.(depot1*depot10),
run2.(depot11*depot20)
/;
parameter
x(inp,i) inputs of DMU i
y(outp,i) outputs of DMU i
;
positive variables
v(inp,i)
input weights
u(outp,i) output weights
;
variable
eff(i)
totaleff
;
equations
objective(i)
normalize(i)
limit(i,i)
totalobj
;
totalobj..
objective(j0)..
eff(j0) =e=
normalize(j0)..
limit(i,j0)..
sum(outp, u(outp,j0)*y(outp,j0));
loop(run,
j0(i) = no;
j0(i)$dist(run,i) = yes;
solve dea using lp maximizing totaleff;
efficiency(j0) = eff.l(j0);
);
*
* create sorted output
*
set r /rnk1*rnk1000/;
parameter rank(i);
alias (i,ii);
rank(i) = sum(ii$(efficiency(ii)>=efficiency(i)), 1);
parameter efficiency2(r,i);
efficiency2(r,i)=efficiency(i)$(rank(i)=ord(r));
option efficiency2:4:0:1;
display efficiency2;
11
4. Numerical experiments
The best balance between size of a batch and the number of batches need to be
determined by experimenting. Some of the state-of-the-art LP solvers are really
good now in solving LP models quickly. This means that it is often advantageous
to make the batches rather large.
runs user time system time total
1
0.083
0.052 0.135
2
0.117
0.062 0.179
3
0.136
0.097 0.233
4
0.171
0.091 0.262
5
0.181
0.125 0.306
7
0.210
0.167 0.377
10
0.320
0.224 0.544
20
0.527
0.412 0.939
Table 1. Performance results
all combined
10 + 10
7+7+6
5+5+5+5
4+4+4+4+4
3+3+3+3+3+3+2
2 each
all individual
for dea4.gms
The above model is very small, so when we tried actual runs, the fastest strategy
was to combine all models in a single run. The timings are on a 1Ghz dual pentium
machine running Linux and were obtained using the time utility of the c-shell. For
this small example we see that combining the 20 models into one run gives us a
speed-up of almost a factor 10.
runs
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
time
time
user system total runs
user system total
7.607
0.361 7.968
16
5.107
0.851
5.958
6.037
0.408 6.445
17
5.142
0.814
5.956
5.777
0.369 6.146
18
5.113
0.875
5.988
5.523
0.396 5.919
19
5.146
0.894
6.04
5.421
0.423 5.844
20
5.177
0.898
6.075
5.392
0.482 5.874
21
5.218
0.955
6.173
5.388
0.552
5.94
22
5.222
1.013
6.235
5.443
0.523 5.966
23
5.275
0.966
6.241
5.451
0.574 6.025
24
5.302
1.019
6.321
5.464
0.636
6.1
25
5.271
1.066
6.337
5.492
0.589 6.081
30
5.455
1.242
6.697
5.341
0.66 6.001
40
6.005
1.503
7.508
5.248
0.728 5.976
50
6.48
1.822
8.302
5.175
0.769 5.944
100
9.136
3.501 12.637
5.181
0.765 5.946
200 15.125
6.589 21.714
Table 2. Performance results for 200 DMU model
12
ERWIN KALVELAGEN
current(run1) = yes;
loop(i,
dist(current,i) = yes;
current(run++1) = current(run);
);
display dist;
Given a value for the environment variable n (the number of batch runs), this
fragment will distribute the subproblems i over the runs. We can set n to any
number. To perform the timing we used a model with 200 DMUs, and varied
n between 1 and 200. Running the model in one run resulted in an LP with
40401 equations and 1601 variables. Each individual model is: 201 equations and
7 variables. The performance results are shown in table 4. Here we see that there
is a wide range of relative efficient combinations. Combining all models into one is
not the best approach here.
5. Examples
5.1. Dual formulation. In this example we show how the dual formulations of
the Constant Returns to Scale CCR model (equation 4) and the Variable Returns
to Scale BCC model (equation 5) can be solved as one big LP model instead of a
series of small models.
We use the data set from [11].
Model bundesliga.gms.
$ontext
DEA models:
input and output oriented
constant returns to scale (CCR) and variable returns to scale (BCC)
Instead of a loop batch equations together to forma single large LP.
Erwin Kalvelagen jan 2005
Reference:
Dieter Haas, Martin G. Kocher and Matthias Sutter,
"Measuring Efficiency of German Football Teams by Data Envelopment Analysis",
University of Innsbruck, 12 may 2003
$offtext
set i teams /
Bayern M
unchen
Bayer Leverkusen
Hamburger SV
1860 M
unchen
1. FC Kaiserslautern
Hertha BSC
Vfl Wolfsburg
Vfb Stuttgart
Werder Bremen
SpVgg Unterhaching
Borussia Dortmund
SC Freiburg
FC Schalke
Eintracht Frankfurt
Hansa Rostock
SSV Ulm
Arminia Bielefeld
5http://amsterdamoptimization.com/models/dea/bundesliga.gms
13
MSV Duisburg
/;
set j data
rank
wagep
wagec
points
spect
fill
rev
CL
UC
keys /
ranking at end of season 1999/2000
avg wage for players (annual, million dm)
wage for coach (monthly, 1000 dm)
points determining ranking
spectators (1000)
stadium utilization (%)
total revenue (million DM)
participation in Champions League
participation in UEFA Cup
/;
table data(i,j)
rank
Bayern M
unchen
1
Bayer Leverkusen
2
Hamburger SV
3
1860 M
unchen
4
1. FC Kaiserslautern 5
Hertha BSC
6
Vfl Wolfsburg
7
Vfb Stuttgart
8
Werder Bremen
9
SpVgg Unterhaching
10
Borussia Dortmund
11
SC Freiburg
12
FC Schalke
13
Eintracht Frankfurt 14
Hansa Rostock
15
SSV Ulm
16
Arminia Bielefeld
17
MSV Duisburg
18
wagep
63.0
30.5
31.0
30.0
31.0
32.5
19.0
20.5
20.0
12.0
60.0
9.5
40.0
20.0
14.0
8.0
16.0
11.5
wagec points
300
73
180
73
125
59
160
53
200
50
100
50
80
49
100
48
30
47
30
44
100
40
50
40
70
39
80
39
35
38
22
35
50
30
42
22
spect
894
382
703
555
684
809
292
500
507
163
1099
420
689
605
275
371
335
257
fill
83.5
89.7
76.6
51.8
96.9
62.8
83.5
65.3
84.5
76.6
93.7
98.8
65.4
58.3
66.0
97.0
74.4
50.1
rev
220
85
61
42
75
42
40
52
63
14
150
31
64
40
32
26
32
28
CL
1
1
1
1
1
1
display data;
set inp(j) inputs
/wagep,wagec/;
set outp(j) outputs /points,fill,rev/;
parameter x(inp,i); x(inp,i) = data(i,inp);
parameter y(outp,i); y(outp,i) = data(i,outp);
alias(i,i0);
positive variables lambda(i0,i);
equations
objective
input1(i0,outp)
input2(i0,inp)
convex(i0)
;
objective..
variables
theta(i0)
z
;
UC
14
ERWIN KALVELAGEN
input2(i0,inp)..
convex(i0)..
output2(i0,outp)..
option results:4:1:2;
display results;
In the example both input oriented and output oriented efficiency scores are
calculated and presented in a results parameter:
----
Bayern M
unchen
Bayer Leverkusen
Hamburger SV
1860 M
unchen
1. FC Kaiserslautern
Hertha BSC
Vfl Wolfsburg
Vfb Stuttgart
Werder Bremen
SpVgg Unterhaching
Borussia Dortmund
SC Freiburg
FC Schalke
Eintracht Frankfurt
Hansa Rostock
SSV Ulm
Arminia Bielefeld
MSV Duisburg
input
CRS/CCR
input
VRS/BCC
output
CRS/CCR
output
VRS/BCC
1.0000
0.8288
0.5897
0.4282
0.7098
0.3934
0.6423
0.7578
1.0000
0.9219
0.7893
1.0000
0.5037
0.5997
0.7073
1.0000
0.6054
0.7242
1.0000
1.0000
0.7968
0.5918
1.0000
0.5545
0.8394
0.8107
1.0000
1.0000
1.0000
1.0000
0.5039
0.6003
0.7443
1.0000
0.6069
0.7450
1.0000
1.2065
1.6956
2.3354
1.4089
2.5419
1.5568
1.3196
1.0000
1.0847
1.2670
1.0000
1.9851
1.6674
1.4139
1.0000
1.6518
1.3809
1.0000
1.0000
1.0757
1.3119
1.0000
1.1827
1.0665
1.1527
1.0000
1.0000
1.0000
1.0000
1.3067
1.3698
1.1465
1.0000
1.3136
1.3695
15
5.2. Bootstrapping. Bootstrapping[6, 16] is used to provide additional information for statistical inference. The following model from [19] implements a resampling
strategy from [15]. Two thousand bootstrap samples are formed, each resulting in
a DEA model of 100 small LPs. In this example we batch the DEA models together in a single large LP, so that we only have to solve 2,000 LP models instead
of 200,000.
Model bootstrap.gms.
$ontext
DEA bootstrapping example
Erwin Kalvelagen, october 2004
References:
Mei Xue, Patrick T. Harker
"Overcoming the Inherent Dependency of DEA Efficiency Scores:
A Bootstrap Approach", Tech. Report, Department of Operations and
Information Management, The Wharton School, University of Pennsylvania,
April 1999
http://opim.wharton.upenn.edu/~harker/DEAboot.pdf
$offtext
sets
i hospital (DMU) /h1*h100/
j inputs and outputs /
FTE
The number of full time employees in the hospital in FY 1994-95
Costs
The expenses of the hospital ($million) in FY 1994-95
PTDAYS The number of the patient days produced by the hospital in FY 1994-95
DISCH
The number of patient discharges produced by the hospital in FY 1994-95
BEDS
The number of patient beds in the hospital in FY 1994-95
FORPROF Dummy variable, one if it is for-profit hospital, zero otherwise
TEACH
Dummy variable, one if it is teaching hospital, zero otherwise
RES
The number of the residents in the hospital in FY 1994-95
CONST
Constant term in regression model
/
inp(j) inputs /FTE,Costs/
outp(j) outputs /PTDAYS,DISCH/
;
table data(i,j)
FTE
h1
h2
h3
h4
h5
h6
h7
h8
h9
h10
h11
h12
h13
h14
h15
h16
h17
h18
h19
6
1571.86
816.54
533.74
805.2
3908.1
727.72
2571.75
521
718
1504.85
1234.49
873
1067.17
668
452.35
1523
3152
871.96
2901.86
Costs
174
69.9
61.7
75.4
396
63.9
220
89.1
50
121
84.6
68.8
85.8
47.5
36.4
97.4
198
30.7
290
PTDAYS
DISCH
71986
53081
25030
34163
187462
31330
130077
43390
27896
75941
57080
48932
50436
67909
25200
59809
108631
17925
130004
12665
5861
4951
11877
42735
8402
26877
8598
6113
16427
14180
12060
11317
6235
6860
13180
22071
4605
24133
365
224
286
256
829
194
620
290
150
393
317
281
278
244
155
394
578
160
549
RES
1
1
136.8
42.81
23.21
13.31
195.67
126.89
http://amsterdamoptimization.com/models/dea/bootstrap.gms
16
ERWIN KALVELAGEN
h20
h21
h22
h23
h24
h25
h26
h27
h28
h29
h30
h31
h32
h33
h34
h35
h36
h37
h38
h39
h40
h41
h42
h43
h44
h45
h46
h47
h48
h49
h50
h51
h52
h53
h54
h55
h56
h57
h58
h59
h60
h61
h62
h63
h64
h65
h66
h67
h68
h69
h70
h71
h72
h73
h74
h75
h76
h77
h78
h79
h80
h81
h82
h83
h84
h85
h86
h87
h88
h89
h90
h91
902.4
194.69
713.51
557.36
2259.2
462.22
1212.1
2391.94
1637
501
412.1
738.56
414.1
1097
742
1010
440.6
1203.3
2558.01
215.45
599.3
480.55
634.51
1211.9
285.5
1030.36
1374.81
953.56
561.11
644
376.55
404.79
397.9
374.2
1702
148.09
253.48
1445.68
414.1
642.58
203.75
421.8
320.62
679.79
2382
559.29
568.15
2408.04
632.34
917.22
554.34
780
663.82
1424
313
778
863.37
3509.12
1593.82
466
666.38
998.8
1018
3238.28
1431.1
1735.99
1769
484.56
204.7
1706.58
1029.11
1167.2
78.2
10.9
62.6
23.8
120
32.4
97.3
192
162
37.9
40.2
27
35.7
105
62.8
97.1
34.2
85.4
195
8.409936
30.4
29.5
29.9
91.4
23.9
67.1
95.5
49.8
41.7
57.1
19.6
32.8
29.4
3.944649
100
5.013379
16.9
99.3
26.5
48.5
13
18.3
17.3
25.6
226
58.1
35
155
54.6
55.2
56.9
75.9
56.9
146
20.7
78.4
62
290
152
40.1
48.2
121
98.2
326
107
273
190
36.2
13.9
287
71.9
111
35743
15555
32558
12728
74061
28886
74194
89843
80468
26813
23217
11514
55611
59443
42542
47246
30773
50710
128450
65743
23299
34279
27157
90008
16473
43486
74279
47934
24800
39663
22003
27566
26072
4179
114603
51660
17599
81041
20432
42733
16923
16179
18882
27561
166559
40534
37120
70392
37228
42135
32352
39213
34180
107457
20110
51496
50957
109673
82400
30647
28048
45513
61176
122118
48900
84118
105741
24070
28137
75153
49993
75004
8664
1530
8966
2291
12942
6101
12681
18396
21345
4594
6044
3052
4354
13101
8739
12073
4305
11470
20441
578
5338
6560
5198
17666
2873
9467
11862
10553
5498
8604
4759
7871
4248
819
17235
771
4044
12912
4068
5983
3467
2840
3370
4447
26019
8806
7242
9538
6359
7294
3320
7154
5284
18198
5967
12302
10557
19213
17707
7265
5182
6855
11386
19068
10623
16458
19256
6464
1615
13465
6690
21334
236
132
138
276
348
134
342
336
415
166
160
144
200
242
172
269
201
247
571
238
173
169
141
320
135
235
284
207
132
260
143
190
170
156
438
172
178
475
129
181
146
160
160
308
787
342
158
266
175
215
205
172
200
432
165
390
228
469
474
164
153
238
350
514
208
278
478
125
135
367
252
350
12.08
14.52
229.19
26.32
1.1
1
1
13.82
5.42
6.25
6.44
11.81
17.53
1
1
11.33
7.08
111.33
2.75
1
1
290.53
11.64
88.86
146.33
1
1
158.4
0.93
1
1
91.56
4
1
1
1
1
1
h92
h93
h94
h95
h96
h97
h98
h99
h100
1657.58
1017.16
1532.7
1462
1133.8
609
301.31
1930.08
1573.3
116
88.5
153
113
109
48.2
20.2
201
177
77753
64147
99998
119107
55540
71817
43214
87197
88124
17528
11135
17391
16053
15566
5639
2153
19315
19661
413
316
395
484
355
376
141
418
458
1
1
1
1
4.8
0.5
8.51
1
69.71
;
data(i,CONST) = 1;
*
* this is the standard DEA model
* instead of 100 small models we solve one big model, see
* http://www.gams.com/~erwin/dea/dea.pdf
*
parameter
x(inp,i) inputs of DMU i
y(outp,i) outputs of DMU i
;
alias(i,j0);
positive variables
v(inp,j0)
input weights
u(outp,j0)
output weights
;
variable
eff(j0) efficiency
z objective variable
;
equations
objective(j0)
normalize(j0)
limit(i,j0)
totalobj
;
totalobj..
objective(j0)..
normalize(j0)..
limit(i,j0)..
17
18
*
*
*
*
*
*
*
ERWIN KALVELAGEN
*
* calculate standard errors
*
scalar df degrees of freedom;
df = card(i)-card(e);
scalar sigma_squared variance of estimate;
sigma_squared = rss/df;
parameter variance(e,ee);
variance(e,ee) = sigma_squared*invXX.l(e,ee);
parameter se(e) standard error;
se(e) = sqrt(variance(e,e));
parameter tval(e) "t statistic";
tval(e) = b(e)/se(e);
parameter pval(e) "p-values";
*
*
*
= 2 * 0.5 * pbeta( df / (df + sqr(abs(tvalue))), df/2, 0.5)
*
= betareg( df / (df+sqr(tvalue)), df/2, 0.5)
*
pval(e) = betareg( df / (df+sqr(tval(e))), df/2, 0.5);
parameter ols(e,*);
ols(e,estimates) = b(e);
ols(e,std.error) = se(e);
ols(e,t value) = tval(e);
ols(e,p value) = pval(e);
display
"------------------------------------ OLS MODEL ------------------------",
ols;
);
19
20
ERWIN KALVELAGEN
*
* get statistics
*
parameter bbar(e) "Averaged estimates";
bbar(e) = sum(s, sb(s,e)) / card(s);
parameter sehat(e) "Standard errors of bootstrap algorithm";
sehat(e) = sqrt(sum(s, sqr(sb(s,e)-bbar(e)))/(card(s)-1));
parameter tbootstrap(e) "t statistic for bootstrap";
tbootstrap(e) = b(e)/sehat(e);
scalar dfbootstrap degrees of freedom;
dfbootstrap = card(i) - (card(e) - 1) - 1;
parameter pbootstrap(e) "p-values for bootstrap";
*
* pvalue = 2 * pt( abs(tvalue), df)
*
= 2 * 0.5 * pbeta( df / (df + sqr(abs(tvalue))), df/2, 0.5)
*
= betareg( df / (df+sqr(tvalue)), df/2, 0.5)
*
pbootstrap(e) = betareg( dfbootstrap / (dfbootstrap+sqr(tbootstrap(e))), dfbootstrap/2, 0.5);
parameter bootstrap(e,*);
bootstrap(e,estimates) = b(e);
bootstrap(e,std.error) = sehat(e);
bootstrap(e,t value) = tbootstrap(e);
bootstrap(e,p value) = pbootstrap(e);
display
"------------------------------------ BOOTSTRAP MODEL ------------------------",
bootstrap;
----
std.error
BEDS
1.040019E-4 1.244050E-4
FORPROF
0.099
0.042
TEACH
-0.057
0.039
RES
-0.001 3.303407E-4
CONST
0.607
0.035
t value
p value
0.836
0.405
2.390
0.019
-1.451
0.150
-3.133
0.002
17.330 3.59753E-31
----
std.error
BEDS
1.040019E-4 1.107967E-4
FORPROF
0.099
0.060
TEACH
-0.057
0.036
RES
-0.001 2.442416E-4
CONST
0.607
0.042
t value
p value
0.939
0.350
1.651
0.102
-1.584
0.116
-4.237 5.234667E-5
14.417 1.18732E-25
21
default
solvelink=2
real
27m12.745s real
14m29.518s
user
20m58.595s user
12m58.734s
sys
5m30.054s
sys
1m3.559s
Table 3. Solvelink results
Here the p-value for FORPROF is indicating this parameter is not significant at the
0.05 level. The p-values are calculated using the incomplete beta function which is
available as BetaReg() in GAMS[12].
It is noted that the option m.solvelink=2; is quite effective for this model.
Timings that illustrate this are reported in table 3.
A further small performance improvement can be achieved to augment the model
equations for the DEA model by the equations that calculate (X T X)1 . This will
combine the DEA and OLS model into one model. After this has been done there
is only one solve for each bootstrap sample.
6. Other DEA sources
We want to mention the work of [8] and [9] for large DEA models in conjunction
with GAMS. The software is described on the web page http://www.gams.com/
contrib/gamsdea/dea.htm [10].
Some earlier DEA modeling work with GAMS is documented in [14, 18].
References
1. R. D. Banker, A. Charnes, and W. W. Cooper, Some models for estimating technical and scale
efficiencies in data envelopment analysis, Management Science 30 (1984), no. 9, 10781092.
2. William F. Bowlin, Measuring performance: An introduction to data envelopment analysis
(dea), Tech. report, Department of Accounting, University of Northern Iowa, Cedar Falls, IA,
1998.
3. A. Charnes, W. W. Cooper, and E. Rhodes, Measuring the efficiency of decision making
units, European Journal of Operational Research 2 (1978), 429444.
4. A. Charnes, W. W. Cooper, and E. Rhodes, Evaluating program and managerial efficiency:
An application of data envelopment analysis to program follow through, Management Science
27 (1981), 668697.
5. Laurens Cherchye, Timo Kuosmanen, and Thierry Post, New tools for dealing with errors-invariables in DEA, Tech. report, Catholic University of Leuven, 2000.
6. Bradley Efron and Robert J. Tibshirani, An Introduction to the Bootstrap, Chapman & Hall,
1993.
7. Ali Emrouznejad, Dea homepage, http://www.deazone.com/, 2001.
8. Michael C. Ferris and Meta M. Voelker, Slice models in general purpose modeling systems,
Tech. report, Computer Sciences Department, University of Wisconsin, 2000.
9.
, Cross-validation, support vector machines and slice models, Tech. report, Computer
Sciences Department, University of Wisconsin, 2001.
10. GAMS Development Corporation, GAMS/DEA, http://www.gams.com/contrib/gamsdea/
dea.htm, 2001.
11. Dieter Haas, Martin G. Kocher, and Matthias Sutter, Measuring Efficiency of German Football Teams by Data Envelopment Analysis, Tech. report, University of Innsbruck, May 2003.
12. Erwin Kalvelagen, New special functions in GAMS, http://amsterdamoptimization.com/
pdf/specfun.pdf.
13.
, Model building with gams, to appear.
14. O.B. Olesen and N.C. Petersen, A presentation of GAMS for DEA, Computers & Operations
Research 23 (1996), no. 4, 323339.
22
ERWIN KALVELAGEN
15. Leopold Simar and Paul W. Wilson, Sensitivity Analysis of Efficiency Scores: How to Bootstrap in Nonparametric Frontier Models, Journal of Applied Statistics 44 (1998), no. 1, 4961.
16.
, A general methodology for bootstrapping in nonparametric frontier models, Journal
of Applied Statistics 27 (2000), 779802.
17. Boris Vuj
ci
c and Igor Jemri
c, Efficiency of banks in transition: A DEA approach, Tech.
report, Croatian National Bank, 2001.
18. John B. Walden and James E. Kirkley, Measuring Technical Efficiency and Capacity in Fisheries by Data Envelopment Analysis Using the General Algebraic Modeling System (GAMS):
A Workbook, NOAA Technical Memorandum NMFS-NE-160, National Oceanic and Atmospheric Administration, National Marine Fisheries Service, Woods Hole Lab., 166 Water St.,
Woods Hole, MA 02543, 2001.
19. Mei Xue and Patrick T. Harker, Overcoming the Inherent Dependency of DEA Efficiency
Scores: A Bootstrap Approach, Tech. report, Department of Operations and Information
Management, The Wharton School, University of Pennsylvania, April 1999.
Amsterdam Optimization Modeling Group, Washington D.C./The Hague
E-mail address: [email protected]