0% found this document useful (0 votes)
118 views50 pages

An Overview of Heuristic Solution Methods: Journal of The Operational Research Society May 2004

This document provides an overview of heuristic solution methods for solving mathematical optimization problems. It defines heuristics as methods that aim to find reasonable solutions without guaranteeing optimality. Reasons for using heuristics include dealing with problems that are too complex to solve optimally due to combinatorial explosions, stochastic elements, or changing conditions over time. The document outlines different categories of heuristic methods and metaheuristics, and discusses evaluating heuristic performance and interactive approaches. Heuristics allow the use of more realistic models compared to optimization methods that require restrictive assumptions.

Uploaded by

Yaswitha Sadhu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
118 views50 pages

An Overview of Heuristic Solution Methods: Journal of The Operational Research Society May 2004

This document provides an overview of heuristic solution methods for solving mathematical optimization problems. It defines heuristics as methods that aim to find reasonable solutions without guaranteeing optimality. Reasons for using heuristics include dealing with problems that are too complex to solve optimally due to combinatorial explosions, stochastic elements, or changing conditions over time. The document outlines different categories of heuristic methods and metaheuristics, and discusses evaluating heuristic performance and interactive approaches. Heuristics allow the use of more realistic models compared to optimization methods that require restrictive assumptions.

Uploaded by

Yaswitha Sadhu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 50

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/245281126

An overview of heuristic solution methods

Article  in  Journal of the Operational Research Society · May 2004


DOI: 10.1057/palgrave.jors.2601758

CITATIONS READS
141 6,012

1 author:

Edward A. Silver
The University of Calgary
186 PUBLICATIONS   9,288 CITATIONS   

SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Inventory Management and Production Planning in Supply Chains, 4th Edition, by Edward A. Silver, David F. Pyke, and Douglas J. Thomas. Available in November 2016. View
project

All content following this page was uploaded by Edward A. Silver on 12 May 2016.

The user has requested enhancement of the downloaded file.


AN OVERVIEW OF HEURISTIC

SOLUTION METHODS

Edward A. Silver
Faculty Professor of Operations Management

Haskayne School of Business

University of Calgary

Working Paper – 2002-15

October, 2002
1

An Overview of Heuristic Solution Methods

Edward A. Silver

Abstract

This paper is particularly directed to analysts and managers with some limited familiarity with

the use of mathematical modelling as an aid to decision making. It is concerned with obtaining

usable solutions to well-defined mathematical representations of real world problem situations.

Heuristic procedures are defined and reasons for their importance are listed. A wide variety of

heuristic methods, including several metaheuristics, are described. In each case, references for

further details, including applications, are provided. There is also considerable discussion

related to performance evaluation.

Keywords : Heuristics, OR education, Methodology

Suggested Running Head: Heuristic Solution Methods


1

Introduction

The purpose of this paper is to provide an overview of heuristic methods for operational research

analysts and those managers who are somewhat familiar with the basic concepts of the modelling

of decision situations. It is not the intention to provide a prescription for the construction of

heuristics, nor is any attempt made to judge the relative merits of different heuristic methods.

For each heuristic procedure described two items will be provided: i) at least one reference for

further details, and ii) at least one reference that includes an illustrative OR application.

As Ackoff1 and Müller-Mehrbach2 , among others, have pointed out, the use of mathematical

models to aid in decision making regarding real world situations really involves four different

stages as shown in Figure 1. In this paper we are concerned only with the last step, ie. we begin

with a well-defined mathematical representation of a problem, specifically there is an objective

or evaluation function that provides the value of any specific solution (values of the set of

decision variables) and there are specified constraints that define the region of feasible solutions.

Ideally one would like to select the so-called optimal solution that achieves the maximum (or

minimum) value.

The third step, the selection of the mathematical model, also involves the use of heuristic

reasoning. However, this paper will not address this phase, other than to point out that one of the

major benefits of heuristic solution methods, unlike mathematical optimization, is a reduced need

for rather restrictive assumptions in the model formulation. An interesting reference on

modelling is Clayson3 . Heuristics are also used in everyday decisions without the associated

construction of a mathematical model. This more general perspective is used in Lenat4 .


2

What is meant by a heuristic (solution) method? There are many possible definitions. We adopt

the following, modified slightly, from Foulds5 . The term heuristic means a method which, on the

basis of experience or judgement, seems likely to yield a reasonable solution to a problem but

which cannot be guaranteed to produce the mathematically optimal solution.

Heuristic solution methods are widely used in practice. For example, Zanakis et al6 for the

period 1975-1986 classified, by type of heuristic and area of application, some 442 articles from

a selection of surveyed journals. There is a vast literature related to heuristics. Books,

specifically focussing on the topic, include Michalewicz and Fogel7 , Morton and Pentico8 , and

Reeves9 . Articles of a tutorial nature, some with extensive reference lists, include Foulds5 ,

Ignizio10 , Muller-Malek, Matthys and Nelis11 , Müller-Mehrbach2 , Pinedo and Simchi- Levi12 ,

Silver, Vidal and de Werra13 , White14 , Zanakis and Evans 15 , and Zanakis et al6 .

The following is an outline of the remainder of the paper. The next section deals with the

question of why heuristic methods are used. This is followed by a framework of a variety of

basic types of heuristic methods. The subsequent section is devoted to so-called metaheuristics,

very general approaches to obtaining solutions of complicated combinatorial problems. Then we

comment on the role of interactive methods involving humans and the computer. Next, there is a

focus on evaluating the performance of a heuristic. The paper concludes with summary remarks

and an extensive reference list.


3

Why use a heuristic solution method?

Referring back to Figure 1, the intention in step 3 should be to construct a mathematical model,

truly representative of the perceived problem, that is as simple (parsimonious) as possible.

Although simplicity is desired, to be representative of the perceived situation the resulting model

may be of such a form as to make it difficult, if not impossible, to find its optimal solution.

There are at least three circumstances that each can lead to this situation, namely i) a

combinatorial explosion of the possible values of the decision variables, ii) difficulty in

evaluating the objective function (or in having probabilistic constraints) due to the presence of

stochastic variables, and iii) conditions that change markedly with item, the latter requiring a

whole time series of solutions, rather than a single solution at a point in time.

To achieve optimality one is forced to simplify the model by introducing questionable

assumptions. Moreover, there is the related issue of inaccuracy of the associated needed data.

Recognizing these factors it indeed is likely to be better to achieve a reasonable (non-optimal)

solution to a more accurate model than to seek the optimal solution to an incorrect or

oversimplified model of the real-world problem. Ackoff1 , Churchman16 , and Eilon17 were early

proponents of this perspective that is also a central theme in each of Michalewicz and Fogel7 and

Reeves9 . To summarize, heuristics, because they do not require the often restrictive assumptions

of optimization routines, permit the use of models that are more representative of the real-world

problems.
4

Mathematically, a very general representation of an optimization problem is as follows:

minimize or maximize f(x) …(1)

subject to gj(x) = 0 j=1, 2…, m …(2)

where x is the vector of decision variables (xi, i=1, 2, .., n)

f ( ) is the objective function

and (2) represents the set of m constraints including the possible discrete nature of

some of the xi’s.

As mentioned earlier the presence of stochastic elements may make it very difficult to evaluate

f(x). Practical examples of such situations include:

i) Commonality inventory problems (Jönsson and Silver 18 ) – There are a number of finished

items each subject to stochastic demand. Finished items are made up of components, with

some components being common to more than one end item. The service provided on end

items is a complicated function of the numbers of components available and the end- item

demands.

ii) Variable yield production problems (Yano and Lee19 ) – In its simplest form, there is a single

type of item that must be processed through various stages each with stochastic yield. The

problem is to select the production quantities to initiate through time at the first stage so as

to satisfy a known demand pattern at lowest expected cost. An additional complexity is

where there is more than one type of end quality and items of a given quality can be used to

satisfy demand for that quality or lower.

iii) Multiechelon inventory problems (Silver et al20 ) – Consider a supply chain distribution

network with stochastic demands for items at end-points and stochastic transportation times

between locations. Also it is possible to transship material from one end point to another.
5

The inventory costs and customer service levels are very complicated functions of the

replenishment policies and the various stochastic variables.

Let us expand on the first of the above- mentioned circumstances that lead to complexity. Many

important real-world problems are combinatorial in nature. There are often a large number (n) of

decision variables, many of which can only take on discrete values (e.g. 0-1 or integer values).

This includes the grouping, ordering (permutation), or selection of discrete objects. Even when

the objective function is linear (which may not be an appropriate representation) there can be

many local optima. Such problems are labelled as nondeterministic-polynomial-time-complete

(NP-complete) and all algorithms currently available for finding optimal solutions to them

require a number of computational steps that grows exponentially with the size of the problem

(two illustrations of what is meant by size are i) the number of jobs to be scheduled in a machine

scheduling problem and ii) the number of facilities to be located with materials to be moved

between them). For early work on computational complexity see Garey and Johnson21 , and

Lawler22 . A more recent reference is Hall23 . In addition, for some problems it may be extremely

difficult to find any feasible solution, let alone the optimal one. We now elaborate on five

specific illustrations of combinatorial problems of practical interest, with an associated reference

for more details:

i) The travelling salesman problem (Hoffman and Padberg24 ) – An individual has to carry out

a tour among n cities, visiting each of them precisely once. Knowing the distances of each

direct link between pairs of cities, the objective is to seek the tour (sequence of cities) so as

to minimize the total distance travelled.

ii) The quadratic assignment problem (Kaku25 ) – For a set of n facilities we know the volume

of material (aik ) to be moved per unit time between each pair of facilities (i and k). There is
6

a grid of points on which the non-overlapping facilities can be located. The objective is to

position the facilities so as to minimize the total volume x distance of the material

movement. The usual modelling approach is to have 0-1 variables

xij = 
1 if facility i is located at po int j
 0 otherwise

and the objective function is given by ∑ ∑ ∑∑


i j k l
xij xkl aik d jl where d jl represents the

distance between points j and l. The adjective “quadratic” is used because of the cross-

products of decision variables in the objective function.

iii) The resource-constrained project-scheduling problem (Kolisch26 ) – A project is made up of

n distinct activities. Each has a given duration and requires the use of one or more resources

where there is an upper limit on the availability of each resource. There are also precedence

constraints among the activities. The objective is to minimize the completion time of the

project.

iv) The fixed-charge capacitated multicommodity network design problem (Magnanti27 ) – A set

of nodes are prescribed with given transportation needs per unit time of different

commodities between each pair of nodes. Direct links are possible between each pair of

nodes and there is a capacity, a fixed cost and variable costs specified for each such link.

The objective is to select the set of links, satisfying the flow requirements, with minimum

overall costs. There are applications in the fields of transportation and telecommunications.

v) The vehicle routing problem (Bodin28 ) – The following description is adapted from Pinedo

and Simchi- Levi12 . Consider a distribution or collection system with a single depot (e.g., a

warehouse or school) and n geographically dispersed demand points (e.g., retailers or bus

stops). The demand points are numbered arbitrarily from 1 to n. At each demand point
7

there are a number of items (e.g., products or students), which are referred to as the demand

and which must be brought to the depot using a fleet of vehicles. There are three types of

constraints.

a) Capacity constraints: an upper bound on the number of units that can be carried by a

vehicle.

b) Distance (or travel time) constraints: a limit on the total distance (or time) travelled by

each vehicle and/or a limit on the amount of time an item can be in transit.

c) Time window constraints: a prespecified earliest and latest pickup or delivery time for

each demand point and/or a prespecified time window in which vehicles must reach

their final destination.

The problem is to design a set of routes for the vehicles such that each route starts and ends at the

depot, no constraint is violated, and total distance travelled is as small as possible. A recent

review of vehicle routing heuristics was provided by Cordeau et al29 .

To this point in this section the discussion has essentially been devoted to a single, major reason

for using heuristics, namely that it is difficult, if not impossible, to obtain the optimal solution of

the mathematical model representation of the problem under consideration.

However, there are other reasons for utilizing heuristic solution methods. They include:

i) Facilitation of implementation – “People would rather live with a problem they cannot solve

than accept a solution they cannot understand”. (Woolsey and Swanson30 ). The acceptance

and use by decision makers of decision rules are likely to be facilitated by an , at least

intuitive, understanding of how the rules operate, in particular how key parameters influence

the chosen actions. This type of understanding is more likely with heuristic rules than with a
8

complex optimization routine. However, as pointed out by Haessler31 , this does not

necessarily mean that heuristics must always be simple in nature; for some complex

problems simple heuristics may not produce acceptable solutions.

ii) Show improvement over current practices – related to the previous point, managers may be

quite satisfied with a heuristic solution that produces better results than those currently

achieved.

iii) Fast results – Sometimes fast, reasonable, results are needed and heuristics can be more

quickly developed and used than optimization routines.

iv) Robustness – Heuristics can be less sensitive to variations in problem characteristics and

data quality. In the words of Barthodi and Platzman32 , “Optimal solutions are fragile in the

sense that they can be exquisitely sensitive to changes in the data. If the problem description

changes slightly, to recover an optimal solution generally requires resolving the entire

problem (which typically was computationally expensive to solve in the first place). On the

other hand, heuristics frequently partition the problem and so ignore interrelationships

between partitions. This allows updates to be confined to just the partition affected.

Recomputation can be local and therefore faster.” Moreover, as pointed out by Fisher33 ,

some constraints are actually flexible in practice and a heuristic method can more easily

accommodate this flexibility.

v) Use within optimization routines – Heuristics can be profitability used within optimization

routines in three ways. First, they can provide good initial solutions in an iterative scheme.

Second, they can furnish bounds to facilitate elimination of portions of the solution space in

partial enumeration optimization methods. Third, heuristics can be used to guide the search

process as illustrated by Mason et al34 in a staff scheduling problem.


9

Basic types of heuristic methods

This section is devoted to a categorization of basic heuristic methods. It should be pointed out

that the categories are not necessarily mutually exclusive. Indeed, it often makes sense to blend

more than one type of heuristic in the solution of a specific class of problems. Moreover, it can

be fruitful to use two or more distinct methods in parallel to solve the same problem, choosing

the best of the solutions. Müller-Mehrbach2 provides a more general discussion of using

combinations of heuristics.

For a given problem, the development of a new heuristic or the choice among existing options is

a creative undertaking; hence the following quotation is relevant: “Creativity involves a

willingness to break away from established patterns and try new directions, but it does not mean

being different for the sake of being different or an exercise in self- indulgence. It is as much a

mistake to ignore the accumulated knowledge of the past as it is to be limited by it. Being

creative means combining knowledge and imagination.” (Ruggiero35 ). In other words, it makes

good sense to be familiar with as much as possible of the existing theory related to the specific or

similar mathematical models, as well as with the range of available heuristic approaches.

The choice of which heuristic (or metaheuristic) approach to use depends upon a number of

factors including:

i) whether the decision area is strategic, tactical or operational,

ii) the frequency with which the decision is made,

iii) the development time available,

iv) the analytical qualifications of the decision maker(s) involved,


10

v) the size of the problem (including the number of decision variables),

vi) the absence or presence of significant stochastic elements.

Randomly Generated Solutions

One relatively straightforward concept is to randomly generate feasible solutions to the problem,

evaluate each and choose the best. Baum and Carlson36 argued that one could decide on the

number of trials so as to achieve a desired probability that the best solution obtained is better

than a prescribed percentage of all solutions. Mabert and Whybark37 , related to a type of facility

location problem, also looked at biasing the sampling, including adapting the biasing as results

are observed. This is closely linked to one of the metaheuristics, the adaptive reasoning

technique, to be discussed later.

Problem Decomposition / Partitioning

Here one takes a complex problem and decomposes or partitions it into a number of, presumably

simpler to solve, subproblems. The partitioning can be by natural hierarchy of decisions (e.g.,

system design versus system operation), by major resources (e.g., different machines in a

production scheduling context), or by chronological time of decisions. Use of myopic decision

rules is an example of the latter, specifically the current decision is made looking only at its

consequences in the very near future. See Bollaprogada and Morton38 for an application related

to production scheduling involving random yield.

Once the subproblems are defined there are three general solution approaches:
11

i) Solve the subproblems independently and somehow coalesce the independent solutions into

a feasible solution of the overall problem. An example of production lot-sizing, involving

partitioning of time, is provided by Federgruen and Tzur 39 . The travelling salesman problem

(TSP) has been solved in this fashion by Karp40 who partitions the overall geographic region

into small regions, solves a TSP for each, then merges the separate tours into a single overall

tour.

ii) Solve the subproblems sequentially, using the results of the first as input to the second, etc.

This is commonly done in system design followed by system operation. So-called

hierarchical planning (Hax and Meal41 and Bitran and Tirupati42 ) is done in this fashion –

aggregate production planning, followed by family scheduling, followed by individual run

sizes. Bodin28 describes a sequential solution procedure for the vehicle routing problem

where the number of vehicles (K) is selected, then the customers are separated into K

clusters, then a route is chosen for each cluster. The OPT software (Fry et al43 and Morton

and Pentico8 ), based on the theory of constraints, focuses on the bottleneck resource in a

multistage scheduling problem, then makes decisions elsewhere to support the smooth

functioning of the bottleneck. Finally, in inventory management (Silver et al20 ) one often

sequentially chooses the order quantity of an item, then its safety stock or reorder point.

iii) Solve the subproblems iteratively, ie. not just in a sequential fashion. The (shifting)

bottleneck dynamics of Morton and Pentico8 encompasses this approach. They consider the

situation of multiple resources shared by multiple activities (e.g. machines shared by jobs in

a job-shop scheduling context) and solve single resource (machine) problems iteratively. A

key idea in each single resource problem is the estimated marginal benefit of the resource
12

use for each activity. Another illustration in production lot-sizing is provided by Dixon and

Silver44 . Morton and Pentico8 also point out the close analogy with the use of transfer

pricing to decompose large systems.

Inductive Methods

There are two aspects here. First is the generalization from smaller (or somewhat simpler)

versions of the same problem or a closely related (from a mathematical perspective) problem.

The latter embraces the concept of analogy which, again, is an important ingredient of creative

problem solving (Polya 45 ). As an example of generalization, Bilde and Vidal46 considered the

problem of locating a number of plants and warehouses. Properties of the solutions for the cases

of but a few facilities were used to develop a heuristic for the more general case of several

facilities. At the opposite extreme, sometimes it is easy to analyze the case where one or more

parameters take on very large values, again providing insight for the more difficult case of

intermediate values of the parameters.

Methods that Reduce the Solution Space

The basic idea is to reduce the solution space, ie. cut back drastically on the number of solutions

that are even cons idered while, hopefully, not seriously affecting the quality of the solution

obtained. This can be done by tightening existing constraints or by introducing extra constraints.

In some cases there may even be an efficient algorithm for the restricted segment of the solution

space (e.g. using an optimization routine that is only valid in the restricted region). A specific
13

method of restriction, called beam search, is so general that we have chosen to discuss it later as

one of the metaheuristics.

One type of approach is to first obtain the optimal solutions to several numerical instances of the

problem under consideration (which, of course, may be very difficult to do!). An extreme

version (eliminating all subsequent search) is to then develop regression relationships that give

values of the decision variables as functions of key parameters of the problem. An example is

the so-called power approximation of Ehrhardt47 used in inventory management. A very flexible

form of non- linear regression is achieved through the use of feed- forward neural networks. Such

networks consist of a set of nodes connected by directional arcs (without any feedback) as

illustrated in Figure 2. The number of layers and the number of nodes in each of the

intermediary layers are parameters that can be adjusted. The input signal at a node is a linearly

weighted mix of the output signals from other nodes directly linked to it. (The weights are

adjustable parameters). The output signal at a node is typically a highly non- linear function of

the input signal, e.g. 0 or 1 depending upon whether or not the input exceeds a threshold.

Considerably further details are available in Ignizio and Burke48 and Michalewicz and Fogel7 .

A more limited restriction of the search space through observing the optimal solutions of a

number of problem instances is so-called feature extraction. Conditions that are observed in all

(or a great majority) of the optimal solutions are assumed to hold for any future cases to be

investigated, ie. the solution is partially specified. Possibilities include:

i) A 0-1 variable that is always 0 or always 1.

ii) Two variables that are highly correlated. (e.g. in a facility layout problem two facilities are

located beside each other).


14

iii) A constraint that has lots of slack in all the observed solutions is then ignored.

iv) A constraint that is always binding.

Examples of this type of approach are provided by Cunningham and Smyth49 and Rosing et al50 ,

the latter including an application to the travelling salesman problem.

Approximation Methods

Here we are specifically concerned with manipulating the mathematical model in some way (or

using a solution from a related simpler model). The possibilities have been split into four

categories. Before elaborating on them we wish by way of an illustration, to make a point which

some analysts and managers may find rather illuminating. Consider a relatively simple problem

of minimizing a non- linear function of the single variable x, as portrayed in Figure 3. The true

upper bound on x is P. The minimizing solution is at x=Q with a true objective function value of

f(Q). Suppose that the piecewise linear objective function (A-B-C) is used as an approximation

of the true, non- linear function. Then, still maintaining the correct upper bound of P, the value

of x that minimizes the linear approximation is x=R with a true objective function value of f(R).

If, in addition to the approximation of the objective function, we also tighten the constraint,

saying that x can’t take on a value larger than S, then the combination of the two approximations

says to use x=S with a true objective function value f (S), which is lower (better) than f(R). The

message is that two approximations, taken together, can lead to a better solution than just one

used alone.

i) Aggregation of parameters: The typical approach here is to replace several variables by a

single aggregate variable, solve the much smaller, aggregate model, then somehow

disaggregate the solution back into a solution of the original problem. Two illustrative
15

applications in logistics decision making are provided by Evans 51 and Geoffrion52 . Another

possibility is to replace a multistage process by an “equivalent” single stage process.

Pentico53 does this for a production problem involving variable yields. A third form of

aggregation is to scale the units of each decision variable, for example, to work in units of

100 instead of 1). Finally, aggregation is possible in dimensions, as illustrated by Bartholdi

and Platzman32 , who transform a two-dimensional combinatorial problem (such as the

travelling salesman problem) into a related single dimensional problem.

ii) Modification of the objective function – One possibility is to approximate a non- linear

function by a piecewise linear one, which may facilitate the use of a linear programming

solution algorithm. Rajagopalan54 describes a related approach, in the context of a make-to-

order versus make-to-stock decision, namely using a tractable lower bound on the objective

function. An alternative is to simply assume a simpler objective function (ie. use an

evaluation function different from the objective function). This is the basic idea in the

Silver-Meal55 heuristic in selecting replenishment lot sizes under a known, but time- varying,

demand pattern. In any of these approaches one must ultimately evaluate the performance

using the most accurate representation of the true objective function.

iii) Approximating probability distributions or stochastic processes – one option is to assume

that random variables are constants at their mean values. Bitran and Yanasse56 illustrate this

idea in the production scheduling of several items, subject to random demands, on a limited

capacity machine. One can also use an analytically convenient distribution, such as the

normal, having the same mean and variance as the random variable under consideration.

This is widely done for the distribution of demand during the replenishment lead time in

inventory control models (Silver et al20 ). Other illustrations are provided by Tijms 57 and van
16

Houtum and Zijm58 (mix of erlang distributions, used in a multiechelon inventory context)

and Shore59,60 (maintenance and inventory applications of a four parameter representation of

an inverse transformation of fractile values).

Sometimes continuous variables are not attractive in that they imply an infinite number of

possibilities. In such circumstances discrete approximations can be useful (see Zaino and

D’Errico61 for details on the approach and Grossman et al62 for an illustrative application

with regard to evaluating contracts with suppliers when demand and internal production

capability are random quantities). A related approach is to randomly generate or simply

select a relatively small set of representative scenarios. Jönsson and Silver63 have used

random generation and Consigli and Dempster64 have selected a set of scenarios in the

contexts of commonality inventory problems and multi-period, portfolio investments,

respectively.

Stochastic processes can also be conveniently approximated. For example, the aggregate

effect of a large number of renewal processes has a Poisson behaviour (Feller65 ). This

property has been used by Silver66 in an iterative scheme for developing the values of the

control parameters in a coordinated inventory control situation. Whitt67 makes use of a

convenient two-parameter renewal process to model the arrivals at any station (parallel

servers) in a queueing network. This permits a decoupling and the repeated use of a single

station model. The rapid modelling technique (Suri and de Treville68 ) is also based on a

simplified representation of a queueing network.

iv) Change nature of constraints including relaxation methods – First, one can approximate a

non- linear constraint by a linear one. One can also choose to completely ignore some

constraints, solve the problem and hopefully find that the solution satisfies the constraints.
17

Alternatively constraints can be weakened, e.g. by using surrogate constraints (Glover69 )

where several constraints are replaced by a single, linear combination of them. In general,

the relaxing of constraints can make it easier to solve the resulting model. Some constraints

may be flexible in any event (e.g. a budget constraint need not necessarily be rigid). A

common relaxation is to permit continuous values of a discrete variable. This may permit

the use of calculus to find approximate extreme points (Silver70 did this in the context of

developing a coordinated inventory control procedure for items having a joint replenishment

cost). Such a relaxation is also frequently used in linear programming relaxations of

(mixed) integer programming problems. Applications include to vehicle routing (Fisher and

Jaikumar71 ) and to the broad class of multidimensional knapsack problems, where a subset

of items, each with a given unit value and unit use of one of more resources, such as weight

and volume, are to be placed in a container so as to maximize the overall value of the

contents, subject to not violating the resource constraints (Bertsimas and Demir72 ).

Relaxation produces a solution, which for a minimization problem, gives a lower bound on

the objective function value of the optimal solution. Patterson and Rolland 73 propose

discarding the “tough” constraints to produce a lower bound, then progressively adding

heuristically generated constraints the raise the lower bound. For a constrained version of

the travelling salesman problem, they are able to produce solutions with objective function

values within a prescribed percent deviation of the unknown optimal solution.

A widely applicable approach is Lagrangian relaxation whereby one or more of the

constraints, multiplied by Lagrange multiplies, are incorporated (relaxed) into the objective

function. The multiplier, associated with a constraint, represents the penalty per unit

violation of the constraint. When constraints are relaxed (whether by Lagrangian relaxation
18

or some other method), if the resulting solution is not feasible, it must be appropriately

adjusted to achieve feasibility e.g. by some form of local search procedure (the topic of a

later section). Beasley74 and Fisher75 , both, provide an overview of the topic including

approaches for obtaining values of the multipliers, also supplying references to a number of

practical applications. In a later section we’ll return to the use of relaxation methods as a

means of bounding the performance of the unknown optimal solution.

Constructive Methods

Constructive methods, as the name implies, use the data of the problem to construct a solution,

step by step. Typically, no solution is obtained until the procedure is complete (in contrast with

improvement methods to be discussed in the next subsection). A special constructive approach

is the so-called greedy method, where, at each step, the next element of the solution is chosen so

as to give the best immediate benefit (highest profit contribution or lowest cost). The greedy

approach is very similar to a sequential myopic perspective, the latter discussed in the earlier

section on decomposition/partitioning methods.

Ignizio10 , for the case of a problem involving only zero-one variables, describes two greedy

approaches. In the add heuristic one starts with all variables set to zero and then considers each

variable, one at a time. If adding it improves the value of the objective function, then it is set to

one. In the mirror image, drop heuristic one starts with all variables set to one (which almost

certainly is an infeasible solution). Each variable is considered for deletion and the one doing

the least damage to the value of the objective function is set to zero. This is continued until a

feasible solution is obtained. Probably the best known greedy application is the nearest

neighbour method for the travelling salesman problem (Golden et al76 ). Specifically, one starts
19

at any city (node) and chooses the closest city as the next one to visit, etc. Unfortunately,

although extremely easy to use, the greedy approach can lead to a very poor solution, in that the

attractive initial choices may result in a very poor selection near the end. As a result,

constructive methods sometimes include some form of look-ahead feature (Atkinson77 ) where

one estimates the future consequences of the current choice. A more sophisticated,

metaheuristic, involving multiple constructive solutions will be covered in a later section.

Local Improvement (Neighbourhood Search) Methods

The basic concept of local improvement methods is quite simple. One starts with a feasible

solution to the problem, often the result of a constructive method. Feasible solutions in the

neighbourhood [N(xc)] of the current solution (xc) are evaluated. If one of these is better than the

current solution, it becomes the new xc , its neighbourhood is investigated, etc. until no

improvement can be found and the current solution, at that stage, is a local optimum. Not

surprisingly for maximization problems neighbourhood search is sometimes called hill climbing.

One obvious question is how does one define the neighbourhood of a point (or solution).

Müller-Mehrbach2 has considerable discussion related to neighbourhoods. The neighbourhood

N(x,t) is the set of solutions that can be obtained from x by some simple transformation t, ie.

different transformations produce different neighbourhoods. Examples include:

i) In a problem of sequencing a set of jobs on a machine a solution is given by a specific

sequence of the jobs and a transformation might be to exchange the order of any two

consecutive jobs.
20

ii) In a model involving 0-1 variables a simple exchange heuristic involves changing one

variable’s value from 0 to 1 and another’s from 1 to 0.

iii) In the travelling salesman problem (TSP) a current solution is a single tour through all

cities finishing back in the starting city. A common transformation is to interchange two

non-adjacent edges of the tour. For example if a current tour through five cities, A to F,

is BDAFC, then switching non-adjacent edges BD and FC would lead to a

neighbourhood solution BCAFD. More sophisticated transformations are certainly

possible. For example, Lin and Kernighan78 permit up to k edges to be selected for

replacement in the TSP.

Another issue is whether to choose a move to the first point in the neighbourhood exhibiting an

improvement or to exhaustively evaluate all points in the neighbourhood and choose the one

giving the largest improvement. The latter is often referred to as steepest ascent (or descent).

Neighbourhood search can also be carried out with continuous variables. The neighbourhood of

x might be defined as all points within a certain Euclidean distance of x. If the gradient (partial

derivatives of the objective function with respect to each decision variable) can be computed or

estimated (which is unlikely) then the concept of steepest ascent (descent) indicates the direction

of the move. A control parameter of the heuristic is how far to move in that direction. When the

gradient can not be easily estimated, an alternative is to deterministically specify or randomly

generate a subset of the points in the neighbourhood. Random generation could be by adding a

normal variable (0, σ i ) to the current value of each xi of the vector solution x. The σ i ’s are

controllable parameters of the heuristic search (see Michalewicz and Fogel7 , p. 124).

Unfortunately there is a fundamental weakness in local search methods. Only a local optimum is

guaranteed, hence, as illustrated in Figure 4, the solution obtained (B) is very much dependent on
21

the starting point (A) and may be quite inferior to the global optimum (C). Not only do we not

know if the solution obtained is the global optimum, but even worse we have no idea of how

much better the global optimum might be. Local (or neighbourhood) search is very focussed. It

has been referred to as exploitation (Box and Voule79 ) or intensification (Glover80 ). To break out

of the clutches of a local optimum we need a broader search, ie. exploration (Box and Voule 79 )

or diversification (Glover80 ), in other parts of the search space. Of course, one approach would

be to significantly increase the size of N(xc). Unfortunately, the required computational effort

quickly explodes as the size increases. Another possibility is to restart the search from a number

of points, randomly chosen from the search space. For example, a starting point to the right of D

in Figure 4 would lead to the global optimum. Exploration or diversification is a major

ingredient of most metaheuristics, the top of the next section.

Metaheuristics

A metahe uristic is a higher level heuristic procedure designed to guide other methods or

processes towards achieving reasonable solutions to difficult combinatorial mathematical

optimization problems. Metaheuristics are particularly concerned with not getting trapped at a

local optimum (for problems that have multiple local optima) and/or judiciously reducing the

search space. Each metaheuristic has one or more adjustable parameters. This permits

flexibility, but for any application (to a specific class of problems) requires careful calibration on

a set of numerical instances of the problem as well as testing on an independent set of instances.

Several metaheuristics are amenable to parallel processing, ie. investigation of different solution

sequences can be done in parallel.


22

Taillard et al81 present a unifying perspective on metaheuristics. In addition, combinations (or

hybrids) of metaheuristics can be used (see, for example, Osman82 related to the vehicle routing

problem). A total of five different metahe uristics will now be discussed. These are not meant to

be exhaustive (see, for example, the ant colony metaheuristic as discussed by Dorigo et al83 and

Michaelwicz and Fogel7 ).

Beam Search

The solution space of many combinatorial problems can, in principle, be represented in a tree

structure. To be specific, consider two types of problems: i) sequencing problems (eg. the

travelling salesman problem), ii) problems where each of a number of variables can take on

several discrete values. The tree representation is illustrated in Figure 5. Suppose that there are

n items to be sequenced or n variables involved. Each path from the start node down through n

levels represents a solution. As mentioned earlier in the paper, for realistic size problems it is

not computationally possible to evaluate all of the paths (solutions). Computation time can be

substantially reduced by a branch-and-bound procedure (see, for example, Mitten84 or Morton

and Pentico8 ). Suppose that we are dealing with a minimization problem and somehow (e.g. by

use of a heuristic!) have obtained a current best complete solution x* with objective function

value f(x* ). Suppose that we are moving from top to bottom of the tree in Figure 5 and are now

at node A. The contribution to the objective function of the upper part of the path is easily

computable. Let us denote it by CA. If we can obtain a lower bound LBA on any path from A to

the bottom of the tree and CA + LBA > f (x* ), then there is no need to consider any of the

solutions that include node A. Despite this pruning feature, branch-and-bound still can not

guarantee finding the optimal solution to a realistic sized problem.


23

Beam search (Morton and Pentico8 ) is a form of partial branch-and-bound. The basic idea is to

discard portions of the tree that are likely, as opposed to guaranteed, to not include the optimal

solution. A parameter, called the beam width (w), is the number of nodes that are retained at

each level as we proceed down the tree. To rank the nodes for culling purposes at any level we

need a way of estimating the contribution of the best path from each node to the bottom of the

tree. To get an accurate estimate may still be very time consuming. A variation, called filtered

beam search (Ow and Morton85 ), is to very quickly develop a crude estimate for each node at the

level under consideration, then for just the best f (filter width) do a more careful evaluation, and

subsequently pick the w (beam width) best of these. Ow and Morton85 showed that this worked

very well on machine scheduling problems. In principle, the beam width need not be the same at

all levels of the tree. In particular, if a nominal width of 3 was being used and at a particular

level the 3rd and 4th best nodes were very close in value, one might profitably change the width to

4 at that level.

Tabu Search

Tabu search is one of the most widely used metaheuristics. Glover86 cites some 70 areas of

application including vehicle routing, electrical power distribution, transport network design and

classroom scheduling. Overview articles on the top include Glover80,86 and Glover et al87 ; far

more detail is available in Glover and Laguna 88 .

We first describe the basic concepts of tabu search. The method begins with a complete, feasible

solution (obtained, e.g., by a constructive heuristic) and, just like local improvement, it continues

developing additional complete solutions from a sequence of neighbourhoods. However, to


24

escape from a local optimum (such as point B in Figure 4), moves to neighbours with inferior

solutions are permitted. Moreover, a mechanism is used that prevents cycling back to recently

visited solutions (in particular, the local optimum). Specifically, recent solutions (or attributes of

solutions) are maintained on a so-called tabu list preventing certain solutions from reoccurring

for a certain number of iterations, called the size (or length) of the list. This size is a key

controllable parameter of the metaheuristic. A record is maintained of the best solution to date,

x*, and its objective function value f(x*). The tabu status of a move can be overridden through

the use of a so-called aspiration criterion. The simplest version is the following (shown for a

maximization problem): if the move leads to a solution x having f (x)>f (x*), then the move is, of

course, permitted. Typically the procedure is terminated after either a prescribed total number of

iterations or if no improvement is achieved in some other specified number of consecutive

iterations. Just as in basis neighbourhood search a key issue is defining the neighbourhood so

that the computational effort is not prohibitive, yet very good solutions are still achieved. The

concept of a course filter, discussed under beam search, is relevant here. In its basic form, given

the initial solution, tabu search does not include any random elements in contrast with two other

metaheuristics to be discussed later, namely simulated annealing and evolutionary algorithms.

There are a wide variety of enhancements of the basic version of tabu search described above.

These include:

i) Dealing with an objective function that is difficult to evaluate – Sometimes it is easier to

evaluate the change in the objective function in moving from xc to a neighbouring solution x. In

choosing among neighbours the changes, rather than the absolute values, are sufficient. As in the

earlier section on approximate methods one may choose to use an approximate evaluation

function, at least for screening purposes. A specific case where it is difficult to evaluate the
25

objective function, is when there are random elements present. Costa and Silver89 have

demonstrated that a form of statistical sampling works quite well in choosing neighbourhood

moves.

ii) Dealing with constraints – The most common approach is to use an evaluation function that

is the original objective function plus a penalty function for violations of each of the constraints.

This is similar in spirit to Lagrangian relaxation, discussed earlier. Costa90 illustrates this

approach in the context of scheduling professional sports matches where constraints include: no

more than a certain number of consecutive home games, no more than 3 matches in 4

consecutive days, an even distribution of games throughout the season, etc.

iii) Probabilistic selection of candidate solutions – The usual tabu approach considers the

solutions in the neighbourhood N (xc) of the current solution xc in a deterministic fashion,

possibly according to a priority scheme. Probabilistic tabu search (Glover86 ) permits using a

random mechanism to choose from a set of candidate solutions where the probabilities can be

based on attributes of the solutions.

iv) Variations of the tabu mechanism/list – One possibility is to systematically and/or randomly

change the length of the list (Taillard91 ). Another option (Hasegawa et al92 ) is to have a

continuous (between 0 and 1) tabu status that decays with time. Suppose that at a particular

iteration the tabu status of a solution x is t. Then, if x is considered, with probability 1-t its tabu

status is ignored. Nanobe and Ibaraki93 present a method for automatically adaptively adjusting

the length of the list based upon performance characteristics in recent iterations.

v) More sophisticated versions of aspiration criteria - The override of tabu status can be based

on specific attributes of the candidate solution, not just its objective function value. One
26

possibility is the degree of “change” in the solution compared with the current one. A large

change is desirable if one is attempting to move away from a local optimum.

vi) Frequency – based memory – the usual tabu memory is short term, ie. what has happened

recently. In addition, one can maintain a longer-term memory that records frequencies of various

solutions (or attributes of solutions). This can be used for two purposes. First, one can intensify,

ie. focus, the search in the areas of previously observed, high quality (elite) solutions. Second, in

the longer term one can diversify by seeking to generate solutions that are significantly different

(as measured by one or more attributes) from those previously encountered. A closely related

concept is path relinking (Glover and Laguna 88 ) where new solutions are generated by starting at

a high quality solution and generating paths that are forced to end up at other high quality

solutions.

vii) Dealing with continuous variables – Chelouah and Siarry94 describe the use of tabu search

for global optimization when the decision variables are continuous. The neighbourhood of a

current solution, xc , is defined by the region (sphere or hyperrectangle) that is within a certain

distance of xc. Solutions are selected at random within a series of different sized

neighbourhoods. Diversification aspects are also discussed.

There are a number of controllable parameters in a tabu search including the size of the tabu list

(and possibly how to dynamically adjust it), how to decide when it is time to diversify, the

weights to use in bringing constraints into the evaluation function, etc.


27

Simulated Annealing

Simulated annealing is another, commonly used, metaheuristic designed to permit escaping from

local optima. References include Anandalingam95 , Dowsland 96 , and Vidal97 , where the latter is a

monograph devoted to applications including the travelling salesman problem,

telecommunications network design, and facility location decisions. The name “simulated

annealing” is due to the fact that conceptually it is similar to a physical process, known as

annealing, where a material is heated into a liquid state then cooled back into a recrystallized

solid state. It has some similarities to tabu search. Both start with an initial complete feasible

solution and iteratively generate additional solutions, both can exactly or approximately evaluate

candidate solutions, both maintain a record of the best solution obtained so far, and both must

have a mechanism for termination (a certain total number of iterations or a prescribed number of

consecutive iterations without any improvement). However, there are two important differences

between the methods. Tabu search permits moving away from a local optimum (ie. diversifying)

by an essentially deterministic mechanism, whereas, as we’ll see, a probabilistic device is used in

simulated annealing. Second, tabu search tends to temporarily permit moving to poorer solutions

only when in the vicinity of a local optimum, whereas this can happen at any time in simulated

annealing.

At any iteration k we have a current solution xc and a candidate solution x selected (at random or

in a systematic fashion) from the neighbourhood N(xc). Suppose we are dealing with a

maximization problem. As a result, if f( x) > f(xc), then x becomes the new xc on the next

iteration. If f(x) < f (xc), then there is still a chance that x replaces xc, specifically the associated

probability is
28

f (x c ) − f (x ) 
P (x → x c ) = exp   … (3)
 TK 

where TK is a parameter called the temperature. The probability of accepting the inferior

solution x is seen to decrease as the performance gap between xc and x increases or as the

temperature becomes smaller. The sequence of temperatures usually satisfies T1 >T2 > ….., ie.

the temperature is gradually decreased. There are various mechanisms of achieving this; a

common one being to maintain a fixed temperature (T) for a specified number (n) of iterations,

then use φ T for the next n iterations, then φ 2 T for the next n iterations etc. where φ is a

controllable parameter satisfying 0< φ <1. Decreasing temperatures mean that in the early

iterations diversification is more likely, whereas in the later stages the method resembles simple

local improvement. Of course, one can use a more sophisticated dynamic control of T where it

can, for example, be temporarily increased any time it appears that the procedure has stalled at a

local optimum (see Osman82 ). When the search is terminated it makes sense to do a subsequent

local search to ensure that the final solution is at a local optimum.

As with any metaheuristic, there are a number of controllable parameters in simulated annealing,

in particular the sequence of temperatures (e.g. the initial temperature and the φ value) and the

termination criterion.

There are other possibilities, besides those of tabu search and simulated annealing, for

occasionally accepting inferior solutions so as to escape from local optima. For example,

Dueck98 presents three simple, deterministic mechanisms.


29

Multi-Start Constructive Approaches – the Adaptive Reasoning Technique

In an earlier section we discussed basic constructive heuristics. Specifically a solution is built up

incrementally so that a complete solution is not obtained until the end of the construction. Also

mentioned was the so-called greedy approach where at each step the element is added which,

considered alone, has the most beneficial impact on the objective function. Multi-start

constructive procedures redo the construction many times from different starting points and/or

introduce random choice elements. The simplest form is to repeat a constructive heuristic from

different starting points. Tsubakitani and Evans 99 have done this for the travelling salesman

problem. After generating a number of constructive solutions, they do a regular local search

away from the best of these, then away from the second best, etc. The greedy randomized

adaptive search procedure (Feo and Resende100 ) starts from a number of random points and at

each step of the construction of a solution a random choice is made from a short list of the most

greedy elements, ie. the greediest choice is not necessarily made. Again, local improvement of

each complete constructive solution is carried out.

The above-discussed multistart procedures do not use information from earlier complete

solutions to modify the choice mechanism. Incorporation of this type of feature is the main idea

of the adaptive reasoning technique, ART (Patterson, Rolland and Pirkul101 ). Memory is used to

aid in learning about the behaviour of a greedy heuristic on a specific problem instance.

Constraints are imposed to prevent the greedy heuristic from making, what have been observed

to be, poor choices (of specific decision variables). The constraints are dropped after a certain

number of iterations (similar to entries on a tabu list). A closely related use of adaptive memory

in constructive methods is described by Fleurent and Glover102 . In a sense ART searches for the
30

appropriate set of variables to not include rather than the usual goal of seeking which variables to

include. The method has been applied to several problem areas including workforce assignment

and the design of telecommunication networks.

Evolutionary Algorithms

Evolutionary algorithms, as the name implies, are a class of metaheuristics that emulate natural

evolutionary processes. Sometimes the adjective “genetic” is used in lieu of “evolutionary”. A

major portion of the Michalewicz and Fogel7 book is devoted to the subject. Another general

reference is Reeves103 in which several applications (with associated references) are discussed,

including the travelling salesman problem. Other illustrative applications include production

scheduling (Moon et al104 , and Rochat105 ) and telecommunications systems design (Armony et

al106 ).

Evolutionary algorithms work with a group or population of solutions (in marked contrast with

earlier discussed metaheuristics). At each iteration each solution in the current population is

evaluated. The evaluations serve to select a subset of the solutions to be either used directly in

the next population or indirectly through some form of transformation or variation (adjustment of

the single solution or generation of a new solution by combining part of the solution with part of

another one). The parts of a solution can be thought of as genes, adjustments of single solutions

as mutations, and combination as mating to produce offspring. As with the earlier discussed

metaheuristics, a record is maintained of the best solution to date.


31

Other issues are representation (how to represent a solution in the form of a vector of genes),

choosing the initial solutions, and termination. We next comment briefly on each of these as

well as providing some further detail on each of evaluation, selection, and variation.

In a problem with n 0-1 variables a natural representation is simply a vector of n 0-1 genes. In a

travelling salesman problem with n cities numbered 1 to n an appropriate representation is an

ordering of the n cities (e.g. 5,3,4,1,2 would represent a tour going from city 5 to 3 to…to 2 to 5).

Other types of representations are discussed by Michalewicz and Fogel7 .

The initial population can be simply randomly generated, but it makes sense to include at least

one solution that is obtained by a good (constructive) heuristic. Another possibility is to ensure

that the initial population is well distributed throughout the solution space. As with the previous

metaheuristics, termination results from completing a prespecified number of iterations or

through seeing a prespecified (different) number of iterations without improvement.

One might think that the evaluation of a solution, sometimes called its fitness, should simply be

based on the associated value of the objective function. However, to avoid possible convergence

to a population of very similar solutions it may be more appropriate to base the subsequent

selection step on a linear transformation of the objective function values or to simply use a

ranking, as opposed to absolute continuous values (Reeves103 ).

There are a variety of possible options for the selection step. Some individuals (solutions) can be

eliminated from consideration in a deterministic fashion based on their fitness levels.

Alternatively, individuals can be randomly selected (including more than once) for use in the
32

variation phase, where the probability of selection is based on the fitness values (or their

rankings).

Selected individuals (solutions) are subjected to forms of variation to produce individuals for the

next generation. The most common way of combining two solutions to form two new ones is

called simple crossover. The genes of the two parents to the left of the crossover point are

interchanged. This is illustrated in Figure 6 where the crossover point is after the 2nd gene. More

elaborate crossovers are possible using more than one crossover point. Yagiura and Ibaraki107 ,

instead of crossover, determine the common characteristics of the two parents and then

efficiently search for the best solution that satisfies these characteristics.

The second common type of variation is mutation where one or more genes of a solution are

individually changed with a small probability. There are other forms of variation, partly

depending upon the method of representation of a solution (Reeves103 ).

There are a considerable number of controllable parameters and other choices in the use of an

evolutionary algorithm to solve a given problem. The parameters include the size of the

population, the probability of mutation of an individual gene (which may be best varied during

the evolutionary process), the mechanism for generating the size of the mutation change when

genes are not just 0-1, the number of crossover points, etc.

There are a variety of enhancements of evolutionary algorithms. Most parallel the earlier

discussion related to tabu search. Possible enhancements include the handling of constraints

(Michalewicz and Fogel7 ) and continuous variables (Chelouah and Siarry108 ), approximate

(rather than exact) evaluation of a solution, coping with stochastic elements, etc. Finally
33

Glover109 describes a more general process, called scatter search, for using combinations of

several solutions to produce new solutions, (see also Laguna 110 ).

Interactive Methods

As was shown in Figure 1 the last two phases in using a mathematical model to aid in resolving a

real world situation of concern are (development of a) mathematical model of the perceived

problem, and solution of the model. This paper has been deliberately focussed on the latter.

Interactive (computer/human) procedures have been found to be particularly useful in the

development of the model, particularly where the objective function and/or some of the

constraints are difficult to explicitly specify. Graphical portrayal of the results of a tentatively

specified model leads to adjustments, etc. (see Bell111 and Bright and Johnston112 for illustrative

discussions of so-called visual interactive modelling). In addition, interactive methods can

afford advantages in the development and use of heuristic solution procedures, viz.

i) The graphical representation of a problem, together with a user-friendly interface, can permit

the analyst or decision maker to suggest promising solutions, which can then be evaluated by the

computer. Fisher113 describes applications in vehicle scheduling, location decision and

production scheduling. Also see Segal and Weinberger114 for an application concerned with

dividing a geographic area into a number of separate customer service areas.

ii) The graphical representation of results (as a function of the number of iterations and the

values of other controllable parameters) can facilitate the fine-tuning of metaheuristics

(Dowsland 96 ).
34

Evaluating the Pe rformance of a Heuristic

There are two broad measures of performance, namely i). how the objective function value

obtained compares to that achievable by the optimal solution or some other benchmark

procedure, and ii). the computational requirements of the heuristic. With respect to the latter the

heuristic should require reasonable computational effort and memory to obtain the solution for

realistic sized problems.

The subsequent discussion here will focus on the objective function value achieved.

Specifically, it is desirable to have very good average performance, ie. the solution value

obtained is close to the optimum value, on average. In addition, robustness is desired in two

senses. First, there should be a very low chance of achieving a poor solution. Second, the

performance should not be sensitive to the actual or estimated values of the parameters of the

problem. If the results are found to be sensitive, then it is helpful to specify under which

conditions the heuristic should be used/not used.

If possible, a histogram of penalties in the objective function value should be obtained. In

particular, for what percentage of the problem instances does the heuristic obtain the optimal

solution? Also, normally one expresses penalties in a percentage form, viz. (for a minimization

problem)

f ( xh ) − f ( xo )
Percent penalty = × 100 …. (4)
f ( x o)

where xh is the heuristic solution and xo is the optimal solution.


35

However, if f ( x o ) is very close to or even equal to zero, extremely high percentage penalties

will result even if f ( x h) is only slightly above f ( x o ) . Thus, Cornuejols et al115 recommended

the use of

f ( xh ) − f ( xo )
Modified percent penalty = ×100 .... (5)
f ( x r ) − f ( xo )

where x r is some reference solution, such as that obtained using the current decision

rule.

Also, in some instances, the decision maker might be more interested in the

Absolute penalty = f ( x h ) − f ( x o ) .… (6)

Experimental Set of Problem Instances

References related to this topic include Barr et al116 , Rardin and Uzsoy117 , and Hooker118 . In

general, the performance of a heuristic depends upon many parameters (or factors) of the

problem. As a result it makes sense to carry out a carefully designed experimental set of test

problems, whose results are then statistically analyzed. Typically there are too many factors to

permit a complete factorial design of experiments. Insight (partly from an understanding of the

related theory), as well as an investigation of the results of preliminary experiments can suggest

which variables are likely to be important. The tested values of the factors should be

representative of those observed in the real world problem context being studied. Illustrative

references on the design of experiments are Montgomery119 and Kuehl120 .

There are two other approaches that have sometimes been advocated in lieu of a designed set of

experiments, but both are of limited, practical value. One, so-called probabilistic analysis (see,
36

for example, Evans and Hall121 ), assumes that each of the parameters follows a prescribed,

independent probability distribution and then analytically (typically only possible under very

questionable assumptions) or by random sampling develop a probability distribution of the

performance of the heuristic. The other, worst case analysis, (see, for example, Fisher33 )

determines the worst possible performance of the heuristic (and the associated values of the

problem parameters). However, even when worst case performance is very poor, typical or

average performance can often be very good.

Comparison with Optimal Solution

Consider a minimization problem. For any instance of the problem we can plot values of the

objective function vertically as in Figure 7. (Incidentally, the modified percentage penalty of (5)

is seen to represent the ratio of two vertical distances in Figure 7). Ideally we’d like to compare

the heuristic solution value with that of the optimal solution. However, as discussed earlier, one

of the primary reasons for using heuristics is that it is not practical to find the optimal solution

for realistic sized problems. Thus, comparison may only be possible with smaller scale problems

or special cases of larger problems. As a result, it is necessary to do other types of comparisons

as discussed in the next two subsections. However, there is another approach for estimating the

optimal value, namely so-called extreme value estimation. (See Marin and Salmeron122 , and

Zanakis and Evans 15 ). Suppose that f ( x 1), f ( x 2),..., f ( x m) represent the values of the solutions

of m independent trials of a heuristic (involving probabilistic elements) on the same problem

instance. Then, based on the assumption that these are independent draws from an extreme value

distribution (three-parameter Weibull distribution), one of whose parameters is the (unknown)


37

minimum value, either point or interval estimates of the minimum can be constructed for

comparison with the lowest of the m values found by the heuristic.

Comparison with Bounds

We continue focussing on a minimization problem. As discussed in the earlier subsection

“Approximative Methods” one can obtain a lower bound on the optimal solution value by

solving (a typically simpler) problem resulting from relaxing one or more constraints. (Typical

relaxations were discussed in that subsection). The lower bound is portrayed by the lowest line

in Figure 7. A different approach to bounding is demonstrated by Klein and Scholl123 . For a

minimization problem they select a value V of the objective function and, using this as a

constraint, they attempt to find at least one feasible solution satisfying this and all the original

constraints of the problem. If it can be proved that there is no such feasible solution, which may

be a very difficult undertaking as the above modified formulation is the so-called constraint

satisfaction problem (see Nanobe and Ibaraki93 or Tsang124 ), which is known to be NP-complete,

then V is a valid lower bound for the original problem. One keeps increasing V incrementally

until a feasible solution is first found.

If the gap between f ( x h ) and LB is small, then we know that the distance between f ( x h ) and

the unknown f ( x o ) must be small. However, a large gap between f ( x h ) and LB leaves us

uncertain as to the performance of the heuristic, in that it can be due to one or both of i). LB is a

very weak bound (ie. there is a large gap between f ( xo ) and LB) and /or ii). the heuristic has

performed poorly (ie. there is a large difference between f ( xh ) and f ( xo ) ).


38

Other Comparisons

Other comparisons are necessary when neither the optimal solution nor a good bound can be

obtained. However, there is an even more compelling argument for comparing the performance

of the heuristic with that of another method, namely that which has been previously used by the

organization under study to make the associated decision(s). Management is more likely to be

convinced of the utility of a heuristic method if it is shown to significantly outperform current

practice than by arguments that it comes close to (the relatively vague concept of) optimality.

The heuristic can also be compared with an earlier proposed heuristic for the same problem, but

the danger here is that the latter’s performance could be quite poor.

Summary

This paper has provided an overview of heuristic solution methods. Several reasons for their use

were provided and a wide variety of methods were presented, including several metaheuristics.

Guidelines were also given with respect to evaluation of performance. It is hoped that the reader

will be motivated to actively develop and/or use heuristic methods to deal with the increasingly

challenging mathematical representations of a wide variety of important decision problems.

Acknowledgements - The research leading to this paper was supported by the Natural Sciences

and Research Council of Canada under Grant No. A1485 and by the Carma Chair at the

University of Calgary. I wish to thank a number of people who either directly provided inputs

that were used to prepare this paper or whose writings have had a major impact on my thinking

related to heuristic methods. These individuals include Sven Axsäter, Peter Bell, Robert G.

Brown, Ton de Kok, Dominique de Werra, Sam Eilon, Jim Evans, Marshall Fisher, Fred Glover,
39

Ron Howard, Tom Morton, Heiner Müller-Mehrbach, Eliezer Naddor, Ray Patterson, Haim

Shore, David Simchi- Levi, Henk Tijms, Gene Woolsey and Stelius Zanakis.

References:

1 Ackoff RL (1977). Optimization + objectivity = opt out. European Journal of


Operational Research 1: 1-7.

2 Müller-Mehrbach H (1981). Heuristics and their design: a survey. European Journal of


Operational Research 8 :1-23.

3 Clayson J (1984). Micro-Operational Research: A Simple Modeling Tool for Managers.


In: J. Richardson (ed). Models of Reality, Lomond: Mt.Airy, Maryland, Chapter 16.

4 Lenat DB (1982). The nature of heuristics. Artificial Intelligence 19: 189-249.

5 Foulds L R (1983). The heuristic problem-solving approach. Journal of the Operational


Research Society 34: 927-934.

6 Zanakis SH, Evans JR, and Vazacopoulos, AA (1989). Heuristic methods and
applications: a categorized survey. European Journal of Operations Research 43: 88-
110.

7 Michalewicz Z and Fogel DB (2000). How to Solve It: Modern Heuristics Springer-
Verlag: Berlin.

8 Morton TE and Pentico DW (1993). Heuristic Scheduling Systems. Wiley – Interscience:


New York.

9 Reeves CR (ed) (1993). Modern Heuristic Techniques for Combinatorial Problems.


Halsted Press: New York.

10 Ignizio JP (1980). Solving large-scale problems: a venture into a new dimension. Journal
of the Operational Research Society 31: 217-225.

11 Muller-Malek H, Matthys D, and Nelis E (1997). Heuristics and expert- like systems.
Belgian Journal of Operations Research, Statistics, and Computer Science 27: 25-63.

12 Pinedo M and Simchi-Levi D (1996). Heuristic methods. In: M. Avreil and B. Golany
(eds.). Mathematical Programming for Industrial Engineers. Marcel Dekker: New York,
pp. 575-617.
40

13 Silver EA, Vidal RVV and de Werra D (1980). A tutorial on heuristic methods.
European Journal of Operational Research 5: 153-162.

14 White DJ (1990). Heuristic programming. IMA Journal of Mathematics Applied in


Business & Industry 2: 173-188.

15 Zanakis SH and Evans JR (1981). Heuristic “optimization”: why, when, and how to use
it”. Interfaces 11: 84-91.

16 Churchman CW (1970). Operations research as a profession. Management Science 17:


B37-B53.

17 Eilon S (1977). More against optimization. Omega 5: 627-633.

18 Jönsson H and Silver EA, (1989). Common component inventory problems with a budget
constraint: heuristics and upper bounds. Engineering Costs and Production Economics
18: 71-81.

19 Yano CA and Lee HL (1995). Lot sizing with random yields: a review. Operations
Research 43: 311-334.

20 Silver EA, Pyke DF, and Peterson R (1998). Inventory Management and Production
Planning and Scheduling. 3rd edition. Wiley: New York.

21 Garey MR and Johnson DS (1979) Computers and Intractability: A Guide to the Theory
of N.P. Completeness. Freeman: New York.

22 Lawler E (1976). Combinatorial Optimization. Holt Rinehart and Winston: New York.

23 Hall L (2000) Computational complexity. In: Gass SI and Harris CM (eds).


Encyclopedia of Operations Research and Management Science, 2nd edition. Kluwer:
Boston, pp 119-122.

24 Hoffman KL and Padberg M (2000). Traveling salesman problem. In: Gass SI and Harris
CM (eds). Encyclopedia of Operations Research and Management Science, 2nd edition.
Kluwer: Boston, pp 849-853.

25 Kaku B (2000). Facilities layout. In: Gass SI and Harris CM (eds). Encyclopedia of
Operations Research and Management Science, 2nd edition. Kluwer: Boston, pp 279-
282.

26 Kolisch R (1999). Resource allocation capabilities of commercial project management


software packages. Interfaces 29: 19-31.
41

27 Magnanti TL (2000). Network optimization. In: Gass SI and Harris CM (eds).


Encyclopedia of Operations Research and Management Science, 2nd edition. Kluwer:
Boston/Dordrecht/London, pp 555-561

28 Bodin L (2000). Vehicle routing. In: Gass SI and Harris CM (eds). Encyclopedia of
Operations Research and Management Science, 2nd edition. Kluwer: Boston, pp 865-870.

29 Cordeau JK, Gendreau M, Laporte G, Potvin JY and Semet F (2002) A guide to vehicle
routing heuristics. Journal of the Operational Research Society 53: 512-522.

30 Woolsey RED and Swanson HS (1975) Operations Research for Immediate


Applications. Harper and Row: New York, p. 68.

31 Haessler RW (1983) Developing an industrial- grade heuristic problem-solving


procedure. Interfaces 13: 62-71.

32 Bartholdi III JJ and Platzman LK (1988) Heuristics based on spacefilling curves for
combinatorial problems in Euclidean space. Management Science 34: 291-305.

33 Fisher ML (1980). Worst-case analysis of heuristic algorithms. Management Science


26: 1-17.

34 Mason A, Ryan D, and Panton D (1998). Integrated simulation, heuristic and


optimization approaches to staff scheduling. Operations Research 46: 161-175.

35 Ruggiero VR (1995). The Art of Thinking. 4th edition. HarperCollins College Publishers:
New York, p.75.

36 Baum S and Carlson R (1979). On solutions that are better than most. Omega 7: 249-
255.

37 Mabert VA and Whybark DC (1977). Sampling as a solution methodology. Decision


Sciences 8: 167-180.

38 Bollapragada S and Morton TE (1999). Myopic heuristics for the random yield problem.
Operations Research 47: 713-722.

39 Federgruen A and Tzur M (1999). Time-partitioning heuristics: application to one


warehouse, multi- item, multi-retailer lot-sizing problems. Naval Research Logistics 46:
463-486.

40 Karp RM (1977). Probabilistic analysis of partitioning algorithms for the traveling-


salesman problem in the plane. Mathematics of Operations Research 2: 209-224.

41 Hax AC and Meal HC (1975). Hierarchical integration of production planning and


scheduling. In: M.A.Geisler (ed.). Logistics 1. North-Holland: Amsterdam, pp. 53-69.
42

42 Bitran GR and Tirupati D (1993). Hierarchical production planning. In: S.C. Graves,
A.H.G. Rinnooy Kay, and P.H. Zipkin (eds.). Logistics of Production and Inventory 4.
North Holland: Amsterdam, Chapter 10.

43 Fry TD, Cox JF, and Blackstone Jr. JH (1992). An analysis and discussion of the
optimized production technology software and its use. Production and Operations
Management 1: 229-242.

44 Dixon PS and Silver EA (1981). A heuristic solution procedure for the multi- item,
single- level, limited capacity, lot-sizing problem. Journal of Operations Management,
2: 23-40.

45 Polya G (1957). How to Solve It: A New Aspect of Mathematical Method. Doubleday
Anchor: New York.

46 Bilde O and Vidal RVV (1973). On the connections between locations and scheduling
problems. In: Johnson M and Ashour S (eds). Simulation Councils Proc. Ser. Vol. 3:
No. 2.

47 Ehrhardt R (1979). The power approximation for computing (s,S) inventory policies.
Management Science 25: 777-786.

48 Ignizio JP and Burke LI (2000). Neural networks. In: Gass SI and Harris CM (eds).
Encyclopedia of Operations Research and Management Science, 2nd edition. Kluwer:
Boston, pp 569-571.

49 Cunningham P and Smyth B (1997). Case-based reasoning and scheduling: reusing


solution components. International Journal of Production Research 35: 2947-2961.

50 Rosing KE, ReVelle CS, Rolland E, Schilling DA, and Current JR (1998). Heuristic
concentration and tabu search: A head to head comparison. European Journal of
Operational Research 104: 93-99.

51 Evans JR (1983). A network decomposition/aggregation procedure for a class of


multicommodity transportation problems. Networks 13: 197-205.

52 Geoffrion AM (1977). A priori error bounds for procurement commodity aggregation in


logistics planning models. Naval Research Logistics Quarterly 24: 201-212.

53 Pentico, DW (1994). Multistage production systems with random yield: heuristics and
optimality. International Journal of Production Research 32: 2455-2462.

54 Rajagopalan S (2002) Make to order or make to stock: model and application.


Management Science 48: 241-256.
43

55 Silver EA and Meal HC (1973). A heuristic for selecting lot size quantities for the case
of a deterministic time- varying demand rate and discrete opportunities for replenishment.
Production and Inventory Management Journal 14: 64-74.

56 Bitran GR and Yanasse HH (1984) Deterministic approximations to stochastic


production problems. Operations Research 32: 999-1018.

57 Tijms HC (1986) Stochastic Modelling and Analysis: a Computational Approach.


Wiley:Chichester.

58 van Houtum GJ and Zijm WHM (1991) Computational procedures for stochastic multi-
echelon production systems. International Journal of Production Economics 23: 223-
237.

59 Shore H (1999). A general solution of the preventive maintenance problem when data
are right-censored. Annals of Operations Research 91: 251-261.

60 Shore H (1999). Optimal solutions for stochastic inventory models when the lead-time
demand distribution is partially specified. International Journal of Production
Economics 59: 477-485.

61 Zaino Jr. N and D’Errico J (1989). Optimal discrete approximations for continuous
outcomes with applications in decision and risk analysis. Journal of the Operational
Research Society 40: 379-388.

62 Grossman TA, Rohleder TR and Silver EA, (2000). A negotiation aid for fixed quantity
contracts with stochastic demand and production. International Journal of Production
Economics 66: 67-76.

63 Jönsson H and Silver EA, (1996). ‘Some insights regarding selecting sets of scenarios in
combinatorial stochastic problems’. International Journal of Production Economics 45:
463-472.

64 Consigli G and Dempster MAH (1998). The CALM stochastic programming model for
dynamic asset- liability management. Annals of OR 81: 131-161.

65 Feller W (1966). An Introduction to Probability Theory and Its Applications. Vol. II


Wiley: New York, p.355.

66 Silver EA, (1974). A control system for coordinated inventory replenishment.


International Journal of Production Research 12: 647-671.

67 Whitt W (1993). Approximations for the GI/G/m queue. Journal of Production and
Operations Management 2: 114-161.

68 Suri R and de Treville S (1991). Full speed ahead. OR/MS Today 18:3 34-42.
44

69 Glover F (1977). Heuristics for integer programming using surrogate constraints.


Decision Sciences 8: 156-166.

70 Silver EA (1976). A simple method of determining order quantities in joint


replenishments under deterministic demand. Management Science 22: 1351-1361.

71 Fisher ML and Jaikumar R (1981). A generalized assignment heuristic for vehicle


routing. Networks 11: 109-124.

72 Bertsimas D and Demir R (2002). An approximate dynamic programming approach to


multidimensional knapsack problems. Management Science 48: 550-565.

73 Patterson R and Rolland E , The cardinality constrained covering traveling salesman


problem, forthcoming in Computers and OR.

74 Beasley JE (1993). Lagrangian relaxation. In: Reeves CR (ed). Modern Heuristic


Techniques for Combinatorial Problems. Halsted Press: New York, Chapter 9.

75 Fisher M L (1985). An applications oriented guide to lagrangian relaxation. Interfaces


15: 10-21.

76 Golden B, Bodin L, Doyle T and Stewart Jr. W (1980). Approximate traveling salesman
algorithms. Operations Research 28: 694-711.

77 Atkinson JB (1994). A greedy look-ahead heuristic for combinatorial optimization: an


application to vehicle scheduling with time windows. Journal of the Operational
Research Society 45: 673-684.

78 Lin S and Kernighan BW (1973). An effective heuristic algorithm for the traveling-
salesman problem. Operations Research 21: 498-516.

79 Box GEP and Voule PV (1955). The exploration and exploitation of response surfaces:
an example of the link between the fitted surface and the basic mechanism of the system.
Biometric 11: 287-323.

80 Glover F (1990). Tabu search: a tutorial. Interfaces 20: 74-94.

81 Taillard ED, Gambardella LM, Gendreau M, and Potvin JY (2001). Adaptive memory
programming: a unified view of metaheuristics. European Journal of Operational
Research 135: 1-16.

82 Osman IH (1993). Metastrategy simulated annealing and tabu search algorithms for the
vehicle routing problem. Annals of Operations Research 41: 421-451.
45

83 Dorigo M, Maniezzo V and Colorni A (1996). The ant sys tem: optimization by a colony
of cooperating agents. IEEE Transactions on Systems, Man, and Cybernetics – Part B
26: 29-41.

84 Mitten LG (1970). Branch and bound methods: general formulations and properties.
Operations Research 18: 24-34.

85 Ow PS and Morton TE (1988). Filtered beam search in scheduling. International


Journal of Production Research 26: 35-62.

86 Glover F (2000). Tabu search. In: Gass SI and Harris CM (eds). Encyclopedia of
Operations Research and Management Science, 2nd edition. Kluwer: Boston, pp 821-
827.

87 Glover F, Taillard E, and de Werra D (1993). A user’s guide to tabu search. Annals of
Operations Research 41: 3-28.

88 Glover F and Laguna M (1997). Tabu Search. Kluwer: London.

89 Costa D and Silver EA, (1998). Tabu search when noise is present: an illustration in the
context of cause and effect analysis. Journal of Heuristics 4: 5-23.

90 Costa D (1995). An evolutionary tabu search algorithm and the NHL scheduling
problem. INFOR 33: 161-177.

91 Taillard ED (1991). Robust tabu search for the quadratic assignment problem. Parallel
Computing 17: 443-455.

92 Hasegawa M, Ikeguchi T, Aihara K, Itoh K (2002) A novel chaotic search for quadratic
assignment problems. European Journal of Operations Research 139: 543-556.

93 Nanobe K and Ibaraki T (1998). A tabu search approach to the constraint satisfaction
problem as a general problem solver. European Journal of Operational Research 106:
599-623.

94 Chelouah R and Siarry P (2000). Tabu search applied to global optimization. European
Journal of Operational Research 123: 256-270.

95 Anandalingam G (2000). Simulated annealing. In: Gass SI and Harris CM (eds).


Encyclopedia of Operations Research and Management Science, 2nd edition. Kluwer:
Boston, pp 748 – 751.

96 Dowsland KA (1993). Simulated annealing. In: Reeves CR (ed). Modern Heuristic


Techniques for Combinatorial Problems. Halsted Press: New York, Chapter 2.

97 Vidal RVV (1993). Applied Simulated Annealing. Springer-Verlag: Berlin.


46

98 Dueck G (1993). New optimization heuristics: the great deluge algorithm and the record-
to-record travel. Journal of Computational Physics 104: 86-92.

99 Tsubakitani S and Evans JR (1998). An empirical study of a new metaheuristic for the
traveling salesman problem. European Journal of Operational Re search 104: 113-128.

100 Feo TA and Resende MGC (1995). Greedy randomized adaptive search procedures.
Journal of Global Optimization 6: 109-133.

101 Patterson R, Rolland E, and Pirkul H (1999). A memory adaptive reasoning technique
for solving the capacitated minimum spanning tree problem. Journal of Heuristics 5:
159-180.

102 Fleurent C and Glover F (1999). Improved constructive multi-start strategies for the
quadratic assignment problem using adaptive memory. INFORMS Journal on
Computing 11: 198-204.

103 Reeves CR (1993). Genetic algorithms. In: Reeves CR (ed). Modern Heuristic
Techniques for Combinatorial Problems Halsted Press: New York, Chapter 4.

104 Moon I, Silver EA, and Choi S (2002). A hybrid genetic algorithm for the economic lot
scheduling problem. International Journal of Production Research 40:. 809-824.

105 Rochat Y (1998). A genetic approach for solving a scheduling problem in a robotized
analytical system. Journal of Heuristics 4: 245-261.

106 Armony M, Klincewicz JG, Luss H, and Rosenwein MB (2000). Design of stacked
self-healing rings using a genetic algorithm. Journal of Heuristics 6: 85-105.

107 Yagiura M and Ibaraki T (1996). The use of dynamic programming in genetic algorithms
for permutation problems. European Journal of Operational Research 92: 387-401

108 Chelouah R and Siarry P (2000). A continuous genetic algorithm designed for the global
optimization of multimodal functions. Journal of Heuristics 6: 191-213.

109 Glover F (1995). Scatter search and star path: beyond the genetic metaphor.
OR Spektrum 17: 125-137.

110 Laguna M. Scatter search. To appear in Pardalos M and Resende MGC (eds.) Handbook
of Applied Optimization. Oxford University Press: New York, N.Y.

111 Bell PC (1991). Visual interactive modelling: the past, the present, and the prospects.
European Journal. of Operational Research 54: 274-286.
47

112 Bright JG and Johnston K J (1991). Whither VIM? - a developer's view. European
Journal of Operational Research 54: 357-362.

113 Fisher ML (1985/86). Interactive optimization. Annals of Operations Research 5: 541-


556.

114 Segal M and Weinberger D (1977). Turfing. Operations Research 25: 367-386.

115 Cornuejols G, Fisher ML, and Nemhauser GL (1977). Location of bank accounts to
optimize float: an analytic study of exact and approximate algorithms. Management
Science 23: 789-810.

116 Barr R, Golden B, Kelly J, Resende M, and Stewart Jr. W (1995). Designing and
reporting on computational experiments with heuristic methods. Journal of Heuristics 1:
9-32.

117 Rardin RL and Uzsoy R (2001). Experimental evaluation of heuristic optimization


algorithms: a tutorial. Journal of Heuristics 7: 261-304.

118 Hooker JN (1995). Testing heuristics: we have it all wrong. Journal of Heuristics 1: 33-
42.

119 Montgomery DC (1991). Design and Analysis of Experiments. 3rd edition. Wiley: New
York.

120 Kuehl RO (2000). Design of Experiments: Statistical Principles of Design and Analysis.
2nd edition. Duxbury Press: Pacific Grove, Ca.

121 Evans JR and Hall RA (1984) Probabilistic analysis of assignment ranking: the traveling
salesman problem. American Journal of Mathematical and Management Sciences 4:
71-88.

122 Marin A and Salmeron S (1996). Tactical design of rail freight networks. Part II: Local
search methods with statistical analysis. European Journal of Operational Research 94:
43-53.

123 Klein R and Scholl A (1999). Computing lower bounds by destructive improvement: An
application to resource-constrained project scheduling. European Journal of Operational
Research 112: 322-346.

124 Tsang E (1993). Foundations of Constraint Satisfaction. Academic Press: London.

View publication stats

You might also like