0% found this document useful (0 votes)

20 views79 pages

The Algorithm Selection Problem

The document discusses the Algorithm Selection Problem, which involves choosing the most effective algorithm from a diverse set for various applications. It presents abstract models and concrete examples, including quadrature algorithms and operating system schedulers, to illustrate the complexities of algorithm selection. The paper outlines a structured approach to tackle this problem through formulation, existence, uniqueness, characterization, and computation of selection mappings.

Uploaded by

Arda Sarıtaş

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

20 views79 pages

The Algorithm Selection Problem

Uploaded by

Arda Sarıtaş

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 79

Purdue University

Purdue e-Pubs

Department of Computer Science Technical Department of Computer Science

Reports

1975

The Algorithm Selection Problem

John R. Rice
Purdue University, [email protected]

Report Number:
75-152

Rice, John R., "The Algorithm Selection Problem" (1975). Department of Computer Science Technical
Reports. Paper 99.
https://docs.lib.purdue.edu/cstech/99

This document has been made available through Purdue e-Pubs, a service of the Purdue University Libraries.
Please contact [email protected] for additional information.
THE ALGORITHM SELECTION PROBLEM

John R. Rice
Computer Science Department
Purdue University
West Lafayette, Indiana 47907

July 1975
CSD-TR 152

(This is a revised version of CSD-TR 116. 117 and 130)

(To appear in Advances in Computers> Vol. 15, Academic Press. 1976)

THE ALGORITHM SELECTION PROBLEM
John R. Rice
Purdue University

CONTENTS

Section 1: Introduction

Section 2: Abstract Models

2.1 The Basic Model and Associated Problems
2.2 The Model with Selection Based on Features
2.3 Alternate Definitions of Best for the Models
2.4 The Model with Variable Performance Criteria
Section 3: Concrete Application - The Selection of Quadrature Algorithms
3.1 The Components in the Abstract Model
3.2 Review of Previous Work on Quadrature Algorithm Evaluation
3.3 A Systematic Evaluation and Selection Approach

Section 4: Concrete Application - The Selection of Operating System Schedulers.

4.1 The Components in the Abstract Model
4.2 An Actual Scheduling Algorithm
4.3 An Approach to the Selection of the lIBest 11 Scheduler

Section 5: Discussion of the Two Concrete Applications

Section 6: Approximation Theory Machinery
6.1 Formulation and_Structure of the Approximation Problem
6.2 Norms and Approximation Forms
6.3 Classification of Problems, Degree of Convergence,
Complexity and Robustness
6.4 Brief Survey of Approximation Form Attributes
6.5 Ari Error to Avoid
6.6 The Mathematical Theory Questions
6.7 Conclusions, Open Questions and Problems

*This work was partially supported by the National Science Foundation through
Grant GP-32940X. This paper was presented as the George E. Forsythe
Memorial Lecture at the Computer Science Conference, February 19, 1975,
\qashington, DC.
1

1. INTRODUCTION

The problem of selecting an effective or good or best algorithm arises

in a wide variety of situations. The context of these situations often

obscures the common features of this selection problem and the primary

purpose of this paper is to formulate abstract models appropriate

for considering it. Within the framework established by these models we

present a variety of questions that can (and usually should) be asked in

any specific application.

It should be made clear that we do not believe that these models will

lead directly (by simple specialization) to superior selection procedures.

This will always require exploitation of the specific nature Df the situa-

tion at hand. Even so, we do believe that these models will clarify the

consideration of this problem and, in particular, show that some approaches

used are based on naive aBs~ptions about the selection assumption.

Three concrete examples are given below which the reader can use to

interpret the abstractions in this paper.

Quadrature: One is given a function f(x), an interval [a,b] and a

tolerance E > O. One is to select an algorithm to estimate

b
f f(x)dx
a

which is efficient (uses few evaluations of f(x» and reliable (produces

an estimate within the specified tolerance).

Operating Systems: One is given an environment for a large

computer operation. Information known includes the mix of jobs between

hatch, interactive and semi-interactive. some basic characteristics of these

classes of jobs and the characteristics of the computer operation. One is

to select an algorithm to schedule the execution of these jobs which pro-

duce (a) high hatch throughput, (b) good response to interactive jobs,

Artificial Intelligence: One is given a description of the game

Tic-Tae-Toe. One is to select an algorithm to play the game which is

effective. i.e. never loses and wins whenever an opponent's mistake allows

it.

A selection procedure is invariably obtained by assigning values to

parameters in general "forml!. More precisely, the selection procedure itself is

an algorithm and a specific class of algorithms is chosen with free parameters

and these parameters are then chosen so as to satisfy (as well as they can)

the objectives of the selection problem. Classical forms include things

like polynomials (with coefficients as parameters) and linear mappings

(with matrix coefficients or weights as parameters). Other relevant forms

are decision trees (with size, shape and individual decision elements as

parameters) and programs (with various program elements as parameters).

The models presented here are primarily aimed at algorithm selection

problems with the following three characteristics:

Problem Space: The set of problems involved is very large and quite

diverse. This set is of high dimension in the sense that there are a number

of independent characteristics of the problems which are important for the

algorithm selection and performance. There is usually considerably un-

certainty about these Characteristies~and~~he~ri~nfluence6.

Algorithm Space: The set of algorithms that needs to be considered

is large and diverse. Ideally there may be millions of algorithms and prac-

tic ally there may be dozens of them. In counting algorithms we do not

distinguish between two which are identical except for the value of some

numeric parameter. Again this set is of high dimensions and there is

uncertainty about the influence of algorithm characteristics.

Performance Measure: The criteria to measure the performance of a

particular algorithm for a particular problem are complex and hard to cam-

pare (e.g. one wants fast execution, high accuracy and simplicity). Again

there is considerable uncertainty in assigning and interpreting these

measures.

2. ABSTRACT MODELS
2.1 The Basic Model and Associated Problems. We describe the basic abstract

model by the diagram in Figure 1. The items in this model are defined below

in detail so as to be completely clear about the nature of the model.

... xE!J" MEN ~·E.@

n
S (xl p'(A,x) ,
SELECTION :AnCOO'RI'IBMl1 PERFORMANCE PERFORMANCE
PROBLEM SPACE MAPPING SBRGEI MAPPING ~1EASURE SFACE

NORM
MAPPING

-
"1/

Figure 1. Schematic diagram of the basic model for the algorithm

selection problem. The objective is to determine
Sex) so as to have high algorithm performance.
4

Definitions for the basic MOdel:

9' = Problem space or collection

x = Member of ,9ig problem to be solved

Jdf= Algorithm space or collection

A = Member of S!f, algorithm applicable to problems from fJl!
S Mapping from 5P to .s#
~""n n-dimensional real vector space 0f performance measures

p = Mapping from.JJI x §II to J1ju d c:l.ete:nm:ini:ggppe.tfo!mD.auce;:,measUJr.es

II II = Norm on ~npP~ovddiggoDnennumberc~oeava~uateaanaa~gotitbm£s

performance on a particular problem.

For completeness we now state the:

Algorithm Selection Problem: Given all the other items in the above

model. determine the selection mapping Sex).

There must be, of course, some criteria for this selection and we present

four primary ones below:

A. Best Selection. Choose that selection mapping B(x) which gives

maximum performance for each problem:

lip (B (x) ,x) II ~ II p (A, x) II for all AEs$

B. Best Selection for a Subclass of Problems. Oue is to choose just

one algorithm to apply to every member of a subclass ~C"9.ChOheose

that selection mapping Sex) = A which minimizes the performance

O
degradation for members of E8 (compared to choosing B(x»:

max.:.x[llp(B(x),x)11 - II p(A ) ,x) II] ~ =*11 H~p,~(x) 11<)111 h 1,~pt~l1<lll]

O
xE90 xx:~

for all AEs:£

C. Best Selection from a Subclass of Mappings. One is to restrict the

mapping S(x) to be of a certain form or from a certain subclass ~ of

all mapping from~to Qt. *

Choose that selection mapping S(~)ff~em

sPa which minimizes the performance degradation for all members of ~

max [I Ip(B(x),x)'1 I - IIp(s * (x),x)lI] ~max [llp(B(x),x)11 - IIp(S(x),x)IIJ

xEJJ> xEJJ>
for all S E"yo

D. Best Selection from a Subclass of Mappings and Problems. One is to

choose just one algorithm from a subclass S(; to apply every member

of a subclass ~E.-9. Choose that selection mapping S* (x) from JIQ

which minimizes the performance degradation for all members of ~:

max [llp(B(x),x II - II p(S * (x) ,x) II] ~

xE90
6

These four criteria do not exhaust the meaningful criteria but they do

illustrate the principal ideas. There are five main steps to the analysis

and solution of the algorithm selection problem as follows

Step 1 (Formulation) Determination of the subclasses of problems

and mappings to be used.

Step 2 (Existence) Does a best selection mapping exist?

Step 3 (Uniqueness) Is there a unique best selection mapping?

Step 4 (Characterization) What properties characterize the best

selection mapping and serve to identify it?

Step 5 (Computation) What methods can be used to actually obtain

the best selection mapping.

The reader familiar with the theory of approximation of functions

will recognize this framework and we may put that

classical theory within this framework. The space 9' is a function space

and the algorithm space !IIi may be identified with a subspace of fi8.. TDhe
algorithm enters as the means of evaluating elements of q(. The performance

mapping is

p(A,x) = Ilx(t) - A(t) II

'§'
where the norm is taken on 9. Thus the performance measure space is

~l and the norm mapping is trivial.

There are two remarks needed about this observation. First, the

body of significant material ia approximation theory is large. It would

require, no doubt, from 2000 to 4000 pages to present a reasonably com-

plete and concise exposition of the results currently known. This

implies that there is a very rich body of material waiting to be applied

to the algorithm selection problem, either directly or by analogy. Second,

and more important. the algorithm selection problem is an essential exten-

sian and generalization of approximation theory. We will see concrete

examples of this problem where the current theory of approximation has

nothing relevant to apply except by the faintest analogies.

Two concrete examples of the model are cii..~~.!s.'Y~.Q. i..,,_ rlPf..R..U_ b,.

Sections 3 and 4 of this paper. We present a third, simpler one from the
area of artificial intelligence.

Example A game playing.problem. We are to devise an algorithm

for playing Tic-Tae-Toe. The problem space is the set of part~~l games of

Tic-Tae-Toe. ~~ile this number is large. there are in fact only 28 distinct

reasonable games if one eliminates blunders, symmetries and board rotations.

The spaceJ31 may be represented as a space of large tables of responses for

each situation. However, we restrict our selection to a decision tree that

involves only the existence of immediate winning positions and vacant position

types. ~e_~lgorithm form may then be represented as shown in Figure 2.

There are 16 parameters a. which take on one of the follcwing five values.
1

1. Play the winning move

2. Block the opponent's win

3. Play in the center square

4. Play in a corner (first free one clockwise from upper right)

5. Play in a side (first free one clockwise from right)

8
QUESTIONS

Do I have a winning position?

Does opponent have a winning

position?

Is the center free?

~37

[s a corner free?

855 I S56
Q
Figure 2. The form of the selection mapping for the Tic-Iae-Toe
example. Each 8 is
one of five moveS~ i

This example is so simple that one can make immediate assignments of ~eTtain

of the values of the a.o , Experiments have shown that a variety of crude

,
schemes for computing values of the a. (selecting the best algorithm) work

very quickly. Nevertheless, it is still of interest to reflect upon how

one would compute this if one had no a priori information about the game.
9

2.2 The Model with Selection Based on Features. An examination of various

instances of the algorithm selection problem shows that there is another

ingredient almost always present. It is sometimes explicit and sometimes

not and we call this selection based on features of the problem. This

model is described by the diagram in Figure 3.

xE'3'

PERROBHEMCE
SPACE

F
. FEATURE
EXTRACTION

n
f(x)Eff- !it' S(f(x)) AEW
p{A.x)
PE~
'- "- PERFORMANCE
FEATURE SELECTION .-- PERFORMANCE /
ALGORITHM PEME1\SlIRE CE
SPACE MAPPING MAPPING
SPACE SPACE

\ 1/
II P II = ALGORITHM PERFORMANCE

Figure 3. Schematic diagram of the model with selection based

on features of the problem.

The additional definitions for this model are:

~= Feature space identified with~m here to suggest it is simpler

and of lower dimension than ~.

F = Mapping from ~to ~ which associates features with problems.

Note that the selection mapping now depends only on the features f(x) but

yet the performance mapping still depends on the problem x. The introduc-

tion of features may be viewed as a way to systematize the introduction of

problem subclasses in the basic model.

The previous statement of the algorithm selection problem and the

criteria for selection are still valid for this new model as well as

the five steps in the analysis and solution of the problem. The deter-

mination of the features to be used is frequently part of the selection

process, often one of the most important parts. One may view the features

as an attempt to introduce an approximate coordinate system in~. Ideally,

those problems with the same features would have the same performance for

any algorithm being considered. Since this ideal is rarely achieved, we

may pose several specific questions about the determination of features.

E. BestBFeatfiEe6ufo~ avPa~t~~ular~lgox1Shmi 'Given~annalgo~i~h~iAhand

the dimension m of ~, what m features are the best for the predic-

tion of the performance of A. Let ~(f) denote the equivalence class

of all those problems x,yE9 so that F(x) "" F(y) = f. We then

wish -to determdne the mapping F* and associated equivakencecelasses

~* (f) so that
*
d (A) ~ max
max * IIp(A,x) - p(A,y)I I < max max I !p(A,x) - p(A,y) II
m fE5' x,yE'ff (f) fEY X,YEY(f)

The seleetion of best features corresponds to the seleetion of best

approximating subspaces in approximation theory and leads one to ideas of

n-widths and entropy of the problem space ~. Roughly speaking, if d* is large

then the effective dimension of 9(for the problem at hand) is probably much

larger than m and, conversely, if d* is small then the effective dimension

of _l1i"is close to m.

F. Best Features for a Class of Algorithms. Given a set ,W'OC..ct'O

and the dimension m of ~ what m features are the best for prediction

of the performance of algorithm AE~? With the previous notation we

wish to determine F* and ~(f) so that

max. IIp(A,x) - p(A,y) II

x,YE.If' (f)

< max max IIp(A,x) - p(A,y)! I

fE¥ x,yE .l:f(£)

G. Best Features for a Subclass of Selection Mappings. Given a subclass

Yo of selection mappings from Y to .s# , what m features are the best

for prediction of the performance of algorithms? With the previous

notation we wish to determine F • and •(f)

~ so that
•
d (yO) = max max IIp(S(f),x) - p(S(f),y) II
m fE¥ SE YO

< max max max IIp(S (f) ,x) - p(S(f) ,y) II

fES' SEY'O s,YE-'f'(f)

The determination of the best (or even good) features is one of the

most important, yet nebulous, aspects of the algorithm selection problem.

Many problem spaces ~ are known only in vague terms and hence an experi-

mental approach is often used to evaluate the performance of algorithms

over 5P. That is, one chooses a sample from !J1 and restricts considerations

to this sample. An appropriate sample is obviously crucial to this approach

and if one has a good set of features for ~ • then one can at least force

the sample to be representative with respect to these features. Note that

the definitfon=oDfbeabtf€a~uDe~sissuch that they are the items of infor-

mation most relevant to the performance of algorithms for the problem at hand.

In some well understood areas of computation there is a generally agreed

upon (if not explicitly stated) set of features. For example, consider the

problem of solving a linear system Ax = b of equations. The features in-

x
elude descriptors like: small order, sparse, band, diagonally dominant,

positive definite, ill-conditioned, etc. Given values for these features

an experienced numerical analyst can select an appropriate algorithm for

this problem with considerable confidence. The selection problem for quad-
12

rature is already much more difficult and the solution of simultaneous

systems of nonlinear equations is very poorly understood. If this

situation exists for problems that have been studied for one or two

centuries then one should not be surprised by the difficulties and

uncertainties for problems that have just appeared in the past one or two

decades.
13

2.3 Alternate Definitions of Best for the Models. In the preceding sections

we ha~e unif~rmly ta~en a minimax appr~ach to the definition of best or

optimum selection. That is, we have minimized the effect of the worst case.

It is reasonable to ignore the performance for the worst case and, instead,

consider optimizing some sort of average behavior. In this section we ex-

hibit the resulting mathematical problems corresponding to using a least

squares or least deviation approach (these correspond to L and L optimiza-

Z 1
tion in mathematical terms). We have identified seven problems label A

through G. Problem A is unaffected by these considerations so let us can-

sider Problem B: Best Selection for a Subclass of Problems. We use the

notation introduced with the original mathematical statement of this prob-

lem. which is:

Minimax Approach

max [llp(B(x),x)11 - IIp(A*,x)ll] < max lllp(B(x),x)11 - IIp(A,x) III

xE90 XE.9
0

for all AE.J;J/

The corresponding mathematical statements for the least squares and least

deviation approach are:

Least Squares Approach

f lllp(B(x),x)ll- IIp(A* ,x)ll] 2dX:5- f [llp(B(x),x)11

.9
0 9b
for all AE%

Least Deviations Approach

for all AES¥

The use of integrals in these formulations implies that a topology has been

introduced in the problem space ~. Many common examples for ~ are dis-

crete in nature and in these cases the topology introduced reduces the
14

integrals to sums. This technicality is unlikely to cause real difficulties

and we continue to use integrals as this gives the neatest formulations.

Note that the only difference between the two new formulations is the

exponent (2 or 1) in the integrand. Thus we may avoid repeating these for-

mulations twice by making this a variable, say r, which has values 1 or 2.

Note that in approximation theory it is shown that minimax is the limiting

case as r + ~ so that all three approaches can be expressed in one formula-

tion with r as a parameter.

Recall that Problem C is the Best Selection from a Subclass of Mappings.

The alternative mathematical formulation of this problem is

flllp(B(x),x)II-llp(so(x),x)lllrdx2 flllp(B(x),x)II-llp(S(x),x11IrdX
9 9
for all S E~

The alternative formulation for Problem D is identical to this except that

the problem subclass ~ replace 98 as the domain of integration.

The next three problems involve features and we choose to use a con-

sistent approach for the reformulations. That is, if we use least squares

on the problem space we also use it on the feature space ~ and the algorithm

space ..w. If we set

l/r
IIp(A,x) - p(A,y) Il r ]
r
d (A,.w)·
m f [ ff
fEY x,y E .w(f)

then for Problem E: Best Feature for a Particular Algorithm, the objective

is to find the feature mapping F* and associated equivalence classes ~* (f)

which minimize dr(A,~ i.e.

r r rA*
d (A) = d (A,~) = min
m m .w

For Problem F we introduce

l/r
f [ f ff IIp(A,x) - p(A,y) Il r ]
fEff AE~ x,yE.\:f(f)
15

and then the objective is to determine F • and associated ~ • (f) so that

A similar approach to Problem G yields a similar expression except that the

integral over ~ is replaced by an integral over ~.

In many practical problems there 1s little to guide one in the choice

of a particular formulation of the mathematical optimization problem, i.e.

should we choose r = 1, 2 or~? These choices might not be particularly

significant in the larger context, but they are very significant in determining

the difficulty of the resulting mathematical optimization problem. A lesson

learned from practical approximation theory might be applicable in this larger

context. This lesson is, roughly, that the crucial ingredients for success

are proper choices of the subclasses ~O' ~O and~. Once these are made

properly then the mathematical optimi~ation should be made for that value of

r that gives the least difficulty. If the problem is completely linear then

r = 2 (least squares) almost always results in the least mathematical diffi-

culty. The situation is more variable for nonlinear problems. Note that there

are practical approximation problems where the choice of r is crucial and

no doubt there are similar cases for the algorithm selection problem. We

are saying that the choice of r is important only in an infrequent number

of instances.
16

2.4 The Model with Variable Performance Criteria. We have assumed so far

that there is a fixed way to measure the performance of a particular

algorithm for a particular problem. There are, however, many situations

where it is reasonable to view the performance criteria as input to the

selection problem. Consider, for example, the selection of a program

to solve ordinary differential equations and the criteria of speed,

accuracy, reliability, and ease of use. In different situations the

weight given to each of these might vary from almost zero to almost 100%.

A model for this version of the selection problem is shown in the diagram

of Figure 4.

xE9 F f(x)E?
PROBLEM FEATURE FEATURE
SPACE EXTRACTION SPACE

S(f(x) ,w) AEN

~
SELECTION ALGORITHM
MAPPING SPACE
n
wE5if
p (A, x)
CRITERIA PERFO RMANCE
SPACE MAPPI NG
PEgen
PERFORMANCE
MEASURE SPACE

Ilpll ~
g(p,w)

ALGORITHM PERFORMANCE

Figure 4. Schematic diagram of the model with selection based on problem

features and variable performance criteria.
17

The additional definition for this model is:

g - Norm function from 91 n x 9l n to R1 which measures the algorithm

performance p(A,x) with the criteria w.

Some of the mappings now have changed domains, but their nature is the same.

The choice of 9l n for the criteria space is clearly arbitrary (and perhaps

unnecessarily restrictive) but it is natural for the most common choice of

the norm function: g(p,w) - p·w.

We can at this point formulate new versions of the algorithm selection

problem involving the criteria space. The variables of these formulations

are:

Problem subclasses ~O

Algorithm subclasses J4fo

Selection mapping subclasses 5IQ
Feature space ~

Norm mapping g

The number of interesting combinations is now quite large and we refrain

from formulating all of them. Some of the more important problems are:

H. Best Selection for a Given Criteria. We assume that g(P.w) is known,

that Y = 9 (and F is the identity) and w is given. The problem then

is to determine that selection mapping B(x,w) which gives maximum

performance:

g(p(B(x,w),x)w) ~ g(p(A,x).w) for all A E s¥'

I. Best Selection from a Subclass of Mappings for a Given Criteria and

Feature Space. We restrict S to a subclass ..so all mappings from

y x ~ to A and, for a particular specified value of w and problem

•
x. we wish to determine the best mapping S (x,w) so that
•
g(p(S (f(x),w),x),w) > g(p(S(f(x),w),x),w) for all SEY; •
18

J. Best Selection from a Subclass of Happings, Problems and Algorithms

for a Given Criteria and Feature Space. This is a model of perhaps

the most realistic situation. We have the feature space ~ and

norm function g specified. We restrict ourselves to subclasses ~,

~o and ~ of selection mappings, problems and algorithms, respec-

tively. Note w'e have ~: yxgen +~. Within this framework we

wish to select that mapping S* so that

max max g(p(B(x,w),x),w) - g(p(S * (f(x),w),x),w)

wE~n xE9
0

< max max g(p(B(x,w),x),w) - g(p(S(f(x),w),x),w)

wE~n xE90

for all S E yO. Note that g(p(B(x,w) ,x) ,w) is the best possible per-

formance and the other g terms are the performances of the algorithms

actually selected.

We note that the abstract model presented in this section CQuid he elaborated

upon considerably. The study of the theoretical questions of the existence,

uniqueness and characterization of best selection mappings and features

can be expanded to fill.a trick monograph. Those

familiar with the mathematicians ability to develop theoretical structures

from simple models can visualize how this would be done. However, the

crucial point of a model is not its theoretical structure but its relevance

to underlying real world problems. In other words, does this model allow us

to develop better insight, understanding and analysis of real algorithm

selection problems? This question is addressed in the next two sections.

3. CONCRETE APPLICATION THE SELECTION OP QUADRATURE ALGORITHMS

3.1 The Components in the Abstract Model. The next two sections are

independent of one another and each has the following format:

Formulation of the general problem and definition of the relevant

spaces;
Examination of concrete cases;
Formulation of a specific and simpler selection problem;
Discussion of the simpler problem and the computations required to
solve it.

The general case of the quadrature algorithm selection problem

may be expressed in one of the two following ways:
A. Given a collection of functions (with reasonabley well known attri-
butes), which one Df the 15 to 25 well known quadrature algorithms
should be selected so as to give the best performance?
B. Given that a program library for a computing center should contain
a small (1 to 4) number of quadrature algorithms, which ones should
be selected?

A thorough analysis of these two questions is a formidable task. We will

formulate this problem (version B) more precisely and summarize the rather
extensive amount of information bearing on the question. Then we formulate
a somewhat simpler and more concrete problem and discuss its solution.
20

This general problem is modelled as in Section 2. 4 '~hich

involves spaces for the problems, the features, the criteria, the algorithms
and the performance measures. These spaces are described as follows:

Problem Space. This space consists of a rather broad class of func-

tions of one variable. While the population characteristics are not well-
known, it is likely that the bulk of the functions are simple, smooth and
well-behaved and yet a small but still significant proportion of the func-
tions have properties that cause real difficulty in quadrature. The possible
properties are illustrated by the feature space.

Feature Space. The features of these problems that should be included

are indicated by a key word followed by a short explanation:
Smoothness - either mathematical or intuitive
Jumps - jump discontinuities of various sizes are present
(or absent)
a
Singularities - local behavior of the form t , -l<a<l or 0>1 and not
integer; log t, etc.
Peaks - small subintervals where the function makes a radical
change in size. These may be actual peaks or
"smoothed" jump discontinuities
Oscillations - oscillatory behavior of various amplitudes, fre-
quencies and extent.
Round-off the presence of significant random uncertainty in
the value of the function
Symbolic - some attributes may be obtained by a cursory exam-
ination of the functions description
Accuracy - the desired accuracy of the quadrature estimate
Domain - the interval of integration (might be infinite).
No doubt there are other significant problem features which have been over-
looked in this list.

Algorithm Space. There are about 15 or 20 quadrature algorithms that

have been completely defined and studied to a certain extent in the lit-
erature. In addition there are a number of very classical algorithms (e.g.
Simpson's Rule) which must be considered even though they are not standardized
(i.e. they really are classes of algorithms). Note that this small number
21

is from a very large population of many millions (see Rice (1975);' The actual
algorithms one might consider are mentioned in the various references
and many of them are named later.

Performance Measures. The most commonly considered measures of per-

formance are work (measured in number of function evaluations) and
reliability (1 if the requested accuracy is achieved). Other important
algorithm characteristics are ease of use, understandability (for possible
modification), memory requirements (both for the algorithm and problem data
generated) and ease of analysis.

Criteria Space. This consists of some numbers designed to weight the

relative importance of the performance measures. The measures in this
case are not very compatible and it is difficult to find a completely satis-
factory method of comparing the various measures. Scaling all the measures
-from zero to one and then applying simple weights is a naive approach with
considerable appeal. Comparisons that involve step functions are more
realistic but less tractible to use or describe to users.

3.2 Review of Previous Work-on Quadrature Algorithm Evaluation. -A substantial

number of experimental tests have been made and reported in the literature. The
fUnctions inVOlved have primarily been chosen from one of the following three:

Test Function Sets (Samples from the problem space)

A. Casa1etto, Pickett and Rice (1969): -A set of SO functions;
B. Kahaner (1971): A set_of 21 functionsj
C. de Boor (1971), Three performance profiles.
There is a small overlap among these sets and some authors have used various
subsets, occasionally with a few additions.

There have been ten substantial testing efforts reported which are listed
below in chronological order. We indicate the test functions used (by A,
B or C), the requested accuracies (by E values) and the algorithms involved.
The algorithms are named and described, but detailed references are not
given here, one must refer to the test reports.
22

1. Casaletto, Pickett and Rice (1969). Complete details not reported.

-1 -2 -8
Test set A with £=10,10, ..• ,10
Algorithms: QUAD - Adaptive Simpson Rule
QUADS4 - Adaptive 4-point Games Quadrature
QUADS6 - Adaptive 6-point Games Quadrature
SIMP - Adaptive Simpson Rule (almost identical with SIMPSN)
SIMPSN - Adaptive Simpson Rule
SQUANK - Improved Version of SIMPSN
ROMBRG - Adaptive Romberg integration
RIEMAN - Adaptive Riemann sums

2. Kabaner (197·1). Exten.~ive tables of detailed results.

Test Bet B with oro '" 10- 3 10- 6 10- 9
• ,

Algorithms: SIMPSN, SQUANK

GAUSS - Adaptive Gaues usiug 5 and 7 point ruilies
G96 - 96 point Gauss rule
HAVlE - Improved version of ROMB
QABS - Combination Romberg and Curtis-Clenshaw
QNC7 - Adaptive 7-point Newton-Cotes rule
QUAD - Adaptive IO-point Newton-Cotes rule
RBUN - Adaptive Romberg
ROMB - Standard Romberg
SliNK - Romberg type using Wynn IS £",:algorithm', for extrapola!:;ion

3. de Boor (1971). Results compatible with Kahaner plus graphs.

-3 -6 -9
Test set B with 't= 10 ,10 ,10 plus test set C.
Algorithm: CADRE - Adaptive Romberg with cautious extrapolation.

4. Gentleman (1972). Co~siderable detail.

-1 -2 -8
Test set A with E: = 10 ,10 , ... ,10
Algorithm: CCQUAD - Curtis-Clenshaw quadrature.

5. Patterson (1973). Partial results reported involving CADRE and QSUBA.

Test set selected from A and B plus three others; total of 13
functions.
Algorithms: CADRE, SQUANK
23

QSUB - Iterated Gauss-Kronrod rules up to 255 points

QSUBA - Adaptive version of QSUB

6. PiesseBB (1973).
Complete details not reported
-2 -3 -13
Teat set A with e:" 10 ,10 " .. ,10
u
Algorithms: CCQUAD, SQUANK

AIND ·'Adaptive GausB~Kronrod rulke 'up to '~l points

HRVINT - Improved version of HAVIE (Adaptive Romberg)

7. Piessens (1973). Considerable detail given. some round-off effects

studied.
Test set A with e:: c: 10
-5 ,10 -7 (wit.h noise),O
Algorithms: AIND, CADRE. SQUANK

8. Einarsson (1974). Complete detail for selected cases only.

Test set A with £ = 10-1. 10- 2 , .. OJ 10- 6
Algorithms: CCQUAD
DRMBIU - IMSL version of Romberg quadrature (two versions)
QATR IBM-SSP version of Romberg quadrature
QAR - IBM-5L-MATH version of Romberg quadrature
ROMINT Romberg quadrature (Algorithm 351 of Comm. ACM)

9. Blue (1975). Considerable detail for a large number of cases plus

numerous performance profiles for his own algorithm.
Test set B with E = 10 -3 • 10 -6 • 10 -9
Algorithms: CADRE,QABS, QNC7, QSUBA, QUAD, RBUN, RUMB, SIMPSN, SQUANK
DQUAD - Adaptive Romberg with cautious extrapolation

10. Krogh and Snyder (1975). Extensive tables of detailed results.

Test set of combined nature and several hitherto unused integrands,
E = 10. 1. ...• 10
-7

Algorithms: AIND. CADRE, QNC7. QSUBA, RBUN

GAUSS Adaptive 8-point Gauss algorithm
SINT Extensive revision of QSUBA
24

Also see Lyness and Kaganore (1975) for further discussion on the nature
of this problem. This testing has provided much useful information and
served to identify some poor algorithms. However, it has not been well
-enough organized- -to·---.al)ow defini tf've-conclusionsand-th-ere is stillconsid-::'
erable doubt about the relative merits of the better algorithms. We note
that a much better experiment can be performed.

3.3 A 'SistenIa:tT€---Eva:'1UatiCln-a-nd*l~·ttion Approach _

1
We assume the quadrature problem is
~ h(t)dt

We choose a feature space with 4 dimensions:

Feature Name Values Assumed Remarks
Smoothness 0,1/2,1 0 is smooth, 1 is not
Singularity [-1,2] value is exponent of singularity
Peak [0,100J _(Average size of h(t)-Ave. peak)
strength
- (Peak base)*(Ave. size of peak)
Oscillation [0,100] maximum frequency of oscillation

We choose four I-parameter families of functions that represent each of

the features (the performance profiles) and then each coordinate
25

axis of ~ is discretized and families introduced with characteristics of

each of the remaining features. Such families can be easily constructed by

addition Dr multiplication (e.g.

------
]t 2_.25]a ------- - - ----
2
has a singularity, sin[N(t +1)]
--- - - - -- ----
2 2 2 2
is oscillatory and both It _.25[U + sin[N(t +1)] and lt -.25Ju sin[N(t +1)]

are oscillatory with a singularity). This process gives a test set which
produces a grid over the entire feature space. This test set can be combined
. -2 -4 -8 -12
with accuracy values of £ = 10 ,10 ,10 • 10 to permit a much more
precise measurement of algorithm performance.

There are about a dozen existing algorithms that merit inclusion in this
experiment and a little estimation shows that a rather substantial compu-
tation is required for this experiment. An important result of the syste-
matic nature of this approach is that one can consider probability distribu-
tion in the problem space which induce a probability distribution on the
feature space and algorithm performances can be compared (over this problem
subdomain) without repeating the experiment.

This suggested experiment is far from the most general of interest and is
clearly biased against certain well known algorithms. For example, SQUANK
takes considerable care in handling round-off effects (a feature omitted
here) and explicitly ignores oscillations (a feature included here) and thus
one would not expect SQUANK to compare favorably with some other algorithms
on the basie of this experiment.
26

We consider two criteria of_performance: efficiency and reliability. These

two variables. are seaTed to the interval [O.. j] -as follo:w.s;

Efficiency: Let N be the minimum nunmer of integrand evaluations

required to solve problem " x (this must be estimated for each problem)
and N be the actual number used by a particular algorithm A. Then the
A
value of the efficiency is
Reliability: Let
...'e:~ N/NA =
be the
PI (A,x) for algorithm A and problem x .
reqJlested accuracy and SA be the
accuracy actually achieved. The value of reliability is then taken to be

g > g , PZ(A,x) = 1 - (1 - g' /

A X

Pz (A,x) 1/(1
"
+ .1(log

This places a severe penalty on failing to achieve EX' and a mild penalty
on achieving much more accuracy than These conventions allow us to
E.
x
find the performance vector (P1(A,x),PZ(A,x)) and we introduce a criteria
unit vector (w1,w ) and the norm of p(A,x) is then
Z
27

4. CONCRETE APPLICATION - THE SELECTION OF OPERATING SYSTEM SCHEDULERS

4.1 The. Components. in --the Abstract Model. The general case of this
probl.em may be' expressed' as fellows:

Consider a computing installation with a fixed configuration

and a work load with reasonably well known attr~butes. How should
jobs be scheduled in order to give the best service?
A thorough analysis of this ,problem requires many hundreds of pages and is
beyond the scope of this paper. {?e will formulate this problem more precisely
within the framework provided by the abstract models. This formula-
tion is oriented toward the specific case of the operation in the Purdue
University Computing Center which is a typical example of large scale,
complex operation. Then we describe a simplified version of the current
scheduling algorithm and, in turn, formulate a much more specific algorithm
selection problem. A discussion is then given of how one could attempt
to solve this problem in terms of the information that is known or obtainable.

The abstract model involves spaces for the prob lems, the featuI'es J the
criteria, the algorithms and the performance measures. These spaces are
described as follows:

Problem Space: This space consists of configurations of computer runs

which are mixtures of batch, remote batch; timeshared and interactive jobs.
These configurations are very dynamic in nature and normally only general
average values are known for the population characteristics (and most of
these values are not known accurately). In addition to very rapid and sub-
stantial changes in the problem characteristics there are often well iden-
tified long term variations in the average values of the problem character-
istics.

Feature Space: The features of a configuration of computer runs are a

combination of the features of the individual jobs. The features of indi-
28

vidual jobs that should be considered are indicated by a keyword plus a

short explanation.
Priority - value given by user and computing center
-CPU-time---va~ue estimated for job by user
Memory - value estimated for job by user and observed by
operating system. Both core and auxiliary memory values
may be considered
I/O requirements - values estimated by user for use of standard
devices (printers, punches, disk channels, etc.)
Special facilities - indications of use of less common facilities
(e.g. tape units, plotters. graphics consoles)
Program locality and stability - indication of the likelyhood of
page requests or job roll-outs
In addition features of the total problem configuration should be considered
such as follows:
Batch load - length of the input queue plus average values for some
of the job features
On Line Load - number of terminal users plus average values
features for the stream of jobs they create
Interactive load - number of users and nature of system being used
r/o load - length of queues at the various I/O devices.
No doubt there are other significant problem features which are not included
in this list.

Algorithm Space. A fair variety of scheduling algorithms have been

proposed and analyzed to a certain extent Coffman and Denning (1974), Wilkes
(1973). An essential characteristic of successful algorithms is that they
are fast to execute (otherwise the system devotes an excessive amount of its
resources to scheduling instead of production). This favors some very
simple schemes (e.g., round-robin, first-come first-served, simple priority)
but one must realize that rather complex algorithms can be fast to execute.

Performance Measures. The performance of an operating system depends

on one's viewpoint - each user wants instant service and the computing
center director wants zero angry or dissatisfied customers. Neither of
these desires are very realistic, but efforts to measure the progress made
toward satisfying them usually involve thruput and response time. These
measures are applied to different classes of jobs as follows:
29

Batch - small job response: median and maximum turnaround for jobs
with small resource requirements
Batch - large job response: median and maximum turnaround for all
batch jobs other than small ones (or special runs)
On line response - median and maximum response time for common service
functions (e.g. fetching a file, editing a line, submitting a
batch job)
Interactive response median and maximum response times for standard
short requests
Thruput - total number of jobs processed per unit time, number of CPU
hours billed per day, etc.

Criteria Space. This consists of numbers to weight the relative

importance of the performance measures. Values of some of these measures
can be improved only by making others worse and it is difficult to compare
them. Scaling the measures to a standard interval (say 0 to 1) and then
applying weights (which sum to one) is simple, but often satisfactory.

4.2 An Actua:t--S-ch"eduling A.lgoritbinso. We present a version of the scheduling

algorithm used on the CDC 6500 system at Purdue University (see Abel (1973)).
This algorithm haa been simplified by omitting features for preventing
deadlock, rrfirst pass" priority given initially to all jobs and job origin
priority. Jobs are scheduled according to priority i.e. if a waiting job
has queue priority QP larger than an executing job with queue priority
I
QP and if the central memory CM used by the executing job is large enough
2 2
for the waiting job (which requires CM in memory) then the executing job
I
is terminated and rolled out and the waiting job is rolled:in and placed
into execution. In summary, if QP > QP and CM 2 CM than job 2 is
I 2 I 2
rolled out and replaced by job 1.

The queue priority QP is a function of six priority parameters

+
r= (rl,r2,r3,r4,rS,r6) as follows:
r = job card priority parameter
l
r = central memory (current requirement)
2
r = time remaining on CPU time estimate
3
r q = I/O units remaining on I/O transfer unit estimate
r = number of tape units in use
S
r = number of rollouts experienced so far
6
30

The value of QP is then a linear combination

where
6
'l(r1 ) 2 * r1
R (r ) = Ir - 1501001/128
2 2 2
if r = 0
5
{ 300 + if r ~ 1
5
R (r ) = r
6 6 6
and R and R are shown in Figure ~ (8) and (b)
a 4

589
..s-·192
448 r-__

253
6500-r
4
900-r 256
112 3
8

30 70 200 900 130 325 3250 65000

(a) priority contribution for CPU time (b) priority contribution for I/O units

Figure 5. Graphs of the functions R (T ) and R (T ). The horizontal

3 3 4 4
axes are not drawn to scale. Each function is linear plus three
step functions.

This function QP involves about 22 coefficients.

4.3 An ApproaCh to the Selection of the rlBest" Scheduler. We now consider

algorithms which involve 3 features of the configuration of computer runs:
31

f • number of short jobs (with 30 seconds or less CPU time estimate)

1
f remaining number of jobs
2
f • number of active terminals (which may be used in a variety of
3
modes)
+
In addition, we use the six job parameters r given above and compute
queue priority as
6
QP = X R. (r )
1""1 ]. i
where
R (r )
1 1
a
1 * r1
R (r )
2 2 a2 * (150100 - r )
2
R (r )
3 3 -r 10
* la 83+

8
9 * max(a10 -r 4 ,O) + all * la12-r41~ + 8 13 * la 14 - r41~

Ir 5 -a16 [+o + a17 * la 18 - r5 1+

Recall the notation

n
(x-c) if x > c
n
Ix-cl • {
+ 0 if x < c
This queue priority function is a slightly modified and simplified version
of the actual one above.

We choose a three dimensional performance measure space with

+
P = (Pl,P Z,P3) where
PI (Mean internal processing time for short ba.tch jabs) / 1000
Pz = (Mean internal processing time for other ~atch jobs)/4000
P3 = (Mean response time for standard short on-line tasks)/lO
+
The scaling implies that p = (1,1,1) corresponds to approximately a 15
minute average processing time for short batch jobs, a 1 hour average processing
time for other jobs and a 10 second response. time on-line~ The algorithm
performance is then measured by
Ilpll . w1 P1 + w2P 2 + w P
3 3
32

where w is from the three dimensional criteria space with wI > 0 and
wI + w
2
+ w3 = 1.

The situation for determining the coefficients of the scheduling algorithm

is as follows:
1. The computer operator selects a criteria vector w
2. The operating system measures the configuration features
f 1 , £2' £3
+
3. The appropriate best coefficients a are used for these values
of Wi and f i "

Thus we see that the 19 coefficients are in fact functions of six other
independent variables. One could. for example, attempt to determine coeffi-
cients n .. so that
1J
~

a1 = a ia + j~l (Qijf j + Qi,j+3Wj)

There is no a priori reason to assume this linear relationship is appropriate,

but it might be and it is simple. It leads then to 133 coefficients
Q .. ,
1J
i = 1 to 19. j =0 to 6 for the algorithm selection problem.

It is appropriate to question the validity of this form of the scheduling

algorithm from the point of view of the intrinsic complexity of the problem.
Such a consideration is entirely subjective at this point because no one
has made a thorough analysis of this problem. It seems intuitively plausible
that the complexity of this scheduling algorithm form is somewhat high.
That is, considering the number of variables involved and the desired
precision of the scheduling, it is likely that an adequate form exists with
perhaps 40 to 70 independent coefficients. A crucial point is that (at
this time) not enough is known about the effect of scheduling algorithms on
system performance for one to identify the really concise. yet adequately
precise, forms for scheduling algorithms.

We now consider how to find the best scheduler of'this form. To set
the context, let us outline how the computation might go in an ideal world.
The basic building block would be the computation of best a for given
i
wj and f j . This block is designated by the function OPT, i.e. OPT(~t) is
the set of 19 best coefficients. Note that this does not involve any assump-
tion about the form of the relationship between the a. and the variables
1
33

W and f. , i.e. the 0 .. are not involved. We would then select an appro-
j J 1J
priate set of values for the variables w and f • say w ' 1 = 1 to m
j j j1 w
f , k = 1 to m and execute the algorithm
jk f

At this point we now have a tabulation of the coefficients a. as a function

1
of the w. and f .• The final step is to do a linear least squares fit to
J J
obtain the final coefficients 0ij"

Let us consider ways that this simple-minded computational approach may

go wrong. He list some obvious ways (no doubt there are others waiting
if one actually tries the approach).
1. The function OPT is too difficult to compute. We would say that
50 to 200 evaluations of functions (that is, Pas a function of ;)
should be considered reasonable. More than 500 or 1000 indicates
real difficulties and less than 50 real luck .
2.
..-
The form chosen for QP as a function of a is inadequate. This
is not likely since the form is the one in current use •
3.
..-
The linear form for the a as a function of the w. and f. is
J J
inadequate.
4. One is unable to vary f , f and f over the range of values as
l 2 3
indicated in the system and thus they are dynamically varying and
uncontrollable. To create configurations with known features is
probably a very substantial task.
5. The measurement of 11P11 is uncertain due to the dynamic nature
of the process. That is, in the 15 minutes that it takes for a
batch job to go through the system there may have been wide
variations in the values of 1 (due to the changing job configura-
tion) and the values of
..-a (due to changes made by OPT).
We note that difficulties 2 and 3 are from the problem formulation and not
the computation, so we ignore them here. The difficulty with OPT might be
very real, but one can be optimistic that a good minimization polyalgorithm
will handle this part of the computation - especially after some experience
is obtained so that good initial guesses are availalbe. This leaves
difficulties 4 and 5 which are very interesting and somewhat unusual in
standard optimization problems.
34

It seems plausible that one can obtain values of IItII which are fairly
tightly associated with values of ~, t and~. This means that it is, in
principle, feasible to carry out the optimization problem. A simplified
example of the situation is shown in Figure 6 where we assume there is 1
+ + ,.
variable for a, and 1 variable for wand t.

II Pi I

values of ;: and t

+
a
Figure 6. Function values of I lin I obtained when there is no direct
control over some of the arguments (f in this case).

In order to compensate for the irregular nature of the values obtained, one
should use an integral form of the minimization problem and then introduce
quadrature rules to accommodate the irregularity. Standard quadrature rules
for this situation are not ava,Ha.bIe . Reasonable accuracy can be achieved
by using ordinary Reimann sums with areas determined from a space-filling
curve map. That is, one maps the high dimensional domain onto [0,1], then
one assigns weights to the points according to the length their images span
in [0,1]. Note that certain values of fmight be very uncommon and hence the
optimization obtained there might be unreliable. Fortunately, the rarity of
35

these values of f means that the reliability of the scheduling algorithm

in that domain is not so crucial. In summary, it appears that adequate
methods probably can be found to carry out the computational approach
outlined earlier.

As a final note, we consider the time that might be required for a complete
determination of the "best" scheduling algorithm.. Given a fairly constant
job configuration, we assume that we can obtain values for I 1P1 I and all
other quantities within a 10 minute time interval. This corresponds to 1
function evaluation. Thus we are led to assume that one evaluation of OPT
takes from 1/2 to 1 day of system time. The ineffic.iency due to the lack
of control over setting parameters will probably double this time, say to
_-'-'-w .. _ __~ . _ . _ .
1 1/2 days. The number of.~~aluations of OPT.~eedea· to 0~tain semi~reason~ble
reliability in the Q .. computations is probably the order of 50 or 100.
1J
This implies ahout 3 ~o 6 months to select the best scheduling algorithm.

Note how this approach differs from . ··the common theoretical approach.
There one assumes some model for the computer operation and then analytically
obtains a good (or optimum) scheduling algorithm for that model. Here there
is no explicit model of the computer operation, one tries to obtain a good
scheduling algorithm by observing the systems behavior directly rather than
through the intermediary of a mathematical model. It is, of course, yet to
be seen just how feasible or effective this direct approach will be.

It is obvious that the determination of the best scheduler by this means

involves a substantial investment of effort. One has very little feel for
the possible payoff from obtaining the best scheduler. It might be nil
if the system efficiency is determined by bottlenecks located elsewhere. It
might be very large if the scheduler is one of the bottlenecks. One way to
estimate the possible gain would be to make a system simulation and perform
th~ optimization there. This would still be a substantial project and would
only give an estimate of the possible gain in optimizing the scheduler.
Nevertheless, it appears that it.might be wise to do this simulation before
attempting to involve a running computer system. Finally we note that the
algorithm selection approach described here can be applied to the other resource
scheduling taSks (disk access, communications controllers, etc.) in the same
way.
36

5. DISCUSSION OF THE TWO CONCRETE APPLICATIONS

The purpose of the preceding two sections is to examine concrete

problems within the framework of the abstract models developed. Our main
objective is to examine these two problems and not to solve them. We
do, however, given an outline fa how a realistic selection might proceed.
These concrete examples show the diversity of real problems that fit into
the abstract models. We might have also included an examination of a
function evaluation problem (e.g. SQRT(X) or SIN(X», but that seems rather
dull since such selection problems have been analyzed in great detail by
others.

The two problems considered here have some characteristics in common:

1. They are real problems subject to active research.
2. The problem space for the algorithms is of high dimension and
the overall nature of the problem is not too well understood.
One concludes that the selection problem is essentially compli-
cated by this high dimensionally.
3. Performance criteria are somewhat subjective and vary considerably
from context to context.
4. The algorithms involve familiar mathematical functions and the
algorithm selection problem can be formulated as a more or less
standard (though complicated) mathematical approximation problem.

There are also some large differences in the characteristics of these

two problems:
5. There is a substantial body of relevant data available for the
quadrature problem, but nothing for the scheduling problem. The
data for the quadrature problem has not been collected systematically
and is thus less useful than one might hope.
6. The scheduling algorithm involves a complex dynamic process in
such a way that:

a. some independent parameters cannot be varied at will;

b. reproducibility of results is unlikely since one rarely
has the same element of the problem space twice;
c. large amounts of calendar time are required ~or the
selection of "best" algorithms.
37

6. APPROXIMATION TIiEORY MACHINERY

6.1 Formu18tion Bnd structure of the approximation problem. The

purpose of this section is to analyze the algorithm selection prob-
lem within the framework of approximation theory. We will see that the
principle questions of this problem can be formulated within the traditional

framework of approximation theory. Even so. the answers to many of the

questions require the development of very novel techniques and theories of

approximation. More specifically then. OUT purpose is to systematically

examine these questions, to indicate what light can be shed on them from

the existing theory of approximation and to point out the new problems in

approximation theory that are raised by the algorithm selection problem.

Needless to say. we do not propose to solve these new problems

here. The principle questions are divided into four groups:

1. Norms and approximation forms

2. Degree of convergence. complexity and robustness

3. Existence. uniqueness and characterization

4. Computation

The question of computation is not considered in this· paper since it would

seem to involve half of the known methods of computation.

6.2 Norms and approximation forms. The question of norms enters in the

final step from the algorithm per-

formance space ~n to the single number which represents the algorithm per-

formance. Since we have a norm on a standard n-dimensional vector space.

the possibilities are well-known. The most common are of the form

Ilpll = [~
i=l
w opr] l/r
1 J.

with typical values of r being I, 2 or infinity (for the Tchebycheff or

minimax norm). However, the nature of the selection problem is suep that

we can anticipate using non-standard norms. The reason is that the perfor-

mance measures tend to include essentially incomparable variables, e.g.

PI = computer time used (measured in seconds)

P2 = computer memory used (measured in words)

P3 = complexity of setting up the computer run (measured in

hours required by the programmer)

A plausible norm to use in such a context might be

where
39

0 for P2 2. 10,000
10- 5 for 10,000 ..:: P ..:: 20,000
a(P ) =
z
2 2*10- 5 for 20,000 ..:: P .::.. 30,000
z
9
7P/1O- for P2 > 30 J OOO

and

0 for P3 < .5

S(P ) = 2 for .5 .::. P3 < 2

3
P3 for P3 > 2

There are two observations, one positive and one negative, about such

complicated norms that can be made based on current experience in approxirna-

tien. The negative one is that they do complicate the theory sometimes and j

more often, make the computations substantially more difficult. The positive

one is that the choice of norm is normally a secondary effect compared to

the choice of approximation form. That is, if one has a good choice of approx-

imation form, one obtains a good approximation for any reasonable norm. This

implies that one can, within reason, modify the norm used so as to simplify

the analysis or computations. A significant corollary to this last observa-

tion is that one cannot compensate for a poor choice of approximation form

by computing power or technical skill in analysis.

We now turn to the crucial question of approximation forms which we

group into five classes:

a. discrete

b. linear

c. piecewise

d. general non-linear:
standard mathematical
separable
abstract

e. tree and algorithm forms.

In order to discuss these choices. Ne need to formulate more precisely the

standard idea of approximation form as it currently exists in approximation

theory. The form is to be used for the selection mapping S(f(x)): ~ 7 Jdf
and we visualize a parameter (or coefficient) space 5f plus a particular

form of the mapping. To show explicitly the dependence of S on the coefficients,

we may write S(f(x),c) at times. Specific examples of the five classes of

approximation forms are given below:

a. Discrete S(f(x),l) = computer program #1

S(f(x) ,2) = computer program #2

S(f(x),3) = computer program #3

b. Linear S(f(xJ,c) = c 1 +c 2 f 1+c 3 f21+C4(f1 f 2 ) 2 +cS (f2 -f3) 3+c6 /f3

Note that linear refers to the dependence on the coefficients c. and
1

not the features f .

j
c. Piecewise linear

S(f(x) ,c) = cl+c2fl+c3f2+c4flf2+cs/f2 for If +f 1 > 2

1 2
= c6+c7fl+caf2+Cgflf2+c10fl2 for If +f 1 < 2 and f .:: f
1 2 1 2
f -£2
= cll+c12fl+c13f2+c14 1+f1 +f for If1+f21 ~2 and £1 ~ f 2
1 2
We see that the feature space is subdivided into pieces and

S(f(x),c) is defined linearly on each of the pieces.

d. Non-linear. Standard forms:

Rational: S(f(x),c) =

Exponential: S(f(x), c)

Spline: S (f(x) ,c) = cl+cZfl+c3fZ+cS(f1-c4)++c7(fZ-c6)+

o for f < c
where (f-c) + = {
f-c for f > c
41

This is an example of variable pieces. If c and c were constants.

4 6
then this would be piecewise linear.

Non-linear. Separable:

The effects of the different features (and their associated

coefficients) are completely independent of one another. The

exponential example given just above is also of this form.

The abstract non-linear form is an arbitrary function of the features f(x)

and the coefficients c.

e. Tree and algorithm forms:

NO YES

c 3 f 1£2+ C4
> C5"'
(£1+f )2
2

s =

FUNCTION S(F,C)
SUM=O
DO 20 K=l, C(l)
20 S~I=S~I+C(K+l)*F(K)
H(F(l) > C(l) ) THEN SUM = s~l/( cecel)+l))+l )
PROD=!.
IF( F(cel)+2) < (C(C(1)+1)+F(2))jF(3) ) THEN PROO=F(1)*F(2)
DO 40 K=l, C(C(1)+3)
40 PROD = ( F(K)+C(K))*PROD+C(C(1)+K+3 )
S = C(1)*SUM+C(2)*PROD+C( C(l)+C( C(1)+3)+1 )*F(l)

The main thrust of approximation theory is for the case where the co-

efficients c are used to parameterize a relatively simple form (i.e. such

as the linear, piecewise linear and non-linear forms). -The distinguishing

characteristic of these cases is that the set of approximation forms can

(at least locally) be identified with a manifold in some ordinary finite

dimensional space. The approximation theory machinery is then used to

obtain the best coefficients or parameters (again, at least locally) from

this manifold.

One thus may conclude that there are three distinct situations as

far as the applicability of approximation theory machinery. The first and

most favorable situation is for the linear, piecewise linear and nonlinear

approximation forms. Here the machinery may be applied essentially as

it currently exists. This does not mean that all of these cases are already

solved and all one has to do is to II copy lt the solutions from somewhere.

Rather, it means that these are the kinds of problems the machinery is supposed

to handle and, if it is currently inadequate in some specific instance, it

needs to be extended in the direction it is already headed.

The second situation is for the tree and algorithm forms. I-Jere it seems

that a major change in emphasis is required. The exact nature of the new

machinery is certainly unclear and no doubt there are hidden difficulties which

are not apparent from a casual inspection. However, it seems plausible that

the general spirit of the approach and techniques may well be similar to that

already existing. For example, the piecewise linear forms may be visualized

as one of the simplest of the tree forms. The development and analysis for

the piecewise forms (even for variable pieces) has progressed fairly smoothly

over the past 10 years and the resulting body of results is very much of the

flavor of the previously established linear and specialized non-linear theories.

There were (and still are), of course, some difficult questions for the piece-

wise linear, but the prospects do not appear to be too bad for developing a

useful body of approximation theory machinery for the tree and algorithm forms.
43

The third and least favorable situation is for the discrete forms. The

standard mathematical approach results in stating that the problem is trivial

in this case. One ascertains the best selection mapping by a finite enumer-

ation. Unfortunately, the enumeration may well be over very large sets. Even

1000 elements (algorithms) are completely unmanageable in most instances and

it is easy to find problems where there are millions of algorithms to be con-

sidered (at least in some abstract sense). It is not at all clear how

algorithm selection procedures are to evolve in this situation and the develop-

ment of such procedures is one of the foremost open questions in this entire

area of study.

lqe close this section by repeating a fundamental observation: The most

important single part of the successful solution of an approximation problem

is the appropriate choice of the approximation form. Approximation theory

machinery comes into play after this choice is made. Thus it is essential to

have insight into both the problem and algorithm spaces and into the possible

forms one might choose for the selection mappings.

6.3 Classification of problems, degree of convergence, COmplexity and robustness.

This section has two distinct parts. First, we introduce the concept

of classifying problems and second, we introduce three other concepts which

are intimately related to ways of classifying problems. These three concepts

degree of convergence, complexity and robustness -- are important for evalua-

ting the overall value of various approximation forms for the algorithm

selection problem.

6.3.1 Classification of Problems. An important approach to obtaining insight into

the nature of the problem space is to partition it into particular classes

of problems. Ideally there is a representative member or property of each

class which is especially relevant to the selection of algorithms. The

exact nature of the classification depends, of course, essentially on the

specific problem space. Some typical examples include:

b
a. Numerical Quadrature: Compute If =f f(x)dx
a
Class 1: Those f(x) which have continuous curvature

Class 2: Those f(x) which have 5 or fewer oscillations in [a,b]

Class 3: Those f(x) which are analytic

Mathematics has a highly developed classification system for func-

tions (integrands f(x)) which provides literally dozens of classes

relevant to numerical integration algorithms.

b. Scheduling a CPU in an operating system

Class 1: Batch processing multiprogramming, 1 CPU, 2 I/O channels

and 1 disk

Class 2: Time sharing. 2 CPU's, SO terminals

Class 3: Time sharing with a batch processing background,

2 CPU's, 50 terminals, saturation loading

We see that the problem classification has many independent

variables giving a high dimensional problem space.

c. Scene ana~ysis.

Class 1: One connected object. a line drawing with 50 or

fewer lines

Class 2: Up to 10 objects, each composed of from 1 to 10 rectangles,

triangles or circular arcs

Class 3: Unknown number of separated objects of one of 4 types;

distinguishing properties are color j texture, size,

position and orientation

It is easy to visualize thousands of particular types of scenes

to analyze.
45

The idea of problem classification is simple, but important. Most

algorithms are developed for a particular class of problems even though the

class is never explicitly defined. Thus the performance of algorithms is

unlikely to be understood without some idea of the problem class associated

with their development.

It is particularly common to attempt a classification system which

goes from easy to hard. Thus one visualizes a nested set of problems where

the innermost set consists of very easy problems and the largest set consists

of very hard ones. Unfortunately J it is not always easy to make such a

classification (at least in a reasonable way) for complex problem spaces. One

is lacking the insight to know in all circumstances just what makes a problem

hard or easy.

6.3.2 Degree of Convergence. The idea of degree of convergence comes from con-

sidering a sequence of approximation forms and asking: How much better do

these forms do as one goes further out in the sequence? A standard example

would be for computing log x by polynomials of degree O~lJ2,3, ... IN •....

We assume that for each approximation from the sequence we have the best

coefficients possible.

In the present context~ our ultimate objective is to choose the best

algorithm for every problem. If we let A*(x) be the best algorithm for

problem x and let ~(x) be the algorithm chosen by the best coefficients for

the N-th approximation form, then the question is: How does

ENe x ) = I Ip(A'ex)) I I - Ilpe~(x))11

behave as N gets big? Does it go to zero for every x? Suppose we set

max
xE9'
46

does EN go to zero fast, slow or at all? The answer to these questions is

called the degree of convergence for the problem space ~ and the sequence of

approximation forms.

In standard mathematical situations this idea is well-developed and the

degree of convergence is known for many cases. In the standard case the

problem is to evaluate a function f(x) and the best algorithm A*(x) is taken

to be the exact value of f(x). The measure of performance of an algorithm

A that produces an approximation a(x) is taken to be [f(x) - a(xJ!. Thus,

for computing sin (x) for xE9= [0,11/2] we know that polynomial forms give

EN - KN
-N for some constant K. In this case EN goes to zero extremely

fast. If one replaces sin (x) by ABS(X-IL then EN - KN- I which is not

very fast at all.

The analogy with approximately evaluating a function can be carried

further. but theoretical information about the degree of convergence is

limited to l'mathematical 11 fWlctions. That is, functions defined in a mathe-

mati cal context where one knows a variety of properties. We can say. however,

that really fast convergence using simple forms (i.e. polynomials and similar

linear forms) requires that the function involved be very well-behaved. By

well-behaved we mean smooth (no jumps or discontinuities of any kind. includ-

ing in derivatives) and of a consistent global nature (i.e. if it oscillates

one place. it oscillates everywhere; if it is flat one place, it is flat

everywhere). A large proportion (at least 50%) of the lIfunctions T1 that arise

naturally in the real world are not well-behaved in this sense.

6.3.3 Complexity. A fashionable idea related to degree of convergence is complexity.

Thus the complexity of a function is some intrinsic measure of how hard it is

to compute the function. The idea extends directly to solving problems by

noting that solving a problem is equivalent to computing the value of the

function which gives the solution of the problem.

In actually measuring complexity, one does several things:

A. Introduce some measure of the work involved in a computation.

Typical examples are: number of arithmetic operations, number of

multiplies, execution time of a real program on a particular real

computer, length of Fortran program needed. number of steps in a

Turing machine computation.

B. Assume that one considers the most efficient scheme. There is

no limit on how badly one can evaluate a function, complexity is

measured with methods of optimal efficiency.

C. Restrict the kinds of steps in the algorithms used for the

computation. For example, polynomial approximation excludes

division so l/x may be difficult to compute, but if division

were allowed then this would be a very easy fWlction. Similarly

Ix-.sl is very easy if ABS is allowed or if a test and branch

operation is allowed.

A uniform way to impose the above conditions on the complexity question

is to say that the function is to be evaluated by a particular machine or,

essentially equivalent, by one of a particular class of programs for a

general purpose machine. We illustrate this approach for polynomials

in Figure 7.
48

/INPUT

MEMORY ""X
1"
N TERMS READ X
COEF(K) SPECIAL F=COEF(O)
MULTIPLY
t.
DO 10 K=l,NTERMS
ADD F=F+X*COEF(K)
UNIT 10 CONTINUE
PRINT F

TEST N TERMS
UNIT FAIt

SUCCEED

,
OUTPUT
F
" I

(a) Polynomial evaluation machine (b) Polynomial evaluation

program
Figure 7. Polynomial evaluation via machine or program.
The special MULTIPLY/ADD unit and TEST unit of the
machine are such that they can only and automatically
do execute the program on the right.

The advantage of the idea of complexity over that of the degree of

convergence is that much greater generality is achieved. Degree of con-

vergence can be normally interpreted as complexity using a very specialized

machine. For example, a machine which can only add and multiply but which

can be programmed to do this in more or less arbitrary sequence and with

arbitrary operands is considerably more versatile than the polynomial

evaluation machine shown in Figure 7. It could, for example, evaluate

l024
the function x in 10 operations rather than the 1024 required for the
49

strictly limited polynomial evaluation machine. This added generality also

makes it possible to place the standard mathematical approximation forms

into the same framework as the piecewise forms and the tree or algorithm

forms. One merely adds or changes a piece of lIhardware" on the machine.

The disadvantage of the idea of complexity is that its generality makes

it very difficult to obtain specific results. Current research is very

intensive and yet concentrated on rather simple problems as seen in Table 1.

Computation Work or Complexity of

Standard
Method Optimal Best Knmm

Add two N-digit integers N N N

Multiply two N-digit integers N2 ? N log N
2

Evaluate polynomial degree N N multiplies ? [N/2]+2 multiplies

Median of list of length N N log N N N
3 N2 • 7 ---
Mul tiply two N x N matrices N ?

Table 1. Summary of complexity results for some common computations

These problems are orders of magnitude simpler than the typical situation

that arises in the algorithm selection problem. Thus there is little hope

for the near future that we will obtain optimal algorithms for most of these

problems (except possibly from very limited subclasses of algorithms).

In spite of the low probability of obtaining precise results about

complexity in the algorithm selection problem, there are three good reasons

to consider the idea. First, it provides the proper framework within which

to contemplate the problem. Second, the results for simple problems show

that the standard ways of doing things are often not optimal or even anywhere

close to best. Third, the high degree of complexity in "real" problems

indicates that simple-minded approaches are unlikely to do well and even

sophisticated approaches will often fall very short of optimal. Indeed, it

is likely that further theoretical developments in the area will indicate that

it is essentially impossible to obtain the optimal algorithms for many real

problems.

6.3.4 Robustness. Robustness is a technically precise term in Statistics which

relates the quality of statistical estimates in extreme situations. Thus

an estimation procedure is robust of its quality degrades gracefully as the

situation becomes more and more extreme. We do not attempt to define this

concept precisely here but it is quite useful in considering the selection

of algorithms. It is a common phenomena for algorithms to do very well on

a certain class of "easy" problems and to do increasingly worse as one

moves away from these easy problems. A robust algorithm then is one whose

performance degrades slowly as one moves away from the problems for which

it was designed. Since the problem space is so large and so poorly under-

stood in many real situations, this quality can be extremely important.

There is a reasonable probability that one will face a problem with a com-

pletely unforeseen combination of attributes which invalidate some of the

rr\\'orking assumptions'! used in the development of the algorithm. The worst

situation is, of course, an algorithm which fails completely and quietly as

soon as one moves away from the ideal problems.

Consider the simple example of estimating the wealth of the "typical ll

student in a classroom. One has three candidate algorithms for the estimate:

the average wealth, the medium wealth and the mid-range wealth. In a

rrnormall! situation these algorithms produce similar estimates, anyone of

which is satisfactory. A difficulty occurs with Hughes Hunt III (wealth

of $625 million) or John D. Mellon V (wealth of $398 million). The

mid-range now produces ridiculous estimates like $200 or $300 million and

the average is not much better with estimates like $20 or $30 million. The
51

median estimate is. however, essentially unaffected by the pressure of such

a wealthy person and thus is a very robust algorithm for this problem.

lVhile the average is more robust than the mid-range, it is not very satis-

factory in extreme situations.

Finally we note that robustness is frequently difficult to identify or

measure. In some situations one can achieve robustness with very simple

algorithms. In others it seems that robustness requires a complex algorithm

that has numerous tests for special situations and cases.

6.4 Brief survey of approximation form attributes. This section presents a

survey of the general attributes of five

important types of approximation forms. Of necessity we speak in generalities
and thus there is a real danger that a ca,s.ual--reader is misled. The state-

ments we make about attributes apply "usuallyll or "commonlyll. Realistic

specific situations exist which exhibit behaviors exactly opposite the usual

one. We have already noted that the most crucial decision in the algorithm

selection problem is that of the approximation form. Ideally, this process

goes as follows: one is intimately familiar with the problem space and with

a large variety of approximation forms. One weighs the various advantages

and disadvantages of the forms as they interact with the special features of
S2

the problem space. Perhaps some simple experimentation is made. Finally

a choice of form for the algorithm selection mapping is made which achieves

a good balance with the overall objectives.

Thus one can visualize this section as a primer on the choice of

approximation forms. Unfortunately, it is only an elementary primer and

there is no substitute for detailed experience with a variety of real

situations.

6.4.1 Discrete Forms. One might tend to dismiss this case as "degenerate". After

all, if one is merely to select the best one of three or eleven algorithms,

there seems to be little need for any elaborate machinery about approximation

forms. We do.not imply that how to:identify the best will be easy. rather

we say that concepts like complexity, degree of convergence 7 etc. do not

playa role. This reaction is appropriate in many cases. However 7 sometimes

there are some very interesting and challenging features of these forms.

The principle feature is that the finite number of algorithm is either

in fact or in concept a very large set. Even though we may have selected

just three algoritluns, we often visualize that these are representative

samples from a very much larger set. Recall from the discussion of the

numerical quadrature problem that there may well be tens of millions of algo-

rithms of even a rather restricted nature. Thus in the mind's eye there is

almost a continuum of algoritluns even though we may in fact be examining only

three of them. One of the major \~eaknesses of modern mathematical machinery

is in its ability to handle problems involving very large finite sets. The

emphasis has been on developing tools to handle problems with infinite sets

(e.g. the continuum) and one frequently draws a complete blank when faced

with a finite set of, say, 10 123 elements.

We are really saying that the proper way to consider discrete forms is
53

as a discretization of a continuum. One then applies some intuitive ideas

about continuous forms (such as presented later in this section) and hope-

fully obtains satisfactory results.

Unfortunately, we cannot continue a meaningful discussion here along

these lines because we have no knowledge of the possible continuum behind the

discrete set.
We conclude by recalling that robustness is a property of individual

algorithms and thus immediately relevant to discrete forms. It could be

evaluated for each algorithm in the discrete set. However. if the set is

large, then this is impractical. In this latter case, one probably must

attempt to transfer information about robustness from some underlying

continuum.

6.4.2 Linear Forms. There are so many obviously nice things about linear forms

that we might tend to concentrate too much on what is bad about them; or

we might tend to ignore anything bad about them. Some of these nice things

are:

They are simple and efficient to use.

They are the easiest to analyze (by far).

They are easy to understand and visualize intuitively.

They are often extremely successful in achieving good approximation.

These observations imply that we should give these forms first consideration

and that we should try other things only after we are fairly sure that some

linear form does not suffice.

The bad thing about these forms comes from the following experimentally

observed fact: Many real world processes are not linear or anywhere close to

being linear. In particular, we would like to emphasize that: Most of the

world processes are not a linear combination of simple, standard mathematical

entities. Since these facts are experimental rather than theoretical, we

cannot prove them here. Indeed, certain theoretical results (e.g. the
Weirstrass Theorem) are frequently used to support just the opposite con-

elusion (e.g. one can use polynomials for everything).

Let us illustrate the situation by a trivial example: Our problem

space 9 has just one attribute of consequence and l~e call it x (which

identifies the problem with a real number that measure this attribute).

Our algorithm space..w is likewise simple with one attribute which we call

A. Suppose that x and A range between 0 andl and suppose the best algorithm

is for A = .27 if x < .41, A = .82 if .41 < x < .8 and is A = .73 for

x > .8. The best or optimal algorithm selection mapping is then as shown

in Figur~ 8 (left).

A A /
/
/
/

/
/
/
/
/
/
/
/

x . x

Figure 8. (left) Graphical representation of the optimal algorithm

selection mapping for a simplified example. (right) The
optimal plus the best linear algorithm selection mapping.
ss

If we attempt a linear form then we would have A =a + Bx where a and B

are coefficients to be determined. The optimal values a* and B* for these
coefficients give a mapping shown as the dashed line in Figure 8. This

mapping is clearly not very close to being optimal.

Once this completely linear form is recognized as inadequate, one then

tends to proceed on to something more flexible. A natural idea is to use

polynomials, e.g.

A= u +u x+u x2 +U X 3+ .•• +uNxN-l

1 2 3 4
If one carries this out for N=4 (cubic polynomials) and N=20, one can

expect results such as shown in Figure 9 (provided one has been careful
in the computations). It is hard to argue that either one of these selec-

tien mappings is a good approximation to the optimal one. Note that in both

cases that the polynomials are truncated at either A=O or at A=l in order

to avoid obtaining non-existant algorithms for some values of x.

A A
., ... ,"
r ,, I

,, , ,,
, , ,. ,
\
\

\ !
I I ~.

I
,• Ii
\J '

I ,
I

,,
,I -, I
[

I '
r
,,J.I I

,I ,, II
, [
[
x x

Figure 9. Graphical representation of the optimal plus the

best cubic (left) and best 20th degree (right)
polynomial selection mappings.
56

Can one hope to do much better by choosing something besides poly-

nomials? One frequently sees Fourier Series (sines and cosines), exponen-

tials, Bessel functions, etc., etc. None of these give noticeably better

approximations. There is, of course, a way to obtain excellent results by

a linear form: A = a + Sw(x). One merely chooses w(x) to be the optimal

selection mapping and then we find 0*=0 and 6*=1 gives a perfect approxima-

tion.
This last observation shows the impossibility of making universal

judgements about linear forms. If you choose linear combinations of the

right things. then the linear forms can do very well indeed. In practice

though. one is usually limited to just a few pOSSibilities and one has

very little information about the optimal mapping. Note that a typical

real problem has 5 to 15 dimensions in each of x and A variables. One is

not likely to hit upon the optimal mapping as one of the things to include

in the linear mapping.

We now attempt to motivate the above conclusions from the point of view

of degree of convergence and complexity. For standard mathematical situa-

tions there are numerous results about how the error of polynomial and

sindlar functions behave as the number of terms increases. The phenomena

of Figure 9 shows very slow convergence, or poor degree of convergence.

Of course, if the optimal selection mapping has a jump as seen in Figure 8,

there will always be a large error at that jump. We also see that the large

error at the jump induces large errors everywhere.

If the optimal mapping is continuous but has breaks in the slope,

then it is known that the degree of convergence for N-terms is like liN.

That means that if 1 term gives a unit error, then 10 terms give a .1 error,

100 terms give .01 error, etc. This is a very bad situation even for the
57

simplest case of a I-dimensional problem space. Higher dimensions compound

this difficulty enormously. Thus if several of these breaks occur in a

K-dimensional problem space, then the error behaves like l/~where N is

again the number of terms. For K=5, if 1 term gives a unit error then we

,qould expect to need about 32 terms for 1/2 unit error, 1000 terms for 1/4

unit error and 100,000 for .1 error. For K=lO, the corresponding numbers
10
are 1,000, 1,000,000 and 10 , respectively for errors of 1/2, 1/4 and .1.

Clearly polynomials and related functions are hopeless in such situations

except for the crudest of approximations to the optimal selection mapping.

How often can one expect the problem space to produce selection

mappings with these troublesome properties? Experimental evidence with

phenomena from physics and engineering problems indicates more than 50% of

these functions are unsuitable for polynomials and other standard linear

mathematical forms. This includes Fourier Series which are currently widely

used in engineering situations where they cannot possibly give accurate

results. There is an intuitive reason why one should expect this. Many

physical phenomena have several domains where different factors completely

dominate the behavior. As one goes from one domain to another there is a

kind of discontinuity in behavior even if there is no sharp break in the

slope. These discontinuities affect the degree of convergence directly and~

expecially for low accuracies~ lead to a very excessive number of turns

being required. Recall that polynomials, Fourier Series, etc. have the

property that their global behavior is completely determined by their

behavior on an arbitrarily small domain. This property is not present in

many real world situations and is another intuitive reason for doubting the

general applicability of the standard mathematical forms.

One must admit that the above" arguments are taken from simplified and
58

specialized situations. The extrapolation to all kinds of algorithm selec-

tion problems is very tenuous indeed. Yet, we conjecture that things get

worse rather than better as one gets away from these situations into a

broad range of real world problems.

6.4.3 Piecewise Linear Forms. In simple terms, we break up the problem domain into

pieces and use separate linear forms on each piece. The motivation is to

circumvent the difficulties described in the preceding discussion. In many

cases the most crucial step is to determine the appropriate pieces and yet

these forms assume that they are fixed and given by some a priori process.

In these cases we in fact have a two stage process: the first is an intuitive-

hopefully realistic. partition of the problem domain into separate pieces.

The second is the application of mathematical techniques to obtain the best

coefficients for each of the linear pieces. Note that there are often some

interconnections between the pieces (for example. broken lines are piecewise

linear functions of one variable which join up continuously) which give rise

to mathematical problems which are non-standard but still linear (and hence

usually tractible).

It is difficult to draw general conclusions about this approach

because of the vagueness of the process for determining the pieces. Indeed

if the pieces are poorly chosen or too big. then one can have all the

difficulties mentioned with the traditional linear forms. On the other hand.

there are the following hopeful facts about this approach:

(i) Sometimes one does have good enough intuition to determine

the pieces so that a very significant improvement is made.

Sometimes only a very few pieces are required for this

improvement to happen.

(ii) Sometimes the problem domain is small enough that one can
S9

break it up into more or less equal pieces that are small

enough to obtain good results and yet still not obtain an

intractible number of pieces.

(iii) There are theoretical results (admittedly again from the

narrow context of approximating functions of one variable)

which indicate that if the best selection of pieces is

made, then there are fantastic improvements possible.

That is so that the degree of convergence may change from

something like ~ to -t
N
where N is the number of coefficients

involved. If one piece gives accuracy 1, then these convergence

rates indicate that about 10,000 or 5, respectively. coeffi-

cients are needed to give an accuracy of .01 in the determi-

nation of the best selection mapping. Such an improvement

obviously changes the entire nature of the problem.

We conclude that piecewise linear forms merit separate consideration

for three reasons:

A. They are non-standard in mathematical/scientific analysis and

might be overlooked if lumped into a larger class.

B. Once the difficult determination of pieces is made, then more

or less standard machinery can be used in further analysis and

computation.

c. They have been very useful in a variety of difficult situations

and. while they are not a panecea, there is reason to believe

that they will continue to be very useful.

6.4.4 General Nonlinear Forms. It is not very profitable to discuss such forms

in the abstract. These forms include everything. including the best possible

selection mapping, and thus one can do perfectly with them. Thus we must
60

really be concerned with various specific classes of nonlinear forms. The

literature on approximation theory contains a considerable development of

a variety of such classes. A partial list of these with simple examples is:

2
c +c x+c x
1 2 s
Rational Functions: c +c x
4 S
C2x +ce c4 x +
Exponential/Trigonometric Functions: ce
1 3

2
c +c x+c x for -~ < x .:: c 4
1 2 3
Piecewise Polynomials: { c +c x+c x 2+CaX 3 for c < x .:: c g
S 6 7 4
2
cIO+cllx+cI2x for c < x < ~
9

Unisolvent Functions: The set of all conic sections in the place.

Varisolvent Functions: A general class of non-linear forms which includes

the rationals, exponentials, etc.

There are several general statements that one can make about these

forms:
(i) A considerable (or even very extensive) amount of analysis has been

made of the theory of approximation.

(ii) In those cases where degree of convergence results are available

(e.g. piecewise polynomials and rationals), they imply that these

special forms are much more ~apable of approximating a wide

variety of behaviors. for exampl~, both rationals and piecewise

polynomials can do very well at approximating a jump discontinuity

or a behavior like IX or l/IX.

(iii) The computational effort required to obtain best (or even very

good) coefficients of these forms can be substantial. The develop-

ment of computational methods is more difficult than for linear

forms. However. it is practical to carry out these computations in

a variety of cases.

Thus one expects (and observes) these forms to be useful in a variety of

situations. The key to success is to analyze onels particular situation

sufficiently to obtain general knowledge of the required behavior of the

selection mapping. One then chooses that nonlinear form which possesses this

behavior and for which one can handle the analytical and computational

difficulties.

In conclusion, the determination of the proper non-linear form is still

sorn~qhat of an art and there is no algorithm for making the choice. On the
other hand, the degree of convergence and complexity results for rational

functions and piecewise polynomials show that they have great flexibility

and are likely to do well in most situations. Doing well might not be

good enough. In real problems the dimensionalities are high and needing

five coefficients per dimension implies that 5n coefficients are required

for an n-diemnsional feature (or problem) space. With n=2 this is a modest

25 coefficients, but n=lO would then require almost 10 million coefficients.

This 10 million may be considered doing well compared to the 6 decillion

coefficients of another approach~ but in either case one cannot use the

forms.

6.4.5 Tree and Algorithm Forms. These forms are most intriguing because they

prorndse so much and have the mystery of the unknown. Perhaps it is a case

of the grass being greener on the other side of the fence. These forms may

have difficulties and disadvantages which are not apparent now but which

may limit their usefulness much more than one hopes.

The primary basis for their promise is their flexibility and potential

for complexity. They certainly should complement the more traditional

mathematical forms. Their flexibility and complexity might be the lirndta-

tion on their application. Computational methods for good coefficients of

traditional foms have taken many years to develop and even now can be

quite demanding. It may well be that the computation of good coefficients

will severely restrict the usefulness of these forms for many years.

The piecewise linear forms are an example of a simple tree form and

their success bodes well for other cases. Computational techniques and

theoretical analysis for these forms is progressing steadily and we can look

for them to enter into the " s tandard and routine" category before long. This

development should serve as a useful guide for other simple tree and algo-

rithmic fonns. Still, we are very far removed from the time when we can

select as our approximation form a 72 line Fortran program and then compute

the best " coefficient values" (Fortran statements) for a particular appli-

cation.

In sununary, \\'e have very little hard information about these forms,

but they appear to hold great promise and to provide a great challenge for

theoreticians and practicioners.

6.5 An error to avoid. Occasionally one observes the following situation develop;

(il A real world problem is considered

(ii) A crude model is made of it. This model perhaps has some

undertermined coefficients or is to be manipulated to obtain

predictions about the real world problem's solution.

(iii) " A huge effort- is spent in. "obtairling accurate coefficients or

predictions based on the model.

In the specific instance at hand, the real world problem is the algorithm

selection mapping, the model is the approximation form selected and the

effort is in determining the coefficients of this form. The error that one

can make is in believing that finding the best coefficients of the selection

mapping will result in good selections. In many cases there is no reason

to believe that the best coefficients will give good selections. One is

particularly susceptible to making this error when using ~imple linear forms

for the selection ~apping. On~ may refer to Figure 8 for an illustration
f.or this situation.
6.6 The Mathematical Theory Questions. This section presents an intuitive

summary of the three principal topics

of approximation theory. The algorithm selection problem presents seme new

open questions in these topics and some of these are indicated. There is

more emphasis on summarizing the theory of approximations than on the impli-

cations for the algorithm selection problem.

6.6.1 The Existence Question. In concrete situations one rarely worries about

the existence of best selection algorithms (even though one continually

worries about the existence of good ones). Yet. from time to time this

question sheds important light on practical questions. Parameterization plays

an important role here. one is continually identifying algorithms by means

of a set of coefficients or parameters. The question of existence of a

best algorithm then becomes a question of the existence of a best set of

coefficients. In the simplest cases (e.g .• linear forms) the coefficients

are just sets of real numbers and the question is readily reduced to a

problem about sets of real numbers. One then attempts to show that:

a. infinite coefficients cannot be best

b. the algorithms depend continuously on the coefficients

It then follows from standard mathematical arguments that a best set of

coefficients exists.

This line of reasoning may fail a various points for nonlinear approxi-

mation forms. The failure is usually because of some weakness in the

parameterization. A key point to remember 15 distinguish carefully

between an approximation form and the particular set of coefficients used

to parameterize it. Consider the two simple examples:

S (f, c)

In both of these cases C

z
=+ ~ corresponds to a constant and hence a

perfectly reasonable function. In the first example this is due to a silly

parameterization, one should have c1+cZf instead. It is sometimes not so

easy to see such silliness in mOTe complex examples. The second example

presents a more delicate situation, there is no familiar mathematical way

to rewrite this form so that the difficulty disappears. One can, however,

obtain a perfectly satisfactory parameterization by taking

be the values of SCf,c) at £=0 and f=I, respectively, However, there is

nm", no nice way to express S(f,c) explicitly in terms of c and c '

l z
True non-existence is fairly common for non-linear forms and discrete

sets. The standard example 1S

S(f, c) = f E {-I,D,11

Thus the feature f can take on only one of three possible values and we

choose to give S the form of the reciprocal of a quadratic polynomial.

Suppose now that the best selection (of all possible forms and problems) is

1 if f=O and 0 if f ~ 1. Consider the case where cl=l; we have

S(D,c) = 1
= 1
l+O*£Z

S(-I,c) = S(+I,c) = 1
l+c
Z
6S

\~e can make SC!.l,c) as close to zero as we ""ant by making C large, however,
z
if we set c2=~J then S(O,e) is ruined. The difficulty in this example is

an essential one. There is no way to reparameterize S(f,e) so as to obtain

the best selection, yet we can corne as close to it as we please.

Study of the existence question occasionally leads one to realize that

the approximation form chosen must be extended in some way. A simple mathe-

mati cal example of this occurs for the t\~O exponential form

S(£,c) = c 1e cz£+c 3 e c4£

Let C
z= (I+E)c 4 and expand the first term in a Taylors series after factor-
ing out c e C4f to obtain
1
c £ [
ere 4 !+E:C4f+
(cc4£) Z
2! + + .•• J +

This may be rewritten as

Now let c =-c , c1=u/E and then let € go to zero. The result is
l 3

afe c4f

and we see that this form with two exponentials also contains a function of
f
completely different mathematical form. However. the plot of fe and

neighboring curves in Figure 10 shows that there is nothing exceptional about

S(f,c) near this curve. Even so, the coefficients are c

l
= +~, c
3
= -~.
= c Idth
C
z 4

There is a singularity in the parameterization near this curve. much as there

is a singularity at the north and south poles for the geographic coordinates

parameterization of the globe.

CASIf .. C, C, (3

+----+ /000. I l.o0oqr- -Joc o

D -_._-.------ 0 q"{)().O$ I~ 000/1 - qooo

X-_ -x 1~qq.9 I. "DO'fJ -'-'l'lq.1i' /

0 0 Jao.' (,oJl .. '''0. 15'"
+ ~
1.0 -«>

f
Figure 10. The curve fe and nearby curves of the form
with various values of c ' c2J and c "
r 3

A variation of this phenomenon OCCllI'S with the piecewise forms.

Consider piecewise linear forms (broken lines) with variable break points.

Figure llshows two things that can happen when the break points come together.

On the left we see that two of them can converge so that the result is a

step function with a jump discontinuity_ On the right we see that four

break points can converge so that an isolated peak (a IIdelta" function)

of arbitrarily small base and large height is obtained.

"
II
I
II
II
II
\ \ \i II
\ \ \I J.:...
\ \ Ii IIII
\ \ III '
\\11 II I

\,'\\
I
I, II
! IrI
III \ 11'1 I
\ \ \ III! I
I \
I', \ \ II I I I

Figure 11. Two ways that non-linear break points in a broken line form
can introduce new forms: a jump discontinuity (left) and
a "delta" function (right).

Study of the existence question can have implications for computations

in the following way. If either non-existance or the need for extending

the definition are discovered, then one can expect computational difficulties.
f
For example, if one is using the two exponential form c e C2f + c e c4
I 3
f
and the best approximation is fe (or nearly so), then the computations

become extremely ill-conditioned and normally collapse in a blizzard of highly

magnified round-off errors.

So far we have discussed only classical mathematical forms. and we

expect the same phenomena to occur for the tree and algorithm forms. A very

interesting open question is whether other phenomena may occur.

6.6.2 The Uniqueness Question. One is usually not interested in this question

per se. any best (or good) approximation will do. HOI.,rever. its study J like

that of existence. can give insight into computational difficulties that

may arise.
68

Global uniqueness is a rare property except for linear problems.

This fact is intuitively illustrated by the simple problem of finding the

closest point on a curve (a class of algorithms) from a given point (the

optimal algorithm). This is illustrated in Figure 12.

CL{'I'""e. l"eff~5.e.ftI""3
<:I. \~ o fl,,... S D kf.ai",~J
by 11. ;w"c..rl"Jl3 S.

Figure 12. Illustration of non-uniqueness of best approximation for a

nonlinear problem. In a linear problem, the curve would be
a straight line and every point would have a unique closest
point on the line.

Two other properties of the uniqueness question are illustrated by

Figure 12. First is that almost all points have a unique best approximation

even if a few do not. Second, we see that when there is more than one

best approximation. they tend to be reasonably separated from one another.

The point x, for example, has best approximations xl and x " Finally, the
2
point y illustrates the most difficult situation Nhere even though the
69

closest point CY ) is uniquely determined, there is another point (yZ) much

r
further away which is locally best and unique. That is to say, there is

no point close to YZ which is closer to y than Y is.

Z
There are enormous computational implications of the phenomena illustrated

in Figure 12. First, and somewhat less important. one can expect trouble

at those points where two or more closest points are close together. This

occurs near the three ends of the "Unes of non-uniqueness II in Figure 12.

More important is the fact that computational schemes are almost always local

in nature and thus might well locate YZ as the closest point to y. Further,

such schemes usually give no inkling that there might be a point much closer

to y. Note that this unfortunate situation occurs when we find a bad

approximation (yZ is far from y) and our limited experience in these matters

does support the hope that Ilgood'! locally best approximations are likely to

be global best approximations.

6.6.3 The Characterization Question. A characterization theorem gives some property

of a best approximation Nhich characterizes it, i.e., which allows us to

distinguish it from other approximations. An elementary approach to the

question goes as fa 11 OI'1S : If we have a best approximation S(F ,C*) with

best coefficients C*, then we have minimized something, namely our measure

of performance I Ip(S(F,C),F)1 I. At minima we have derivatives equal to

zero. Therefore, a characteristic property comes from the equations that

result from evaluating the derivative of the measure of performance and

setting it equal to zero.

The application of this approach is straight forward in many instances,

for example, the derivation of the normal equations for least squares

approximations. In other instances I the characteristic conditions might

•
70

appear to be completely unrelated to this approach. However. there usually

is a direct relationship. For example, the conditions for optimality in

linear programming problems is obtained this way modulo the changes necess-

ary to include lIdifferentiation11 at the corners of roul ti-dimensional

polyhedra. As an example, we derive the classical alternation theorem of

minimax approximation using this elementary approach. Assume we want to

approximate f(x) by sec,t) with coefficients c = c ' c ZJ .'" en so that

max I£(t) - S(c,t)1 = minimum

t
Then we wan t

max I£(t) - S(c,tll =0 J = 1,2, ... ,n

Now, the maximum only occurs at the extrema of If-51 and if we denote them

by t~J i=1,2.3 •... we have

3
30. I£(t) - S(c,t)l at t, =0
J 1 ). = 1,2,3, ...

We now differentiate off the absolute value sign to get

a j = 1.2, ... ,n
signl£ - sl at 3c. S(c,t)at t* =0
"
1 J 1 i = 1,2,3, ...

n
If S(c,t) is linear i.e., S(c,t) = ~ c" ~J.(t) then we have
j=l J

J = 1,2, ... ,n
(1) sign 1£ - sl ~j(tlat t, =0
1 i = 1,2,3, ...

That this is a variation of the alternation theorem is seen as follows

(for the case of polynomial approximation, ~.(t) = t j - l ). First note that

J
there must be at least n extrema t~ because otherwise we could find a
1
71

polynomial Sed,t) of degree n-l so that

(2) i = 1,2,3, ...• k~n

which contradicts the preceding relationship (1). More generally, the

extrema t~
1
must occur with a combination of signs so that it is impossible

to achieve (2) with any choice of coefficients d. Thus, using elementary

properties of polynomials. one finds that there must be a set of extrema t~

so that

'gn If
S• - sl at t, = (_l)i or (_ll i + 1 i .. 1,2, ... ,n+l
1

This is the classical alternation property that characterizes best minimax

approximations.

The main point made lS that almost all characterization conditions come

from setting derivatives equal to zero even though in some cases it may look

much different because of special situations or because the conditions have

been manipulated after equating the derivatives to zero.

The implication for computation is that they also are based on finding

coefficients where the derivative is zero. In many situations the key to

an effective computational procedure is to find a proper interpretation of the

derivative in the problem at hand. 1J:1-;~s~ procedures are generally i~ative

in nature (unless one is lucky) and share many of the computational properties

of similar methods of elementary numerical analysis (e.g .• Newton's method.

secant method, bisection, fixed point iteration). Unfortunately, these shared

properties are not that attractive in high dimensional problems. That is

that some of them are slow to converge or computationally expensive or

difficult to initialize for convergence. Some methods may have all three of

these unattractive properties in certain cases.

6.7 Conclusions, open questions and problems. One objective of this section

is to explore the applicability of approximation theory to the algorithm

selection problem. We conclude that there is

an intimate relationship here and that approximation theory forms an appro-
priate base upon which to develop a theory of algorithm selection methods.

We also conclude that approximation theory currently lacks much of the necessary

machinery for the algorithm selection problem. There is a need to develop

new results for and apply known techniques to these new circumstances. The

final pages of this paper is somewhat of an appendix which lists 15

specific open problems and questions in this area.

We note that there is a close relationship between the algorithm selection

problem and general optimization theory. This is not surprising since the

approximation problem is a special form of the optimization problem. We

have not attempted to detail this relationship here, but one may refer to Rice (1970)

where the relationship between non-linear approximation and optimization

is explored.

We conclude that most realistic algorithm selection problems are of

moderate to high dimensionality and thus one should expect them to be quite

complex. One consequence of this is that most straight forward approaches

(even well-conceived ones) are likely to lead to enormous computations for

the best selection. Indeed, the results of Rabin (1974) suggest that this com-

plexity precludes the determination of the best selection in many important

cases.

Finally, we reiterate the observation that the single most important

part of the solution of a selection problem is the appropriate choice of the

form for the selection mapping. It is here that theories give the least

guidance and where the art of problem solving is most crucial.

We list 15 questions that are given or suggested by the developments
of this paper.

1. What is the relationship between tree forms and piecewise linear forms?

Can all tree forms be made equivalent to some piecewise form, linear or
non-linear?

2. What alre the algorithm forms for the standard mathematical forms? Do
they suggest useful simple classes of algorithm forms? See Hart etaal.
(1968, Chapter 4) for algorithm forms for some polynomial and rational forms.

3. Determine specific classes of tree forms where the current machinery of

non-linear approximation is applicable.

4. Develop some general approaches ~r method~ to classifying problems within

a problem space. This is related to the next problem.

5. Develop an abstract machinery for analyzing optimal features. Such a

machinery might well combine the theoretical ideas of n-widths and/or

enthropy (see Lorentz (1966)) with the intuitive ideas of performance

profiles (see Lyness aRd Kagono~e (1975)).

6. What is the nature of the dependence of the degree of convergence on the

dimensionality of the problem? Some results are known for polynomial

approximation to multivariate functions. Are these typical of what one

should expect in general?

7. !'lhat is the nature of the dependence of complexity on the dimensionality

of the problem? Can results of 6. above be translated directly into

statements about complexity?

8. Obtain more precise information about the nature of real world functions?

The generalities used in this report were obtained by selecting a large

number of empirically determined functions from the Handbook of

Chemistry and.Physics (1960) and then observing

how effective polynomial approximation is. Are the results of this

experiment representative of other contexts? Can more precise infor-

mation about the properties of such classes be obtained?

9. Determine the computational complexity of the following specific problems.

For simplicity, one may use one evaluation of f(x) as the unit of

computation and ignore all other work.

Ca) Approximation to f(x) via interpolation by polynomials.

Assume various kinds of smoothness for f(x).

(b) Least squares approximation to f(x) on [0,1] by polynomials.

Assume various kinds of smoothness for f(x).
I
ee) Evaluate f, f(x)dx. This is closely related to the least
o
squares problem.

10. For~ulate a more precise and general concept of robustness.

11. Develop useful mechanisms to embed certain classes of discrete forms

into continuous ones. This is particularly relevant for non-standard

mathematical forms.

12. Develop techniques to partition high dimensional problem sets into subsets

where good linear approximations are possible. A particular instance

would be to develop adaptive algorithms for piecewise linear (no continuity)

approximations in high dimensions. See Pavlidis (1973) for some work

in one dimension.

13. Develop existence theorems for various classes of thee form approximations.

Do the difficulties of coalesced knots that occur in spline approxima-

tion have an analogy in general tree forms?

14. l'ihat are the relationships between best algorithm selection and the

results in automata theory about computability and computational complexity?

IS. Is there anyway to "differentiate" the tree form so as to obtidn a

local characterization theorem?

REFERENCES

Abell, V., Queue Priorities in the Purdue MACE Operating Systems.

puce Publication ZO QP-l, Dec., 1973.
Blue, James L. J Automatic Numerical Quadrature - DQ1JAD, Computer
Science Technical Report 25, Bell Telephone Labs,
Feb., 1975, 41 pages.

Casaletto, J., Picket, M. and Rice, J., A Comparison of some Numerical

integration programs. SIGNUM Newsletter, ~ (1969), 30-40.

Coffman, E.G. and Denning, P.J., Operating Systems Theory, Prentice

Hall, Englewood Cliffs, N.J., 1974, (Chapters 3 and 4).

de Boor, C., CADRE - An algorithm for numerical quadrature, i~

Mathematical Software (Ed. J. Rice). Academic Press,
New York, (1971) 417-449.

80, Testing and Evaluation of Some Sbbroutines for Numerical

Quadrature. in Software for Numerical Mathematics (D. J.
Evans, ed.), Academic Press. 1974. pp. 149-157.

Gentleman, M.W., Algorithm 424, Clenshaw-Curtis quadrature Comm. ACM,

14 (1972) 337-342 and 353-355. •

Hand600K o:t~<.."fiemJ.·stry ami' P1i.ysics. Handbook Publishers Ihc., Sandusky,

Ohio. 1960.

Hart. John F. et .!!..• Computer Approximations, John Wiley. New York, 1968.

Kahaner, D.K., Comparison of Numerical Quadrature Formulas, in

Mathematical Software (Ed. J. Rice). Academic Press, New
York (1971) 229-259.

Krogh, F. T. and Snyder. M. V.• Preliminary ReSUlts with a New Quadrature

Subroutine. Cornpt. Memo. 363. Section 914. Jet Propulsion
Laboratory. April. 1975. 27 pages.
Lorentz. G. G., Approximation of Functions, Holt. Rinehart and Winston,
New York. 1966.
76

Lyness J J. N. and Kaganove J J. J., Comments on the nature of auto-

matic quadrature routines~ AcM Trans. Math. Software,
1 (1975) to appear.

Patterson, T.N.L' J Algorithm 468. Algorithm for Automatic Numerical

Integration Over a Finite Interval, Comm. ACM, 16 (1973)
694-699.

Pavlidis. T. Waveform segmentation through functional approximation.

IEEE Trans. C-22 (1973), 689Q697.

Pieasene, R., An Algorithm for Automatic Integration. Angewandte Info.

(1973) 399-401.

------------. A Quadrature Routine with Round-off Error Guard.

Report TW 17, Appl. Math. Prog. Div., Katholicke Universiteit
Leu~en. March 1973, 11 pages.

Rabin, M. 0., Theoretical impediments to artificial intelligence.

Prcc. IFIP '74, North-HOlland, ~terdarnJ 1974, pp. 615-619.
Rice, J. R., A Metalgorithm for Adaptive Quadrature. J. Assoc. Camp.
Mach., 22 (1975) pp. 61-82.

____~ ~.Minimization and techniques in nonlinear approximation.

Studies in Numerical Analysis. Vol. 2 (1970). 80-98.

Wilkes, M., Dynamics of Paging. Computer 3., 18 (1973) 4-9.

DLP in Arithmetic Sequence For DEMO
100% (2)
DLP in Arithmetic Sequence For DEMO
9 pages
Average Case 1
No ratings yet
Average Case 1
59 pages
Module 5
No ratings yet
Module 5
25 pages
Cs1410 Design and Analysis of Algorithms: P, NP, NP P NP
No ratings yet
Cs1410 Design and Analysis of Algorithms: P, NP, NP P NP
20 pages
01 - Study Unit 1 - Introduction To Algorithms
No ratings yet
01 - Study Unit 1 - Introduction To Algorithms
39 pages
Module1 Lecture1
No ratings yet
Module1 Lecture1
23 pages
Daa Part 5
No ratings yet
Daa Part 5
15 pages
ADA MOD 5 Question Bank ?
No ratings yet
ADA MOD 5 Question Bank ?
10 pages
Algo - 1
No ratings yet
Algo - 1
54 pages
Unit 5
No ratings yet
Unit 5
8 pages
Unit 1-Daa
No ratings yet
Unit 1-Daa
9 pages
COS 102 - Merged
No ratings yet
COS 102 - Merged
114 pages
HND in Computing and Software Engineering: Lesson 02 - Introduction To Algorithms
No ratings yet
HND in Computing and Software Engineering: Lesson 02 - Introduction To Algorithms
17 pages
Module 1
No ratings yet
Module 1
53 pages
DAA Handouts Feb12
No ratings yet
DAA Handouts Feb12
26 pages
Unit 2 - Basic Computer Engineering - WWW - Rgpvnotes.in
No ratings yet
Unit 2 - Basic Computer Engineering - WWW - Rgpvnotes.in
27 pages
ADA 5th Module
No ratings yet
ADA 5th Module
40 pages
Computational Complexity: Definition of Mixed-Integer Programming
No ratings yet
Computational Complexity: Definition of Mixed-Integer Programming
8 pages
DAA - Unit I Student
No ratings yet
DAA - Unit I Student
57 pages
Analysis and Design of Algorithms: by Manju Choudhary
No ratings yet
Analysis and Design of Algorithms: by Manju Choudhary
39 pages
Module1 Algorithm Analysis
No ratings yet
Module1 Algorithm Analysis
26 pages
AOA 2021 Solution
No ratings yet
AOA 2021 Solution
17 pages
Unit 1 Fundamentals of Algorithmic Problem Solving Read Only 1677308772949
No ratings yet
Unit 1 Fundamentals of Algorithmic Problem Solving Read Only 1677308772949
46 pages
Algo - Mod12 - NP-Hard and NP-Complete Problems
No ratings yet
Algo - Mod12 - NP-Hard and NP-Complete Problems
56 pages
Alogorithm
No ratings yet
Alogorithm
8 pages
Design and Analysis of Algorithms OCR
No ratings yet
Design and Analysis of Algorithms OCR
5 pages
Data Structure and Algorithms
No ratings yet
Data Structure and Algorithms
57 pages
Lecture01 2
No ratings yet
Lecture01 2
46 pages
Srarm - Unit 1
No ratings yet
Srarm - Unit 1
16 pages
Unit 5
No ratings yet
Unit 5
82 pages
Topic 1 - Introduction To Algorithm and Analysis
No ratings yet
Topic 1 - Introduction To Algorithm and Analysis
101 pages
Rational and Factors: For Algorithm
No ratings yet
Rational and Factors: For Algorithm
20 pages
Local Beam Search: Individuals
No ratings yet
Local Beam Search: Individuals
15 pages
Slides Part3 AlgAndDatastrct
No ratings yet
Slides Part3 AlgAndDatastrct
227 pages
Bi Linear Searching: Algorithm
No ratings yet
Bi Linear Searching: Algorithm
45 pages
CSE3004-Lecture 0-Merged-Pages-Deleted
No ratings yet
CSE3004-Lecture 0-Merged-Pages-Deleted
18 pages
Unit 4
No ratings yet
Unit 4
26 pages
Lecture 10-Fffcomplexity of Algorithms
No ratings yet
Lecture 10-Fffcomplexity of Algorithms
19 pages
CSC 207
No ratings yet
CSC 207
59 pages
1 Introduction To Algorithms
No ratings yet
1 Introduction To Algorithms
30 pages
01 Algo
No ratings yet
01 Algo
95 pages
DAA Defination
No ratings yet
DAA Defination
4 pages
Unit - 5 Daa
No ratings yet
Unit - 5 Daa
22 pages
Combinatorial Algorithms
No ratings yet
Combinatorial Algorithms
77 pages
Unit 5 - Part - 2 Limitations of Algorithm Power
No ratings yet
Unit 5 - Part - 2 Limitations of Algorithm Power
9 pages
No Free Lunch Theorems For Optimization: David H. Wolpert and William G. Macready
No ratings yet
No Free Lunch Theorems For Optimization: David H. Wolpert and William G. Macready
16 pages
BNP Unit-5 Lecture 21
No ratings yet
BNP Unit-5 Lecture 21
22 pages
Automatic Mining of Numerical Classification Rules With Parliamentary Optimization Algorithm
No ratings yet
Automatic Mining of Numerical Classification Rules With Parliamentary Optimization Algorithm
8 pages
Advanced Design and Analysis of Algorithms: Dr. Hajira Jabeen
No ratings yet
Advanced Design and Analysis of Algorithms: Dr. Hajira Jabeen
36 pages
Introduction To Algorithm Analysis
No ratings yet
Introduction To Algorithm Analysis
10 pages
Daa Unit5 Notes
No ratings yet
Daa Unit5 Notes
15 pages
Week 1
No ratings yet
Week 1
133 pages
01 - The Role of Algorithms in Computing
No ratings yet
01 - The Role of Algorithms in Computing
30 pages
Chapter 1.1 - Analysis of Algorithms-1
No ratings yet
Chapter 1.1 - Analysis of Algorithms-1
5 pages
U1 - Introduction
No ratings yet
U1 - Introduction
67 pages
Beyond The Worst-Case Analysis of Algorithms
No ratings yet
Beyond The Worst-Case Analysis of Algorithms
706 pages
DAA - Non-Deterministic Algorithms
No ratings yet
DAA - Non-Deterministic Algorithms
13 pages
SJBIT (Design and Analysis of Algorithm Lab Manual)
No ratings yet
SJBIT (Design and Analysis of Algorithm Lab Manual)
50 pages
Alogorithms
No ratings yet
Alogorithms
20 pages
Feedback Control Theory
From Everand
Feedback Control Theory
Bruce Francis
5/5 (1)
Random Optimization: Fundamentals and Applications
From Everand
Random Optimization: Fundamentals and Applications
Fouad Sabry
No ratings yet
Lectures Cs 1073
No ratings yet
Lectures Cs 1073
314 pages
Links Between String Theory and The Riemann's Zeta Function
No ratings yet
Links Between String Theory and The Riemann's Zeta Function
23 pages
ESE 2023 Fy Mtech GH Raisoni
No ratings yet
ESE 2023 Fy Mtech GH Raisoni
8 pages
Some Direct and Inverse Problems in Land Subsidence Theory: Vesselina Dimova
No ratings yet
Some Direct and Inverse Problems in Land Subsidence Theory: Vesselina Dimova
7 pages
COSC 1101 Programming Fundamentals
0% (1)
COSC 1101 Programming Fundamentals
5 pages
Special Function
No ratings yet
Special Function
8 pages
Formula Tutorial1
No ratings yet
Formula Tutorial1
83 pages
Prime and Semiprime Inner Functions - I.chalendar Et Al
No ratings yet
Prime and Semiprime Inner Functions - I.chalendar Et Al
22 pages
Mathematics GR 9 FAT 3 2 Assignment Graphs Term 3 Final
No ratings yet
Mathematics GR 9 FAT 3 2 Assignment Graphs Term 3 Final
7 pages
CUET 2024 Commerce Syllabus
No ratings yet
CUET 2024 Commerce Syllabus
27 pages
Maths Lab Activities Class 12
0% (1)
Maths Lab Activities Class 12
20 pages
04 - Continuity and Differentiability
No ratings yet
04 - Continuity and Differentiability
8 pages
Discrete Probability Distribution
No ratings yet
Discrete Probability Distribution
5 pages
MHS Course Catalog 2023-2024
No ratings yet
MHS Course Catalog 2023-2024
43 pages
Class Notes - Calculus
No ratings yet
Class Notes - Calculus
31 pages
Nimcet & Cuet PG Syllabus For Mca
No ratings yet
Nimcet & Cuet PG Syllabus For Mca
4 pages
Algebra 1 Dividers
No ratings yet
Algebra 1 Dividers
10 pages
Lecture Notes 1
No ratings yet
Lecture Notes 1
23 pages
Final Exam
No ratings yet
Final Exam
3 pages
Living in The It Era Midterm Exam Pointers
No ratings yet
Living in The It Era Midterm Exam Pointers
45 pages
General Mathematics
No ratings yet
General Mathematics
2 pages
Calculus Cram Packet PDF
100% (6)
Calculus Cram Packet PDF
140 pages
HIS-IG - OL-P-MathB Exam Question (2hr)
No ratings yet
HIS-IG - OL-P-MathB Exam Question (2hr)
5 pages
MATA32H3: Final EXAM
No ratings yet
MATA32H3: Final EXAM
71 pages
Opencv 2 Refman
No ratings yet
Opencv 2 Refman
929 pages
A First Course in Fuzzy Logic Fuzzy Dynamical Systems and Biomathematics Theory and Applications 1st Edition Laécio Carvalho de Barros
100% (4)
A First Course in Fuzzy Logic Fuzzy Dynamical Systems and Biomathematics Theory and Applications 1st Edition Laécio Carvalho de Barros
68 pages
g11 All Source Complete Week 1 To 10 Quarter
No ratings yet
g11 All Source Complete Week 1 To 10 Quarter
270 pages
CSE101 L12-13 Recurrence
No ratings yet
CSE101 L12-13 Recurrence
61 pages
Control Theory - 3141708 Lab Manual - 4th Sem
No ratings yet
Control Theory - 3141708 Lab Manual - 4th Sem
62 pages