0% found this document useful (0 votes)

35 views15 pages

Introduction To Dynamic Programming

Uploaded by

avinandan.mallick

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

35 views15 pages

Introduction To Dynamic Programming

Uploaded by

avinandan.mallick

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 15

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/50334812

Introduction To Dynamic Programming

Article · January 1967

Source: OAI

CITATIONS READS

367 3,081

1 author:

George L. Nemhauser
Georgia Institute of Technology
332 PUBLICATIONS 37,931 CITATIONS

SEE PROFILE

All content following this page was uploaded by George L. Nemhauser on 28 July 2014.

The user has requested enhancement of the downloaded file.

JSS 16 (4) (1961) 261-274

AN INTRODUCTION TO
DYNAMIC PROGRAMMING

BRIAN GLUSS

{Armour Research Foundation of Illinois Institute of Technology)

D Y N A M I C programming, a mathematical field that has grown up

in the past few years, is recognized in the U.S.A. as an important
new research tool. However, in other countries, little interest has
as yet been taken in the subject, nor has much research been
performed. The objective of this paper is to give an expository
introduction to the field, and give an indication of the variety of
actual and possible areas of application, including actuarial theory.
In the last decade a large amount of research has been performed
by a small body of mathematicians, most of them members of the
staff of the RAND Corporation, in the field of multi-stage decision
processes, and during this time the theory and practice of the art
have experienced great advances. The leading force in these
advances has been Richard Bellman, whose contributions to the
subject, which he has entitled Dynamic Programming[I], have had
effects not only in immediate fields of application but also in
general mathematical theory; for example, the calculus of variations
(see chapter IX of [I]), and linear programming (chapter VI).
The work done so far has paved the way for innumerable further
research and applications, and hence the field, besides being a very
stimulating one, is one which new blood will find potentially
highly fruitful.
As will be seen when the concept of dynamic programming is
defined and illustrated more fully below, a further development in
the last decade that has made research in the subject more feasible
and useful has been the evolution of the digital computer, which
has made it possible to simulate complicated processes and hence
deduce their properties by Monte Carlo methods; sometimes,
262 BRIAN GLUSS
also, simulation has provided hints as to the solution of thorny
mathematical problems that have arisen out of the study of the
processes.

THE TERM 'DYNAMIC PROGRAMMING"

Decision processes exist in which it is required to make just one
decision which will completely determine the outcome of the
process. For example, we may bet on the throw of a die and, of
course, if we have observed previous tosses from which we have
deduced the die is loaded, our decision may not necessarily be
arbitrary. However, a more interesting type of decision process
is a multi-stage one; that is, one in which a sequence of decisions
must be made, each of which will affect the final outcome of the
process. Also, in such a process the order of the decisions may be
crucial. An example, of a stochastic nature, is an allocation process
in which funds on hand at one point in time must be distributed
among various investments, and then the returns re-distributed
if necessary at each point of time so as to maximize the expected
final return; information, such as seasonally and trends, will
naturally affect our decisions. Another example, of a deterministic
nature, is the writing of a computer program, in which the order of
instructions—or decisions—is of course of paramount importance.
The fact that in most of these multi-stage decision processes
time plays a significant role accounts for the term 'dynamic'. Of
course, this does not preclude the possibility of embedding other
processes in a dynamic programming formulation, and in fact
many procedures evolved in the subject provide new approaches
to processes in which time plays no part.
The word 'programming' is used not in the restricted computer
sense but in that now encompassed by the phrase mathematical
programming. That is, we wish to describe, in formal mathematical
terms, a system or process and then consider the mathematical
model thus obtained. The word programming implies not only the
translation of the system to a mathematical model but also the kind
of model, the kind of mathematical functions we choose to manipu-
late, and how we manipulate them.
DYNAMIC PROGRAMMING 263

BASIC PROPERTIES
The basic characteristic of problems in dynamic programming is
that they are formulated as models involving N-stage decision
processes, where the desired solution is the determination of
'policies' or 'strategies' which satisfy criteria, which, for example,
could be minimization of cost, or, in stochastic models, of the
statistical expectation of the cost or maximization of profit. To
avoid any confusion with insurance terminology, the term
'strategy' will be used throughout this paper. One of the basic
techniques used is to reduce the TV-dimensional optimization
problem to N sequential one-dimensional problems using Bellman's
so-called 'Principal of Optimality':
'An optimal strategy has the property that whatever the initial
state and initial decision are, the remaining decisions must constitute
an optimal strategy with regard to the state resulting from the first
decision'
These one-dimensional problems lead to sequential equations
which are often amenable to mathematical iterative solution, and
even when this is not so, the iterative equations often prove much
simpler to handle on a digital computer for specific solutions than
the original N-dimensional optimization problem, reducing con-
siderably the amount of storage required.
An example of this type of reduction—applied to a problem
independent of time—is as follows
Suppose it is desired to find the maximum of

(1)

over the segment of the plane

X
!+ ... + XN=C\ Xi>O, (2)

where the gi are continuous. Since the maximum depends only

upon c and N, define it to befN(c).
Then, using the principle of optimality,
(3)
264 BRIAN GLUSS
That is, the maximum equals the maximum of the sum of (i) the
last term of equation (1) and (ii) the maximum of the first N—i
terms of F subject to the constraint x1 +... x N _ 1 = c — xN; xt> o
in place of equation (2).
Also, fx(c) = g1(c), and hence one may solve successively for
f2, f2 f3....,. fN- For example,

where, since g1 and g2 are mathematically defined, the value of x

that maximizes the function in parentheses is easily determined by
calculus. Whether this value of x is within the interval (o, c) or
at one of its boundaries o, c, this value x—which is a function of
c alone—is inserted in the function in parentheses, andf2(c) is
given by

/3(c) is then obtained in exactly the same way, using equation (3)
with N = 3; and so on.
Note that this kind of technique of maximization eliminates the
trouble that plagues such well-known techniques as partial differen-
tiation and Lagrangian multipliers; that is, that if the solution
happens to be on a boundary, as so often happens in maximization
problems (cf. for example, linear programming), then the equations
break down. This trouble is never encountered in Bellman's
maximization procedures.
The crux of dynamic programming formulation, then, is to
define functionsfN(x)—or in multi-dimensional casesfN(XI • •• xr)
—so that they may be expressed in terms off^x), i < N. The most
important part of the procedure is to choose the 'right' function
so that such equations may be developed and solved, and this to
a great extent is a matter of experience and intuition, which may be
partially acquired by delving through the literature. Reference [1]
is an obvious starting point towards this end, and the bibliography
in it gives ample further references.
The following example comes from reference [1], (example 45,
chapter I) and is chosen because of its similarity to valuation
problems. The notation has been slightly amended.
DYNAMIC PROGRAMMING 265
Example (i)
Suppose that we have a machine whose output in time periods
1, 2, 3, ... is r1; r2, r3, ..., and its upkeep cost ux, w2, u3, .... The
purchase price of a new machine is p, and its trade-in value at the
end of the tth period is st. p> s0, where s0 is defined as the trade-in
value at the beginning of the first period. The discounting factor
is v.
Let the present value of future returns using an optimal strategy
from the end of the tth period =/ ( , where fo is defined as at the
beginning of the first period but immediately after purchase.
Assume rt and ut are to be discounted from the end of the
period.
The recurrence relationships are
f0 = max

for practical purposes

and ft = max

The recurrence relationship gives

if n is the value of t for which

These equations give

and
Eliminating fn,

i.e.
17 ASS l 6
266 BRIAN GLUSS
Hence to maximize the present value of future returns, n must be
chosen to maximize this expression.
Example (ii)
Reference [I] chapter I, example 47, asks further if it is uniformly
true that the optimal policy, if given an over-age machine, is to
turn it in immediately for a new one.
Consider the value of the process again:
If we are given a machine aged T and sell it immediately, then
the value of the process is

If we save it one time-period the value is

Hence, if

then we should not trade it in immediately. This of course will

only be the case when the return per time-period on the total profit
f0 — p at the rate of interest inherent in v is smaller than the difference
between the return rT+1 and the upkeep uT+1. This is seen more
easily if we put v = 1/(1 +i) and write the inequality as

Example (iii)
Reference [1], chapter I, example 48, asks how one would
formulate the problem to take into account technological improve-
ment in machines and operating procedure.
Here the inclusion in the recurrence relationship of a penalty for
outdating would be reasonable, as a loss is incurred due to the
inefficiency of the old machine relative to a new type. Hence the
equation would read something like

Outdating would also affect st but it is assumed this is incor-

porated in st itself.
The expression arrived at in example (i) could be obtained
directly by standard actuarial techniques:
DYNAMIC PROGRAMMING 267
If sold after the with period every time the value

Define commutation functions

and
all of which can be tabulated.
Then value

which can be tabulated to obtain the maximum value by inspection.

The solution to example (ii) might also suggest to the actuarial
student that he would more naturally pose the original problem as
one of maximizing the inherent rate of interest i.
Thus for an initial outlay of p the future returns are

etc. in successive periods.

Hence the inherent rate of interest i is given by

where
It would be required to find the value of n which maximized i.

SUCCESSIVE APPROXIMATION IN STRATEGY SPACE

Another important concept in dymanic programming involves the
use of successive approximation in solving complicated functional
equations of the type
(4)
17-2
268 BRIAN GLUSS
An equation of this kind is equation (3) when gN(x) = g(x), all N.
Now suppose that the process is unbounded in time. We then
wish to solve the asymptotic equation
(5)
Solution of this equation comprises finding the values of X, i.e.
the strategies, and of f(x), the outcomes of the process, for all x.
Sometimes exact solutions are easily obtainable; however, often
they are not. In such cases, a powerful procedure for obtaining
successive approximations to the solution is to make successive
approximations in strategy space rather than function space; in
other words, to make successive approximations Xa\ X(2\ ... to X,
where these X(i) are of course all functions of x. First, we assume
a solution Xa\ and then solve the equation
(6)
(1)(x
to obtain a first approximation f ) tof(x). Then X®> is obtained
by solving
(7)
and so on, finding alternately successive valuesf(2), X(3)\f(3\ X(i\ ...
of the function and the strategy.
The powerful nature of this concept is obvious, in that it may
not only be used as a purely mathematical technique but also as
one partially using computer simulation of the process. This will
be discussed further after consideration of an example of the
application of dynamic programming and successive approximation
in strategy space to an actuarial problem, which might clarify
concepts slightly, even though the model might be considered
somewhat oversimplified.

Example (iv)
Suppose an insurance company has a certain number of life
insurance policies and must decide how large its liquid reserves
should be in order to handle all claims. If the reserves are too
large, the company is losing money by failing to invest the surfeit;
if they are too small, additional costs are incurred in meeting the
DYNAMIC PROGRAMMING 269
excess claims, for example, by being forced to sell securities under
unfavourable circumstances. What, then, is the optimal reserve
level to adopt?
Let it be assumed that if there are reserves x on hand at time t, we
wish to determine what additional reserves y — x should be put
aside in the next time-period (t, t + 1) for payment of claims, given
that the probability distribution for the amounts of claims for the
time-period is (s)ds, that it costs k(y — x) to make these additional
reserves available (for example, in administrative costs and lost
investment dividends), that a cost p(z) is incurred if there is an
excess z of claims over reserves, and that there is a discount ratio v,
o < v < 1, per unit time.
Then, if f{x) = expected total discounted cost of the process
starting with initial reserves x and using an optimal policy,

(8)
The first term is the cost of adding reserves y — x; the second is
the integral of the probability that s > y and a cost p(s —y) is
incurred; the third term is the cost of the process from the next
time-period given that reserves have been completely exhausted
(i.e. s > y); and the fourth term is the integral of the probability
that s < y and we are left at time t+I with reserves y — s, from when
the cost of the process is f(y — s).
This is precisely equation (7), p. 156, of [1], and the reader is
referred to it for a discussion of the solution. An exact solution
is given for k(z) = kz and p(z) =pz; further cases are considered
there and also in references [2] and [3]. The optimization criterion
used was that of minimizing f(x). It turns out, incidentally, that
the exact solutions in some of the cases just referred to are constant
reserve level solutions. That is, if at time t our reserves are x, we
order reserves y — x, where y is independent of x and depends only
upon the parameters of the system. This would appear to be
actuarially convenient.
27O BRIAN GLUSS
For example, if ($) ds, the distribution for the amount of claims,

and the cost p(z) of an excess z of claims over reserves

= pz
and k is also taken as a constant, then

= min
On p. 160 it is given that if vp > k, and some other simple practical
conditions are also satisfied, then the solution is as follows:
Let x be the unique root of

i.e. of

i.e.

Then the optimal policy has the form

(i) for o < x< x (y = x),
(ii) for x > (y = x).
In other words, the optimal reserve level is x.
If vp< k, the solution is given by y = x for x > o, i.e. never order.
This is intuitively obvious since vp < k implies that the cost of
a penalty is less than the cost of ordering its equivalent in advance.
Of course, we could alternatively consider 'suboptimal'
strategies. For example, we could confine ourselves to constant
reserve level solution and deduce from equation (8) what the
constant level should be. That is, solutions would be confined to
DYNAMIC PROGRAMMING 271

a subset of possible solutions, and the optimal solution of this

subset determined.
Also, for functions k(z) a.ndp(z) that make equation (8) difficult
to solve exactly, successive approximations in strategy space may
be found in the manner described in the previous section.

SIMULATION OF MULTI-STAGE DECISION PROCESSES

A powerful new tool for investigating complicated multi-stage
decision processes is the digital computer: successive approxi-
mations to optimal strategies may be derived by means of simulation
of the processes on the computer. Using a specific strategy, and
random numbers if the process is stochastic, the process may be
performed again and again, after feeding the parameters and
constraints of the system into the computer, and the outcomes
observed. When this has been done a sufficiently large number of
times, a clear picture of the results obtained from using the strategy
will be evident: we will have a close approximation to the pro-
ability distribution of the outcomes. This method is generally
referred to as a Monte Carlo method.
Once the outcomes have been observed, the strategy is changed
slightly and a new set of outcomes is observed. If these are better
with respect to the optimization criteria—for example, the average
cost of a search process may have been reduced—then we now have
a closer approximation to the optimal strategy. Hence a systematic
technique for successive approximation in strategy space may be
defined for any multi-stage decisions process.
As an example, suppose that it is required to find an object which
is in one of N cells with probabilities plt ...,p N , where = 1.

Also, let there be corresponding cost parameters tlt ..., tN. Let it
be assumed further that if a cell is searched and the object is in
a neighbouring cell it will move one cell further away, unless it is
in either of the cells 1, N, when it will remain where it is. Now,
in the process in which no movement is permitted, it has been
determined that the strategy that minimizes the statistical ex-
pectation of the total cost of the search is to examine in order of
272 BRIAN GLUSS
increasing values of tr\pr. This is an intuitively reasonable solution,
since the procedure involves examining cells with low cost and high
probability first.
Hence let us use this strategy as a first approximation to our
more complicated process, and simulate the process a large number
of times, noting what the costs of the search are. For successive
approximations, consider the subset of possible strategies defined
by minimizing tr/(pr)kr, where the kr are initially set equal to
unity, and then varied incrementally (and independently). For
example, searching near the boundaries will have a different effect
from searching elsewhere, and we may therefore decide to see
what happens when we vary the kr for r near i and N first. Hence
the whole problem of determining the strategy in this subset may
be (computer) programmed from beginning to end to simulate
the process, change the kr incrementally, observe when these
changes produce positive and negative effects (i.e. decrease and
increase the cost), and increment and decrement accordingly until
the suboptimal strategy is determined. Another subset of strategies
could comprise criteria of the form krtrjpr, for example. An
interesting research problem for the reader, incidentally, would be to
attempt to obtain the mathematically exact optimal strategy, which
has not as yet been found. More sophisticated methods exist of
applying the concept of successive approximation using Monte
Carlo simulation on a computer; the example above and the
strategy subsets considered were chosen to attempt to explain the
concept as simply as possible.

FIELDS OF APPLICATION
The number of areas of application, actual and potential, of dynamic
programming is extremely large, one might say virtually inexhaust-
ible; and much research of considerable importance is being
attacked successfully from this new angle. Probably one of the
most important and promising of these areas is the field of learning
processes [4], feed-back processes, and adaptive control pro-
cesses [5], [6]. Much thought has been given in the last decade to
the construction of machines that can 'learn'. The definition of
DYNAMIC PROGRAMMING 273
this phrase as far as machines are concerned is to some extent a
philosophical problem; simply, what we usually mean is to
program the machine, or give it sufficient capacity, instructions
and logic, to enable it to learn from its own experiences, and adapt
its decisions, or strategies, according to the criteria for success that
we feed into it. Hence, as it observes its successes and failures
using certain strategies, it will modify them to increase the success
rate, hoping eventually to achieve complete success. This is
obviously possible with a computer of infinite storage space,
because the totality of experiences may be stored to cite in future
experiences; however, since the machine cannot have infinite
storage space, other concepts, of a probabilistic nature, have to be
developed.
Another important area is that concerning search processes, in
which we may be considering a system in which we wish to find
an object such as the cause of a breakdown in the system, for ex-
ample [7], [8], or even find it and automatically repair it [9], There
naturally are also applications to war games in which it is desired
to derive optimal strategies for detecting enemy objects, such as
submarines. A further application is the retrieval of information
from large libraries of data or documents. A simple example of
an information retrieval problem is given on pp. 50-51 of [1].
Also, inventory control, and scheduling problems, as previously
cited, have been considered with success from the dynamic pro-
gramming viewpoint.
Dynamic programming has also been applied to such diverse
fields as communication theory (see pp. 140-143 of [1], [10]),
allocation processes (chapter I of [1]), and communication network
theory [II], [12].
It should of course be realized that some of the fields and
references mentioned are to a certain extent overlapping, depending
upon one's definitions of these respective fields. However, the variety
of applications should nevertheless be obvious, and it is to be hoped
that this introduction to dynamic programming may attract
further research workers into the field.
274 BRIAN GLUSS

REFERENCES
[i] BELLMAN, R. (1957). Dynamic Programming. Princeton University Press.
(The reader should perhaps be cautioned that, as is to be expected for any
important mathematical first edition, there are a certain number of slight
errors.)
[2] GLUSS, B. (I960). 'An optimal inventory solution for some specific
demand distributions'. Naval Research Logistics Quarterly, 7, no. 1,
45-48-
[3] LEVY, J. (1959). Further notes on the loss resulting from the use of incorrect
data in computing an optimal policy. Naval Research Logistics Quarterly,
6, no. 1, 25-31.
[4] BELLMAN, R. & KALABA, R. (1958). On communication processes involv-
ing learning and random duration. RAND Research Memorandum P-II 94.
[5] BELLMAN, R. (1961). Adaptive Control; A Guided Tour. Princeton
University Press.
[6] KALABA, R. (1959). Some aspects of adaptive control processes. RAND
Research Memorandum P-1809.
[7] JOHNSON, S. M. (1956). Optimal sequential testing. RAND Research
Memorandum RM-1652.
[8] GLUSS, B. (1959). An optimum policy for detecting a fault in a complex
system. Operations Research, 7, no. 4, 467-477.
[9] FIRSTMAN, S. I. & GLUSS, B. (i960). Optimum search routines for auto-
matic fault location. Operations Research, 8, no. 4, 512-523.
[10] BELLMAN, R. & KALABA, R. (1957). On the role of dynamic programming in
statistical communication theory. IRE Transactions of the Professional
Group on Information Theory, vol. IT-3, no. 3.
[11] KALABA, R. & JUNCOSA, M. L. (1956). Optimal design and utilization of
communication networks. RAND Research Memorandum P-782.
[12] KALABA, R. (1959). On some communication network problems. RAND
Research Memorandum P-1325.

View publication stats

Optimizations, Chapter 1,2,3,4
No ratings yet
Optimizations, Chapter 1,2,3,4
13 pages
Dynamic Programming An Introduction by Example
No ratings yet
Dynamic Programming An Introduction by Example
24 pages
Process Optimisation: Dynamic Programming
No ratings yet
Process Optimisation: Dynamic Programming
35 pages
Dynamic Programming Guide
No ratings yet
Dynamic Programming Guide
3 pages
Powell - Modernizing The Teaching of Optimization January 5 2024
No ratings yet
Powell - Modernizing The Teaching of Optimization January 5 2024
8 pages
Dynamic Programming....
No ratings yet
Dynamic Programming....
4 pages
DD Unit 5. Slides
No ratings yet
DD Unit 5. Slides
27 pages
Dynamic Programming 7707
No ratings yet
Dynamic Programming 7707
51 pages
DP by Bellman Functional Equation
No ratings yet
DP by Bellman Functional Equation
296 pages
Dynamic Programming
No ratings yet
Dynamic Programming
9 pages
Dynamic Programming Essentials
No ratings yet
Dynamic Programming Essentials
34 pages
Adsa U4,1
No ratings yet
Adsa U4,1
4 pages
ADA Unit 3
No ratings yet
ADA Unit 3
19 pages
DP Methods
No ratings yet
DP Methods
61 pages
CH 18
No ratings yet
CH 18
30 pages
Dynamic Programming in Graphs
No ratings yet
Dynamic Programming in Graphs
7 pages
DAA Unit-3 PPT 19
No ratings yet
DAA Unit-3 PPT 19
49 pages
Dynamic Programming
No ratings yet
Dynamic Programming
14 pages
Dynamic Programming
No ratings yet
Dynamic Programming
39 pages
Functional Equations in Dynamic Programming
No ratings yet
Functional Equations in Dynamic Programming
18 pages
Dynamic Programming
No ratings yet
Dynamic Programming
384 pages
Dynamic Programming
No ratings yet
Dynamic Programming
30 pages
Non-Serial Dynamic Programming. Academic Press, 1972. Another Reference Used in
No ratings yet
Non-Serial Dynamic Programming. Academic Press, 1972. Another Reference Used in
37 pages
Dynamic Programming - Part 1
No ratings yet
Dynamic Programming - Part 1
23 pages
Dynamicprogrammingkk
No ratings yet
Dynamicprogrammingkk
513 pages
Operations Research: Dynamic Programming
100% (1)
Operations Research: Dynamic Programming
52 pages
Optimal Binary Search Tree1
No ratings yet
Optimal Binary Search Tree1
50 pages
Lecture 8 Dynamic Programming
No ratings yet
Lecture 8 Dynamic Programming
32 pages
Dynamic Programming in Computer Science
No ratings yet
Dynamic Programming in Computer Science
49 pages
Unit 4
No ratings yet
Unit 4
70 pages
Dynamic Programming - Fill Up The Blanks
No ratings yet
Dynamic Programming - Fill Up The Blanks
4 pages
Dynamic Programming
No ratings yet
Dynamic Programming
10 pages
Dynamic Programming Optimization
No ratings yet
Dynamic Programming Optimization
52 pages
Understanding DP Time Complexity
No ratings yet
Understanding DP Time Complexity
13 pages
Non-Serial Dynamic Programming. Academic Press, 1972. Another Reference Used in
No ratings yet
Non-Serial Dynamic Programming. Academic Press, 1972. Another Reference Used in
37 pages
Shashikant Mharasale ME107 - or
No ratings yet
Shashikant Mharasale ME107 - or
14 pages
04 - OR2 - Dynamic Programming
No ratings yet
04 - OR2 - Dynamic Programming
14 pages
Lec Dynamic Prog 1
No ratings yet
Lec Dynamic Prog 1
26 pages
Dynamic Programming & Control
No ratings yet
Dynamic Programming & Control
62 pages
Opt Class CH17102 - Unit 4
No ratings yet
Opt Class CH17102 - Unit 4
26 pages
Dynamic Programming for CS Students
No ratings yet
Dynamic Programming for CS Students
24 pages
Aad Mod 4
No ratings yet
Aad Mod 4
24 pages
Dynamic Programming for Analysts
No ratings yet
Dynamic Programming for Analysts
43 pages
1 s2.0 0022247X86901460 Main
No ratings yet
1 s2.0 0022247X86901460 Main
7 pages
Dynamic Programming
No ratings yet
Dynamic Programming
11 pages
CH 9 MDP
No ratings yet
CH 9 MDP
97 pages
Dynamic Programming in Real-Life Problems
No ratings yet
Dynamic Programming in Real-Life Problems
10 pages
Dynamic Optimization
No ratings yet
Dynamic Optimization
73 pages
Minimum Cost Path Algortihm
No ratings yet
Minimum Cost Path Algortihm
8 pages
Dynamic Programming: of Optimality
No ratings yet
Dynamic Programming: of Optimality
11 pages
Dynamic Programming for Engineers
No ratings yet
Dynamic Programming for Engineers
10 pages
Dynamic Programming Overview and Applications
No ratings yet
Dynamic Programming Overview and Applications
6 pages
Dynamic Programming
No ratings yet
Dynamic Programming
39 pages
Optimization: Dynamic Programming
No ratings yet
Optimization: Dynamic Programming
49 pages
Dynamic Programming
No ratings yet
Dynamic Programming
28 pages
SonicOS 6
No ratings yet
SonicOS 6
19 pages
Air Pollution and Society
No ratings yet
Air Pollution and Society
6 pages
Analysis and Application of Dynamic Programming
No ratings yet
Analysis and Application of Dynamic Programming
5 pages
IEEE 802.11s: WLAN Mesh Networking
No ratings yet
IEEE 802.11s: WLAN Mesh Networking
9 pages
Quantum Computing in India Recent Developments and
No ratings yet
Quantum Computing in India Recent Developments and
3 pages
Pandey 2021
No ratings yet
Pandey 2021
6 pages
Environmental Pollution Waste Generation and Human
No ratings yet
Environmental Pollution Waste Generation and Human
3 pages
ADSTuner: Hyperparameter Optimization Guide
No ratings yet
ADSTuner: Hyperparameter Optimization Guide
11 pages
Water Pollution The Problems and Solutions
No ratings yet
Water Pollution The Problems and Solutions
7 pages
The Quantum Computing Revolution: Challenges and Opportunities
No ratings yet
The Quantum Computing Revolution: Challenges and Opportunities
14 pages
EI 2021 Art00002 Jungwook-Lim
No ratings yet
EI 2021 Art00002 Jungwook-Lim
5 pages
An Introduction To Quantum Computing: September 2007
No ratings yet
An Introduction To Quantum Computing: September 2007
35 pages
Quantum Computing
No ratings yet
Quantum Computing
6 pages
Education Challenges for Hijras in Bangladesh
No ratings yet
Education Challenges for Hijras in Bangladesh
17 pages
Transforming Research With Quantum Computing AuthorVersion
No ratings yet
Transforming Research With Quantum Computing AuthorVersion
11 pages
The Review of Technology in Monitoring The Heart H
No ratings yet
The Review of Technology in Monitoring The Heart H
9 pages
HPE Telco Service Director - Zero-Touch Orchestration For Physical and Virtualized Environments-A00008912enw
No ratings yet
HPE Telco Service Director - Zero-Touch Orchestration For Physical and Virtualized Environments-A00008912enw
6 pages
Ensemble Saas Mano
No ratings yet
Ensemble Saas Mano
4 pages
Effectof Ethno Religious Conflictcorrection
No ratings yet
Effectof Ethno Religious Conflictcorrection
18 pages
Keysight Vision Orchestrator
No ratings yet
Keysight Vision Orchestrator
5 pages
Secure LAN Design for AERE
No ratings yet
Secure LAN Design for AERE
9 pages
Independence, Nigeria
No ratings yet
Independence, Nigeria
16 pages
IJAISE4
No ratings yet
IJAISE4
7 pages
RK Kanodia Networks
No ratings yet
RK Kanodia Networks
961 pages
Chemical Engineering Lectures
No ratings yet
Chemical Engineering Lectures
101 pages
ST-08 Pressure Stability 15.12.2006
No ratings yet
ST-08 Pressure Stability 15.12.2006
7 pages
Veynante 1997
No ratings yet
Veynante 1997
31 pages
Basic Properties of Fluids in Mechanics
No ratings yet
Basic Properties of Fluids in Mechanics
8 pages
Catalogo Lámparas
No ratings yet
Catalogo Lámparas
56 pages
Mech FM Model QP (N-Scheme)
No ratings yet
Mech FM Model QP (N-Scheme)
10 pages
Sirio TH From SN 12540 en PDF
100% (1)
Sirio TH From SN 12540 en PDF
137 pages
Magnetic Resonance Image Reconstruction
No ratings yet
Magnetic Resonance Image Reconstruction
19 pages
Mathematics II
No ratings yet
Mathematics II
2 pages
Illumination Sec 1
No ratings yet
Illumination Sec 1
48 pages
Hydroacoustic Effects on Fish Guide
No ratings yet
Hydroacoustic Effects on Fish Guide
533 pages
Lorenz Properties
No ratings yet
Lorenz Properties
6 pages
L6 Biomechanical Principles of Tooth Preparation
No ratings yet
L6 Biomechanical Principles of Tooth Preparation
31 pages
CSE C 2024 2025 Odd Semester
No ratings yet
CSE C 2024 2025 Odd Semester
1 page
Chemistry Lab Manual for Students
No ratings yet
Chemistry Lab Manual for Students
69 pages
Knauf AMF PDF
No ratings yet
Knauf AMF PDF
82 pages
Plant Area
No ratings yet
Plant Area
6 pages
) D (U Table Break Force
No ratings yet
) D (U Table Break Force
1 page
Math 3410 Homework 5 Solutions
No ratings yet
Math 3410 Homework 5 Solutions
5 pages
ĐỀ ĐỀ XUẤT 11
100% (1)
ĐỀ ĐỀ XUẤT 11
31 pages
Statistics M.Sc. Syllabus
No ratings yet
Statistics M.Sc. Syllabus
67 pages
Experiment No. 3
No ratings yet
Experiment No. 3
5 pages
Physics DPP for Class 11 Students
No ratings yet
Physics DPP for Class 11 Students
5 pages
Finnemore - Solutions - 11e - Chapter - 02-Problem Statements
No ratings yet
Finnemore - Solutions - 11e - Chapter - 02-Problem Statements
7 pages
WELL FOUNDATIONS Numerical Part
No ratings yet
WELL FOUNDATIONS Numerical Part
40 pages
LPI Recertification Written Instruction Guidelines 2020 1.1
No ratings yet
LPI Recertification Written Instruction Guidelines 2020 1.1
2 pages
BE02000011
No ratings yet
BE02000011
3 pages
Hooke's Law
No ratings yet
Hooke's Law
14 pages
Auirgps4067d1 3 PDF
No ratings yet
Auirgps4067d1 3 PDF
13 pages

Introduction To Dynamic Programming

Uploaded by

Introduction To Dynamic Programming

Uploaded by

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

Introduction To Dynamic Programming

Article · January 1967

The user has requested enhancement of the downloaded file.

{Armour Research Foundation of Illinois Institute of Technology)

D Y N A M I C programming, a mathematical field that has grown up

THE TERM 'DYNAMIC PROGRAMMING"

over the segment of the plane

where the gi are continuous. Since the maximum depends only

where, since g1 and g2 are mathematically defined, the value of x

for practical purposes

The recurrence relationship gives

if n is the value of t for which

These equations give

If we save it one time-period the value is

then we should not trade it in immediately. This of course will

Outdating would also affect st but it is assumed this is incor-

Define commutation functions

which can be tabulated to obtain the maximum value by inspection.

etc. in successive periods.

SUCCESSIVE APPROXIMATION IN STRATEGY SPACE

and the cost p(z) of an excess z of claims over reserves

Then the optimal policy has the form

a subset of possible solutions, and the optimal solution of this

SIMULATION OF MULTI-STAGE DECISION PROCESSES

View publication stats

You might also like