0% found this document useful (0 votes)
49 views

Lecture 8 Dynamic Programming

Dynamic programming is a technique for solving sequential decision problems that breaks them down into smaller subproblems. It transforms a complex multi-variable problem into a series of single-variable problems. The technique divides a problem into stages where optimal decisions are made sequentially. It uses the principle of optimality - that the remaining decisions should be optimal given the current state, regardless of how it was reached. Solutions are found recursively by solving subproblems and using their solutions to solve larger subproblems until the overall optimal solution is determined.

Uploaded by

chalie molla
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
49 views

Lecture 8 Dynamic Programming

Dynamic programming is a technique for solving sequential decision problems that breaks them down into smaller subproblems. It transforms a complex multi-variable problem into a series of single-variable problems. The technique divides a problem into stages where optimal decisions are made sequentially. It uses the principle of optimality - that the remaining decisions should be optimal given the current state, regardless of how it was reached. Solutions are found recursively by solving subproblems and using their solutions to solve larger subproblems until the overall optimal solution is determined.

Uploaded by

chalie molla
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 32

Dynamic programming

1 8/16/2017
Introduction
In most practical problems, decisions have to be made
sequentially at different points in time, at different
points in space, and at different levels, say, for a system
and/or a subsystem.

The problems in which the decisions are to be made


sequentially are called sequential decision problems.

Since these decisions are to be made at a number of


stages, they are also referred to as multistage decision
problems.

2 8/16/2017
As applied to dynamic programming, a multistage
decision process is one in which a number of single-stage
processes are connected in series so that the output of one
stage is the input of the succeeding stage.

Strictly speaking, this type of process should be called a


serial multistage decision process since the individual
stages are connected head to tail with no recycle.

3 8/16/2017
Example of A serial Multistage decision problem

The stage numbers may be assigned in increasing order


either in the forward direction or in the backward direction.
4 8/16/2017
Multi-stage decision making
5

 The objective of a multistage decision problem


is to find x1, x2, x3, ,…., xn to maximize a
function of return ( the sum of returns over all
stages) while satisfying the stage
transformation equation.

 Note: A multistage problem may consist of more


than one state variables.

8/16/2017
Dynamic programming (DP) is a useful mathematical
technique ideally suited for the optimization of sequential
decision problems (multistage decision problems).

Dynamic programming (DP) provides a systematic


procedure for determining the optimal combination of
sequential (multi-stage) decisions.

Example: Reservoir operation problem: release decisions have to be


made sequentially across time periods based on the inflow
and the available storage in each period.

This technique was developed by Richard Bellman in the


early 1950s.
6 8/16/2017
 Dynamic programming is an approach that divides
the original optimization problem into a set of
smaller optimization problems, each of which needs
to be solved before the overall optimum solution to
the original problem can be identified.
 DP transforms a complicated n-variable decision
process into a series of n-stages with a single
decision at each stage i.e. the dynamic
programming technique decomposes a multistage
decision problem as a sequence of single-stage
decision problems.

 In contrast to linear programming, DP problems


are not amenable to a single, standard algebraic
7 formulation 8/16/2017
The optimization problem is transformed as a sequence
of n decision processes or stages

8 8/16/2017
Example of A single stage decision problem

9 8/16/2017
Example
For instance, take the case of maximizing a monthly
(variable) yield for a reservoir of known capacity.
Assume further that there are n-months of historic (or
synthetic) flow records.

 Typically this is an optimization problem in which


the n-months’ releases are to be optimized.
 In a LP formulation, the optimization algorithm
attempts to maximize the releases by formulating
an appropriate objective function that involves the
releases of all the months.

10 8/16/2017
 DP, on the other hand, approaches the problem from a
different angle.

 In DP the optimization problem is transformed as a


sequence of n decision processes, or stages, at which the
single monthly release, say at month i, is “optimally”
selected to maximize the total release over the entire
length of record.

11 8/16/2017
 Thus, decision is made at each stage; in the present
example, at each month.
 This sequential decision process can be shown as in the
following figure:

r1 r2 ri rN-1 rN
S3 Si+1
S1 S2 Si
1 2 i N-1 N

d1 d2 di dN-1 dN

12 8/16/2017
Terminologies

 Stages: are the points, in time or space, where decision


are made.

 Decision Variables (X) : are courses of action to be


taken for each stage. e.g. the release of a reservoir at
each stage (i.e. month).

 State variables (Si): are variables that describe the


state of the system at any stage i e.g., the amount of
water available in the reservoir at the end of each stage.
13 8/16/2017
Terminologies
 Stage return (R): is a scalar measure of the
effectiveness of the decision making at each stage. It is a
function of (S, X).

 Stage transformation or state transition (T): is a


single-valued transformation which relates the input, the
output state, and the decision.

14 8/16/2017
Types of Multistage Decision Problems

1. Initial value problem. If the value of the initial state


variable, sn+1, is prescribed, the problem is called an
initial value problem.
2. Final value problem. If the value of the final state
variable, s1 is prescribed, the problem is called a final
value problem. Notice that a final value problem can be
transformed into an initial value problem by reversing
the directions of si ,i = 1, 2, . . . , n + 1.
3. Boundary value problem. If the values of both the input
and output variables are specified, the problem is
called a boundary value problem.
8/16/2017 15
Types of multistage problems: (a) initial value problem;
(b) final value problem; (c) boundary value problem.

16 8/16/2017
DP solution is based on Bellman’s principle
of optimality
Bellman’s principle of optimality state that
 Given the current state of a system, the optimal policy
(Sequence of decisions) for the remaining stages is independent
of the policy adopted in the previous stages.

 The principle implies that, given the state Si of the system at a


stage i, one must proceed optimally till the last stage,
irrespective of how one arrived at the state Si.

The solution begins by finding the optimal decision for each


possible state using the backward recursion or the Forward
recursion.

17 8/16/2017
Bellman’s principle of optimality

8/16/2017 18
Recursive nature of computations in DP
 Computations in DP are done recursively, so that the optimum
solution of one sub problem is used as an input to the next
sub problem.

 By the time the last sub problem is solved, the optimum


solution for the entire problem is at hand.

 The manner in which the recursive computations are


carried out depends on how we decompose the original
problem i.e. on the type of DP problem.

 The recursive computations can be done both in forward and


backward directions. Both recursions yield same solutions.

8/16/2017 19
Characteristics of a DP problem
 A single n-variable problem is divided into n number
of single variable problems.
 A problem is divided into stages, with a policy
decision required at each stage.
 Each stage has a number of possible states associated
with it.
 The policy decision transforms the current state into a
state associated with the next stage.
 A recursive relationship identifies the optimal
decision at a given stage for a specific state, given the
optimal decision for each state at the previous stage.
 A solution moves backward or forward, stage by stage
20 till optimal decision for the last stage is found. 8/16/2017
21 8/16/2017
22 8/16/2017
Example 1. Shortest route problem
Suppose that you want to select the shortest highway route
between two cities. The network in the Figure below shows
the possible routes between the starting city at node 1 and
the destination city at node 7. The routes pass through
intermediate cities designated by nodes 2 to 6.

8/16/2017 23
Solution
1. We can solve this problem by exhaustively
enumerating all the routes between node 1 and 7 (there
are five such routes). But in a large network exhaustive
enumeration may be difficult computationally. Thus,
another method of solution is required. DP is the most
suitable technique for such type of problems.

2. To solve the problem by DP, we first decompose it into


stages as delineated by the vertical dashed lines in the
next Figure. Next, we carry out the computations for
each stage separately.
8/16/2017 24
Decomposition of shortest route problem into stages

8/16/2017 25
Solution using DP : Points that should be noted
1. The problem is decomposed into three stages
2. At each stage a decision is made.
3. The solution to the problem is found recursively. there
are two ways – beginning at the left most node and
moving from left to right, called the forward-moving
(but backward-looking) algorithm, or beginning at the
most right column of nodes or states and moving from
right to left, called the backward-moving (but
forward-looking) algorithm.
First we will see the forward recursion and then we will
see the backward recursion.
Go to solution
8/16/2017 26
Example 2. Water Allocation problem

A total of 6 unit of water is to be allocated optimally to


three users. The allocation is made in discrete steps of one
unit ranging from 0 to 6. With the three users denoted as
user 1, User 2, and User 3 respectively, the returns obtained
from the users for a given allocation are given in the table
shown in the next slide.

The problem is to find allocations to the three users such


that the total return is maximized.

8/16/2017 27
28 8/16/2017
Go to solution
Exercises
1.

8/16/2017 29
2. A minimum-cost pipeline is to be laid
between points (towns) A and E. The
pipeline is required to pass through one
node out of B1, B2, and B3, one out of
C1, C2, and C3, and one out of D1, D2,
and D3 (see Fig. below). The costs
associated with the various segments of
the pipeline are given below:

8/16/2017 30
8/16/2017 31
8/16/2017 32

You might also like