0% found this document useful (0 votes)

77 views8 pages

Dynamic Programming

The document discusses the key features of dynamic programming problems: 1) Problems are divided into stages where decisions must be made; 2) Each stage has associated states representing possible system conditions; 3) Decisions transition the state to the next stage; 4) The solution provides an optimal policy prescribing the best decision for each state. Dynamic programming solves problems recursively by working backward from the last stage to the first.

Uploaded by

kasim

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as RTF, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

77 views8 pages

Dynamic Programming

Uploaded by

kasim

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as RTF, PDF, TXT or read online on Scribd

Dynamic programming

These basic features that characterize dynamic programming problems are the following

Stages: The problem can be divided into stages, with a policy decision required at each stage. If
an investment planning problem has n period horizon then the problem can be divided into n
stages where each stage represents each period in the planning horizon. At each stage decision
has to be made what to do: how much to invest, for instance, in the investment planning problem.
Generally, dynamic programming problems require making a sequence of interrelated decisions,
where each decision corresponds to one stage of the problem.

States: Each stage has a number of states associated with the beginning of that stage. For
instance, if the problem is to find the shortest route from a certain initial node to a destination
node, at stage n, the states are the nodes where the traveler might be located and from which
points he wants to decide the next route to the next nodes. In general, the states are the various
possible conditions in which the system might be at that stage of the problem. The number of
states may be either finite or infinite. For instance, in a shortest route problem, there are a few
discrete number of nodes at a given stage representing the starting state of the system at that
stage. On the other hand, the amount of previous period investments that begin investment
decision at period k may be a continuous number with an infinite number of possible values.

State transition: The effect of the policy decision at each stage is to transform the current state
to a state associated with the beginning of the next stage (possibly according to a probability
distribution). The fortune seeker’s decision as to his next destination led him from his current
state to the next state on his journey. This procedure suggests that dynamic programming
problems can be interpreted in terms of the networks. Each node would correspond to a state.
The network would consist of columns of nodes, with each column corresponding to a stage, so
that the flow from a node can go only to a node in the next column to the right. The links from a
node to nodes in the next column correspond to the possible policy decisions on which state to
go to next. The value assigned to each link usually can be interpreted as the immediate
contribution to the objective function from making that policy decision. In most cases, the
objective corresponds to finding either the shortest or the longest path through the network.

Optima policy decision: The solution procedure is designed to find an optimal policy for the
overall problem, i.e., a prescription of the optimal policy decision at each stage for each of the
possible states. For the stagecoach problem, the solution procedure constructed a table for each
stage (n) that prescribed the optimal decision (xn*) for each possible state (s). Thus, in addition
to identifying three optimal solutions (optimal routes) for the overall problem, the results show
the fortune seeker how he should proceed if he gets detoured to a state that is not on an optimal
route. For any problem, dynamic programming provides this kind of policy prescription of what
to do under every possible circumstance (which is why the actual decision made upon reaching a
particular state at a given stage is referred to as a policy decision). Providing this additional
information beyond simply specifying an optimal solution (optimal sequence of decisions) can
be helpful in a variety of ways, including sensitivity analysis.

Optimality Principle: Given the current state, an optimal policy for the remaining stages is
independent of the policy decisions adopted in previous stages. Therefore, the optimal immediate
decision depends on only the current state and not on how you got there. This is the principle of
optimality for dynamic programming.

Given the state in which the fortune seeker is currently located, the optimal life insurance policy
(and its associated route) from this point onward is independent of how he got there. For
dynamic programming problems in general, knowledge of the current state of the system
conveys all the information about its previous behavior necessary for determining the optimal
policy henceforth. (This property is the Markovian property). Any problem lacking this property
cannot be formulated as a dynamic programming problem.

Optimal policy for last stage: The solution procedure begins by finding the optimal policy for
the last stage. The optimal policy for the last stage prescribes the optimal policy decision for
each of the possible states at that stage. The solution of this one-stage problem is usually trivial,
as it was for the stagecoach problem.

A recursive relationship that identifies the optimal policy for stage n, given the optimal policy
for stage n + 1, is available.

For the stagecoach problem, this recursive relationship was

f n(s) = minxn {CsXn + fn+1(xn)}.

Therefore, finding the optimal policy decision when you start in state s at stage n requires finding
the minimizing value of xn. For this particular problem, the corresponding minimum cost is
achieved by using this value of xn and then following the optimal policy when you start in state
xn at stage n +1.
The precise form of the recursive relationship differs somewhat among dynamic programming
problems. However, notation analogous to that introduced in the preceding section will continue
to be used here, as summarized below.

N=number of stages.

n =label for current stage (n = 1, 2, . . . , N).

sn = current state for stage n.

xn =decision variable for stage n.

xn* = optimal value of xn (given sn).

fn(sn, xn) = contribution of stages n, n =1, . . . , N to objective function if system starts in state sn
at stage n, immediate decision is xn, and optimal decisions are made thereafter.

f n(sn) = fn(sn, xn).

The recursive relationship will always be of the form

f n(sn) =max xn {fn(sn, xn)} or f n(sn) = min xn {fn(sn, xn)},

where fn(sn, xn) would be written in terms of sn, xn, f*n+1(sn+1), and probably some measure of
the immediate contribution of xn to the objective function. It is the inclusion of f*n+1(sn+1) on
the right-hand side, so that f*n(sn) is defined in terms of f*n+1(sn+1), that makes the expression
for f*n (sn) a recursive relationship.

The recursive relationship keeps recurring as we move backward stage by stage. When the
current stage number n is decreased by 1, the new f*n(sn) function is derived by using the
f*n+1(sn+1) function that was just derived during the preceding iteration, and then this process
keeps repeating. This property is emphasized in the next (and final) characteristic of dynamic
programming.
DP solution process

DP problems can be solved either through backward or forward recursion methods. In forward
recursion, the solution starts from stage 1 and proceeds to stage n. In backward recursion, the
reverse order is taken: from stage n to stage 1. Both approaches should give the same result but
depending on the problem one or the other may be more efficient computationally. (Taha 2007)

Backward Recursion: When we use this recursive relationship, the solution procedure starts at
the end and moves backward stage by stage—each time finding the optimal policy for that stage
— until it finds the optimal policy starting at the initial stage. This optimal policy immediately
yields an optimal solution for the entire problem, namely, x1* for the initial state s1, then x2* for
the resulting state s2, then x3* for the resulting state s3, and so forth to x*N for the resulting
stage sN.

This backward movement was demonstrated by the stagecoach problem, where the optimal
policy was found successively beginning in each state at stages 4, 3, 2, and 1, respectively.1 For
all dynamic programming problems, a table such as the following would be obtained for each
stage (n = N, N _1 . . . , 1).

xn fn(sn, xn) fn(sn) Xn

When this table is finally obtained for the initial stage (n =1), the problem of interest is solved.
Because the initial state is known, the initial decision is specified by x1* in this table. The
optimal value of the other decision variables is then specified by the other tables in turn
according to the state of the system that results from the preceding decisions.

State of the system

State of the system at stage n is the subtlest component of DP problems. Though the context of
the problem is determinant of what it means a state, the following two questions help guide
determining the relevant states
 What relationships binds the stages together
 What information is needed to make feasible decisions at the current stage without
reexamining the decisions made at the previous stages.


Deterministic Dynamic Programming

 The state at the next stage is completely determined by the state and policy decision at the
current stage.
 Structure of deterministic dynamic programming is as shown in the diagram here

Stage Stage

n n+1
xn
Sn Sn +1
State
Contribution of xn
Value fn(sn, xn) fn+1(sn+1)

Thus, at stage n the process will be in some state sn. Making policy decision xn then moves the
process to some state sn+1 at stage n +1. The contribution thereafter to the objective function
under an optimal policy has been previously calculated to be f*n+1(sn+1). The policy decision
xn also makes some contribution to the objective function. Combining these two quantities in an
appropriate way provides fn(sn, xn), the contribution of stages n onward to the objective
function. Optimizing with respect to xn then gives f n*(sn) = fn(sn, xn*). After xn* and f n*(sn)
are found for each possible value of sn, the solution procedure is ready to move back one stage.
Types of deterministic DP

Objective function: Minimization/ maximization of a sum, product of contributions of individual

stages

Nature of set of states in each stage: discrete, continuous, a state vector

A Prevalent Problem Type—The Distribution of Effort Problem

For this type of problem, there is just one kind of resource that is to be allocated to a number of
activities. The objective is to determine how to distribute the effort (the resource) among the
activities most effectively.

Assumptions

Though like LP this is a resource allocation problem, there is still fundamental difference
between the two. One key difference is that the distribution of effort problem involves only one
resource (one functional constraint), whereas linear programming can deal with thousands of
resources. The three assumptions of proportionality, divisibility and certainty can be violated in
this model. The only assumption held here as in LP is additivity. This assumption is needed
because of the principle of optimality in DP.

Formulation

Because they always involve allocating one kind of resource to a number of activities,
distribution of effort problems always have the following dynamic programming formulation
(where the ordering of the activities is arbitrary):

Stage n = activity n (n _ 1, 2, . . . , N).

xn = amount of resource allocated to activity n.

State sn = amount of resource still available for allocation to remaining activities (n, . . . , N).

The reason for defining state sn in this way is that the amount of the resource still available for
allocation is precisely the information about the current state of affairs (entering stage n) that is
needed for making the allocation decisions for the remaining activities.

When the system starts at stage n in state sn, the choice of xn results in the next state at stage n+
1 being sn+1 =sn -xn, as depicted below:

Stage Stage

n n+1
xn
Sn Sn - Xn
State

Probabilistic dynamic programming model

Probabilistic dynamic programming differs from deterministic dynamic programming in that the
state at the next stage is not completely determined by the state and policy decision at the current
stage. Rather, there is a probability distribution for what the next state will be. However, this
probability distribution still is completely determined by the state and policy decision at the
current stage. The structure of the probabilistic DP model is shown in the diagram below. We
denote by S, the number of states in n+1 stage

For the purposes of this diagram, we let S denote the number of possible states at stage n + 1 and
label these states on the right side as 1, 2, . . . , S. The system goes to state i with probability pi (i
=1, 2, . . . , S) given state sn and decision xn at stage n. If the system goes to state i, Ci is the
contribution of stage n to the objective function.

Stage n Stage

Contribution from stage n n+1

1
C1
P1 f* n+1(1)
Decision
Sn P2 C2 2
xn
State
f* n+1(2)
PS Cs
….

S
f* n+1(s)

Dynamic Programming 7707
No ratings yet
Dynamic Programming 7707
51 pages
Dynamic Programming for Analysts
No ratings yet
Dynamic Programming for Analysts
32 pages
Dynamic Programming Overview
No ratings yet
Dynamic Programming Overview
5 pages
Dynamic Programming
No ratings yet
Dynamic Programming
9 pages
Dynamic Programming
No ratings yet
Dynamic Programming
30 pages
04 - OR2 - Dynamic Programming
No ratings yet
04 - OR2 - Dynamic Programming
14 pages
Dynamic Programming Guide
No ratings yet
Dynamic Programming Guide
3 pages
Dynamic Programming - Fill Up The Blanks
No ratings yet
Dynamic Programming - Fill Up The Blanks
4 pages
Dynamic Programming in Real-Life Problems
No ratings yet
Dynamic Programming in Real-Life Problems
10 pages
Dynamic Programming in Graphs
No ratings yet
Dynamic Programming in Graphs
7 pages
Deterministic Dynamic Programming: To The Next
No ratings yet
Deterministic Dynamic Programming: To The Next
52 pages
Group 5 Dyn Prog
No ratings yet
Group 5 Dyn Prog
15 pages
Lecture 8 Dynamic Programming
No ratings yet
Lecture 8 Dynamic Programming
32 pages
DP Methods
No ratings yet
DP Methods
61 pages
Notas - Dynamic Optimation and Optimal Control
No ratings yet
Notas - Dynamic Optimation and Optimal Control
26 pages
Dynamic Programming - Part 1
No ratings yet
Dynamic Programming - Part 1
23 pages
Dynamic Programming
No ratings yet
Dynamic Programming
37 pages
Understanding Dynamic Programming Techniques
No ratings yet
Understanding Dynamic Programming Techniques
9 pages
Ie 303 - LN9 - 1
No ratings yet
Ie 303 - LN9 - 1
17 pages
Process Optimisation: Dynamic Programming
No ratings yet
Process Optimisation: Dynamic Programming
35 pages
Dynamic Programming - Introdution
No ratings yet
Dynamic Programming - Introdution
3 pages
Operation Research 2 Dynamic Programming
No ratings yet
Operation Research 2 Dynamic Programming
34 pages
Week 10 - Dynamic Programming
No ratings yet
Week 10 - Dynamic Programming
12 pages
Hiller - Dynamic Programming PDF
No ratings yet
Hiller - Dynamic Programming PDF
6 pages
Dynamic Programming Essentials
No ratings yet
Dynamic Programming Essentials
34 pages
Dynamic Programming Examples
No ratings yet
Dynamic Programming Examples
18 pages
Dynamic Programming
100% (1)
Dynamic Programming
15 pages
CSC 411 Dynamic Programming
No ratings yet
CSC 411 Dynamic Programming
7 pages
Unit 4 New
No ratings yet
Unit 4 New
56 pages
Dynamic Programming
No ratings yet
Dynamic Programming
16 pages
Dynamic Programming - Multiple Choice Questions
No ratings yet
Dynamic Programming - Multiple Choice Questions
4 pages
CH 9 MDP
No ratings yet
CH 9 MDP
97 pages
Modified Dynamic
No ratings yet
Modified Dynamic
63 pages
Dynamic Programming in 0/1 Knapsack
No ratings yet
Dynamic Programming in 0/1 Knapsack
23 pages
Lec37 Dynamic Programming
No ratings yet
Lec37 Dynamic Programming
23 pages
Lecture 2 Deterministic
No ratings yet
Lecture 2 Deterministic
21 pages
Dynamic Programming
No ratings yet
Dynamic Programming
6 pages
ADA Unit 3
No ratings yet
ADA Unit 3
19 pages
Dynamic Programming: of Optimality
No ratings yet
Dynamic Programming: of Optimality
11 pages
PPT3 - W2-S3 - Dynamic Programming - R0
No ratings yet
PPT3 - W2-S3 - Dynamic Programming - R0
29 pages
Dynamic Programming: Xiaolan Xie
No ratings yet
Dynamic Programming: Xiaolan Xie
97 pages
Dynamic Programming MCQs Guide
No ratings yet
Dynamic Programming MCQs Guide
5 pages
Opt Class CH17102 - Unit 4
No ratings yet
Opt Class CH17102 - Unit 4
26 pages
Dynamic Programming in Operations Research
No ratings yet
Dynamic Programming in Operations Research
33 pages
Optimization: Dynamic Programming
No ratings yet
Optimization: Dynamic Programming
49 pages
Namic Programming
No ratings yet
Namic Programming
18 pages
Understanding Dynamic Programming Concepts
No ratings yet
Understanding Dynamic Programming Concepts
13 pages
Chapter 12
No ratings yet
Chapter 12
21 pages
Dynamic Prog
No ratings yet
Dynamic Prog
25 pages
Introduction To Dynamic Programming
No ratings yet
Introduction To Dynamic Programming
15 pages
Dynamic Programming Overview and Applications
No ratings yet
Dynamic Programming Overview and Applications
6 pages
Dynamic Programming
No ratings yet
Dynamic Programming
39 pages
NERIST NOTES DynamicProgramming
No ratings yet
NERIST NOTES DynamicProgramming
12 pages
Dynamic Programming & Algorithms
No ratings yet
Dynamic Programming & Algorithms
14 pages
DP - Bellman - 1741339134 2025-03-07 09 - 19 - 05
No ratings yet
DP - Bellman - 1741339134 2025-03-07 09 - 19 - 05
13 pages
Dynamic Programming and Linear Programming
No ratings yet
Dynamic Programming and Linear Programming
14 pages
Frederick S. Hillier, Gerald J. Lieberman (Late) - Introduction To Operations Research (2015
No ratings yet
Frederick S. Hillier, Gerald J. Lieberman (Late) - Introduction To Operations Research (2015
6 pages
Governmental Accounting Principles Overview
No ratings yet
Governmental Accounting Principles Overview
12 pages
General Fund Overview and Accounting
No ratings yet
General Fund Overview and Accounting
16 pages
Research Questions & Hypotheses Guide
No ratings yet
Research Questions & Hypotheses Guide
11 pages
Government Capital Project Funds
No ratings yet
Government Capital Project Funds
8 pages
Research Proposal
100% (1)
Research Proposal
6 pages
Chapter 1
No ratings yet
Chapter 1
17 pages
Civil Engineering Internship Report
100% (1)
Civil Engineering Internship Report
44 pages
Research Proposal
No ratings yet
Research Proposal
9 pages
Chapter 4
No ratings yet
Chapter 4
10 pages
Chapter 2
No ratings yet
Chapter 2
16 pages
Management Science Course Overview
No ratings yet
Management Science Course Overview
19 pages
Decision Analysis for Business
No ratings yet
Decision Analysis for Business
35 pages
Procurment Article Review
100% (2)
Procurment Article Review
4 pages
Zemen Post-Graduate College Department of Management MA in Project Management
100% (1)
Zemen Post-Graduate College Department of Management MA in Project Management
9 pages
Article Reviewww
No ratings yet
Article Reviewww
4 pages
Procurment Article Review
100% (6)
Procurment Article Review
4 pages
Entrepreneur Group Assignment
No ratings yet
Entrepreneur Group Assignment
23 pages
Procurment Case Study
100% (2)
Procurment Case Study
15 pages
Assignment For Moral and Citizenship Education
No ratings yet
Assignment For Moral and Citizenship Education
1 page
Project Cost Managment Article Review
No ratings yet
Project Cost Managment Article Review
3 pages
General Objective of The Study
No ratings yet
General Objective of The Study
28 pages
Social Science Exam Guide
No ratings yet
Social Science Exam Guide
3 pages
ADAMS Solver Guide: Nonlinear Equations
100% (1)
ADAMS Solver Guide: Nonlinear Equations
67 pages
Differential Evolution
No ratings yet
Differential Evolution
11 pages
Dynamic Programming Basics
No ratings yet
Dynamic Programming Basics
53 pages
In Uence of The Probing Definition On The Atness Measurement
No ratings yet
In Uence of The Probing Definition On The Atness Measurement
7 pages
Operations Research Module Overview
100% (5)
Operations Research Module Overview
163 pages
Simplex Method for Linear Programming
No ratings yet
Simplex Method for Linear Programming
24 pages
Enhanced GWO for Heart Disease Prediction
No ratings yet
Enhanced GWO for Heart Disease Prediction
13 pages
Understanding the Traveling Salesman Problem
No ratings yet
Understanding the Traveling Salesman Problem
12 pages
Financial Risk and Market Models
No ratings yet
Financial Risk and Market Models
44 pages
ATASSN CAT Course Syllabus
No ratings yet
ATASSN CAT Course Syllabus
8 pages
Sensitivity Analysis in Linear Programming
No ratings yet
Sensitivity Analysis in Linear Programming
10 pages
Dynamic Programming Presentation
No ratings yet
Dynamic Programming Presentation
20 pages
Control Arm Topology Optimization Guide
No ratings yet
Control Arm Topology Optimization Guide
5 pages
CDMA RF Network Optimization Guidebook
No ratings yet
CDMA RF Network Optimization Guidebook
223 pages
Tutorial Metaheuristics
No ratings yet
Tutorial Metaheuristics
131 pages
Hindi Character Recognition Report
No ratings yet
Hindi Character Recognition Report
50 pages
CS772 Lec10
No ratings yet
CS772 Lec10
23 pages
Math Ass3
No ratings yet
Math Ass3
7 pages
Assignment 3
No ratings yet
Assignment 3
3 pages
Week 4 Lagrange Suff
No ratings yet
Week 4 Lagrange Suff
12 pages
Propriety Audit
No ratings yet
Propriety Audit
3 pages
Optimization Lec04 LinearProgramming
No ratings yet
Optimization Lec04 LinearProgramming
15 pages
Open Pit Mining Optimization Techniques
No ratings yet
Open Pit Mining Optimization Techniques
159 pages
Extended Enterprise Supply Chain
No ratings yet
Extended Enterprise Supply Chain
19 pages
3 Numerical Optimization
No ratings yet
3 Numerical Optimization
17 pages
BMS Operations Research MCQs
No ratings yet
BMS Operations Research MCQs
3 pages
Duality Theory and Sensitivity Analysis
No ratings yet
Duality Theory and Sensitivity Analysis
26 pages
Week 5 AI
No ratings yet
Week 5 AI
8 pages
Pugh Matrix: Decision-Making Tool
50% (2)
Pugh Matrix: Decision-Making Tool
18 pages
B.Tech CSE Curriculum Overview
No ratings yet
B.Tech CSE Curriculum Overview
171 pages

Dynamic Programming

Uploaded by

Dynamic Programming

Uploaded by

Dynamic programming

For the stagecoach problem, this recursive relationship was

f n*(s) = minxn {CsXn + f*n+1(xn)}.

n =label for current stage (n = 1, 2, . . . , N).

sn = current state for stage n.

xn =decision variable for stage n.

xn* = optimal value of xn (given sn).

f n*(sn) = fn(sn, xn*).

The recursive relationship will always be of the form

f n*(sn) =max xn {fn(sn, xn)} or f n*(sn) = min xn {fn(sn, xn)},

xn fn(sn, xn) fn*(sn) Xn*

State of the system

Deterministic Dynamic Programming

Objective function: Minimization/ maximization of a sum, product of contributions of individual

Nature of set of states in each stage: discrete, continuous, a state vector

A Prevalent Problem Type—The Distribution of Effort Problem

Stage n = activity n (n _ 1, 2, . . . , N).

xn = amount of resource allocated to activity n.

Probabilistic dynamic programming model

Contribution from stage n n+1

You might also like

f n(s) = minxn {CsXn + fn+1(xn)}.

f n(sn) = fn(sn, xn).

f n(sn) =max xn {fn(sn, xn)} or f n(sn) = min xn {fn(sn, xn)},

xn fn(sn, xn) fn(sn) Xn