Management Science Lecture Notes
Management Science Lecture Notes
z Decision Trees
Malcolm Gladwell
Blink: The Power of Thinking Without Thinking
2
Newsvendor Problem
z News vendor Phyllis orders newspapers for sale
ordered newspapers cost 20¢ each
she sells newspapers for 25¢ each
unsold newspapers are worthless
Prob(demand = j) = 1/5 for j = 6 to 10.
demand is j.
rij = 25 i – 20 i = 5 i if i ≤ j
rij = 25 j – 20 i if i > j
3
Papers Ordered
6 30 30 30 30 30
7 10 35 35 35 35
8 -10 15 40 40 40
9 -30 -5 20 45 45
10 -50 -25 0 25 50
5 25 25 25 25 25
11 -70 -45 -20 5 30
We now consider various metrics for measuring the quality of a solution. The
maximin criteria is very conservative, and is arguably very pessimistic. For each
number that you order, we consider the worst possible profit. This is stored in the
column min (rij). We then maximize over all possible orders. In the newsvendor
problem, this will always lead to ordering the minimum possible demand for
newspapers.
The Maximax Criterion
Papers Demanded
6 7 8 9 10 max (rij)
30
Papers Ordered
6 30 30 30 30 30
7 10 35 35 35 35 35
8 -10 15 40 40 40 40
9 -30 -5 20 45 45 45
10 -50 -25 0 25 50 50
The maximax criteria is really dumb. For each number that you order, we consider
the best possible profit. This is stored in the column max (rij). Then you would
order the number that maximizes the max. For the newsvendor problem, you would
always order the maximum number of newspapers that can be sold.
The Minimax Regret Criterion
Papers Demanded
Papers Ordered 6 7 8 9 10
6 30 30 30 30 30
7 10 35 35 35 35
8 -10 15 40 40 40
9 -30 -5 20 45 45
10 -50 -25 0 25 50
max i rij 30 35 40 45 50
The minimax regret criterion is a somewhat weird criterion. It assumes that you
will always look back with regret on what you do. So, if you order 6 papers and the
number of customers is 8, you have a regret of $.10 because had you ordered
exactly 8 you would have made $.10 more. If you order 10 papers, and the number
of customers is 8, then you have a regret of $.40 since had you ordered only 8
newspapers, you would have made $.40 more.
max i rij 30 35 40 45 50
Here we list all of the regrets. Then we create a column of maximum regrets. The
minimax regret criterion is to order the number that minimizes the max regret. In
this case, we could order 6 or 7 papers. In both cases, the maximum possible regret
is $.20.
The amount of water is
just right. But the glass
is twice as big as it should
be.
6 30 30 30 30 30
7 10 35 35 35 35
8 -10 15 40 40 40
9 -30 -5 20 45 45
10 -50 -25 0 25 50
The expected value ( a horribly named word) is the average value one would obtain
if one could repeat this event an arbitrarily large number of times. It is also a
standard term used in probability, and is also called the “mean”.
It’s also a very useful way of assessing the return of an investment, and is arguably
the most common measure of performance used.
Expected Value
Papers Demanded Expected
6 7 8 9 10 Value
30
Papers Ordered
6 30 30 30 30 30
7 10 35 35 35 35 30
8 -10 15 40 40 40 25
9 -30 -5 20 45 45 15
10 -50 -25 0 25 50 0
13
If we changed the selling price from $.25 to $.30, the values rij would all change as
would the expected value of each number ordered.
Note that the expected value increases with the number of papers ordered and then
decreases. This always happens with newsvendor problems, and it is the key to the
solution approach.
Back to the newsvendor problem
z Assumptions:
D = demand (random variable)
known probability p(d) that demand is d;
q is the amount ordered (decision variable);
c(q, d) = “profit” if q items are ordered and d items
are demanded .
co = Overstocking cost = ordering cost.
It is the unit cost of ordering too many.
In previous example this was 20¢.
cu = Understocking cost = price - ordering cost
It is the unit “cost” of ordering too few.
In previous example this was 25¢ - 20¢ = 5¢.
It is an opportunity cost.
14
The standard assumptions for a newsvendor problem is that you know the demand
distribution, you pay a cost for each item ordered, and you receive an amount for
each sold.
In order to determine the optimal solution, we focus on two values. The first value
is the overstocking cost co. This is the cost incurred for every newspaper purchased
that is not sold. In this case, it is 20¢.
6 30 30 30 30 30
7 10 35 35 35 35
8 -10 15 40 40 40
9 -30 -5 20 45 45
10 -50 -25 0 25 50
6 60 10
Find the largest
7 64 4
value of q such
8 62 -2 MV(q) is positive.
9 54 -8
10 40 -14
17
Here are the expected values when the selling price of the newspaper was 30¢. We
then list the marginal values. The marginal value for purchasing q newspapers is
E(q) – E(q-1), is the value of the q-th newspaper to the expected return. We note
that MV(q) is a decreasing function. So, if we choose the largest value q for which
MV(q) is positive, we will also be maximizing E(q). In subsequent slides we focus
on MV(q).
Marginal Analysis
We will find the largest value of q such that MV(q) is
positive
Case 1: D ≤ q-1 Value
-co decreases
Prob(D ≤ q-1)
by co.
MV(q) =
E(q) – E(q-1) A
Case 2: D ≥ q Value
cu increases
1 - Prob(D ≤ q-1) by cu.
To compute MV(q), we consider two cases. In the first case, the number demanded
is at most q-1, in which case the q-th newspaper is unsold. This leads to MV(q)
being –co. It occurs with probability Prob(D ≤ q-1).
In the second case, the number demanded is at least D. In this case, the q-th
newspaper is sold, and the revenue increases by cu. This occurs with probability
Prob(D ≥ q) = 1 - Prob(D ≤ q-1).
Solving using marginal analysis
19
We can now compute the expected value of MV(q) using the probabilities.
MV(q) has positive expected value when Prob(D ≤ q - 1) < cu / (c0 + cu).
So, to maximize E(q), we find the largest value of q such that MV(q) is positive,
which is also the largest value of q such that Prob(D ≤ q - 1) < cu / (c0 + cu). This is
the solution to the newsvendor problem.
Solution for Newsvendor Problem
Example 1. cu = 5¢
ordered cost: 20¢ each
co = 20¢
selling cost 25¢ each
cu / (c0 + cu) = 1/5
Prob(demand = j) = 1/5 for
j = 6 to 10. Prob(D ≤ 6 - 1) = 0
Prob(D ≤ 7 - 1) = 1/5
20
In our original instance, 6 is the largest value of q such that Prob(D ≤ q - 1) < cu /
(c0 + cu) = 1/5. Note that Prob(D ≤ 6 - 1) = 0, and Prob(D ≤ 7 - 1) = 1/5. (It turns
out that 7 is an alternative optimum solution.)
Example 2 for Newsvendor Problem
Papers Ordered
6 60
Prob(D ≤ 7 - 1) = 1/5
7 64
Prob(D ≤ 8 - 1) = 2/5
8 62
Optimum order is 7 9 54
newspapers. 10 40 21
If each newspaper is sold for $.30, 7 is the largest value of q such that Prob(D ≤ q
1) < cu / (c0 + cu) = 10/30=1/3. Note that Prob(D ≤ 7 - 1) = 1/5, and Prob(D ≤ 8 - 1)
= 2/5.
Examples of Optimal Ordering
Choose the greatest value of q such that
22
23
Juan Lee’s Decision Problem
It is January 10th, and Juan Lee is currently a fourth year
undergraduate in Management Science at Sloan. He has
decided to seek out a job as a consultant. He already has
received an offer from ABC consulting for $72,000 per year.
He has until February 1st to decide whether to accept the
offer. An old classmate of his, Mary Kumar, has told him
that she has recommended him highly to her consulting
firm, and feels that there is an excellent chance that they
would give him an offer for $80,000. However, they are not
prepared to make any decision until February 15th. If they
made him an offer, he would need to decide by March 1. He
also has the option of taking part in the consulting job fair in
the middle of March. He is fairly certain that he could get a
consulting job at that time, but is uncertain as to what he
would be paid.
24
25
Juan’s probabilities
z Juan does not know how to evaluate his
“excellent chances” at Mary’s firm, nor does he
know what to expect from the consulting fair.
26
A comment on probabilities
z Juan’s best guesses are not probabilities in the
sense of frequencies.
probabilities.
27
A natural question is: can one use subjective probabilities in making calculations on
probabilities? What this question means is “will the answers be useful and
meaningful?”
In our class we will use the subjective probabilities as though they are the
probabilities we are used to. In practice, using subjective probabilities is far better
than ignoring the uncertainties altogether. And the value of using the subjective
probabilities depends a lot on whether Juan is reasonably good at expressing his
lack of certainty, and whether he is reasonably consistent in assigning probabilities
to events.
Decision Trees
z Method of organizing decisions over time in the
face of uncertainties
Decision nodes:
Accept ABC
A •Represented as boxes
•lines coming from the nodes
Reject ABC
represent different choices.
.6
Mary’s Firm Event nodes:
makes offer
B •Represented as circles
•lines coming from the nodes
.4
Mary’s Firm represent different outcomes.
makes no offer 28
There are two types of nodes of the decision tree: decision nodes and event nodes.
Decision nodes are represented as rectangles, and event nodes are represented as
ovals.
Reject
E Medium Salary
Mary’s
A B offer
Low Salary
Reject
ABC
High Salary
D
No offer Medium Salary
from Mary
Low Salary 29
First Juan has to make a decision on whether or not to accept ABC. If he accepts
ABC, the decision process ends for Juan. Otherwise, he next finds out whether
there is an offer or not from Mary. If there is an offer, he can accept it or not.
If Juan does not receive an offer from Mary or if receives one and rejects it, he then
moves to the job fair and learns which of three salaries he will get.
Step 2. Assign Probabilities to Events
Accept ABC
Reject
E .5 Medium Salary
Mary
A B
.4 Low Salary
Reject
ABC
.1 High Salary
.4
D
No offer .5 Medium Salary
from Mary
.4 Low Salary 30
The probability of getting the high, medium and low salaries are taken from a
previous slide.
E .5 Medium Salary 70
Reject
Mary
A B 60
.4 Low Salary
Reject
ABC 90
.1 High Salary
.4
D 70
No offer .5 Medium Salary
from Mary
60
.4 Low Salary 31
The essence of calculating the values of nodes is to start at the end, and work backwards to the beginning. This
is because each end node reflects a possible decision path for Juan, including outcomes of events. We refer
to all of the nodes on the far right as “endnodes.”
For example, the second endnode from the top on the right has a value of 80. This node reflects the following
history:
1. Juan rejects ABC’s offer
2. Juan gets an offer from Mary
3. Juan accepts Mary’s offer.
As such, this is worth $80,000 to Juan, which is the value of the offer.
As another example, consider the fourth endnode with a value of 70. This reflects the following history:
1. Juan rejects ABC’s offer
2. Juan gets an offer from Mary
3. Juan rejects Mary’s offer
4. Juan goes to the job fair and is given a medium salary.
As such, this is worth $70, which is the value of the medium salary.
For some problems, one needs to look carefully at the full history to figure what the endnode is worth. For this
problem, it is more straightforward to compute the value of an endnode.
Step 4. Work Backwards and Evaluate
Accept ABC
72
Accept Mary’s Offer
Offer 80 80
from Mary
C .1 90
.6 High Salary
68
E .5 Medium Salary 70
Reject
Mary
A B 60
.4 Low Salary
Reject
ABC 90
68 .1 High Salary
.4
D 70
No offer .5 Medium Salary
from Mary
60
.4 Low Salary 32
Once the endnodes are given values, we can work backwards (from right to left) and compute the
value of all other nodes.
There are two types of calculations, one for event nodes and one for decision nodes.
You can click on the nodes during the slide show and see the calculations in more detail. Then click
on the return button (curly arrow button on the slide) to return to this slide. Or if you forget, just
type in 32 and press enter.
The value of an event node is an expected value calculation. For example, the value of node E is
.1 x 90 + .5 x 70 + .4 x 60, which is the expected value at node E. This value is $68,000 or 68 on the
decision tree. What this means is the following: If Juan rejects the ABC offer, obtains an offer from
Mary, and then rejects the offer from Mary, he has an expected annual salary of $68,000. (Recall
once again that “expected value” is a technical term from probability. Juan has no chance at all of
getting exactly $68,000 in this example.)
A natural question is whether we need to calculate the value of an event node by calculating its
expected value. The answer is that this is the most useful way of approaching decision trees, but it
can run into difficulties if the decision maker is either risk preferring (very unusual) or risk averse
(very common). We will show how one can deal with risk aversion in the next lecture. For now, we
will work with expected values.
The other type of calculation is for a decision node. For example, consider node C, which is a
decision node. At a decision node, Juan will choose the best decision, that is, the one with the
highest expected return. For node C, he has a choice of accepting Mary’s offer for $80,000, or going
to the job fair and getting an expected income of $68,000. In this case, Juan will take the $80,000.
So, at a decision node, the value is the max of the values of the different decisions.
Evaluate Node E
.1 90
High Salary
Take the 68 70
E .5 Medium Salary
expected value.
60
.4 Low Salary
.1 × 90
+ .5 × 70
+ .4 × 60
= 9 + 35 + 24 = 68
33
The next few slides treat the evaluations of different nodes. The button with the
curly arrow is the return button.
Evaluate Node C
34
Evaluate Nodes A and B
72
Offer 80
from Mary Exercise, determine the
C
.6 values for nodes B and A.
A B
Reject
ABC
68
.4
D
No offer
from Mary
35
Key Aspects of a Decision Tree
Time flows from left to right
Branches from a decision node represent
decisions and take into account all decisions or
events leading to that node
This slide reflects an interesting use of subjective probability. People will often
change their probability assessments when they get new information. Here we
suppose that Juan will change his probability assessments if he gets the offer from
Mary’s firm. (If he is rejected by Mary’s firm, for some reason he does not change
his probability assessments. It may Juan’s ego, or it may reflect reality.)
So, we assume here that the probability of a high offer increases to 60%, and the
probability of medium and low offers also adjust.
Illustration of new probabilities
Accept ABC
Reject
E .5
.3 Medium Salary
Mary
A B .1
.4 Low Salary
Reject
ABC
.1 High Salary
.4
D
No offer .5 Medium Salary
from Mary
.4 Low Salary 37
Note that the probability of .6 for a high salary occurs at event node E. The event node E reflects the
following history:
1. Juan has rejected the ABC offer
2. Juan received an offer from Mary
3. Juan rejected the offer from Mary
Under these circumstances, Juan believes the probability of a high salary is .6.
If you look at node D, you will note that the probability there of a high salary is .1, which is Juan’s
probability if he gets no offer from Mary.
.6 x 90 + .3 x 70 + .1 x 60 = 54 + 21 + 6 = 81.
So, at node C, the optimal decision is to reject the offer from Mary’s firm. It’s ironic that without the
offer, Juan’s expected salary would be $68,000 and the offer from Mary’s firm would be very
attractive. But after getting the offer, it now is optimal to reject it.
Key Aspects of a Decision Tree
z Branches from an event node represent a set of
mutually exclusive and collectively exhaustive
38
But the tree is only the beginning
z Typically in decision trees, there is a great deal of
uncertainty surrounding the numbers.
Decision Trees work well in such conditions
39
One of the great things about Decision trees is that it is easy to change the numbers
and get new results. This type of sensitivity analysis can do a variety of things:
41
Summary Conclusions
z Decision trees are a very useful technique for
mapping out sequential decisions under
uncertainty.
42