Topic 5 Lecture Notes
Topic 5 Lecture Notes
• Consider an extensive form game (possibly with simultaneous moves) with one extra player:
nature
Nature follow some predetermined probability distribution.
• Nature is not strategic: it follows a predetermined, and often mixed, strategy
• Nature is fictional: it is a way to model uncertainty about the structure of the game (the
same as in Bayesian games)
Example:
d n d n
2 2
Q Q
p 1 p
1 N 1
strong weak
B B
2 2
D N D N
1
Introduction
Extensive form games with asymmetric information combine asymmetric information and time dynamics within a single model.
Example
Consider two players, player 1 and player 2. Player 1 can be one of the two types, either strong or weak. The type of player 1 is known to
the player himself but unknown to his opponent, i.e. player 2. The game is played in the following process:
1. The fictitious player Nature chooses player 1’s type according to some predetermined probability distribution;
2. Player 1 make a choice between B and Q or randomize between the two actions after observing his type;
3. Player 2 observes player’s decision and get some inference about the type of player 1;
4. Player 2 makes a decision of whether to challenge player 1 for a dual or not (D/d or N/n).
Asymmetric information arises because player 1’s type is not known by player 2. This asymmetric information is represented by the
information sets in the tree diagram. In other words, Player 2 cannot distinguish between the histories (strong, B) and (weak, B) and
between the histories (strong, Q) and (weak, Q).
Player 2’s optimal action in the information set [(strong, B), (weak, B)] is
• D if
• N if
• Indifferent if
We can apply the same idea to the information set [(strong, Q), (weak, Q)]. Suppose that player 2 forms certain belief that he is in the
history (strong, Q) with probability γ and in the history (weak, Q) with probability (1 - γ). Then, his optimal action based on his expected
payoffs is
• d if
• n if
• Indifferent if
We refer to μ and γ as player 2’s belief and these beliefs are formed conditional on observing player 1’s action. In other words, player 2’s
prior belief is p and (1 - p) and his posterior beliefs after observing player 1’s action become μ and γ. Using this new notion of beliefs, we
can use backward induction to solve the game.
Nature’s strategy
Player 1’s strategy after observing his type.
To summarize, player 2’s belief (on the LHS) allows us to compute his optimal choices in the information set [(strong, B), (weak, B)]. If we
assume that player 2 updates his beliefs using Bayes rules, then we can discipline his beliefs using his opponent’s strategies. In other
words, player 2’s beliefs rely on him realizing that his opponent is rational and therefore he should play according to some strategies that
are supposedly equilibrium strategies and then his belief must come out as a result of his opponents playing those strategies.
Note that both μ and γ are endogenous variables meaning that they are part of the equilibrium. Given these parametrization, we can
solve the game using backward induction. We begin by finding player 2’s optimal action in each of the information sets and then
consider player 1’s optimal choice.
Cases to Consider
There are three possible situations in each of the informations sets [(strong, B), (weak, B)] and [(strong, Q), (weak, Q)] as listed below.
Since there is no direct connection between μ and γ, there is a total of 9 possible combinations. This means that we need to consider all
9 cases.
After finding player 2’s optimal action, we need to look for player 1’s optimal action.
If player 1 is strong, then playing Q would give him a payoff of 1 and playing B would
give him a payoff of 2. Thus, it is optimal for type strong of player 1 to play B. If
player 1 is weak, then playing Q would give him a payoff of 1 and playing B would
give him a payoff of 0. Thus, it is optimal for type weak of player 1 to play Q.
After pinning down the optimal actions for the players, we need to verify that μ and γ
satisfies the presumed conditions, i.e. μ < 1/2 and γ < 1/2.
In fact, we do not need to use the Bayes rule in this case. A simpler approach requires
realizing that type weak of player 1 will never choose B, thus (1 - μ) must be equal to
zero and therefore μ = 1. The same idea applies to γ and we can conclude direct that
γ = 0.
We have arrived at a contradiction because we started by assuming that μ < 1/2. However, the compute value of μ is 1, which violates the
condition. Therefore, we conclude that there is no equilibrium under this case.
μ is equal to
A simpler approach requires realizing that since both types of player 1 play B, player
2 receives no new information from observing player 1’s action. Thus, his posterior
belief μ is the same as the prior belief, i.e. player 1 is strong with probability p. An
equilibrium may exist if μ = p > 1/2. Note that the probability p is an exogenous
variable so it is given. Thus, as long as p > 1/2, the condition on μ is satisfied.
Now, consider the condition on γ. Since player 1 will never choose the action Q, the
information set [(strong, Q), (weak, Q)] will never be visited and is entirely off the
equilibrium path. This means that we do not need to place any discipline on the belief
γ and thus γ can be any value within the interval [0, 1]. However, we do need γ to be
less than 1/2 to satisfy the condition.
It is important to keep the conditions μ > 1/2 and γ < 1/2 because these conditions determine player 2’s optimal actions in the
information sets, which affects player 1’s optimal actions. Thus, it is important to make sure that the beliefs are appropriate for the
optimal actions that we have conjectured in the information sets.
In conclusion, an equilibrium exists in this case if μ = p > 1/2 and the belief on γ is less than 1/2. The equilibrium is
• Both types of player 1 will choose B
• Player 2 will play N in the set [(strong, B), (weak, B)] and d in the set [(strong, Q), (weak, Q)].
Case Three: μ < 1/2 and γ > 1/2
The players’ optimal actions are shown on the diagram. We can conclude that γ is
equal to p and no discipline is imposed on μ. An equilibrium exists as long as γ = p >
1/2 and μ < 1/2.
The players’ optimal actions are shown on the diagram. The diagram suggests that μ
= 1 and γ = 0. We have arrived at a contradiction because the only belief that is
consistent with these optimal actions is γ = 0. However, these actions are optimal
only when γ > 1/2. Thus, there is no equilibrium such that μ > 1/2 and γ > 1/2.
We add an additional assumption that p > 1/2 because without this assumption, there
will be more sub-cases to be considered.
In this case, player 2 is indifferent between D and N in the information set [(strong, B),
(weak, B)] and indifferent between d and n in the set [(strong, Q), (weak, Q)]. This
means that player 2 can play some pure strategies or some mixtures of the actions.
We cannot proceed with the method used previously. Instead, we are going to ask
the question: when is it possible for μ and γ to be equal to 1/2. To answer this, we
need to look at the Bayes rule and the consistency that makes μ and γ depend on the
actions taken by the players in the game.
We begin by considering each of the following cases and see whether the condition μ = γ = 1/2 holds or not.
• Case 1: player 1 of different types choose different actions
In this case, player 2 would be able to distinguish player 1’s type by observing his action (e.g. if type strong of player 1 chooses B
with probability 1 and weak player 1 chooses Q with probability 1, then player 2 would assign μ = 1 and γ = 0.
In short, this case cannot happen because player 2’s beliefs will not be consistent with the condition μ = γ = 1/2.
• Case 2: player 1 of both types play the same action
In this case, player 2’s posterior beliefs will be equal to his prior beliefs. In other words, player 2 will assign either μ = p (if player 1
of both types play B) or γ = p (if player 1 of both types play Q).
Since we have assumed that p > 1/2, the beliefs μ = p > 1/2 or γ = p > 1/2 will not be consistent with the condition μ = γ = 1/2
and this case cannot happen.
• Case 3: type strong of player 1 randomizes between B and Q and type weak of player 1 choose B with probability 1
In this case, player 2 would be able to distinguish player 1’s type when he finds himself in the information set [(strong, Q), (weak,
Q)] because the only route to this information set is that player 1 is of type strong and the realization of player 1’s randomization
is the action Q. It follows that Bayes rule is going to tell player 2 that γ = 1, which is not consistent with the condition that γ = 1/2.
• Case 4: both types of player 1 randomizes between B and Q
This is the only possible case for which the condition potentially holds. We now check whether or not an equilibrium exist under
this case.
Suppose that player 1 plays D with probability θ and plays d with probability δ. Note that this is without loss of generality because the
probabilities θ and δ can be zero or one (in which case player 2 is playing some pure strategies) or strictly between zero and one (in
which case player 2 randomizes). All these possibilities falls under the category of player 2’s optimal actions under the belief that he is
always indifferent.
For player 1 to be willing to randomize, it must be the case that player 1 is indifferent between playing B and Q. In other words, player 1
must receives the same payoff from playing B and playing Q. Then, we can write down the following equations:
Since the RHS of the two equations are the same, the LHS of the two equations must also be the same. However, as shown below, it is
not possible for this to happen.
In conclusion, we have arrived at a contradiction. This means that given the beliefs μ = γ = 1/2, we cannot find the actions that are
optimal for both players while at the same time is consistent with the beliefs. Therefore, we conclude that under the additional
assumption that p > 1/2, there is no equilibrium in this case.
The condition suggests that player 2’s optimal action is d in the information set
[(strong, Q), (weak, Q)] and is indifferent in the set [(strong, B), (weak, B)]. Suppose
that player 2 plays D with probability θ and N with probability (1 - θ).
In order to arrive at the belief μ = 1/2, it must be the case that both types of player 1
play B with positive probability. If only one type of player 1 plays B, then player 2
would be able to distinguish player 1’s type and generate the belief that μ = 0 or 1.
First, consider the optimal action for type strong of player 1. The player would
receive 1 from playing Q and some convex combination of 2 and 4 from playing B.
Generally, we would need to know the weights that player 2 places on actions D and
N. However, in this case, the lowest possible value of a convex combination of 2 and
4 is 2. Thus, we conclude that player 1 will always choose B when he is strong. It
follows that γ must be equal to zero. This is in agreement with the condition that γ <
1/2.
Type weak of player 1 would receive 1 from playing Q and some convex combination of 0 and 2 from playing B. We cannot conclude
which choice is more optimal because the payoff from playing B can be smaller or greater than 1 depending on the probability θ.
Assume that player 1 plays B with probability β and Q with probability (1 - β), where β is an endogenous variable. We can pin down the
probability β using Bayes rule as follows (based on the condition μ = 1/2):
We also need to make sure that β, which is a probability term, is within the interval [0, 1]:
In short, an equilibrium is possible only when the inequality above holds.
Then, we need to pin down the probability of player 2 choosing D, which is denoted by θ. To do this, we need to look at the incentive of
player 1. Note that type weak of player 1 must randomize between Q and B. For this to happen, it must be the case the he gets the same
payoff from playing B and Q. Then, we can write down the following expression and solve for θ:
In conclusion, an equilibrium under the case μ = 1/2 and γ < 1/2 is described as follows:
• Type strong of player 1 play B with probability 1
• Type weak of player 1 play B with probability β = p/(1 - p) where p must be weakly smaller than 1/2 for the equilibrium to exist
• In the information set [(strong, Q), (weak, Q)], player 2 is going to believe that player 1 is weak and thus is going to best reply with d
(i.e. γ = 0)
• In the information set [(strong, B), (weak, B)], player 2 is going to believe that the likelihood of him facing a strong player is the same
that the likelihood of him facing a weak player (i.e. μ = 1/2) and his optimal strategy is to play D with probability 1/2 and N with
probability 1/2.
Type weak of player 1 will receive 3 from playing Q and some convex combination of
0 and 2 from playing B. Since 3 is greater than any convex combination between 0
and 2, type weak of player 1 will always choose Q, meaning that μ = 1. However, this
violates the condition μ = 1/2. Thus, the only possibility for an equilibrium to exist in
this case is that player 1 of type strong must also play Q such that we do not need to
place any discipline on the belief μ. If a strong play 1 play B with some positive
probability, than player 2 would be able to distinguish player 1’s type when he finds
himself in the information set [(strong, B), (weak, B)]. It follows that the consistency of
beliefs will tell him that μ = 1, which contradicts with the condition μ = 1/2. Thus, the
only possibility for there to be an equilibrium is for both types of player 1 to always
choose Q. If this happens, we can allow player 2 to form any belief of μ. In particular,
player 2 can form the belief μ = 1/2 to meet the condition.
Now, we need to make sure that both types of player 1 will indeed play Q as the optimal action. We have already checked that a weak
player 1 prefers playing Q. Now, we need to check this for a strong player 1. When player 1 is strong, he can receive 3 from playing Q
and some convex combination of 2 and 4 from playing B and the combination depends on the probability of player 2 playing D, which is
denoted by θ. Thus, we need the following to hold:
In conclusion, an equilibrium exist in this case if γ = p > 1/2. The equilibrium is described as follows:
• Both types of player 1 will choose Q
• In the information set [(strong, Q), (weak, Q)], player 2 will always play n
• In the information set [(strong, B), (weak, B)], player 2 will play D with probability θ ≥ 1/2
Note that in this case, there is a continuum of equilibria because any θ equal to or greater than 1/2 will generate an equilibrium. Player 2
can also play θ = 1 such that he is playing an pure strategy.
A strong player 1 can receive 4 from playing B and some convex combination of 1 and 3 from playing Q. Thus, B is always the optimal
action for a strong player 1. This means that γ is equal to 1, which violates the condition γ = 1/2. Similar to the previous case, the only
possibility for there to be an equilibrium is where both types of player 1 play B such that we do not need to impose any discipline on the
belief γ. We have already checked that a strong player 1 will play B, we now need to look at the condition for a weak player 1 to play B.
A weak player 1 can receive 2 from playing B and some convex combination of 1 and 3 from playing Q. For B to be the optimal action for
player 1 of type weak, we need 2 to be greater than the expected payoff of playing Q. Suppose that player 2 play d with probability δ
and n with probability (1 - δ), then the following inequality must hold
By consistency of beliefs, μ is equal to p. Thus, an equilibrium exists if μ = p > 1/2.
There is a continuum of equilibria in this case because there is an interval of weights
that player 2 can put on playing d. Specifically, any δ equal to or greater than 1/2
would work.
A weak player 1 can receive 0 from playing B and some convex combination of 1 and
3 from playing Q. Thus, it is optimal for a weak player 1 to player Q. A strong player 1
can receive 2 from playing B and some convex combination of 1 and 3 from playing
Q. The optimal action depends on the probability of player 2 playing d.
If type strong of player 1 play B with any positive probability, player 2 would be able
to distinguish player 1’s type when he finds himself in the [(strong, B), (weak, B)]
information set and this would imply that μ = 1, which is inconsistent with the
condition that μ < 1/2. Thus, the only possibility for there to be an equilibrium is for
player 1 of both types to play Q, meaning that γ must be equal to p. It follows that p
must be equal to 1/2 in order to meet the condition γ = 1/2.
Suppose that p = 1/2 such that the condition γ = p = 1/2 holds. We now need to make sure that playing Q is the optimal action for
player 1 of type strong. This means that the following inequality must hold in equilibrium:
In conclusion, an equilibrium exist in this case if the prior belief is p = 1/2. There is a continuum of equilibria because any value of δ equal
to or greater than 1/2 would generate an equilibrium.
∆(I_i) is a collection of all possible probability distribution over the elements of set I_i and a belief is one of these probability
distributions. An agent’s beliefs will facilitate him in making decisions by telling him the probability of him being in each of the
histories within the information set. We can then choose an agent’s optimal action by computing his expected payoff from
different actions using his beliefs.
Analogy between continuation values and beliefs
‣ We used the idea of continuation values in extensive form game without asymmetric information. A continue action value is a
sufficient statistics of what is going to happen following an agent taking a certain action. Specifically, what is going to be the
agent’s equilibrium payoff once he take a certain action.
‣ A belief is a sufficient statistics about what has happened in the past once an agent finds himself in a particular information
set.
‣ The continuation values and the beliefs are similar in the sense that both are sufficient statistics. Furthermore, we need to
impose some conditions on both of them. With continuation values, the condition is that they need to be computed using
the equilibrium strategies of everyone playing that game. With beliefs, the condition is that they needs to be computed
using the equilibrium strategies of people who were acting prior to the spot at which the agent need to make a decision.
• System of beliefs
A system of beliefs is a collection of beliefs and there is a system of beliefs for each information set in the game.
There is a set of systems of beliefs because we can assignment different beliefs to a single information set and this generates
multiple systems of beliefs.
• Assessment
An assessment includes the strategy profiles and a system of beliefs; it is a description of equilibrium in extensive form games with
asymmetric information.
The assessment is a description of what has happened in the game, i.e. it shows what the players will do and what they believe.
Practical Advice
When solving for extensive form games with asymmetric information, we should use all the conditions in the definition of WSE.
Example
The sequential rationality requires checking the actions at the four highlighted nodes are
optimal. This gives four conditions. The consistency of beliefs requires checking that the beliefs
are consistent at the two nontrivial information sets, which gives two more conditions. We do
not need to check for consistency at trivial information sets because they have no degrees of
freedom. In total, we have six conditions.
1. Sequential rationality at #1: γ < 1/2, player 2’s optimal choice is d.
2. Sequential rationality at #2: μ < 1/2, player 2’s optimal choice is D.
3. Sequential rationality at #3: 2 > 1, player 1’s optimal choice is B.
4. Sequential rationality at #4: 1 > 0, player 1’s optimal choice is Q.
5. Consistency of beliefs at #5: the belief is γ = 0, which satisfies γ < 1/2.
6. Consistency of beliefs at #6: the belief is μ = 1, which violates the condition μ < 1/2.
We arrive at a contradiction.
5.1 Mixed and behavioral strategies
1
A B
3,1 2
a b
6,4 1
C D
5,-3 0,14
A mixed strategy is a vector that shows how should a player randomize across all available pure strategies in
the game.
A behavioral strategy is a set of vectors that shows how a player randomize every time he has to make a
decision. In other words, a behavioral strategy concerns only with the actions that are currently available to the
agent.
In comparison, behavioral strategies are more convenient and are more outcome equivalent. We will be using
behavioral strategies when solving for extensive form games but we can use the term mixed strategy to name
them.
2
5.2 Beliefs
1,0 3,1 1,2 3,1
d n d n
2 2
Q Q
p 1 p
1 N 1
strong weak
B B
2 2
D N D N
Consider any nonsingleton information set Ii . Player i, who makes a decision at I, forms a
belief about his exact location within the information set Ii .
Interpretation: if player i finds himself inside the information set Ii he believes that history
x 2 Ii was played with probability µ(x).
5.3 Assessments
In extensive games with incomplete information we are interested in both players strategies and
their beliefs.
3
1,0 3,1 1,2 3,1
d n d n
2 2
Q Q
p 1 p
1 N 1
strong weak
B B
2 2
D N D N
Sequential rationality: Each player’s strategy is optimal given her beliefs and the strate-
gies of her opponents.
Consistency of beliefs: Beliefs are determined by Bayes rule whenever possible (i.e. in
all information sets that are reached with positive probability according to strategies of players).
Pr(A \ B)
Pr(A | B) =
Pr(B)
4
0,2 2,1
Q Q
p 1 p
1 N 1
strong weak
B B
↵ 1 ↵
2 2
D N D N
C Pr(str. \ B) Pr(str. \ B)
↵ = Pr(str. | B) = =
Pr(B) Pr(str. \ B) + Pr(weak \ B)
...whenever possible: if the information set is not reached (reached with probability zero),
any belief on this information set is consistent.
5
5.6 Beer-quiche game: Looking for equilibria
1,0 3,1 1,2 3,1
d n d n
2 2
1
Q Q
p 1 p
1 N 1
strong weak
B B
µ 1 µ
2 2
D N D N
6
1,0 3,1 1,2 3,1
d n d n
2 2
1
Q Q
p 1 p
1 N 1
strong weak
B B
µ 1 µ
2 2
D N D N
µ = 1 – not possible!
In case 4, similarly, you get that = 0 – not possible.
7
5.8 Case 2
1,0 3,1 1,2 3,1
d n d n
2 2
1
Q Q
p 1 p
1 N 1
strong weak
B B
µ 1 µ
2 2
D N D N
s(str.) = B, s(weak) = B.
Consistency requires that µ = p. As long as p > 0.5, and < 0.5 this assessment is a
sequential equilibrium.
Note that is a free parameter because there is no consistency requirement for it: no one
plays Q in this equilibrium.
Properties: Both types select the same actions, hence action is not informative about the
type: player 2’s posterior belief is the same as his prior belief (p, 1 p).
Equilibria in which no additional information about the types is revealed to uninformed
player are called pooling equilibria.
8
5.9 Case 3
1,0 3,1 1,2 3,1
d n d n
2 2
1
Q Q
p 1 p
1 N 1
strong weak
B B
µ 1 µ
2 2
D N D N
9
5.10 Case 5
1,0 3,1 1,2 3,1
d n d n
2 2
1
Q Q
p 1 p
1 N 1
strong weak
B B
µ 1 µ
2 2
D N D N
Since µ = 0.5 and = 0.5, consistency implies that both types of player 1 are mixing. Also
case 5 implies that player 2 is mixing after observing any history.
Suppose player 2 plays D with probability ✓ and d with probability .
Then player 1’s indi↵erence conditions are
2✓ + 4(1 ✓) = + 3(1 )
0✓ + 2(1 ✓) = + 3(1 )
10
5.11 Case 6
1,0 3,1 1,2 3,1
d n d n
2 2
1
Q Q
p 1 p
1 N 1
strong weak
B B
µ 1 µ
2 2
D N D N
Player 2 is indi↵erent between playing D and N . Let him play D with probability ↵ and N
with a remaining prob.
Player 1 strictly prefers to play B if he is strong.
Suppose he plays B with prob. if he is weak.
Consistency:
Pr(B \ str.) Pr(B \ str.)
0.5 = µ = = =
Pr(B) Pr(B \ str.) + Pr(B \ weak)
Pr(B | str.) Pr(str.)
Pr(B | str.) Pr(str.) + Pr(B | weak) Pr(weak)
or
p
=
1 p
2(1 ↵) = 1 or ↵ = 0.5
11
Note that consistency also implies = 0 which is compatible with case 6 ( < 0.5).
Properties:
In this equilibrium player 2 obtains some extra information which allows him to make a more
precise judgment or inference of player 1’s type. In particular he can say for sure that player 1
is weak if player 1 plays Q. He still cannot distinguish perfectly between weak and strong type
when he sees B played.
Such equilibria are called partially separating equilibria.
An equilibrium is called separating if uninformed agent can make precise inference (can
guess without making a mistake) of his opponent’s type. There are no separating equilibria in
this particular game.
• ability ✓ is drawn from a binomial distribution: ✓L with prob p and ✓H with prob (1 p)
• labor market is competitive: firms pay their employees their expected productivity (and
collect zero profits)
Benchmark
In equilibrium firms cannot distinguish between high ability and low ability employees, hence
the wage is uniform:
w = E[✓]
5.14 Education
• Before joining the labor force, agents can obtain some education
e
• Education is costly: c(e, ✓) = ✓
(easily generalizable)
• Note, that marginal costs of education are lower for high productivity agents
Things to note
1. Wage is a convex combination of θ_H and θ_L. The implication is that
2. This equation gives a one to one correspondence between the wage and the belief. In particular, we can solve for the belief from this
equation. The implication is that if the firm has the belief, it can solve for wages. The opposite is true, that is, if the firm has wage, it
can solve for the belief (i.e. what the firm believes about this person with that particular wage).
This has important implication because in this game, it is more convenient to work with wages rather than beliefs. Instead of
finding the equilibrium choices of the job candidates and the firms’ beliefs, we can solve for the equilibrium wage and the
equilibrium choices of the job candidates. Once we know the wage for every effort level and the choices of each type of the
agents, we can back out the beliefs and formulate the assessment (which includes the strategy profile and the system of beliefs)
and thus the equilibrium.
We are going to work with strategies and the wage schedules (i.e. the function w(e)). Note that the wage schedule has to be
defined for every education level, regardless of whether or not a particular education level will be chosen in equilibrium. Thus is
because beliefs have to be defined for every information set, no matter what that information set is reached is an equilibrium or
not.
Example of Assessment (not necessarily an equilibrium)
There is a continuum of wages between θ_H and θ_L (wage must be between these two values).
Suppose that the wage schedule is a random function (in the graph) and suppose that there is an education level e_L and an education
level e_H. With this information, we can constitute an assessment.
• The strategy profile is: an agent of type θ_L picks e_L and an agent of type θ_H picks e_H.
• The firm offers this wage schedule w(e), which pins down the beliefs of the firm at each information set.
E.g. if the education level is e*, then the firm would offer the wage w* and we can then back out the beliefs. This pins down the
belief in that information set. Since the function is defined everywhere, we can obtain the firm’s beliefs in each information set.
Note that the function above is not a weak sequential equilibrium (the function only gives an demonstration of what an assessment would
look like in this game. It is easy to see that the function above is not a WSE. Notice that the low types and the high types would choose
different levels of education, meaning that by consistency of the beliefs, the firm is able to distinguish the type of an agent. Specifically,
in this information set with e*, the firm know that the candidate is of low types when it observes e_L. As a result, the wage that the firm
offers must be equal to θ_L, meaning that the corresponding wage according to the wage schedule is inconsistent with the firm’s beliefs.
Therefore, the assessment above cannot be an equilibrium assessment.
This game has many equilibria, we are going to bundle them into classes. Equilibria within a class are going to look similar with some
minor differences.
Pooling equilibrium
• WSE in which all types of job candidates choose the same action in equilibrium.
• In equilibrium, the education level of the low type equals the education level of the high type, i.e. e(θ_L) = e(θ_H).
Start with the case in which the education level is zero, i.e. e(θ_L) = e(θ_H) = 0. In this case, zero is the only information set that is visited
on path. Thus, it is the only information set that occurs with positive probability. This means that in this information set, we are bound by
the consistency of beliefs. The posterior beliefs must be the same as the prior beliefs. This is because since everyone does the same
thing, the firm learns nothing from the observed education levels. The prior beliefs is Pr (θ_L) = p and Pr (θ_H) = 1 - p. Thus, the wage
when e = 0 is
Both conditions (sequential rationality and consistency of beliefs) are satisfied at the point e = 0. We now need to extend the wage
schedule to education levels greater than zero. In the language of beliefs, we need to define the firm’s beliefs in all information sets (both
on-path and off-path). We can draw an arbitrary function w(e) (i.e. the orange curve) and check if it is an equilibrium. To do that, we need
to examine the sequential rationality condition of the job candidates. The sequential rationality condition needed for the high type
candidates is
The equation says that for e = 0 to be a rational choice, the payoff from choosing e = 0 must be greater than the payoff from any other
choices. We can solve for ω(e) as follows:
ω(e) needs to be below the curve given by the function on the RHS. We can draw the curve on the diagram above. This is θ_H’s
indifference curve. The interpretation is that type θ_H is indifferent between being at this point (e = 0, wage = θ bar) and being at any
other point (i.e. any other pair of education and wage) on this curve (e.g. choosing some positive education and receiving a higher wage).
Sequential rationality means that the wage schedule (i.e. the orange line) has to stay below the indifference curve (i.e. the yellow line) for
every education level e. This is violated because the orange curve is above the yellow line for some values of e. Thus, we need to adjust
the firm’s beliefs or equivalent, the wage offered (i.e. need to make the firms more pessimistic about the job candidates’ types). The
adjusted curve (i.e. the green curve) satisfies the sequential rationality for the high type.
The sequential rationality condition needed for the low type candidates is
θ_L’s indifference curve is steeper than θ_H’s indifference curve (i.e. the yellow line). The green curve needs to stay below the yellow
curve for the low type. The interpretation is that the firm has to be sufficiently pessimistic about the ability of an candidate who gets
some positive education. The intuition is that the firm has to be sufficiently pessimistic about the ability of an candidate who gets some
positive education because if the firm is optimistic, then one of the types, the high type in this case, is going to be more eager to deviate
from zero to some positive e. Although deviating means that the candidate needs to bear more cost, he can obtain the benefit of
receiving a higher wage because the firm is optimistic.
The pooling class of equilibrium contains many equilibria. For example, ω hat (e) is also a pooling equilibrium in which on equilibrium
path, both high and low type select e = 0. In this equilibrium, the firm is more pessimistic about the ability of candidates who get e > 0
than in the equilibrium above.
From the point of view of the beliefs, we know that all information sets for all e > 0 are not reached on path, therefore we are not bound
by the condition of consistency of beliefs. Thus, we can select any beliefs without violating any equilibrium conditions. This leads to a
huge set of equilibria.
This type of equilibrium is called pooling equilibrium because different types of agents select the same action (pooled together in one
action) and the firm learns nothing by observing their education level in equilibrium. In other words, the posterior belief is the same as
the prior belief.
Can we have a pooling equilibrium where the education level that is chosen by both types of candidates is e, where e is strictly positive?
In other words, can we have e(θ_L) = e(θ_H) = e* > 0?
By consistency of the beliefs on path, the wage offered at e = e* has to be equal to the average ability, i.e. θ bar. We need to complete
the wage schedule for all education levels and check that sequential rationality for the candidates is satisfied.
We can draw two indifference curves for the two types of candidates (orange line). The indifference curve for the high types is flatter than
that for the low types because of the denominator in the cost function. The intuition is as follows. The two curves intersect at (e*, θ bar).
Now we ask the high type how much wage premium would he need for him to increase his education level by one. The amount of wage
premium that would keep the high type indifferent between choosing e* and (e* + 1) is going to be (1/θ_H), which is a small number
since θ_H is a high number. The low type would need to be compensated by (1/θ_L), which is a larger number. Thus, the indifference
curve for θ_L is steeper than that for θ_H because education is more costly for the low type.
Anything that lies before the indifference curve for θ_H is worse than the point (e*, θ bar) for the high type.
Anything that lies before the indifference curve for θ_L is worse than the point (e*, θ bar) for the low type.
Together, everything below the indifference curve θ_H is going to be worse for the candidates than the point (e*, θ bar). This means that
any wage schedule below θ_H is going to be an equilibrium (e.g. blue line). This is an equilibrium because no one has the incentive to
deviate and everyone’s action is sequentially rational. The only information set that is reached on path satisfies the consistency of beliefs.
Can we always find an equilibrium using this procedure? How large can e* be such that we can use the steps above?
Clearly, we cannot make e* arbitrarily large because if we do so, the agents would not participate no matter what the firm believes about
them. Of course we can do that as long as there is space to draw my function between the lower bound of the corridor and these
envelop (indifference curve). If for every level of education, there is some space to draw my wage schedule, then we can construct an
equilibrium. However, if there is no space, that means we would wonder above the ID curve for high type and that means there will be a
profitable deviation.
We need to calculate the height of the intersection of IC of θ_L and the y-axis. To do this, we use the observation that the y-intersect is
on the same IC for θ_L) as the point (e*, θ bar).
Any education level that lies below the threshold can be sustained as a pooling equilibrium using this construction.
Separating Equilibrium
• In equilibrium, different types of agents will choose different actions, i.e. e(θ_L) ≠ e(θ_H).
• By looking at the actions chosen, the firm would be able to infer the types of the agents exactly without noise. , then the firm by
looking at the action that is chosen, will be able to infer the type of the agent exactly without any noise.
From the perspective of formalism for sequential games with asymmetric information, having an separating equilibrium means that
consistency of the beliefs in the information set e(θ_L) implies that the firm should believe that this agent is of low types with probability
one, and consistency of the beliefs in the information set e(θ_H) implies that the firm should believe that this agent is of high types with
probability one. This means that the wage that will be offered when education level e(θ_L) is observed is going to be equal to θ_L and
the wage that will be offered when education level e(θ_H) is observed is going to be equal to θ_H. This gives part of the description of
the equilibrium.
What kind of education level can we sustain in equilibrium for the low types? Can the education level be strictly positive?
The answer is no. This is because in a separating equilibrium, the low type will be paid a wage of θ_L. If the low type selects a positive
level of education, he can make a profitable deviating by choosing e = 0, which would reduce the cost of education and potentially
increase the wage received, but will definitely not reduce the wage (impact of the wage depend on the firm’s beliefs). In short, in a
separating equilibrium, the only option for the low type is e_L = 0, where e_L is the education level that would be chosen by the low
type. The wage paid to the low type would be θ_L.
We then need to figure out the education level that would be chosen by the high type. In equilibrium, the high type should have no
profitable deviation. Can the high type gain a profitable deviation by switching from e(θ_H) to e(θ_L) = 0? To answer this, we can draw
the IC for the high type through the point (0, θ_L). What sort of education can the high type obtain such that deviating to the point (0,
θ_L) is not tempting? The answer is that anything above this IC is better than (0, θ_L). But then the consistency of beliefs tells us that we
can only be on the line w(e) = θ_H. In other word, the only wage that the high type can get in a separating equilibrium is θ_H. The range
of education level that is above the IC and equal to θ_H is given by the orange line. For example, if we pick e = e_H, then as long as it
stays within the interval, the high type would not want to deviate from e_H to e_L = 0. This means that we have got rid of at least one
profitable deviation.
Does the low type want to deviate from e_L to e_H? To answer this, we can draw the IC for the low type. For the point e_H to be not
attractive to the low types, we need to make sure that this point lies below the IC for θ_L. Thus, the options available for the high type
reduces from the entire interval (i.e. the orange dashed line) to the green interval.
If we focus only on the two types e_L = 0 and and e_H within the green interval, we would be able to achieve self-enforced separation
between the high and the low types.
We know that (e_L=0, θ_L) and (e_H, θ_H) are on the wage schedule. We now need to extend the wage schedule to fill in the gaps and
we need to make sure that there is no profitable deviation. One way to do this is to pay the wage θ_L to everyone except the candidates
who get the education level e_H (i.e. the purple line). Wage schedule is discontinuous at e = e_H.
The purple line (in the LHS panel) is an equilibrium because the low type is not willing to deviate to anywhere else because deviating is
going to incur a cost of education but is not going to increase wage. We also know that the low type would not want to deviate to e_H
even though wage is higher at e_H. The high type would not want to deviate because deviating is going to reduce the wage. The most
tempting point is (0, θ_L), but this point is on the IC for the high type so it is worse than e_H.
There are other equilibria. In order to characterize all separating equilibria, we need to draw another IC for the high type through the
point (e_H, θ_H) (i.e. the red line). In our particular set up, this IC is parallel to the IC through (0, θ_L). The wage schedule need to stay
below the blue dashed line and this would be sufficient to give an equilibrium. This works because the low type would choose (0, θ_L) on
path and everything else that is offered lies below the IC for θ_L, so the low type would not deviate. The high type would choose e_H on
path and everything else lies below the red IC for θ_H, so anything else on the wage schedule is worse than e = e_H, therefore the high
type would not deviate.
The only thing we need to check again is whether the consistency of beliefs hold. It holds because at e = 0, the firm would believe that
the candidate is of low type and at e = e_H, the firm would believe that the candidate is of high type.
The intuition is that by deviating to θ_H, the low type would be able to make the firm believe that he is the high type (since the firm
believes that whoever chooses e_H is the high type. This will lead the low type to get higher wage θ_H. However, pretending to be the
high type incurs a cost for obtain the education level e_H. In order for this not to be a profitable deviation, we need
This inequality is called the incentive compatibility constraint for the low type. For the separation to be achieved in equilibrium, the low
type must have no incentive to present to be the high type.
For high type to not want to pretend to be the low type, we need
This inequality is the incentive compatibility constraint for the high type.
The idea is that once we find the education levels that satisfy these constraints, we can always find the entire wage schedule to fill in the
gaps and fully characterize the equilibrium. Because we are not bounded by consistency of beliefs at e ≠ {e_L, e_H}, we have a lot of
freedom in extending the wage schedule to off-path points. All we need to do is to make the firm such that deviating to any other point
is not profitable.
As mentioned above, the two incentive compatibility constraints give the interval of e_H (i.e. the green line) that gives an separating
equilibrium. We can show this as follows:
Note that both inequalities are with respect to e_H because e_L = 0. Any point between this interval give rise to a separating equilibrium.
• beliefs:
• beliefs:
eP e
p✓L + (1 p)✓H ✓L
✓
This set of inequalities implies that
eP ✓L (✓H ✓L )(1 p)
13
5.17 Separating equilibria
Agents signal their ability to firms using education:
• Let low productivity agents choose eL and high prod. agents choose eH : eH 6= eL
• Consistency implies that firms can distinguish between high and low ability by looking at
education
w(eL ) = ✓L and w(eH ) = ✓H
• History e 62 {eL , eH } is reached with prob. 0 therefore beliefs there are free: for now
assume that Pr(✓H | e 62 {eL , eH }) = 0, hence for all e 62 {eL , eH } : w(e) = ✓L
• Low productivity:
e
w(0) w(e 6= eH )
✓L
hence eL = 0.
• Low type does not want to pretend to be a high type and vice versa:
eH
✓L ✓H
✓L
eH
✓H ✓L
✓H
or
eH 2 [(✓H ✓L )✓L , (✓H ✓L )✓H ]
• The advisor knows the state of the world and the policymaker does not.
• Policymaker’s payo↵:
up (x, y) = (x y)2
• Adisor’s payo↵:
ua (x, y) = (x y b)2
where b is his bias (the larger the b the larger the conflict of interest between A and P).
14
Cheap Talk
Nature chooses the state of the world out of a continuum U[0, 1]. There is an advisor (A) and a policy maker (P), and there is an unknown
state of the world that P is trying to match with the policy. For instance, suppose that there is an ideal tax rate that is unknown by the
policy maker. The policy maker would like to set the tax rate equal to the ideal rate. A is the informed party and knows the ideal tax rate.
But A has a slightly different preference than P. We want to answer the following question: is it possible for A to pass the information to
P? Will there be loss of information in the communication between A and P?
Nature selects the state of the world, denoted by x, according to a uniform distribution over the interval [0, 1].
Player 1 is the policy maker (P)
• P needs to choose a policy y, where y \in [0, 1].
Player 2 is the advisor (A)
• A is informed about the state of the world x.
• A can send a message m to P, where m \in [0, 1].
We can generalize the message m to any space of messages later.
Payoffs
• P’s payoff is U(y, x) = - (x - y)^2
The utility function is a quadratic loss function. If the state of the world is x, then ideally, P would want to set y = x, which gives a
payoff of zero. In any other case, he is going to receive a negative profit where the loss is the quadratic difference between the
true state of the world and the policy that is implemented.
The policy maker wants to get as close as possible to the true state of the world.
• A’s payoff is V(y, x) = - (x - b - y)^2 where b > 0
The number b is commonly known by the players and b represents the bias of A. The ideal point for A is (x - b); i.e. he wants to
have a policy that is biased against the ideal policy for the policy maker or the true state of the world.
The bias introduces conflict of interest between the two players.
• Note that the payoffs for both players do not depend directly on the message (i.e. the action taken by the advisor). The message m
does not enter the utility functions. The model is called cheap talk because there is no direct consequences of sending different
messages when it comes to his payoff.
Role of Message
P tries to improves his belief about x by looking at the message. How much of the message sent by A is credible (i.e. P can believe that
information and make the decision accordingly)?
We start with the incentives of P in some abstract history where he receives some message that contains some information. Given the
information that P has, what is the best policy that he can implement (i.e. optimal choice of y)?
In the information set m (i.e. the message m has been observed), P needs to solve
This is the sequential rationality for P. Sequential rationality says that the agents have to believe rationally or optimally given their beliefs.
So we take the beliefs of P at this information set as given and we try to find the optimal action.
To solve for the optimization problem, we need to find the FOC. Since expectation is a linear operator, we can take the derivative inside
the expectation.
❓
Since y is the action of P, it is not random and we can take it outside of the expectation (note that x is random).
y* denote the optimal policy. Furthermore, the policy y is a function of the message m that P receives and is the expected value of x
conditional on m. This means that P updates his beliefs about x at this message m and then take the expectation of x with respect to
those beliefs.
With this result, we can investigate the incentives of A. Then, we can connect A’s problem with the consistency of the beliefs for P
because once we known the strategy of A, we can use the Bayes rule and calculate the beliefs that we need to compute the expectation
above.
The model has multiple equilibria. We are going to bundle them into classes.
The equilibrium is called babbling because the message m contains no information about x.
Observations
• Every information set is reached with positive density (we do not say positive probability because x is a continuous distribution). As a
result, we can use Bayes rule in each information set, meaning that P’s beliefs will be pinned down by the condition of consistency of
the beliefs.
• Independence implies that P would believe that x is distributed uniformly on [0, 1] after every message. This is because m does not
contain any information, so the posterior belief is the same as the prior belief, which is that x is distributed uniformly on [0, 1].
We can now walk through all the conditions that must be satisfied for the assessment proposed to be a WSE.
• Consistency of beliefs
The belief x|m ~ U[0,1] is consistent because m contains no information about x . Therefore, using the Bayes rule in every
information set is going to result in the same distribution of x.
This condition is satisfied.
• Sequential rationality for P
This condition is satisfied because y*(m) = 1/2 comes from the general rule for P, which was derived using the sequential
rationality condition.
• Sequential rationality for A
A’s payoff is
Note that this is independent of the message that A sends under this strategy profile. Therefore, any action is a best reply,
including this mixed strategy of sending a random message that is independent of x.
Therefore, the assessment proposed is a babbling equilibrium. The intuition is that A says something nonsense and P does not pay any
attention to m. The best that that P can do is to rely on the prior belief and choose y = 1/2 (i.e. the expected value of x). Because P does
not pay any attention to A, A may as well say something nonsense. This is a WSE.
Alternative Equilibrium
We can make one modification of A’s strategy: m(x) = m*.
A’s strategy is a mapping from x to some given number m*. This strategy also contains no information about x. The difference is that this
strategy is deterministic while the strategy above is random. Since m contains no information, P’s strategy is going to be the same as
above. The payoff of A is independent of his message. Therefore, this strategy is also a best reply.
However, when we formulate an equilibrium like this, we need to be careful about the beliefs. In the equilibrium above, the belief was
pinned down by consistency of the beliefs because every information set is visited with a positive probability. Now, in this equilibrium,
the only information set visited on the equilibrium path is m = m*. So for this information set, we know that x|m* = U[0, 1]. Every other
information set is not visited such that the consistency of the beliefs does not apply. Nevertheless, we need to set some beliefs. We can
supplement with the same beliefs everywhere else, i.e. x|m ~ U[0, 1]. We do this because we want to make sure that in every other
information set, the optimal action of A remains the same. If the optimal action changes in some information set or if the beliefs are
different from x|m ~ U[0, 1], then it might be the case that A would have a profitable deviation from m* to some other m. If P listens to A
in some cases, A may be tempted to change his message under certain circumstances. This is not allowed for there to be a babbling
equilibrium.
There are many other alternative strategies that A can follow. For example, we can write down the same strategy for A but supplement
the off-path beliefs with a new belief.
This would also work because under these beliefs, the expectation of x is still 1/2 and the optimal choice y* is not going to change. We
can do this because consistency of beliefs do not apply to off-path information sets. Thus, we have a high degrees of freedom in terms of
modifying the beliefs.
Separating Equilibrium
Can we have a separating equilibrium in this set up?
The definition of a separating equilibrium is that each type of the informed agent would take an action that is unique to that type such
that when an uninformed agent observes the action, he would be able to guess correctly the type of the agent that has taken the action.
In this model, a separating equilibrium would require A to send a message m that enables P to guess the correct state of the world. This
means that all information that A has would be passed to P without any loss.
Separating equilibrium does not exist in this model because with the bias b, it is impossible for A to pass all his information to P.
Specifically, because A’s preference is different from P’s preference, A cannot credibly convey all the information that he has.
Proof
We first consider a simple example. Suppose that A’s strategy is to say m(x) = x. In this case, A would have a profitable deviation at some
point. Suppose that the realized state is x hat. Then under this strategy, P would receive m hat = m(x hat).
Given A’s strategy and by the consistency of the beliefs, P would believe that x is equal to m hat with probability 1 after receiving the
message m hat. Then, he would choose y*(m hat) = E[m|x] = m hat. The payoff that A receives would be the following:
Instead of sending m hat, A can send m tilde. By doing this, A would make P believe that the true state is x hat - b and this would allow A
to achieve his best policy.
What if we adjust A’s strategy for the bias; i.e. m(x) = x - b? The problem is that if A adjusts for the bias, then P can roll back the
adjustment. E.g. suppose the strategy is
Consider this arbitrary function where some high messages correspond to some medium states of the world and some low messages
correspond to some high state of the world. The function is monotone. The only condition preserved is that m(x) is invertible. This needs
to be the case because in order to have a separating equilibrium, the A’s strategy must have the following property: for every message
m, P can unambiguously know the state of the world that m corresponds to. This, it must be the case that the function m(x) is invertible.
Then, the function m^{-1}(.) is can decode A’s strategy. Specifically, m^{-1}(m(x)) = x. Then, the argument presented in the simple
example remains the same. Let the generic strategy be m hat = m(x hat). Since this is invertible, P knows that x hat corresponds to m hat.
To gain a high payoff, A would want to make P believe that the state of the world is (x hat - b) instead of x hat and A can achieve this by
sending the signal m tilde = m(x hat - b) instead of m hat.
In conclusion, in order to have a separating equilibrium, the function m(x) must be invertible. However, for any invertible strategy, there is
always a profitable deviation for at least one type of the advisor. Thus, we cannot achieve full separation in equilibrium. The intuition is
that A would not be able to convey all the information that he has. Instead of trying to achieve full separation, we can try to find an
equilibrium in which only some information is transmitted.
Suppose that m_1 says ‘high’ and m_0 says ‘low’. These names are for exposition purposes only. They do not have any implications for
the solution.
As long as the message space is large enough, the name of the messages does not have any meaning. The values of these messages are
purely endogenous and thus we can denote the messages as ‘high’ and ‘low’ or something else instead of ‘m_1’ and ‘m_0’. What
happens is that A is going to code the state of the world into some messages, and then P is going to decode them back into information
about x.
Given this messages, we can first look at the consistency of beliefs for P. Then, we can use sequential rationality for P to figure out what
policies is he going to implemented following the messages ‘high’ and ‘low’.
Consistency of beliefs:
This condition says that P will believe that the true state of the world is distributed uniformly on [x*, 1] conditional on the message ‘high’.
This part is where P decodes the message received and translates it into information about x.
This is the case because every x above x* is translated into the message ‘high’ and none of the states below x* are translated into the
message ‘high’. If P takes the uniform distribution, then P’s updated belief following A’s strategy would be a truncation of the uniform
distribution.
Similarly,
We can include x* in the intervals for both messages because since x is a continuous variable, the probability for x = x* to happen is zero.
In order to define a random amiable, we need the support to be a closed set. P’s beliefs are consistent with A’s strategy. So these beliefs
will give rise to some optimal policy y*, where
We have checked the consistency of the beliefs and the sequential rationality conditions for P. We now need to check that the strategy
m(x) is sequentially rational for A. To do that, we are going to ignore all other available messages for now. Now, we assume that the only
messages that are available to the advisor are ‘high’ and ‘low’. If A is rational, how would he choose between ‘high’ and ‘low’?
From the point of view of A, the upper envelope is going to be the payoff that he would be able to receive. This is because A knows the
true state of the world, so he would send ‘low’ whenever it is better to do so and send ‘high’ when every it is better to do so.
The intersection of the two parabola is where A is indifferent between sending ‘high’ and sending ‘low’. To the right from this value of x
at the intersection, he would prefer to send ‘high’ and to the left he would prefer to send ‘low’. This means that the the intersection point
gives the threshold value x*.
Given this idea, we can write down the sequential rationality condition for A. The condition says that at x*, A must be indifferent between
sending ‘high’ and sending ‘low’. He would prefer to send ‘high’ at any point to the right of x* and send ‘low’ at any point to the left of
x*. Mathematically:
Solving the equation would give two solutions (since this is a quadratic equation). However, only one solution makes sense:
This is going to be an equilibrium as long as the threshold has a meaning. That is, the threshold has to lie within the interval [0, 1]. If the
threshold is outside the interval, then one of the messages would never be sent and the whole constraint would not make sense.
Therefore, we need to make sure that the threshold is between [0, 1]. Clearly, it is above zero because we have assumed that b is
positive. Thus, we only need to make sure that it is less than 1.
Given this construction, we may ask if we can do something where we transmit more than one bit of information. The answer is yes.
Under certain conditions, we can transmit more information. For example, instead using this binary strategy for A, we can use three
signals where we send ‘high’ if x is above a certain upper threshold, send ‘low’ if x is below a certain lower threshold, and send ‘medium’
if x is between the two thresholds. Then we need to write the beliefs by normalizing the uniform distribution within the three intervals.
We can then repeat the steps above with an addition condition each time and we would have tree parabolas and two intersections points
(i.e. two indifference condition) to pin down the two thresholds. There is also going to be an condition on b and this condition is going to
be stricter than the one we obtained in the case of binary strategy. In order to transmit more information, we would need the conflict of
interest to be even smaller.
The more messages we have, the smaller the size of the bias (i.e. the conflict of interest) needs to be. In principle, we can construct this
equilibrium for as many messages as we want. The problem is that the equilibrium would only exist when b is very small. This is intuitive.
If we decrease the conflict between the sender and the receiver, then the sender would be able to send more information to the receiver
credibility without having a temptation to deceive the receiver.
5.19 Babbling equilibria
The advisor will send a meaningful message only if he thinks that the policymaker pays attention
to his messages. There are always an equilibria in which policymaker does not pay attention to
messages and the advisor sends arbitrary messages:
• The advisor sends a random message that does not depend on the state of the world: for
example m | x ⇠ U [0, 1]
• The policymaker’s posterior beliefs coincide with his prior: Pr{x < z | m} = z.
Z1
max{ (x y)2 dx}
y
0
or
1
y ⇤ (m) =
2
• By contradiction, assume that separating equilibrium exists. Let µ(x) be an eq. strategy
of the advisor
• Since the eq. is separating, µ(x) is invertible. WLOG assume µ(x) strictly increasing.
15
• If message m1 is received, policymaker thinks that x < x⇤ and implements the policy that
maximizes
Zx⇤
max{ (x y)2 dx},
y
0
i.e.
x⇤
y(m1 ) =
2
• If message m2 is received, policymaker thinks that x x⇤ and implements the policy that
maximizes
Z1
max{ (x y)2 dx},
y
x⇤
i.e.
1 + x⇤
y(m2 ) =
2
i.e.
1
x⇤ = + 2b
2
16