0% found this document useful (0 votes)
67 views25 pages

Revisiting Secretary Problem

The document discusses the secretary problem as a Markov decision process (MDP). An MDP is defined by states, actions, transition probabilities between states based on actions, costs of transitions, and a discount factor. The secretary problem can be modeled as an MDP with two states (hire or reject a candidate), actions at each state, transition costs depending on if the hired candidate is best or not, and the objective of minimizing costs/interviews to hire the best candidate. Dynamic programming approaches can find optimal policies for MDPs by considering costs of current and future states.

Uploaded by

Supriya Murdia
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
67 views25 pages

Revisiting Secretary Problem

The document discusses the secretary problem as a Markov decision process (MDP). An MDP is defined by states, actions, transition probabilities between states based on actions, costs of transitions, and a discount factor. The secretary problem can be modeled as an MDP with two states (hire or reject a candidate), actions at each state, transition costs depending on if the hired candidate is best or not, and the objective of minimizing costs/interviews to hire the best candidate. Dynamic programming approaches can find optimal policies for MDPs by considering costs of current and future states.

Uploaded by

Supriya Murdia
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 25

1 Acknowledgement

Everything that comes down on paper is the tip of the iceberg in terms of the effort put into the
making of it, not just by the author but by everyone with him. I extend my heartfelt gratitude to
Prof. K S Mallikarjuna Rao for introducing me to a new and intriguing topic and suggesting me
valuable books to strengthen my knowledge in this field.
I would also like to thank our Professor’s PhD scholars Ravikant, Anirban and Aanchal who
helped me understanding concepts where I got stuck. Also, thank you would be a small word to
acknowledge the moral support my mother lent me.
I hope the report fulfills the purpose it has been designed for and with the illustrations described,
be a good guide to understanding this very popular secretary problem. Hope you have a fine time
reading ahead!

1
Contents
1 Acknowledgement 1

2 Introduction 3
2.1 Proceeding with the problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

3 Markov Decision Processes 4

4 Optimal Stopping Theory 5

5 Dynamic Programming 6

6 The Secretary Problem 7


6.1 Objective . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
6.2 Solution Strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
6.2.1 Strategy of rejecting an optimal number of candidates and then
selecting the next best . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
6.2.2 Solution Strategy via Dynamic Programming . . . . . . . . . . . . . . . . . 11

7 Variants of the Secretary Problem 13


7.1 Kepler’s Suitable Wife Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
7.2 Cayley’s ticket riddle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
7.3 Variants classified on the basis of information about candidates . . . . . . . . . . . 13
7.3.1 No information model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
7.3.2 Full information model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
7.3.3 Partial information model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
7.4 Variants classified on the basis of time horizon . . . . . . . . . . . . . . . . . . . . 14
7.4.1 Finite time horizon model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
7.4.2 Infinite time horizon model . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
7.4.3 The Secretary Problem with unknown number of options . . . . . . . . . . 14
7.5 The Secretary Problem involving a discount factor . . . . . . . . . . . . . . . . . . 14

8 Significant Contributions 15

9 Multiple hiring 16
9.0.1 Hiring above a threshold . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
9.0.2 Maximal hiring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
9.0.3 Hiring above the mean . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

10 Multiple employers 19
10.0.1 Hiring only the best . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
10.0.2 Hiring at or above Employer’s Rank . . . . . . . . . . . . . . . . . . . . . . 21

11 Conclusion 23

12 Appendices 24

13 References 25

2
Revisiting the Secretary Problem
Supriya Murdia
July 18, 2017

2 Introduction

The secretary problem has been widely and intensively studied for over two decades. It is still
being worked upon and likely outcomes being proposed over time. It is studied under Markov
decision processes and probability theory.

The secretary problem states the probability of successfully hiring one best candi-
date among a given number of candidates, that can be ranked relatively, if the hiring
decision is irrevocable, given that you take minimum number of interviews possible.

As simple as that sounds, the statement is not very crisp because of quite a few parameters
that are not fixed but take up uncertain values. It’s easy to find this probability if we fix the
number of interviews we’re taking. It is a child’s play if the candidates are absolutely ranked or if
the candidates are judged based on some quality score and we know the maximum quality attained
by them. But, that is not not the case here.

In order to deal with this problem we need a more analytical approach rather than the basic
concepts of probability.

2.1 Proceeding with the problem


When we think of a solution to the secretary problem, we can first consider an ith interview, or
the ith candidate appearing for the interview. At this stage the candidate can be hired or rejected.
This gives the notion of two action states. With this decision, the employer will incur a cost or
reward, depending on whether the candidate is the overall best or not.

So, eventually, a logical approach to solving the secretary problem boils down to having two
action states, a decision being taken at each stage, and a cost being incurred as a result of this
decision for a transition into the consequent probabilistic state. The objective being minimizing
the number of interviews, i.e. minimizing the cost incurred per interview and still end up with the
one best candidate of the lot.

Mathematically, such a situation is called a Markov decision process if it follows the Markov
property.

3
3 Markov Decision Processes

Following the formal definition, Markov decision processes (MDPs) provide a mathematical
framework for modeling decision making in situations where outcomes are partly random and partly
under the control of a decision maker

Typically, it is a five-tuple process observed at time points t = 1, 2, . . . , n to be in one of a


possible set of states. A Markov decision process is defined by a set of all possible states S, set
of all possible actions A, (where As is the set of all possible actions given a state S), transition
probabilities, Pij (a) for a transition from state i to state j when the action a ∈ Ai is performed,
Ci (a) is incurred in that transition and the discount factor, λ ∈ (0, 1) .

Such a five-tuple process needs to follow the Markov property in order to be classified as a
Markov decision process. The transition probability, Pij (a) for a transition from state i to state j
given action a is performed at state i is defined by:
Pij (a) = P (Xt+1 = j|X0 , a0 ; X1 , a1 ; . . . ; Xt , at )
where Xt defines the state of the process at time t.
If a process follows the Markov property, the above definition can be rewritten as:
Pij (a) = P (Xt+1 = j|Xt , at )
i.e. the next state of the process depends only on the last state and the action chosen in
that stage. The transition is independent of all the previous states. This is essentially beneficial
as it allows to not store a trail of all states and only concentrate on the present state and the
corresponding action.

We, hence, observe that besides the above variables another important factor to define these
functions is the method of choosing the actions to be performed at each state. A policy determines
the rules to choose an action at a particular state. It may depend on the history of the process up
to that point or it may be randomized in the sense that it chooses action a with some probability
Pa , a ∈ A.

A stationary policy is a non-randomized policy that chooses an action at time t depending only
on the state of the process at time t. It is basically a function mapping the state space into the
action space. Such a policy yields a sequence of states that form a Markov chain.

In basic finance, we learn how the value of money falls with time. Along the same principles,
the cost incurred at each transition,when a certain is performed, falls over time. Taking λ ∈ (0, 1)
as the discount factor the discounted value/ present value at time t = 0 of the costs incurred at
each transition in the entire process when we follow a policy π, is given by a function, Q as:
Qπ (i) = C(X0 , a0 ) + λC(X1 , a1 ) + λ2 C(X2 , a2 ) + . . . + λt C(Xt , at )
where i is the initial state, X0 . Mostly in such problems, the objective is to find the minimum
value of Q as above in order to minimize the cost incurred in the process. So, for a policy π,
P∞
Vπ (i) = Eπ [ t=0 λt C(Xt , at )|X0 = i], i≥0
The above function determines the expected total cost incurred when we are employing a policy
π and the initial state is i.

4
4 Optimal Stopping Theory

Let
Vλ (i) = infπ Vπ (i), i≥0
A policy π ∗ (i) is said to be λ - optimal if
Vπ∗ (i) = Vλ (i), ∀ i≥0

A policy is said to be λ- optimal if its expected λ-discounted cost is minimal for every initial
state.
We can also express the above equation in a recursive form as follows :
P∞
Vλ (i) = mina C(i, a) + λ j=0 Pij (a)Vα (j), i≥0
Therefore, a policy is said to be optimal if it is at least as good as all other policies, or, for all
other policies, the optimal policy has at least zero or more states for which its minimum expected
discounted value is strictly less that the other policy.

Coming back to our problem


However, formulating this equation for our secretary problem does not completely solve it
because in the secretary problem, we do not have a pre-determined, fixed time at which the
process should terminate. In fact, in the secretary problem, we are supposed to optimize the time
taken, i.e. the number of interviews conducted so that we eventually do land up with the best
candidate. This classifies it as an optimal stopping problem instead.

Optimal stopping problems are associated with:

1. A sequence of states X0 , X2 , . . . , Xn . It is assumed that the joint distribution of these random


variables is known beforehand.
2. Rewards associated with each of those states, yi (a), i ≥ 0 being the states of the states, given
the action a is performed at the ith stage.

Using these we construct the expected discounted cost function and the objective is to find
the minimum stopping time t such that the cost function attains the minimum possible value (/or
maximizing the reward function depending upon the requirement of the problem). For a finite
horizon problem, the equation is popularly referred to as the Bellman equation or the dynamic
programming equation. Finite horizon problems are those for which the stopping time cannot
exceed a finite time, say T , i.e. T < ∞ For infinite horizon problems, the equation for the optimal
cost function takes the following form as discussed before:
P∞
Vπ (i) = t=0 λt C(Xt , at )|X0 = i,

It can be developed into three-boundary problems and solved using Markovian methods, Jump
Diffusion process or Brownian motion. For the finite horizon problem, the Bellman equation takes
the following form:
Pt=∞
V (i) = mini {C(X0 , a0 ) + λ{mina t=1 λt−1 C(Xt , at )}}, ∀t ≥ 0.

Optimal stopping problems find a variety of applications in exercising American options,


selling an asset and in our secretary problem as well.

5
5 Dynamic Programming

So far we have seen that the secretary problem can be classified as an optimal stopping problem.
In the secretary problem that we have considered so far and with what Ferguson[2] (cite his paper
here) defines the classic Secretary problem to be, we have a fixed n number of candidates. Hence,
the worst possible scenario is that we follow some strategy (
policy) under which we reject all the n candidates and therefore, fail to obtain the best candidate
of the lot. So we can have a maximum of n times (
n interviews). Here the ith time interval and the interview of the ith candidate can be used inter-
changeably because we’re observing a candidate at their interview and giving a judgment based
on that, and not at any other points in time.

Hence, with what we learned from the optimal stopping theory, we can derive an optimal
solution of the secretary problem using dynamic programming.

Dynamic programming is basically backward recursion. It is significantly used in mathematical


optimization and in bioinformatics and computer programming as well. It differs from linear
programming in the sense that the algorithm for the next state depends on the action performed
in the present state. Employing the basic concepts of programming, dynamic programming involves
breaking down a process spread over a finite number of time intervals into more simply solvable
blocks in a recursive manner.

Bellman described that decisions that are spread through several points in time can mostly be
broken apart recursively. Known as the Principal of Optimality, it can more authentically be
stated as follows :
An optimal policy has the property that whatever the initial state and initial decision are, the
remaining decisions must constitute an optimal policy with regard to the state resulting from the
first decision

6
6 The Secretary Problem

All this while we’ve been trying to approach the secretary problem on grounds of a rather ambigu-
ous, loosely stated problem statement. The secretary problem follows the following criteria:

1. There are a finite number of candidates, say n who await their interview calls.
2. There is one employer who has to hire one candidate among them all as his secretary.

3. Each candidates arrive for the interview sequentially in some random order which cannot be
defined beforehand.
4. They can all be relatively ranked on the basis of some quality as desired by the employer.
The candidate with rank 1 is supposed to be the best.

5. Whenever an interview happens, the decision must be taken right after it which must be
irrevocable i.e. a candidate once rejected cannot be considered again laters. Also, no ties are
allowed and one and only one candidate can be hired.
6. The employer is assumed to be satisfied only if the hired candidate is the best of all, and is
said to have failed if he hires anyone but the best i.e. the payoff function is 1 if he hires the
best candidate and 0 otherwise.

Given the above restrictions the secretary problem investigates the idea to find the maximum
probability of succeeding in hiring the best candidate on following some strategy that is proposed
pertaining to the above conditions.

This problem first appeared in Martin Gardner’s column of Scientific American in its
February, 1960’s issue. This problem was supposed to be a riddle for entertainment but it’s
ambiguity and negotiable constraints made it tremendously popular among Mathematicians and
Statisticians. Since then, specifically for about three decades following its release it was intensively
worked on. Many theories were developed to obtain an -optimal solution to it and about each of
its constraints were fiddled with for new variants.

6.1 Objective
This report aims at discussing the strategies implied to solve the secretary problem and a few
of its variants and simulate the results for a them. It also attempts at interpreting some more
variants sans their simulations and discuss relative applications of this problem into real word and
comparing relative strategies on the basis of the risk (/success) factor of ending up with the best
candidate (/s, for some variants) of the lot.

7
6.2 Solution Strategy

6.2.1 Strategy of rejecting an optimal number of candidates and then selecting the
next best

Until now, we have very carefully and systematically developed how dynamic programming might
be the one-stop solution to this problem, but there’s a remarkably simple solution to this
seemingly difficult problem.

Let φn (r) be the probability of hiring the best candidate, given that reject the first r − 1
candidates of a total of n candidates and then hire the next relatively best candidate of all those
we have seen so far.

For r = 1,
φn (r) = $φn (1) i.e. we reject the first 1 − 1 candidates 0 candidates and select the next best so
far. Since we do not have any further information, each candidate is equally likely to be the best
candidate, the probability of an ith candidate to be the best becomes 1/n

1
φn (1) = n

For r > 1,
we have
Pn
φn (r) = j=1 P ( jth candidate is the best and we select it)
However, since we have rejected the first r − 1 candidates, the probability of selecting them is
0. So, the above equation reduces to the one that follows
Pr−1 Pn
φn (r) = j=1 0 + j=r P ( jth candidate is the best and we select it)
Pn
=⇒ φn (r) = j=r P ( jth candidate is the best and we select it) . . . (i)
The probability that the jth candidate is the best will remain the same as the probability of
any one of the n candidates being the best. i.e.
P ( jth candidate is the best) = n1
The probability that we select the jth candidate is the ratio of the number of candidates rejected
to the number of candidates not hired, i.e. j − 1 candidates.
r−1
P ( We select the jth candidate) = j−1

This follows from the following.

Figure 1: Case 1. When j < r

Figure 2: Case 2. When j ≥ r

8
Case 1 is now obvious from the explanation on the previous page.
Case 2: Success will only be achieved if every candidate starting at position r and going up
to position j − 1 is worse than some candidate in the r − 1 people we let go, because otherwise,
i.e. if we encountered a candidate between r and j − 1 who is better than the jth candidate, then
chances are we would have already hired the candidate at which the hiring process would have
terminated or if we have not already hired, we can still not hire the jth candidate because the
strategy demands hiring the candidate ranked relatively only the best so far. So, we’re not allowed
to choose the jth candidate if a better candidate has been interviewed before him from among the
r to j − 1 candidates.

∴ P ( jth candidate is the best and we select it) = P ( jth candidate is the best overall) ×P (
We select the jth candidate)
1 r−1
P ( jth candidate is the best and we select it) = n × j−1

Pn 1 r−1
φn (r) = j=r n × j−1

r−1
Pn 1
=⇒ φn (r) = n j=r j−1 . . . (ii)

In order to maximize φn (r), we need to find an optimal r so that when we reject the first r − 1
candidates, we have the maximum chances of the best candidate being among one of the remaining
and that the employer actually hires this overall best candidate.

For small n, we can compute φn (r)

Table 1: Strategy of rejecting first r − 1 applicants and then choosing the next best, n=10
Threshold (n=10) φn (R)
1 -
2 0.2828968
3 0.3657937
4 0.3986905
5 0.3982540
6 0.3728175
7 0.3273810
8 0.2652778
9 0.1888889
10 0.1000000

9
The code for obtaining this table is attached in Appendix A

For large n, say, when n → ∞, we denote the limits for each term in (ii) as follows
x = lim nr
dt = lim n1
t = lim nj
Hence, P1 dt
limn→∞ φn (r) = (x − dt) t=x t−dt

Using Riemann approximation, the summation transforms into an integral as below:


R1
limn→∞ φn (r) = x x 1t dt
=⇒ −x ln x
In order to find the maxima of the above equation, we use the double- differenciation method
and obtain
φ0n (r) = −(1 + lnx)
φ0n (r) = − x1
Setting φ0n (r) to 0, we get x = 1e . φ0n (r) for x = 1
e is −e which is < 0, so there exists a maxima
of φn (r) at x = 1e

This maximum value is the maximized probability we require.


φn ( 1e ) = 1e ln( 1e ) = 1
e

which is roughly equal to 0.367879.

10
6.2.2 Solution Strategy via Dynamic Programming

We have seen how dynamic programming works like and how the appropriate constraints make the
secretary problem one of it kind, so, here we try solving the secretary problem through the same.
Let
Wt be the history of observations upto time t , I.e. after we have interviewed the tth candidate.
xt be an indicator variable
xt = 1, if the t th candidate is better than all its predecessors.
xt = 0, otherwise.
We also need to compute the probability of a candidate to be the best of all h candidates, given
he’s the best among the t candidates seen so far.
bestamongh&bestamongt
P (bestamongh|bestamongt) = bestamongt
P (bestof h&bestof t)
=⇒ P (bestof t)

P (bestof h)
=⇒ P (bestof t)
1
=⇒ h
1
t

t
h

Also, the probability the tth candidate is better than all her predecessors, given the history
upto t − 1 candidates is given as follows :
P (Wt−1 |xt =1)P (xt =1)
P (xt = 1|Wt−1 ) = P (Wt−1 ) (By Bayes’ theorem for con-
ditional probabilities)
=⇒ P (WPt−1 )P (xt =1)
(Wt−1 ) (Since, probabilities of
Wt−1 and xt are statistically independent of each other.
=⇒ P (xt = 1)
1
=⇒ t

We now frame the optimality equation and then using dynamic programming obtain a solution
to it. Let us define a function F (t) which defines the probability of successfully finding the best
candidate given we have passed t − 1 candidates.
F (t − 1) = 1t max( ht , F (t)) + (1 − 1t )F (t) . . . (1)
=⇒ max( h1 , F (t) 1
t ) + (1 − t )F (t)
1 1
=⇒ max( h + (1 − t )F (t), F (t))
For the tth candidate, when he’s the best so far (the 1st term of equation (1), the probability
of this is 1t . At this stage, we can either hire that candidate, in this case the probability of success
will be if xt = 1 as well, I.e. the candidate is the best of all h candidates, the probability of which is
t
h as computed above. Or we could over him, in which case the probability becomes F (t). Another
case (the second term of equation (1)) is when the tth candidate is not the best so far. In this
case, we simply move on to the next candidate with probability F (t). This explains our equation
(1) which can be obtained into the final form as obtained eventually.
For t ≤ h,
1 1
=⇒ h ≤ t
F (t) F (t)
=⇒ h ≤ t
1 F (t) 1 F (t)
=⇒ h − h ≥ h − t

The L.H.S. will always be a small positive constant since F (t) is a probabilistic value in (0, 1].
Replacing it by  , we get
1 F (t)
h − t ≤

11
t
=⇒ F (t) ≥ h − ‘
Substituting this result in F (t − 1) = max( h1 + (1 − 1t )F (t)
=⇒ h1 + (1 − 1t )F (t))
=⇒ F (t) + h1 − F (t) t
≥ F (t) + h1 − 1t ( ht − 0 )
≥ F (t) + h1 − h1 + ‘
≥ F (t) + ‘
Hence, F (t − 1) ≥ F (t)
Since ht is increasing in t and F (t) is non-increasing in t, hence, for small t, F (t) > ht and for
large t, F (t) ≤ th. Therefore, for large t, following backwards induction, and keeping F (h) = 0,
we obtain the following:
F (h−1) 1
h−1 = h(h−1)
F (h−2) F (h−1) 1
h−2 = h−1 + h(h−2)
F (h−2) 1 1
h−2 = h(h−1) + h(h−2)
···
F (t−1) 1 1 1
t−1 = th + h(t+1) + ... + h(h−1)

t−1
Ph−1 1
F (t − 1) = h τ =t−1 τ , t ≥ to where to is the point of
inflection.
We require F (to ) ≤ tho , then to is the smallest integer satisfying
Ph−1 1
τ =t0 τ ≤ 1

Solving this, we obtain L.H.S. approximately log( tho ).


So, t0 = he
1
The probability of success is, therefore, F (t0 ) = e = 0.3679 which is quite large for such a
ambiguously variable problem.

12
7 Variants of the Secretary Problem

As stated before, the secretary problem has been meddled with a lot over the decades, giving rise
to innumerable variants of this problem.

7.1 Kepler’s Suitable Wife Problem

After the death of his first wife, Kepler was very skeptical about how to choose his second wife
who would be nurturing and supportive in helping his mathematical genius to flourish.
Kepler did a lot of Mathematics and consulted quite a few advisors he knew of in order to not
go wrong this time. He eventually shortlisted a total of 11 ladies who he believed possessed the
qualities to be his potential spouse. He eventually chose the fifth one and lived a satisfying life
henceforth.
Following the strategy as stated above, the highest probability of a satisfying life could have
been x = 0.367879. Since x is an approximation for nr , for n = 11, r = 4. However, Kepler chose
the 5th lady. He wasn’t very far off from the predicted solution. Had he known this strategy it
would have saved him a lot of time and efforts.
Although Kepler’s problem wasn’t the first problem of this kind to bother Mathematicians, it
can be very closely said to be similar to the secretary problem, thereby, leading it to be called the
dating problem. (Other names include the Sultan’s dowry problem, the best choice problem, etc)

7.2 Cayley’s ticket riddle


The earliest problem similar to the secretary problem was Cayley’s ticket riddle.
n cards are labeled with numbers in a random order and placed upside down but the numbering
of the cards is not known beforehand. The player gets a reward equivalent to the number on the
card he draws so he would likely want to draw the card with the highest value in order to maximize
his profit. The problem is to devise a strategy to eventually find the highest valued card and a fair
probability of this strategy succeeding.

This problem is quite similar to the secretary problem, but not exactly the same since
the payoff function is not 1 or 0 but the value of the card drawn and shown.
Same is the case with the Googol problem.

7.3 Variants classified on the basis of information about candidates

7.3.1 No information model

In this, each alternative can be given a ranking with respect to all previous alternatives, but no
form of individual valuation can be made.

7.3.2 Full information model

Each alternative has an observable value, such that the sequence of alternatives may be regarded
as a sequence of random variables, independently identically distributed according to a known
distribution.

13
7.3.3 Partial information model

The partial information model in an intermediate case between the full- information and no-
information models. In this, values may be attached to each alternative but arise from a distribution
whose parameters are not known beforehand. The most significant contributions towards the
partial information model were made by Stewart in 1978 and Petrucilli in 1980

7.4 Variants classified on the basis of time horizon

7.4.1 Finite time horizon model

The classic secretary problem is a finite horizon problem as we know beforehand the number of
candidates available for interview. This has obviously, been worked on a lot but is not a very
likely real life scenario unless we are talking of situations like selling an American option. In
case of an American option, there is a final date before which the option has to be sold or it will
expire. However, the person holding the option can choose to sell it any day before the expiry
date, so this becomes purely a problem of optimization as to when should the option be sold in
order to maximize the profit. Quite a few mathematicians have tried solving this and obtaining
an  optimal solution.

7.4.2 Infinite time horizon model

This is the case when there are infinite number of candidates available or more generally, when
there are infinite options to choose from.

7.4.3 The Secretary Problem with unknown number of options


Samuels and Bruss (1987) worked on this aspect of the secretary problem when the number of
options were finite but not known a priori. This considers the more dynamic real-life scenario
when the options although finite, keep changing with time. The optimal rule for infinitely many
options is shown to be minimax with respect to all possible distributions of N , nearly optimal
whenever N is likely to be large and formal Bayes against a non-informative prior. These results
hold whatever be the lost function.

7.5 The Secretary Problem involving a discount factor


As of now, we have seen that we’re free to conduct interviews as long as we wish to. We are even
allowed to interview all candidates and then choose the last of the lot. The probability for this
to be a successful attempt is very bleak but it still is a valid possibility. A number of authors -
Samuels (1985), Rasmussen and Pliska (1975) and Fernestein and Enns (1988) worked on how,
what if every interview incurred a cost. This led to another optimization along with so many the
secretary problem in itself calls for. In this problem, we will also have to calculate an optimal
time/ an optimal number of interviews to be conducted so that we obtain the best candidate in
the minimum required time.

14
8 Significant Contributions

There are certain extensions of the secretary problem where ties are also dealt with. Some
altering the arrival order of the candidates to conclude to obtaining a higher success probability
and still keeping the problem realistic.

Thee contributions by Lindley, Gilbert and Mosteller, Samuel, Bruss, Stewart, DeGroot,
Sagachuki, Chow and Robbins in shooting the relevance and significance of the secretary problem
and its advancements are noteworthy.

We see how optimal stopping problems constitute a separate field of themselves. It is no trivial
task to be able to enlist every variant proposed along with its solution. Further, we explore two
particular variants in depth.

15
9 Multiple hiring

In the secretary problem we saw there was only one secretarial position available.

We now approach the same problem allowing multiple hiring to take place. The objective is
to maximize the quality of employees with time after successful hiring.

9.0.1 Hiring above a threshold

The rest of the restrictions are retained. The payoff function has a slightly different context. Payoff
is 1 if the candidate hired is above the threshold, t and 0 otherwise.

Set a threshold value, t.


Let the quality score of the ith candidate from 1 to n be denoted by iq where we assume that
their scores are uniformly distributed in (0, 1).
Note: Logically, the scores of the candidates should be normally distributed between (0, 1) with
the mean somewhere around the symmetric center of this interval, since most of the candidates we
come across will more likely have their quality scores oscillating around some certain value and be
fairly scattered along the extremities of the interval.
However, assuming their quality scores to have a uniform distribution greatly simplifies the
calculations while dealing with the normal distribution would not have led to significant differences
anyway.
Observe iq in the interview.
Hire if iq ≥ t.
It is easy to notice that the quality scores of all the hirs are U nif (t, 1).
The probability of the ith candidate’s score to be anything between t and 1 is equally likely
1
with the probability P (iq = q) = 1−t

Mean Quality
Mean quality after multiple hiring can be obtained by the standard formula for expectation of
continuous probability distributions.
R1 1
E(t) = t 1−t sds
1 s2 1
=⇒ 1−t [ 2 ]t
1+t
=⇒ 2

Thus we end up with a constant hiring rate following this strategy, which makes it not the most
progressive strategy possible. It is fair at one point in time as it allows same opportunity for each
candidate to be hired, i.e. above a constant threshold t.
But it ignores competition and the quality of the hires stagnates over time.

16
9.0.2 Maximal hiring

To improvise the previous strategy, we resort to maximal hiring.


This involves hiring a candidate only if he’s better than all the employees so far. This aptly
deals with the much ignored competition until now. However, this becomes somewhat unfair as it
depends upon the time of arrival of a particular candidate.
If a candidate has a lower quality score, then their chances of getting hired are more likely if
they come up for the interview before the ones with high quality scores.

Say, initially, the highest quality score of all the employees is q ∈ (0, 1)
Let hi be the ith candidate hired.
We focus on the gap, gi = 1 − hi
If hi ≥ q, hire the candidate.
Here, the quality of the next candidate to be hired will be an iid uniform random variable but
not on a constant interval. The quality score of the next candidate to be hired will be uniformly
distributed in hi to 1 , i.e. :
hn ∼ U nif (hn−1 , 1)
∴ gn ∼ U nif (0, gn−1 )
Conditioning on gn−1 , the probability density function, fgn (t) then becomes :
1
fgn (gn |gn−1 ) = gn−1

,since these are uniform iid random variables.


E[gn ](gn |gn−1 ) = E[gn ]
Rg 1
=⇒ 0 n−1 gn−1 tdt
2
1 gn−1
=⇒ gn−1 2
gn−1
=⇒ 2
1−gn−2
=⇒ 22

..
.
1−q
=⇒ 2n

So we see the expectation of the gap decreases and with n → ∞, the expected value of the gap
approaches 0. Thus, we obtain a very quality of hiring eventually.This might not be a very practical
strategy for a budding company with many competitors already well-established in the market.
This also causes hiring to be very slow over time. Google is believed to follow this strategy for
hiring new employees.
The major drawback of this strategy is an unfair advantage to the candidates since the hiring
strategy depends on their arrival order.
To cope up with this, we follow the next two strategies also called the Lake Wobegon
strategies - I.e. hiring above the mean and hiring above the median

17
9.0.3 Hiring above the mean
In this strategy, the next employee we hire has a quality score greater than or equal to the mean
to all the present employees.
Let Ai be the average quality after i hires. A0 = q quality of the initial employee or more
correctly, the average quality of all the present employees before the recruiting begins.
Note. limi→∞ Ai = 1
Consider the gap sequence : Gi = 1 − Ai . This converges to 0 as n → ∞
Initial gap, G0 = 1 − Ai = 1 − q
The quality scores qi s have common uniform distribution between (0, 1) as all candidates are
equally likely to have any quality in the given interval. Also, this makes computation highly
simplified. For t ≥ 0, the probability fpr Gi+t given Gi Is given as follows :
Ui
P (Gi+t |Gi ) = Gi Gi+1 . . . Gj+1 . . . Gt+1 Ai = i+1
Ui
∴ Gi = 1 − i+1
Conditioning on Gi , we obtain Gi+t as follows :
Qt Uj
P (Gi+t |Gi ) = Gi j=1 (1 − j+i+1 )
=⇒ (GI )(Gi+1 ) . . . (Gi+j+1 . . . (Gi+t+1 )
i+t ∩Gi
P (Gi+t |Gi ) = P (G
P (Gi )
∴ =⇒ P (Gi+t )
Hence, we obtain: Q
t Uj
Gn = g j=1 (1 − j+1 )

As from the theory above,


Gi+t = 1 − Ai+t
Ai+t is the average quality score given by the average of quality scores of i + t − 1 candidates,
given by :
(i+t)Ai+t−1 +Ai+t
Ai+t = i+t+1

Now Qi+t will be a uniform distribution U nif (Ai+t−1 , 1) or in terms of the gap sequence -
U nif (1 − Gi+t−1 , 1) which can be rearranged as follows :
Qi+t = 1 − Gi+t−1 Ut

So,
(i+t)Ai+t−1 +1−Gi+t−1 Ut
Ai+t = i+t+1
Gi+t = 1 − (i+t)Ai+t−1 +1−Gi+t−1 Ut
i+t+1
=⇒ (i+t)(1−Ai+t−1 )+1+1−Gi+t−1 Ut
i+t+1
(i+t)(Gi+t−1 )+2−Gi+t−1 Ut
=⇒ i+t+1
=⇒ (i+t−Uti+t+1
)(Gi+t−1 )+2

=⇒ 1 + 1−Ui+t+1
t Gi+t−1

Which thus, states the statistical independence of the gap sequence random variables.
This strategy gives an average quality of O( g1 ) over time which is fairly more practically
n2
implementable compared to the previous two strategies.

18
10 Multiple employers

This involves the secretary problem in a slightly twisted manner. As proposed by Karlin and Lei
Until now we had one employer and a bunch of candidates to choose the best applicant from.
Now, we have a more realistic situation than before with multiple employers trying to hire from
the same bunch of candidates. The interviewing and decision process still remains the same - the
candidates are interviewed in a random order and an irrevocable decision of hiring or rejecting
them is to be taken right after their interview with no ties being allowed.
We can follow two strategies in this process.

10.0.1 Hiring only the best

Hiring only the best involves a constrained payoff function where employer j is satisfied after the
hiring, I.e. his payoff is 1 iff he hires the best of all candidates and is 0 otherwise. So we need to
device an optimal strategy such that each employer gets to hire the best of all the candidates he
can choose from. An employer after hiring is considered out of the hiring process along with the
candidate he has hired.
We make a few assumptions here :
The rank of the employers is publicly known to each candidate that appears for the interview.
The candidates can be ranked relatively based on their quality scores at the time they come for
their interview but their overall (or relative) rank is not known beforehand. Also, the candidates
arrive in a random order not known to the employees beforehand.
As we will observe, this case sets an optimal threshold of e−1 candidates for the highest ranked
3
employer to reject directly and then choose the next best he encounters, e− 2 candidates to be
rejected by the second highest ranked employer and so on. So, lower the rank of the employer, the
more candidates he gets to choose from.

19
To solve this mathematically, we need to introduce a few variables.
Rj (i) is the optimal risk of the jth employer of failing to find the best candidate and hire him,
given that he rejects the first i − 1 applicants. For this we assume that the (j − 1) higher ranked
employers are following their optimal strategies and none of them have already hired before from
the first i candidates. This risk function implicitly depends on n.
Let T be an integer between 0 to n, the threshold strategy T is the number of candidates an
employer rejects and then chooses the next best so far.
It is possible that two or more employers eventually send an offer letter to the same candidate.
In this case, the candidate’s strategy is trivial. He will directly choose the employer with a higher
ranking and reject all other offer letters. The rejected employers will have to continue the hiring
process.
We assume that there are k employers and n employees. In the introduction to this strategy,
we have hypothesized that the threshold of the highest ranked employer will be highest, i.e. :
T1 ≥ T2 ≥ . . . ≥ Tk−1
For employer k, there are two cases.
When i ≤ Tk−1 :
Rk (i − 1) = 1i min(Rk (i), 1 − ni ) + (1 − 1i )Rk (i)
The above equation can be logically understood as, when the ith candidate is the best of all
the ones seen before, we can either accept her, then the risk of her not being the best overall is
(1 − ni ) or we can skip her and move onto the next candidate, for which the risk becomes Rk (i). If
the ith candidate is not the best among the ones seen so far, we simply move on to the next one,
the risk for this case being Rk (i).
When i > Tk−1 :
Rk (i − 1) = 1i Rk−1 (i) + (1 − 1i )Rk (i)
Here, the chance that the ith candidate is the best so far is 1i and since i > Tk−1 there will
be potential higher ranked employers wanting to higher that same candidate so the optimal risk
becomes Rk−1 (i) since there are k − 1 employers wanting to hire that candidate among 1 . . . k
employers. When the ith candidate is not the best so far, we move on to the next candidate and
the risk simply becomes Rk (i).
As defined before, Tj is the threshold for each employer. Logically, since the risk of rejecting
the first i − 1 candidates and failing to find the best will be greater than the risk of rejecting the
first i − 1 candidates and finding the best candidate overall and hiring him. Hence, we can define
T as follows:
T = min(i|Rk (i) ≥ 1 − ni ) − 1
T exists when Rk (n) = 1 It can be easily analyzed that this strategy is the optimal strategy.

20
Table 2: Hiring only the best - Multiple employers
Employer Threshold (n=10) Ratio(T/n)
1 3 0.3
2 2 0.2
3 1 0.1
4 1 0.1
5 1 0.1
6 1 0.1
7 1 0.1
8 1 0.1
9 0 0.0
10 0 0.0

Table 3: Hiring only the best - Multiple employers


Employer Threshold (n=3000) Ratio(T/n)
1 1103 0.368
2 669 0.223
3 423 0.141
4 273 0.079
5 178 0.059
6 117 0.039
7 77 0.026
8 51 0.017
9 34 0.011
10 23 0.008

10.0.2 Hiring at or above Employer’s Rank

This is the second strategy for the variation of the secretary problem when allowing for a single
secretarial position by each of the employers in the multiple employers’ situation. The payoff
function is 1 when an employer with a rank j ends up hiring one candidate from among the
candidates ranked from 1 to j, I.e. worst own rank hiring.
Again we introduce some variables to proceed with solving this.
Employers, 1 through k.
n candidates
Hiring status, x = (x1 , x2 , . . . , xk )
Rank of the ith applicant, ri
Relative rank of the ith applicant, rri
Rj (i, x), the optimal risk of the jth employer i.e. the probability of failing to hire a candidate
of the worst own rank, under the hiring status x given that we reject the first i candidates

21
Auxiliary functions
HRR(i, j, x), the highest relative rank of the ith applicant who is to be hired by one of employers
1 through j assuming that none of them have already hired. If no such applicant exists, HRR
would be 0.
AE(i, t, j, x) or the Accepted Employer. It represents an ith applicant with relative rank t who
has been given an offer letter by employer j, who hasn’t already hired yet under the hiring status
x and the ith applicant accepts his offer letter.
N X(i, t, j, x) is simply a copy of AE(i, t, j, x) which represents a change in the hiring status x
where we replace xj = 1 denoting that the jth employer has hired a candidate at work own rank
and is now out of the hiring process. This variable represents the new hiring status.
Another thing we will need to compute Rj (i, x) is the probability that an ith candidate with a
certain relative rank has a worse rank than, say, c. This probability is seen to follow a hypergeo-
metric distribution and can be defined as follows:
Pt n
 n−c  n
P (ri > c|rri = t) = m=0 m i−m / i

Since we take a summation of the probabilities of all ways in which the first i applicants contain
m ≤ t of the top c applicants. This value is non increasing in i.
Let s = HRR(i, j, x)
The optimal risk of the jth employer given that he rejects the first i applicants under the hiring
status x is given as follows:
Ps Pi
Rj−1 (i − 1, x) = 1i ( t=1 Rj (i, N X(i, t, j − 1, x)) + t=s+1 min(Rj (i, x), P (ri >
j|rri = t))
We take the initial condition as Rj (n, x) = 1 i.e. the risk of failing to find the best candidate
and hire him is a 1 given that we reject all n candidates under a hiring status x. We have to find
Rj (0, 0) and we use backward recursion on the above equation to obtain that.
The above equation can be logically understood as, when the ith candidate is the best so far
then all the higher ranked employers who haven’t hired yet will be looking to hire that candidate
and thus we sum the risk of all employers 1 through j −1 who wish to hire that candidate assuming
they haven’t hired yet. If the ith candidate is of worst own rank of the employee, then the employee
can choose to hire the candidate, in which case the risk becomes the probability of the ith candidate
not being the best overall or we could reject the ith candidate who is relatively the best yet in
which case the risk becomes Rj (i, x).

22
11 Conclusion

This report studies the very popular Secretary Problem and investigates the various strategies
used to approach a solution to it. Although a definitive solution has not been proposed for it,
several  optimal strategies have been devised. As studied above, quite a few variants based of
each condition of the classic secretary problem have cropped up over the years. The report tried
discussing the variants to some extent. A lot more have been studied and an equal amount are
open for work.

Future Work

A lot of work is yet to be done in this direction. There is great practical scope for the upcoming
variants of this problem and might actually be very beneficial in quite a few real-life everyday situ-
ations. Future work involves studying, interpreting and improvising them to the best of knowledge
and resources available.
Best-choice problems are required each time a choice has to be made. A natural instinct to
choose a best option among a set of given options is what invariably catches the eye. This is
the strategy that most recruiters follow and no complicated mathematics is really followed when
actually choosing the presumably best.This is mainly because it’s not transparent of one same
option as being considered the best by all employers because of the different requirements of each
of time, of the flexibility and convenience of the candidate as well.
Also, if we actually do follow the strategy of rejecting an optimal number of candidates and
then choosing the best candidate, no one would like to go for the interview for the first few turns.
This strategy also, becomes kind of unfair for the candidates rejected initially. This strategy is
more employer-friendly. We need to propose a strategy which is both employer and candidate
friendly.
The best choice problem can also be extended to the pricing of American options. A lot of work
has already been done in this direction but there’s always more scope. The flexibility of American
options make them all the more alluring. It is tricky, but there is a possibility of incurring a profit
from American options whereas the European options lack this possibility of profit and flexibility.
Future work includes working on the pricing of such American options.

23
12 Appendices

Figure 3: Code : Hiring only the best - Multiple employers

24
13 References

1. https://en.wikipedia.org/wiki/Secretary_Problem
2. Fergusson, T. (1989). Who solved the secretary problem? Statistical Science, Vol. 4, No. 3
(Aug., 1989), pp. 282-289

3. Karlin, E. and Lei, E. On a competitive secretary problem


4. GILBERT, J. and MOSTELLER, F. (1966). Recognizing the maximum of a sequence. J.
Amer. Statist. Assoc. 61 35-73. GILBERT, J. and MOSTELLER, F. (1966). Recognizing
the maximum of a sequence. J. Amer. Statist. Assoc. 61 35-73.

5. LINDLEY, D. V. (1961). Dynamic programming and decision theory. Appl. Statist. 10


39-51
6. MOSER, L. (1956). On a problem of Cayley. Scripta Math. 22 289-292.
7. PETRUCCELLI, J. D. (1980). On a best choice problem with partial information. Ann.
Statist. 8 1171-1174.

8. SAMUELS, S. M. (1985). A best-choice problem with linear travel cost. J. Amer. Statist.
Assoc. 80 461-464.
9. RASMUSSEN, W. T. and PLISKA, S. R. (1976). Choosing the maxi- mum from a sequlence
with a discount function. Appl. Math. Optim. 2 279-289.

10. SAKAGUCHI, M. (1976). Optimal stopping problems for randomly arriving offers. Math.
Japon. 21 201-217
11. STEWART, T. J. (1978). Optimal selection from a random sequence with learning of the
underlying distribution. J. Amer. Statist. Assoc. 73 775-780.

25

You might also like