Stochastic Scheduling With Abandonments Via Greedy Strategies
Stochastic Scheduling With Abandonments Via Greedy Strategies
Abstract
Motivated by applications where impatience is pervasive and service times are uncertain,
we study a scheduling model where jobs may depart at an unknown point in time and service
times are stochastic. Initially, we have access to a single server and n jobs with known non-
negative values: these jobs have unknown stochastic service and departure times with known
distributional information, which we assume to be independent. When the server is free, we
can run an available job which occupies the server for an unknown amount of time, and collect
its value. The objective is to maximize the expected total value obtained from jobs run on
the server. Natural formulations of this problem suffer from the curse of dimensionality. In
fact, this problem is NP-hard even in the deterministic case. Hence, we focus on efficiently
computable approximation algorithms that can provide high expected reward compared to the
optimal expected value. Towards this end, we first provide a compact linear programming (LP)
relaxation that gives an upper bound on the expected value obtained by the optimal policy.
Then we design a polynomial-time algorithm that is nearly a (1/2) · (1 − 1/e)-approximation to
the optimal LP value (so also to the optimal expected value). We next shift our focus to the
case of independent and identically distributed (i.i.d.) service times. In this case, we show that
the greedy policy that always runs the highest-valued job whenever the server is free obtains a
1/2-approximation to the optimal expected value. Our approaches extend effortlessly and we
demonstrate their flexibility by providing approximations to natural extensions of our problem.
Finally, we evaluate our LP-based policies and the greedy policy empirically on synthetic and
real datasets.
1 Introduction
Stochastic scheduling, a fundamental problem in Operations Research and Computer Science, deals
with the allocation of resources to jobs under uncertainty. In this paper, we consider the following
single-server scheduling problem: a service system receives a collection of n jobs to be run sequen-
tially. Each job j has a service time sj and provides a value vj upon completion. The goal is to
select a subset of jobs in order to maximize the sum of values obtained. Such scheduling decisions
are often subject to constraints; for example, capacity constraints where the server can only run a
fixed number of jobs due to resource limitations, or certain jobs may have time-based constraints
such as specific operating hours or time windows during which service can be provided. The setting
where each job must be scheduled by a fixed deadline is captured by the classic NP-hard knapsack
problem [47]. In practice, the exact service time of a job may not be known; however, we may
assume (for example, using historical data) that the probability distribution of the service time is
known. This stochastic variant, termed stochastic knapsack, was introduced in [19], and has been
extensively studied in several communities (see, for example, [9, 21]).
In many customer-facing systems, an additional challenge arises due to impatient customers.
Extended wait times can lead to customers leaving the system without receiving service, driven by
1
factors such as dissatisfaction, or due to having access to alternative options. This can result in a
loss of revenue in several settings such as healthcare, call centers, and online service platforms (see,
for example, [1, 2, 4, 14], and the references therein).
Motivated by this, we study sequential stochastic scheduling problems comprising two sources
of uncertainty: jobs have a stochastic service time, and jobs may leave the system at a random
time. We model impatience in a Bayesian worst-case manner: we assume that, for each job j, we
are given a distribution that represents the probability that job j stays in the system for at least t
units of time. In particular, we assume:
• the service time of each job j is an independent random variable which is distributed according
to a known probability distribution, and
• each job j independently stays in the system for an unknown random amount of time, following
a known probability distribution.
Informally speaking, there is a single server and n jobs queue up for service. Whenever the server is
free, an available job, say j ∈ [n], is selected to be run, and the corresponding value vj is obtained
as reward. The server is then rendered busy for a random amount of time that corresponds to the
service time of job j. In the meantime, each of the remaining jobs may depart from the system.
This process repeats until there are no jobs remaining in the system. We refer to this problem as
stochastic scheduling with abandonments, denoted as StocSchedX. We believe that StocSchedX
is a widely applicable model. Here are two concrete applications:
Call Centers. A typical occurrence in call centers is customers waiting to be served. If the wait
becomes too long, an impatient customer may choose to hang up. The duration a customer spends
interacting with the call center is contingent on the nature of the call; for instance, opening a new
account in a bank call center might require less time than a loan payment. Existing models often
assume service and abandonment times are independent and identically distributed draws from
exponential distributions [8, 24]. However, empirical studies [12, 41], suggest that service times and
abandonment times are not necessarily exponentially distributed. Moreover, the utility of a caller
can be modeled as a function of waiting cost and value of the service associated with the call [2],
or by assigning a reward under some conditions; for example, if customers are served within some
fixed time windows [51].
On-Demand Service Platforms. On-demand service platforms have experienced tremendous
growth in the last few years [54]. In online platforms offering services like freelancing or task
assignments (for example, TaskRabbit or UpWork), ride-hailing (Uber of Lyft) or food delivery
(such as DoorDash or Meituan), service completion times can be uncertain due to factors like
task complexity, provider availability, or traffic conditions. Customers may abandon requests if
response is slow. Current models often assume that customer arrivals follow a Poisson process, and
that service and abandonment times are exponentially distributed [55, 57]. However, this may not
always be the case; for example, a recent empirical study on Meituan, a food delivery service in
China, shows that a longer waiting times for service production generally increases the likelihood
of abandonment, while longer delivery times do not [59]. The utility associated with a customer
can be modeled in several ways; for example, we may assign a fixed price/wage rate per service
unit [6, 7] or use a dynamic (system-state-contingent) rate [13].
Our model makes mild assumptions about service and departure distributions (see §1.1) and
assigns a value to each job, making it a practical tool for modeling a diverse array of applications,
including the ones mentioned above. In fact, StocSchedX generalizes the deterministic knapsack
problem: suppose that for all jobs j, the probability that job j stays in the system for t > T −sj units
of time is 0, where T > 0 and sj is the deterministic service time for job j. This is exactly a knapsack
2
instance with deadline T , proving that StocSchedX is NP-hard. As a consequence of this, we aim
to design polynomial-time algorithms with provable guarantees for the problem. A natural approach
would be to study StocSchedX under the competitive analysis framework [11], where we compare
ourselves to a “clairvoyant” adversary who knows the random outcomes of departure and service
times in advance. Unfortunately, the following example shows that an optimal algorithm (even
with unbounded computational power) for StocSchedX gets at most O(1/n) times the expected
value obtained by a clairvoyant adversary.
Example 1.1. Consider n identical jobs such that each job j ∈ [n] has vj = 1, departs at time n
with probability 1, and has service time uniformly in {1, n}. Observe that, in expectation, n/2 jobs
have a service time of 1, and a clairvoyant algorithm obtains expected total reward of n/2. On the
other hand, any algorithm that does not know the realizations of the service times obtains expected
total reward of at most 2 (in expectation, the second job that is selected has service time equal to
n).
This motivates us to utilize a more refined benchmark, and thus, we compare our algorithm
against an optimal online (non-clairvoyant) algorithm. We note that an optimal online algorithm
for StocSchedX can be formulated as a stochastic Markov decision process (MDP) [50], which
can be computed in exponential time. So, the main question of our work is:
Can we design polynomial time algorithms that obtain provably good expected total
value compared to an optimal online algorithm for StocSchedX while making mild
assumptions on the service time and impatience distributions?
Our main result lies in designing polynomial-time algorithms for StocSchedX that achieve
a constant-factor approximation to the online optimum. We summarize our main contributions
below.
3
are still in the system at time t by Rt . We assume that all random variables (service times and
departure times) are independent. We also assume that time starts at t = 1, with the server being
free. We say that the server is busy when it is running a job, and say it is free otherwise. When the
server is free, it can run a job from the set of available jobs; that is, those jobs that have not yet
been run and are still in the system. Upon starting job j, the service time Sj of job j is revealed,
and we obtain a reward of vj . The server is then rendered busy for exactly Sj units of time. We
assume that jobs cannot leave the system once service has begun, and preemptions are not allowed
(see Figure 1 for an illustration). We use F Sj (t) := Pr(Sj > t) to denote the tail distribution of
Sj . We assume that there is T = T (I) < +∞ such that F Sj (T ) = 0 for all j ∈ [n]. For Dj , we
use pj (t) := Pr(Dj ≥ t) to denote the probability that job j remains in the system at least t units
of time. We also assume that pj (t) > 0 for all t ≥ 1. We note that the assumptions on Sj and Dj
can be made at the expense of an arbitrarily small loss in the objective value (see Appendix A.1
and A.2 for details).
Time (t) 1 1 + Sj 1 + Sj + Sj ′
Available Jobs [n] R1+Sj R1+Sj +Sj′
...
Job Run j j′ j ′′
Reward up to time t vj vj + vj ′ vj + vj ′ + vj ′′
Figure 1: The figure demonstrates a sample execution. We select job j at time t = 1 which gives a value
vj , and causes the system to be busy for Sj time units. Once the server is free, we select job j ′ from the set
of available jobs and obtain (additional) value vj ′ , and so on.
Given an instance I, a policy Π(I) is a function that dictates at each time that the server is free
which available job to run in the server. Note that Π takes as input the time t, the available jobs
Rt , and the current state of the server (free or busy). We denote by v(Π(I)) the expected value
obtained by following policy Π(I). We seek a policy that solves
i.e., a policy that maximizes the expected value collected. We denote by OPT(I) the optimal policy
on the instance I that solves (1). If the instance is clear from the context, we simply denote Π and
OPT instead of Π(I) and OPT(I) respectively. We abuse notation and write simply OPT for OPT(I).
Even though, in theory, we can solve (1) and compute OPT (see, e.g., [50]), we seek algorithms that
run efficiently and can give provably good approximations to (1).
We also consider several extensions of StocSchedX that are motivated by classic scheduling
problems as well as real-world applications. One extension to StocSchedX adds a job-dependent
strict deadline Bj ≥ 1 for each individual job j ∈ [n], allowing us to differentiate between departure
and deadline. Under this extension, we only collect revenue from jobs that finish running before
their deadline. In another set of extension, we introduce operational knapsack constraints imposed
to the server, that are independent of service and departure times. In this context, each job j
is assigned a weight wj ≥ 0. The total weight of the jobs run on the server must not surpass a
specified capacity W . See §3 for details.
4
Theorem 1.1 (Upper bound on the online optimum). Let I be an instance of StocSchedX and
let Sj be bounded by T with probability 1 for all j ∈ [n]. Then, the optimal value of the following
linear program (denoted as LP-Sched) with n2 · T variables and n · (1 + T ) constraints gives an
upper bound on the online optimum.
nT X
X n
maximize xj,t vj (LP-Sched)
t=1 j=1
nT
X xj,t
subject to ≤1 ∀j ∈ [n] (a)
pj (t)
t=1
n
XX
xj,τ F Sj (t − τ ) ≤ 1 ∀t ∈ [nT ] (b)
τ ≤t j=1
For each instance I, our (LP-Sched) has one variable xj,t per job j and time t that can be
interpreted
P asP the the probability that the optimal policy runs job j at time t. Hence, the objective
value is nj=1 t≥1 vj xj,t . (LP-Sched) has two types of constraints that are deduced from natural
constraints that any policy for StocSchedX must satisfy: (1) no job can be run more than once,
and (2) at any given time, the server is rendered by at most one job. Our LP satisfies these
constraints in expectation; hence, it provides a valid upper bound to the online optimum, v(OPT).
We defer the details to §2.1.
Our next result demonstrates that the objective value of (LP-Sched) is at most a constant
factor from v(OPT). We provide an algorithm that uses optimal solutions from the LP.
Theorem 1.2 (Approximation guarantee). There is an algorithm ALG that, for any instance to
StocSchedX, guarantees v(ALG) ≥ (1/2) · (1 − 1/e)vLP .
A naı̈ve approach uses the optimal solution x∗j,t to the (LP-Sched) as the probability to run job
j at time t, hoping to obtain expected value equal to the value of the optimal solution. However,
note that Constraints (a) and (b) only hold in expectation. So, this naı̈ve implementation runs into
two critical issues: (1) running a job at time t can decrease the chance of running a job at time
τ > t, leading to dependencies in the decisions across times, and (2) the mechanism to select one
job at each time is unclear since sampling according to the x∗j,t values may lead to multiple jobs
being selected, while our server can run only one job at a time. We get around these challenges as
follows. First, we scale down the probability that a job is available to be run using a scaling akin to
simulation-based attenuation [22, 40]. The scaling guarantees that the probability that the server
is free at any given time is at least 1/2. To get around the other issue, we create a “consideration
set” using an optimal solution to the (LP-Sched) and the set of available jobs. Then, from this
consideration set, we greedily run the highest-valued job. This greedy step ensures that we get at
least a (1 − 1/e) fraction of the value obtained by the LP at the corresponding time. Furthermore,
√
we show that our analysis is almost tight via an instance where v(ALG) is at most (1 − 1/ e) ≈ 0.39
times vLP . Finally, in §2.4, we present an instance such that v(OPT) ≤ (1 − 1/e + ε)vLP ; hence,
we cannot expect to devise an algorithm with an approximation guarantee better than 1 − 1/e, at
least using our LP.
The scaling factors used by ALG at time t require the simulation of events that happen up to
time t − 1. This simulation could require exponential time to be computed accurately. However,
if we assume that the service times have support in [0, nk ], for a fixed k, our algorithm can be
run in polynomial time. In short, we can use Monte-Carlo approximations [56] to estimate the
5
scaling factors a the expense of a small multiplicative loss in the objective. The following corollary
formalizes this result; the details are presented in §2.3.
Corollary 1.3 (Polynomial-time approximation ratio). Assume that the support of the service
times are contained in [0, nk ] for a fixed k, and denote vmax = maxj∈[n] vj . Fix an ε > 0, then there
is an algorithm ALGε that obtains an expected total reward of at least (1/2) · (1 − 1/e − ε) · v(OPT)
which runs in time 6n2+k vmax ln(2(n2+k /ε))/ε3 .
We obtain a better approximation guarantee when considering the special case where the service
times of jobs are independent and identically distributed (i.i.d.). We show that the greedy solution
that runs the highest-valued job whenever the server is free obtains a 1/2-approximation to the
optimal policy.
Theorem 1.4 (Greedy policy for i.i.d. service times). The greedy policy that, from the set of
available jobs, runs the one with the largest value, obtains an expected total value at least 1/2 times
the expected total value obtained by an optimal policy for StocSchedX when the service times of
the jobs are i.i.d.
To show this result, we use coupling and charging arguments. Using a coupling argument,
we ensure that the server is free at the same time steps under both, the greedy and the optimal,
policies. Furthermore, via a charging argument, we show that, whenever the greedy policy runs a
job, twice the value collected by the greedy policy suffices to account for the value collected by the
optimal policy at the same time and one step in the future. We present the proof in Appendix C.
Finally, we note that our analysis for the greedy algorithm is tight, as demonstrated by the following
example.
Example 1.2. Consider the following instance with n = 2 jobs. We set the parameters of the jobs
as follows: v1 = 1 + ε, D1 = 2 with probability 1; v2 = 1 and D2 = 1 with probability 1. Let S1
and S2 be 1 with probability 1. Observe that the greedy algorithm (serving the highest valued job)
obtains (1 + ε)/(2 + ε) times the value obtained by an optimal policy.
Due to the inherent simplicity and practicality of greedy solutions, it might be tempting to
formulate greedy algorithms for StocSchedX. However, as the subsequent examples illustrate,
these intuitive, natural greedy approaches — such as prioritizing jobs based on their value or on
the ratio of value to expected service time — do not yield favorable results.
Example 1.3. Consider the greedy algorithm that runs the job with the largest value. Consider
the following instance of n jobs. We set the parameters of job 1 as follows: v1 = 1 + ε, D1 = n + 1
and S1 = n with probability 1. For each job j ∈ {2, . . . , n}, we have vj = 1, Dj = n and Sj = 1
with probability 1. Observe that the greedy algorithm runs job 1 at time t = 1 and obtains a value
of 1 + ε, while the optimal solution obtains a total value of n + ε by first serving jobs 2, . . . n, and
then job 1.
Example 1.4. Consider the greedy algorithm that runs the job with the largest ratio of value to
expected service time. Consider the behavior of this algorithm on the following instance with n = 2.
We have v1 = 1 + ε, D1 = K + 2 and S1 = 1 with probability 1, and v2 = K, D2 = 1 and S2 = K
with probability 1. The greedy algorithm runs job 1 at time t = 1, and loses out on the value of job
2 (which leaves the system at t = 2). Thus, the greedy algorithm obtains a value of 1 + ε, while the
optimal solution obtains a total value of K + 1 + ε.
6
Our approaches can be easily extended to tackle natural extensions of StocSchedX. Our first
extension includes job-dependent deadlines where no value is obtained if the job finishes after its
deadline. This model generalizes the classic stochastic knapsack problem [19] by allowing jobs to
leave the system. Our next extension involves introducing a knapsack constraint on the jobs that
the server can run. In this case, each job j has a weight wj ≥ 0, and the cumulative weight of
jobs run on the server cannot exceed a predetermined capacity W . This extension also captures a
cardinality constraint when wj = 1 for all j ∈ [n] and W = k ∈ Z. For each of these extensions,
we propose modifications to LP-Sched that give valid upper bounds as in Theorem 1.1, as well as
results analogous to Theorem 1.2 and Corollary 1.3. We also demonstrate that our greedy approach
offers approximations for specific subinstances of the proposed extensions. We summarize the final
approximation guarantees in Table 1 and provide a detailed description of these extensions and
modifications to our techniques in § 3.
Table 1: Extensions considered and their approximation guarantees. We ignore the error introduce
to make the running time polynomial time in n and T . We use wmax to denote maxj∈[n] wj in the
knapsack sub-extension.
7
they become too impatient. In particular, the famous cµ rule (which runs job with largest value
of holding cost per unit time (c) multiplying service rate(µ)) was extended to cµ/θ rule when an
exponential patience-time distribution is introduced in [5] where θ denotes the abandonment rate,
and then further exploited in a dynamic fluid scheduling setting [39]. There is a large body of work
at the intersection of queuing theory and job scheduling considering settings including service-level
differentiation [29], shortest-remaining-processing-time (SRPT) scheduling policy [23], and setup
costs [31]. The objective in these works is typically regret minimization. In our case, we consider
approximation guarantees in nonstationary environment.
Central to our analysis for general service times is the use of LP relaxations. LP relaxations
have been used extensively in job scheduling [32, 45], stochastic knapsack [10, 19, 28, 40] and other
related problems [3,17,38]. Closer to our work is [18], where they study instances of StocSchedX
where jobs depart with a geometric random time, and each job’s service time is 1. Our model
extends theirs by incorporating stochastic service times and general stochastic departure process.
The authors of [18] present an LP-based algorithm that guarantees a 0.7 approximation; however,
this approach can only be applied when the services times are 1.
Our model exhibits parallels with online bipartite-matching as well, where the single server
can be interpreted as an offline vertex, and jobs as online vertices. In contrast to the majority of
scenarios where jobs arrive online (for example, [42]), we consider stochastic departures for jobs
in the system. Moreover, our model resembles the setting where the offline vertex can be treated
as reusable resources [22, 27, 52]. Our model also shares similarities with adaptive combinatorial
optimization, in particular, the stochastic knapsack problem introduced in [20]. The study of
adaptivity starts with the work of Dean et al. [19], which adopts a competitive analysis perspective.
However, their work focuses on the adaptivity gap as opposed to the design of approximations. Note
that fully adaptive solutions can be obtained using Markov decision processes [50].
1.4 Organization
The remainder of the paper is organized as follows. Section 2 introduces the LP relaxation and
an LP-induced efficient algorithm with provable approximation guarantees for any StocSchedX
instances. Section 3 presents the flexibility of our model that can readily accommodate additional
constraints with minor modifications. Section 4 offers empirical validation of our policies via nu-
merical experiments. Section 5 summarizes the paper and discusses potential future work.
8
Lemma 2.1. Given any instance I of StocSchedX, and let Π be any policy that determines which
job to run next (if any) at time t if the server is free, we have vLP (I) ≥ v(Π(I)).
Proof. Given an instance I of StocSchedX and a feasible policy Π, let xj,t denote the probability
that the policy Π runs job j at time t. Then, v(Π) = Tt=1 nj=1 xj,t vj . To finish the proof, we
P P
need to show that x = {xj,t }j∈[n],t∈[T ] is a feasible solution to LP (LP-Sched). Fix j ∈ [n], and
observe that for any t ≥ 1, we have
xj,t
= Pr(Π runs j at t | Dj ≥ t) = 1 − Pr(Π does not run j at t | Dj ≥ t)
pj (t)
X
≤ 1 − Pr(Π runs j at some τ < t | Dj ≥ t) = 1 − Pr(Π runs j at τ | Dj ≥ t)
τ <t
X X xj,τ
=1− Pr(Π runs j at τ | Dj ≥ τ ) = 1 − ,
τ <t
p (τ )
τ <t j
where the first inequality holds since Π can run job j only once, and the penultimate equality
follows from the fact that Π can only make the decision at time τ based on information up to
time τ . This implies that x satisfies (a). Next, we show that x also satisfies (b). Let Ij,tstart be an
indicator random variable that is 1 when Π selects job j at time t and job j is available at time t.
At any given time t ≥ 1, the server can either (i) start serving a new job, or (ii) continue serving a
job from some prior round. In the second case, job j “blocks” time t if the server started serving
it at time τ and its service time was greaterPthan (t − τ ): let
P Sj,t−τ
P be anstart
indicator denoting this
start
quantity. Formally, for any t ≥ 1, we have j∈[n] Ij,t + τ <t j∈[n] Ij,τ · Sj,t−τ ≤ 1. Taking
expectations, noting that the service times are independent of the decisions made by Π, and using
E[Sj,t−τ ] = Pr(Sj > t−τ ) = F Sj (t−τ ) gives (b). Thus, x is a feasible solution for LP (LP-Sched),
which completes the proof.
2.2.1 Algorithm
At a high level, our algorithm ALG utilizes the optimal solution x∗ to (LP-Sched) to guide the
execution of jobs when the server is available. Iteratively, for each time t when the server is free,
the algorithm constructs a consideration set C ∗ = C ∗ (t) of jobs that have not been considered in
previous times and have not departed. From this consideration set, the algorithm will choose the
highest-valued job in the set C ∗ , if any. To add jobs to the consideration set, we do the following.
At time t, the algorithm computes fj,t which is the probability that ALG has not considered job j
before t and the server is free at time t conditioned on j being available at time t. Hence, fj,1 = 1
for all j and for t > 1, fj,t can be computed using only information observed up to time t − 1. Then,
9
when the server is free at time t, ALG adds job j to the consideration set for time t with probability
x∗j,t /(2 · pj (t) · fj,t ) if j has not departed and has not been considered before. This normalization
factor allows us to avoid contentions in the analysis. We formally describe ALG in Algorithm 1.
Algorithm 1 ALG
1: Input: jobs j = 1, . . . , n, with value vj , fj,t values for time t ≥ 1, departure time pj (·) and
service time F̄j (·).
2: x∗ ← optimal solution (LP-Sched).
3: C ← [n].
4: for t = 1, . . . do
5: if server is free then
6: C ∗ ← ∅.
7: for j ∈ C do: add j to C ∗ w.p. x∗j,t /(2 · pj (t) · fj,t ).
8: jt ← arg maxj∈C ∗ {vj }, run jt .
9: update C by removing the job that was considered (if any) or jobs that are no longer
available.
Note that we need to ensure that the ratio in 7. is a valid probability. Before we move into the
analysis of correctness and approximation of the algorithm, we provide some additional discussion
about ALG. First, note that {fj,t }j∈[n],t≥1 is part of the algorithm’s input. We do this to avoid
notational clutter. This can be easily modified by adding a subroutine that computes fj,t at time t
for each j. Our presentation is slightly more general as it allows users to choose alternative values
for fj,t . Let Aj,t be the event that job j is available at time t, then Pr(Aj,t ) = Pr(Dj ≥ t) = pj (t).
We also denote by SFt the event that server is free at t when running ALG, and NCj,t the event that
j has not been considered before t by ALG. Then, fj,t , as defined above, can be concisely written as
fj,t = Pr(NCj,t , SFt | Aj,t ). Note that computing fj,t may involve a sum over exponentially many
terms which precludes a polynomial-time implementation of ALG. Instead of using the exact values
of fj,t , we can approximate them using Monte Carlo simulation in polynomial time at a minimal
loss in the objective value; we discuss the details in §2.3. In the remaining of the section, we assume
that fj,t = Pr(NCj,t , SFt | Aj,t ).
10
We are now ready to complete the proof of the lemma. We consider the value of 1 − fj,t which
can be upper bounded by the probability of events in which, either (i) the server is busy at time t
because of another job, or (ii) job j was considered at an earlier time τ < t. Formally, we have
1 − fj,t = Pr ({∃ job i ̸= j : Runi,τ and Si > t − τ } or {∃τ < t : Cj,τ } | Aj,t )
XX X
≤ Pr(Runi,τ | Aj,t ) · Pr(Si > t − τ ) + Pr(Cj,τ | Aj,t ) (Union bound)
τ <t i̸=j τ <t
XX x∗i,τ X x∗j,τ
≤ · Pr(Si > t − τ ) + (Using Claim 2.3)
τ <t i̸=j
2 τ <t
2 · pj (τ )
1 − ni=1 x∗i,t x∗j,t
P
1
≤ + − . (Constraints (a) and (b))
2 2 2 · pj (t)
From here, we can deduce that x∗j,t /(2 · pj (t) · fj,t ) ≤ 1 which finishes the proof.
Proof of Theorem 1.2. For t ≥ 1, let Xt be P the value of the job run at time t by ALG; we set Xt = 0
if not job is run at time t. Then, v(ALG)
Pn = t≥1 E[Xt ]. Hence, it is enough to show that for t ≥ 1,
we have E[Xt ] ≥ (1/2) · (1 − 1/e) j=1 vj xj,t .∗
For the rest of the analysis, we assume without loss of generality that v1 ≥ v2 ≥ · · · ≥ vn . Then,
Xt = vj if (i) the server is free at t, (ii) job j is considered at t and (iii) no job i < j is considered
at t. Recall that Cj,t refers to the event that job j is considered by ALG at time t and denote Cj,t as
its complement. Then,
11
Q
For each job j, let αj,t = Pr(Cj,t | SFt ). Then, Pr(Xt = vj ) = Pr(SFt )αj,t i<j (1 − αi,t ) and
αj,t = Pr(Cj,t |SFt , Aj,t , NCj,t ) · Pr(NCj,t |SFt , Aj,t ) · Pr(Aj,t ) (Conditioning on Aj,t and NCj,t )
x∗j,t Pr(NCj,t , SFt , Aj,t )
= · · Pr(Aj,t ) (Using ALG)
2fj,t · Pr(Aj,t ) Pr(SFt , Aj,t )
x∗j,t Pr(NCj,t , SFt , Aj,t )
= · (Independence of Aj,t and SFt )
2fj,t Pr(Aj,t ) · Pr(SFt )
x∗j,t Pr(NCj,t , SFt |Aj,t )
= ·
2fj,t Pr(SFt )
∗
xj,t
=
2 Pr(SFt )
wherePin the last equality we use that fj,t = Pr(NCj,t , SFt | Aj,t ). Using Proposition 2.4 we can show
that nj=1 αj,t ≤ 1. Furthermore, we have the following claim. We defer its proof to Appendix E.
Claim 2.5. It holds that nj=1 vj αj,t i<j (1 − αi,t ) ≥ (1 − 1/e) nj=1 x∗j,t · vj /(2 Pr(SFt )).
P Q P
Proof of Proposition 2.6. We consider the following instance of StocSchedX with n identical jobs:
vj = 1, Dj = 2 with probability 1, and Sj = 1 with probability 1 for all j ∈ [n]. Note that x∗j,1 = 1/n,
for all j ∈ [n] is an optimal solution to (LP-Sched) with objective value vLP = 1.1
Observe that since all jobs would depart at time 2, ALG can only collect value at time 1. At
time 1, the server is free with probability 1, and both pj (1) and fj,1 equal to 1 for any job j. Thus,
the probability that each job j is added to the consideration set is 1/(2n), and the probability that
√
there are at least one job added into the consideration set is 1 − (1 − 1/(2n))n → 1 − 1/ e, when
n large. Since ALG only runs a job and collects value 1 when the consideration set is not empty,
√ √
v(ALG) → 1 − 1/ e. Thus, v(ALG)/vLP → 1 − 1/ e.
1
Note there might be different solution assignments that can achieve the same objective.
12
2.3 A Polynomial Time Implementation of Algorithm 1
In this section, we provide a proof for Corollary 1.3, that is, we can provide a polynomial-time
implementation for ALG with provable performance guarantees with mild conditions on service
times.
Proof of Corollary 1.3. Note that (LP-Sched) can be solved in polynomial time (in n) when the
service times are polynomially bounded. It is clear that when values of fj,t are provided as input,
Algorithm 1 can be implemented with a polynomial-time computational complexity. While we
make the assumption that we have access to the exact {fj,t }j∈[n],t≥1 in our analysis, computing
exact fj,t value is challenging and involve a sum over exponentially many terms. Thus, in the
remaining parts of this section, we focus on getting around this issue by providing an efficient and
accurate approximation for fj,t values, as summarized in Lemma 2.7.
Lemma 2.7. Fix an ε > 0, {fj,t }j∈[n],t≥1 values can be approximated with 6n2+k vmax ln(2(n2+k /ε))/ε3
number of simulations via ALGε , where vmax = maxj∈[n] vj , and this approximation leads to an ϵ
multiplicative loss from original approximation ratio.
We defer the proof of Lemma 2.7 to Appendix E. At a high level, we utilize simulation and
attenuation techniques as shown in [22, 40]. We iteratively simulate a modified version of ALG, and
compute empirical estimates fbj,t . For each job j and fixed time t, we use simulated fbj,τ information
up to τ = t − 1 to retrieve empirical estimation of fbj,t . Note small x∗j,t values can lead to small
fj,t values and these tiny values do not fit in Monte-Carlo approximations scheme, thus we address
this concern by setting small enough x∗j,t to 0 and show that at most ε is lost from the objective
(see §E.2.1). Then to further show simulated fbj,t can recover an ε multiplicative loss from the
original competitive ratio, we show that the failure event that the empirical estimate fbj,t is more
than ε away from fj,t can be bounded. And by treating failed simulation as obtaining 0 value and
otherwise follow a similar line of proof from Theorem 1.2, we conclude the proof for Lemma 2.7
(see §E.2.2).
Proposition 2.8 (Integrality gap). For any ε > 0, there is an instance I such that v(OPT(I)) ≤
(1 − 1/e + ε)vLP (I).
Proof. Let n be large enough and m < n large but constant. To keep the analysis simple, assume
that m divides n. Consider the following instance: There are n jobs divided into m different types
of jobs with each group being of size n/m. All the jobs have the same values vj = 1. For each job
j job of type k, we have
(
1 with probability 1 − m/n
Sj,k = 2k−1 with probability 1 ; Dj,k = k−1
2 + 1 with probability m/n
13
We can find a feasible solution to (LP-Sched) as follows. Set x1,1 = 1; for all remaining jobs j
from job type k, where k ∈ [m], we set xj,2k−1 +1 = m/n. Thus the optimal LP solution is at least
m + 1.
We now compute explicitly v(OPT). Note that by construction, a job of type k departs at time
t = 1 or time t = 2k−1 + 1 = Sj,k + 1. Hence, after time 1, we know exactly the jobs that departed
at time 1 and the ones that remain at time t = Sj,k + 1. Furthermore, from each type job, we can
run at most two of them. Therefore, the optimal way to run the jobs is in increasing order of their
type. After running a job of type 1, at time 2, there is a job of type 1 remaining with probability
1−(1−m/n)n/m−1 and for each type k = 2, . . . , m there is at least a job remaining with probability
1 − (1 − m/n)n/m . Hence,
m n/m−1 m n/m
v(OPT) = 1 + 1 − 1 − + (m − 1) 1 − 1 −
n n
Remark 2.9. This proposition shows that for any algorithm ALG′ such that v(ALG′ ) ≥ γvLP (I),
we must have γ ≤ 1 − 1/e. In other words, it poses a limit to the approximations possible us-
ing (LP-Sched).
3 Further Extensions
In this section, we demonstrate the flexibility of our approaches through extensions to StocSchedX
that accommodate additional constraints that are of interest in practical applications. We refer to
the vanilla StocSchedX as the StocSchedX described in § 1. Firstly, we consider jobs with hard
deadlines. Then, we consider a server with a knapsack constraint. We show how our LP and greedy
approaches can be modified in each case to obtain algorithms with provable guarantees comparable
to the ones obtained in the vanilla StocSchedX.
14
First, consider the following formulation (LP-ddl).
Bj
n X
X
maximize Pr(Sj ≤ Bj − t)xj,t vj (LP-ddl)
j=1 t=1
Bj
X xj,t
subject to ≤1 ∀j ∈ [n] (ddl(a))
pj (t)
t=1
X j}
n min{t,B
X
xj,τ F Sj (t − τ ) ≤ 1 ∀t ∈ [1, max Bj ] (ddl(b))
j∈[n]
j=1 τ =1
Theorem 3.1. Let I be an instance of StocSchedX with job deadlines, then vLP-ddl (I) ≥ v(OPT),
where vLP-ddl is the objective value of (LP-ddl).
Theorem 3.2 (Polynomial-time approximation with deadlines). For any instance to StocSchedX
with job-dependent deadlines and fix an ε > 0, we can compute the value of the algorithm ALG with
6nBmax vmax ln(2(nBmax /ε))/ε3 number of simulations, where Bmax = maxj∈[n] Bj , and guarantee
an expected reward of at least 1/2 · (1 − 1/e − ε) · v(OPT), with mild conditions on the service times.
Proof. We apply the same ALG in Algorithm 1. Recall that we show in Theorem 1.2 the expected
value is at least (1/2) · (1 − 1/e) collected by LP at each time t. Thus, by summing the time horizon
up to time maxj∈[n] Bj , same theoretical approximation ratio guarantee is attained. Moreover,
(LP-ddl) can be solved in polynomial time (in n) when the service times are polynomially bounded.
By applying the reasoning detailed in the proof of Lemma 2.7 with modified time horizon Bmax ,
we can provide well-approximated fj,t values as input, thereby recovering the theorem.
For the subinstances when the service times are i.i.d. and all the jobs share the same deadline
Bj = B, we can utilize the coupling argument used in the proof for the vanilla StocSchedX
to obtain an approximation of 1/2 of the optimal value (see §C for the discussion in the vanilla
StocSchedX.).
Theorem 3.3 (Approximation Ratio with Deadlines and I.I.D Service Times). The greedy policy
that runs the job j with the largest vj , obtains an expected total value at least 1/2 times the expected
total value obtained by an optimal policy for StocSchedX with deadlines when the service times
of the jobs are i.i.d. and Bj = B for all j ∈ [n].
Proof. To see this, we first note that the greedy policy that runs the largest value from all available
jobs does not work when jobs are associated with different hard deadlines.
Example 3.1. Consider the greedy algorithm that runs the job with the largest value. Consider
the following instance of 2 jobs. We set the parameters of job 1 as follows: v1 = 1 + ε, B1 = 2,
D1 = 2 and S1 = 2 with probability 1. For job 2, we have v2 = 1, B2 = K, D2 = 2 and Sj = 2
with probability 1. Observe that the greedy algorithm attempts to run job 1 at time t = 1 yet fails
to obtain a value of 1 + ε because the deadline for this job falls on t = 2, and by t = 2, the second
job has already departed. The optimal solution obtains a total value of 1 by first serving jobs 2.
15
As previous example aptly shows, when deadlines are heterogeneous, we cannot retrieve an
approximation ratio improvement. However, we can recover an 1/2 approximation ratio when
setting all job-dependent deadlines to be identical (i.e., Bj = B for all job j). To see this, we use
the same coupling argument from Theorem 1.4, which ensures that the server can start a new job
at same time steps for both greedy and optimal policy. If we denote, in all sample paths, the last
time period that we can start a new job before time B as t′ , then by analysis from Theorem 1.4,
the expected value gained by the greedy policy is at least half of the value using the optimal policy
up to time t′ . We conclude the analysis by recognizing that, under the assumption of i.i.d. service
times, the probability of the final job contributing to the overall collected value remains the same
regardless of the specific job chosen. Moreover, the value of the job selected by the greedy algorithm
is guaranteed to be at least as high as that of the job selected by the optimal policy.
Then, if vLP −kn is the optimal value of (LP-Sched) with the additional constraint (c), we have
vLP −kn ≥ v(OPTkn ). Indeed, if we consider xj,t as the probability that OPTkn runs j at t, then the
right-hand side of (c) is the expected weight used by the jobs run by OPTkn which is never larger
than W ; hence (c) is satisfied. Constraints (a) and (b) are satisfied by x by following the same
proof as in Lemma 2.2; so we skip them for brevity.
Theorem 3.4 (Polynomial-time approximation with knapsack constraints). For any instance to
StocSchedX with non-negative weights and a knapsack capacity W , fix an ε > 0, we can com-
pute the value of the algorithm ALG with 6n2+k vmax ln(2(n2+k/ε))/ε3 number ofsimulations and
2 2
guarantee an expected reward of at least 1/2 · (1 − 1/e − ε) · 1 − e−W /(2nwmax ) · v(OPT), where
wmax = maxj∈[n] wj , with mild conditions on the service times.
Proof. To prove Theorem 3.4, we apply ALG from Algorithm 1 which uses the optimal solution
x∗ from (LP-Sched) with an additional Constraint (c). Note that analysis from Theorem 1.2
remains applicable because the approximation ratio is valid for all time t, we retain the flexibility
to terminate the process at any point once the capacity W is reached. However, it is possible
that the weights of jobs that we run from the ALG exceed the capacity. Thus, in the following
Lemma 3.5 (where its proof is deferred to Appendix D), we bound the probability of the event that
the cumulative weights of the jobs ALG runs exceed the knapsack capacity.
16
Lemma 3.5. Let E be the event that the cumulative weight of jobs that ALG runs exceed the knaspsack
capacity W . Then, we have Pr(E) ≤ exp(−W 2 /(2nwmax
2 )), where wmax = maxj∈[n] wj .
Let us show how this Lemma implies the result stated in Theorem 3.4. When event E happens,
we treat this run as a failure and thus retrieve a value of 0, otherwise we retrieve a reward that is
at least1/2 · (1 − 1/e) of vLP as illustrated in Theorem 1.2. Therefore, we retrieve a multiplicative
−W 2 /(2nw 2 )
loss of 1 − e max . Finally, we conclude the proof of the theorem by pointing out that
the (LP-Sched) with an additional Constraint (c) can be solved in polynomial time and simply
following the remaining reasoning from Corollary 1.3 will give a polynomial implementation with
an ε-loss from the approximation ratio guarantee.
2 2
Remark 3.6. We provide more insights into the multiplicative loss term 1 − e−W /(2nwmax )
above where the term diminishes as W increases and wmax decreases. Given the Chernoff bound
technique [30] we apply, this approximation more accurately reflects the problem characteristics
especially when job weights are similar and are far away from the knapsack size. For instance, in
applications like call centers, each call has a specific bounded length (see §4 for details), and agents
have a fixed and comparatively large number of working hours per shift. Thus, above scenario can
be formulated as a temporal knapsack problem and our approximation scheme allows managements
effectively schedule workforces within the limited timeframe.
Note that the cardinality constraint is a special case of the knapsack constraint, where at most
k jobs from n jobs can be run. This constraint can be represented by assigning weights wj = 1
for all jobs j and setting the capacity W equal to k. By applying a Chernoff bound technique [43]
akin to the one previously employed, we arrive at the following Corollary 3.7 (proof deferred to
Appendix D).
Corollary 3.7 (Polynomial-time approximation with cardinality constraints). For any instance to
StocSchedX with a cardinality capacity k, fix an ε > 0, we can compute the value of the algorithm
ALG with 6n2+k vmax ln(2(n2+k /ε))/ε3 number of simulations and guarantee an expected reward of
at least 1/2 · (1 − 1/e − ε) · 1 − e−k/6 · v(OPT), with mild conditions on the service times.
Next, we consider the scenario where job service times are i.i.d for knapsack constraints and
cardinality constraints respectively.
Remark 3.8. Since we do not allow fractional solutions within the scope of job scheduling, when
dealing with an additional knapsack constraints, the greedy policy, which for example simply prior-
itizes the value or the ratio of value per unit-weight that won’t exceed the size of the knapsack, does
not yield more advantageous outcomes. We illustrate by providing counterexamples D.1 and D.2.
One might question whether more complex greedy policies could yield better approximations. How-
ever, we present a counterexample D.3 demonstrating that certain modified hybrid greedy-based
policy that proves to be effective in general knapsack problems still fails to provide satisfactory
approximations.
Nevertheless, when all weights are uniform, the analysis presented in Theorem 1.4 remains valid.
As we demonstrate that at each time step, the expected value received from the optimal policy is
bounded above by twice the value obtained by the greedy policy. Consequently, even after reaching
the cardinality capacity, this tighter bound of 1/2 persists. The summarized result is provided
below.
17
Theorem 3.9 (Approximation Ratio with Cardinality Constraints and i.i.d service times). The
greedy policy that runs the job with the largest value, obtains an expected total value at least 1/2
times the expected total value obtained by an optimal policy for StocSchedX with Cardinality
Constraints when the service times of the jobs are i.i.d.
4 Computational Results
In this section, we provide a summary of computational results of our algorithms for StocSchedX.
We test our algorithms on two sets of instances: a synthetic dataset, and a public domain dataset
from the call center of ‘Anonymous Bank’ in Israel 2 . We conducted all of our computational
experiments using Python 3.7.6 and Gurobi 10.0.3 with a 2.4 Ghz Quad-Core Intel Core i5 processor
and 8.0 GB 2.4 MHz LPDDR3 memory.
Algorithms. We test our simulation-based algorithm, denoted SimAlg, against a number of
algorithms. To begin, recall that if the server is free at time t, SimAlg constructs a consideration
set of jobs that have not been previously considered (before time t) and are still available at time
j,t x∗
t. In particular, the algorithm adds such jobs j to the consideration set with probability 2·pj (t)·f j,t
,
and then greedily selects a job from the consideration set (see §2 for definitions of x and fj,t and
∗
§2.3 for details of the algorithm). Leveraging insights gained from our algorithm design, we develop
a heuristic procedure, denoted ConSet, that only differs from SimAlg in the construction of the
consideration set. Here, we construct aPconsideration set as follows: add job j to the consideration
∗ ∗
set with probability xj,t / pj (t) · (1 − τ <t xj,τ /pj (τ )) . We note that this term represents the
probability that job j is available but has not been run / selected at time τ < t. Additionally, we
also test the greedy policy that, from the set of available jobs, serves the job with the largest value.
We denote this algorithm as Greedy.
We also use the following algorithms to evaluate the performance. Ideally, we would like to
compare to value of the MDP solution for a given instance; however, the state space of this MDP
blows up exponentially, making it more convenient to use the optimal LP value (obtained by
solving (LP-Sched)) as a benchmark. As a benchmark P we also compare to an algorithm, denoted
Safe, that runs an available job j with probability x∗j,t / i∈Rt x∗i,t where Rt denotes the set of jobs
that are available at time t (this heuristic was used in the work [22] in the context of matching
problems). Our last algorithm is a naive randomized algorithm, denoted UR, that runs any available
job uniformly at random whenever the server is free.
4.1 Instances
We produce two datasets to test our algorithms. First, we produce a synthetic dataset that is
aimed to stress our algorithms. The second dataset is constructed from real data obtained from a
small Israeli call center.
Synthetic Data. We test our algorithms on synthetic data generated as follows. We first set
n ∈ {5, 10, 15, . . . , 50} to denote the number of jobs. Given n, we generate n probabilities, p1 , . . . , pn ,
uniformly at random in (0.2, 1). We assume that after every timestep t, job j continues to remains
in the system independently with probability pj ; that is, the departure probabilities are geometric.
We set the total time to 50 and the maximum service time per job, smax , as n/5 with a minimum
value of 3 to ensure that it is not redundant. Each job is independently and uniformly categorized
as either a short job or a long job. For short jobs, we set the service time to be 1 with probability
2
Data can be accessed from https://seelab.net.technion.ac.il/data/.
18
0.9 and 2 with the remaining probability. For long jobs, we set the service time to be smax with
probability 0.9 and smax − 1 with the remaining probability. Furthermore, each job is assigned a
value-type in {low, medium, high} with probabilities {0.2, 0.6, 0.2} respectively. A low-value (resp.
medium-value and high-value) job is assigned a value uniformly in (1, 2) (resp. (2, 4) and (4, 8)).
The instances are labeled as Syn-n. For each n we generate 10 instances.
Call Center Data. Moreover, we use a public domain dataset from the call center of ‘Anonymous
Bank’ in Israel. For our experimental setup, we utilize data originating from January 1999, which
comprises 30, 140 call entries, as the training dataset. The call records are primarily split into
three distinct categories: ‘new’, ‘regular’, and ‘priority’, each consisting 8, 723, 4, 440 and 16, 977
entries, respectively. To account for the different valuations that may be attached to a call type,
we assume that priority calls have a value 8, regular calls have a value 2, and new calls hold a
value 1. Furthermore, using 20 seconds as a reasonable unit time-step, we extract distributional
information for both the (stochastic) service and the departure time. We present these as empirical
point mass function distributions in Figures 5 and 6 in Appendix F, which are then used to sample
instances for our computational setup. As a sanity check, we compute the standard root mean
squared error (RMSE) of the learned departure and service point mass functions for January and
February, and observe that the error is 0.013 and 0.006, respectively. Similar to the experimental
setup on synthetic data, we label the instances as Real-n. For every value of n ∈ {5, 10, 15, . . . , 50},
we generate 10 instances based on the specified category frequency distribution and associate each
customer with respective service and departure time distributions.
4.2 Results
For both synthetic and real datasets, we give our results for average value along with the 95%
confidence interval in Figure 2 and 3 respectively (numerical values can be found in Table 3 and 4
in Appendix F). Except for the LP benchmark method which have set values, all other algorithms
are simulated 100 times in order to record an averaged performance. In particular, in SimAlg, we
use 100 trials to reach a good estimation for each fj,t , for all j ∈ [n], and all t ≥ 1. A sample path
of simulating f1,3 in dataset Syn-10 is shown in Figure 7 in Appendix F. Additionally, we compare
the runtime for all algorithms across different instances through Table 2 in synthetic datasets.
19
Figure 2: Algorithm values across different number of jobs in synthetic datasets. There are no
overlapping lines (Lines representing SimAlg and ConSet may look overlapping, one can refer
Table 3 for more precise comparison). Shaded area represents the 95% confidence interval.
Table 2: Average running time (in milliseconds). Running time of SimAlg is divided into two
components, the first column (tagged with s) is the simulation part to retrieve fj,t values, and the
second column (tagged with e) is the execution part to obtain values.
20
Figure 3: Algorithm values across different number of jobs in real bank datasets. There are no
overlapping lines. Shaded area represents the 95% confidence interval.
4.3 Discussions
We end this section by presenting a comprehensive discussion of our results. First note that com-
pared to the LP optimal solution, SimAlg attains a considerably high approximation ratio with
consistent performances across all datasets. This empirical outcome serves to reinforce the theoret-
ical guarantees, and solidifies the application of the simulation-based algorithm. However, SimAlg
takes considerably longer running time compared to both LP-based algorithm and those not reliant
on LP solutions. This elongation arises primarily due to the extra iterations needed for simulating
fj,t values. In situations necessitating efficient implementations, ConSet approach stands out as
a notable option. This method not only leverages the structure of forming a consideration set to
achieve commendable performance but also significantly diminishes runtime. Nonetheless, despite
we receive good (sometimes better than SimAlg) empirical performances for methods like Con-
Set, Safe or Greedy, there are no formal theoretical guarantees to support their performance
across all possible scenarios when the service times are not i.i.d.
We also note that while Greedy algorithm performs poorly in synthetic datasets, it brings more
convincing performance in real bank datasets even compared to LP-based methods. We believe
this is because of the nature of the algorithm and the data. Priority calls are far more yielding
candidates to run compared to its regular and new call counterparts and they all share comparable
service and departure distributions, and at the same time, the solution of LP may not continue to
assign much weight to priority calls in later periods of time horizons. Thus, even when the server is
free and a high-valued job is available to run, LP-based algorithms do not necessarily run them as
other low-valued jobs are also probable to run according to the probabilistic approach we adopt.
21
5 Final Remarks
In this work, we introduced the StocSchedX problem that models a revenue-maximizing single
server job scheduling problem with jobs with unknown stochastic service and departure times
that model user impatience. Since the StocSchedX is provably NP-hard, we examine heuristic
approximations. We provided an extensive analysis on a time-indexed LP relaxation. We find an
LP-based approximation algorithm that guarantees an expected total reward of at least (1/2) ·
(1 − 1/e) of the expected reward of the optimal policy and provide an implementation that can
be computed in polynomial time under mild conditions on the service times. This policy can
flexibly extend to various settings, including jobs with deadlines and knapsack constraints. We
also note, that with our LP methodology, we can only hope to get at most an approximation of
1 − 1/e (see § 2.4). Breaking this factor would require a different LP relaxation or other heuristic
approach. We further the theoretical results by examining the subfamily of i.i.d. service times.
In this case, we show that the simple greedy solution that runs the highest valued available job
guarantees an expected reward of 1/2 of the expected reward of the optimal policy. We empirically
validate various policies derived from our LP relaxation using synthetic and real data.
Several extensions of our model to multiple servers and other combinatorial constraints over
the server are captivating from a theoretical and applied perspectives. A natural extension of our
model is the multiple server case. In this case, we can still use an extended version of (LP-Sched)
to provide an upper bound on the optimal value. For example, if we use i to index the servers, we
can introduce a decision variable xj,t,i to represent the probability that the optimal policy runs job
j at time t on server i. Using similar constraints as in (LP-Sched), we can find a valid upper bound
on the expected optimal value for the multiple servers setting. However, it is unclear that how to
extend our algorithmic framework to multiple servers. Another natural extension is to consider job
arrivals. The challenge in this case is defining a suitable arrival model. A possible arrival model
is a job-dependent release times—typically used in scheduling literature. In short, if we consider
release times, our LP-based analysis do not extend to this case as technical parts in Lemma 2.1 fail
to hold when there is a release time. However, we note that if the service times are i.i.d., then, our
greedy policy still provides and approximation of 1/2 for the release time settings.
References
[1] Zeynep Aksin, Mor Armony, and Vijay Mehrotra. The modern call center: A multi-disciplinary
perspective on operations management research. Production and operations management,
16(6):665–688, 2007.
[2] Zeynep Akşin, Barış Ata, Seyed Morteza Emadi, and Che-Lin Su. Structural estimation of
callers’ delay sensitivity in call centers. Management Science, 59(12):2727–2746, 2013.
[3] Ali Aouad and Ömer Saritaç. Dynamic stochastic matching under limited time. In Proceedings
of the 21st ACM Conference on Economics and Computation, pages 789–790, 2020.
[4] Mor Armony, Shlomo Israelit, Avishai Mandelbaum, Yariv N Marmor, Yulia Tseytlin, and
Galit B Yom-Tov. On patient flow in hospitals: A data-based queueing-science perspective.
Stochastic systems, 5(1):146–194, 2015.
[5] Rami Atar, Chanit Giat, and Nahum Shimkin. The cµ/θ rule for many-server queues with
abandonment. Operations Research, 58(5):1427–1439, 2010.
22
[6] Jiaru Bai, Kut C. So, Christopher S. Tang, Xiqun (Michael) Chen, and Hai Wang. Coor-
dinating supply and demand on an on-demand service platform with impatient customers.
Manufacturing & Service Operations Management, 21(3):556–570, 2019.
[7] Siddhartha Banerjee, Carlos Riquelme, and Ramesh Johari. Pricing in ride-share platforms:
A queueing-theoretic approach. Available at SSRN 2568258, 2015.
[8] Achal Bassamboo, J Michael Harrison, and Assaf Zeevi. Design and control of a large call
center: Asymptotic analysis of an lp-based method. Operations Research, 54(3):419–435, 2006.
[9] Anand Bhalgat, Ashish Goel, and Sanjeev Khanna. Improved approximation results for
stochastic knapsack problems. In Proceedings of the twenty-second annual ACM-SIAM sym-
posium on Discrete Algorithms, pages 1647–1665. SIAM, 2011.
[10] Daniel Blado and Alejandro Toriello. Relaxation analysis for the dynamic knapsack problem
with stochastic item sizes. SIAM Journal on Optimization, 29(1):1–30, 2019.
[11] Allan Borodin and Ran El-Yaniv. Online computation and competitive analysis. cambridge
university press, 2005.
[12] Lawrence Brown, Noah Gans, Avishai Mandelbaum, Anat Sakov, Haipeng Shen, Sergey Zeltyn,
and Linda Zhao. Statistical analysis of a telephone call center: A queueing-science perspective.
Journal of the American statistical association, 100(469):36–50, 2005.
[13] Gerard P Cachon, Kaitlin M Daniels, and Ruben Lobel. The role of surge pricing on a service
platform with self-scheduling capacity. Manufacturing & Service Operations Management,
19(3):368–384, 2017.
[14] Francisco Castro, Hamid Nazerzadeh, and Chiwei Yan. Matching queues with reneging: a
product form solution. Queueing Systems, 96(3-4):359–385, 2020.
[15] Ceren Cebi, Enes Atac, and Ozgur Koray Sahingoz. Job shop scheduling problem and solution
algorithms: a review. In 2020 11th International Conference on Computing, Communication
and Networking Technologies (ICCCNT), pages 1–7. IEEE, 2020.
[16] Gang Chen, Jean-Philippe Gayon, and Pierre Lemaire. Stochastic scheduling with aban-
donment: necessary and sufficient conditions for the optimality of a strict priority policy.
Operations Research, 71(5):1789–1793, 2023.
[17] Natalie Collina, Nicole Immorlica, Kevin Leyton-Brown, Brendan Lucier, and Neil Newman.
Dynamic weighted matching with heterogeneous arrival and departure rates. In Web and
Internet Economics: 16th International Conference, WINE 2020, Beijing, China, December
7–11, 2020, Proceedings 16, pages 17–30. Springer, 2020.
[18] Marek Cygan, Matthias Englert, Anupam Gupta, Marcin Mucha, and Piotr Sankowski. Catch
them if you can: how to serve impatient users. In Proceedings of the 4th conference on
Innovations in Theoretical Computer Science, pages 485–494, 2013.
[19] Brian C Dean, Michel X Goemans, and Jan Vondrák. Approximating the stochastic knapsack
problem: The benefit of adaptivity. Mathematics of Operations Research, 33(4):945–964, 2008.
[20] Cyrus Derman, Gerald J Lieberman, and Sheldon M Ross. A renewal decision problem.
Management Science, 24(5):554–561, 1978.
23
[21] Amol Deshpande, Lisa Hellerstein, and Devorah Kletenik. Approximation algorithms for
stochastic submodular set cover with applications to boolean function evaluation and min-
knapsack. ACM Transactions on Algorithms (TALG), 12(3):1–28, 2016.
[22] John P Dickerson, Karthik A Sankararaman, Aravind Srinivasan, and Pan Xu. Allocation
problems in ride-sharing platforms: Online matching with offline reusable resources. ACM
Transactions on Economics and Computation (TEAC), 9(3):1–17, 2021.
[23] Jing Dong and Rouba Ibrahim. Srpt scheduling discipline in many-server queues with impatient
customers. Management Science, 67(12):7708–7718, 2021.
[24] Ofer Garnett, Avishai Mandelbaum, and Martin Reiman. Designing a call center with impa-
tient customers. Manufacturing & Service Operations Management, 4(3):208–227, 2002.
[25] John C Gittins. Bandit processes and dynamic allocation indices. Journal of the Royal Sta-
tistical Society Series B: Statistical Methodology, 41(2):148–164, 1979.
[26] Martin Charles Golumbic. Algorithmic graph theory and perfect graphs. Elsevier, 2004.
[27] Vineet Goyal, Garud Iyengar, and Rajan Udwani. Asymptotically optimal competitive ratio
for online allocation of reusable resources, 2022.
[28] Anupam Gupta, Ravishankar Krishnaswamy, Marco Molinaro, and Ramamoorthi Ravi. Ap-
proximation algorithms for correlated knapsacks and non-martingale bandits. In 2011 IEEE
52nd Annual Symposium on Foundations of Computer Science, pages 827–836. IEEE, 2011.
[29] Itai Gurvich and Ward Whitt. Service-level differentiation in many-server service systems via
queue-ratio routing. Operations research, 58(2):316–328, 2010.
[30] Wassily Hoeffding. Probability inequalities for sums of bounded random variables. The collected
works of Wassily Hoeffding, pages 409–426, 1994.
[31] Emmanuel Hyon and Alain Jean-Marie. Scheduling services in a queuing system with impa-
tience and setup costs. The Computer Journal, 55(5):553–563, 2012.
[32] Sungjin Im, Benjamin Moseley, and Kirk Pruhs. Stochastic scheduling of heavy-tailed jobs.
In 32nd International Symposium on Theoretical Aspects of Computer Science (STACS 2015).
Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik, 2015.
[33] Berit Johannes. Scheduling parallel jobs to minimize the makespan. Journal of Scheduling,
9:433–452, 2006.
[34] Eun-Seok Kim and Daniel Oron. Minimizing total completion time on a single machine with
step improving jobs. Journal of the Operational Research Society, 66:1481–1490, 2015.
[35] Jeunghyun Kim, Ramandeep S Randhawa, and Amy R Ward. Dynamic scheduling in a many-
server, multiclass system: The role of customer impatience in large systems. Manufacturing &
Service Operations Management, 20(2):285–301, 2018.
[36] Antoon WJ Kolen, Jan Karel Lenstra, Christos H Papadimitriou, and Frits CR Spieksma.
Interval scheduling: A survey. Naval Research Logistics (NRL), 54(5):530–543, 2007.
24
[37] Joseph YT Leung, Haibing Li, and Michael Pinedo. Order scheduling models: an overview.
In Multidisciplinary Scheduling: Theory and Applications: 1 st International Conference,
MISTA’03 Nottingham, UK, 13–15 August 2003 Selected Papers, pages 37–53. Springer, 2005.
[38] Zihao Li, Hao Wang, and Zhenzhen Yan. Fully online matching with stochastic arrivals and
departures. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 37, pages
12014–12021, 2023.
[39] Zhenghua Long, Nahum Shimkin, Hailun Zhang, and Jiheng Zhang. Dynamic scheduling of
multiclass many-server queues with abandonment: The generalized cµ/h rule. Operations
Research, 68(4):1218–1230, 2020.
[40] Will Ma. Improvements and generalizations of stochastic knapsack and markovian bandits
approximation algorithms. Mathematics of Operations Research, 43(3):789–812, 2018.
[41] Avi Mandelbaum and Sergey Zeltyn. The impact of customers’ patience on delay and aban-
donment: some empirically-driven experiments with the m/m/n+ g queue. OR Spectrum,
26(3):377–411, 2004.
[42] Aranyak Mehta. Online matching and ad allocation. Foundations and Trends in Theoretical
Computer Science, 8(4):265–368, 2013.
[43] Michael Mitzenmacher and Eli Upfal. Probability and computing: Randomization and proba-
bilistic techniques in algorithms and data analysis. Cambridge university press, 2017.
[44] Jatoth Mohan, Krishnanand Lanka, and A Neelakanteswara Rao. A review of dynamic job
shop scheduling techniques. Procedia Manufacturing, 30:34–39, 2019.
[45] Rolf H Möhring, Andreas S Schulz, and Marc Uetz. Approximation in stochastic scheduling:
the power of lp-based priority policies. Journal of the ACM (JACM), 46(6):924–942, 1999.
[46] Michael L Pinedo. Scheduling, volume 29. Springer, 2012.
[47] David Pisinger and Paolo Toth. Knapsack problems. Handbook of Combinatorial Optimization:
Volume1–3, pages 299–428, 1998.
[48] Chris N Potts and Luk N Van Wassenhove. An algorithm for single machine sequencing
with deadlines to minimize total weighted completion time. European Journal of Operational
Research, 12(4):379–387, 1983.
[49] Amber L Puha and Amy R Ward. Scheduling an overloaded multiclass many-server queue with
impatient customers. In Operations research & management science in the age of analytics,
pages 189–217. INFORMS, 2019.
[50] Martin L Puterman. Markov decision processes: discrete stochastic dynamic programming.
John Wiley & Sons, 2014.
[51] Thomas R Robbins and Terry P Harrison. A stochastic programming model for scheduling
call centers with global service level agreements. European Journal of Operational Research,
207(3):1608–1619, 2010.
[52] Paat Rusmevichientong, Mika Sumida, and Huseyin Topaloglu. Dynamic assortment optimiza-
tion for reusable products with random usage durations. Management Science, 66(7):2820–
2844, 2020.
25
[53] Shraga Shoval and Mahmoud Efatmaneshnik. A probabilistic approach to the stochastic job-
shop scheduling problem. Procedia Manufacturing, 21:533–540, 2018.
[54] Natasha Singh. Growth trends for on demand service platforms: retrieved from
https://www.bluelabellabs.com/blog/2020-growth-trends-for-on-demand-service-platforms,
2020.
[55] Terry A. Taylor. On-demand service platforms. Manufacturing & Service Operations Manage-
ment, 20(4):704–720, 2018.
[56] Nick T Thomopoulos. Essentials of Monte Carlo simulation: Statistical methods for building
simulation models. Springer Science & Business Media, 2012.
[57] Guangju Wang, Hailun Zhang, and Jiheng Zhang. On-demand ride-matching in a spatial
model with abandonment and cancellation. Operations Research, 2022.
[58] Hegen Xiong, Shuangyuan Shi, Danni Ren, and Jinjin Hu. A survey of job shop scheduling
problem: The types and models. Computers & Operations Research, 142:105731, 2022.
[59] Xun Xu, Nina Yan, and Tingting Tong. Longer waiting, more cancellation? empirical evidence
from an on-demand service platform. Journal of Business Research, 126:162–169, 2021.
[60] Yueyang Zhong, John R Birge, and Amy Ward. Learning the scheduling policy in time-varying
multiclass many server queues with abandonment. Available at SSRN 4090021, 2022.
where we used that vj ≤ v(OPT) for all j ∈ [n]. From here, the result follows.
Thus, p′j (t) := Pr(Dj′ ≥ t) ≥ εt−1 for all t and j. We denote by v ′ (OPT′ ) the optimal value of the
CapStocSchedX problem when run with departure times D1′ , . . . , Dn′ .
26
Proposition A.2. For any ε > 0, v(OPT) ≤ v ′ (OPT′ ) ≤ (1 + n2 ε)v(OPT).
Proof. The first inequality is direct as items remain longer in the system with the new departure times
D1′ , . . . , Dn′ . For the second inequality, we argue as follows. Let B = {∃j ∈ [n] : Dj ̸= Dj′ } be the event
where there is a departure time that differs in both problems. Then, Pr(B) ≤ Pr(∃j ∈ [n] : Gj ≥ 2) ≤ nε.
Let Π′ be a policy for the problem with departure times D1′ , . . . , Dn′ . Consider the policy Π that runs P ′ on
the problem with the departure times D1 , . . . , Dn . Note that over the event B, Π and Π′ collect the same
expected reward. We denote by vS,D (Π) the value collected by the policy Π when run on departure times
D1 , . . . , Dn , and vS,D (Π′ ) the expected value of the policy Π′ when run on departure times D1′ , . . . , Dn′ .
Then,
X n
vS,D (Π) = ES,D vj I{Π runs j}
j=1
Xn
= ES,D′ vj I{Π′ runs j} | B Pr(B)
j=1
Xn
= vS,D′ (Π′ ) − ES,D′ vj I{Π′ runs j} | B Pr(B)
j=1
′ 2
≥ vS,D′ (Π ) − n εv(OPT)
where we used that vj ≤ v(OPT) for all j ∈ [n]. From here, the result follows.
Proof of Proposition 2.4. The result is clearly true for t = 1 since the server is always free at time 1. For
t > 1, we can analyze the probability that the server is busy at time t:
1 − Pr(SFt ) = Pr(∃τ < t, ∃j : Runj,τ ∩ {Sj > t − τ })
XX n
≤ Pr(Runj,τ )F̄Sj (t − τ )
τ <t j=1
n
XX x∗j,τ 1
≤ F̄Sj (t − τ ) ≤
τ <t j=1
2 2
27
The first inequality follows from the union bound. The second inequality follows from Inequality (2). The
last inequality follows from Constraint (b). Reordering terms gives us the desired result.
Proof of Claim 2.5. By the assumption that P v1 ≥ v2 ≥ · · · ≥ vn ≥ 0, it follows that there exist values
n
u1 , . . . , un ≥ 0 such that for all j ∈ [n], vj = ℓ=j uℓ . Using Xt to denote the value of the highest valued
available job at time t. Then, we have:
Pn Q
E[Xt ] j=1 αj,t (1 − αi,t ) · vj
Pn = Pn i<j
α v
j=1 j,t j i=j j,t vj
α
Pn Q Pn
j=1 αj,t i<j (1 − αi,t ) ℓ=j uℓ
= P n P n
α
i=j j,t u
ℓ=j ℓ
Pn Pℓ Q
ℓ=1 uℓ j=1 αj,t i<j (1 − αi,t )
= Pn Pℓ
ℓ=1 uℓ j=1 αj,t
Pℓ Q
uℓ j=1 αj,t i<j (1 − αi,t )
≥ min Pℓ
ℓ uℓ j=1 αj,t
Qℓ
1 − i=1 (1 − αi,t )
= min Pℓ
ℓ
j=1 αj,t
Pℓ
1 − e− i=1 αi,t
≥ min Pℓ
ℓ αj,t
j=1
−αt
1−e 1
= ≤1−
αt e
where the first inequality follows from 1 + x ≤ ex for all x, the penultimate equality follows from the the
fact that f (x) = (1 − e−x )/x is decreasing in [0, 1], and final inequality comes from αt ≤ 1. Thus, we can
conclude that
n
X Y n
X n
X
E[Xt ] = vj αj,t (1 − αi,t ) ≥ (1 − 1/e) · αj,t vj = (1 − 1/e) x∗j,t · vj /(2 Pr(SFt )).
j=1 i<j j=1 j=1
28
C.1 Preliminaries
In this subsection, we recast the StocSchedX problem by first defining states that keep track of the history
of actions and jobs remaining in the system. Then, we define policies that make decisions based on these
states. Finally, we introduce the value (or reward) collected by the policy.
The problem input are n jobs with triple (v1 , Dj , Sj ) for j ∈ [n]. A state σ for time t is an ordered
sequence of length t − 1 of the form σ1 σ2 · · · σt−1 , where:
1. Each στ ∈ {(Rτ , null, null) : Rτ ⊆ [n]} ∪ {(Rτ , s, τ → j) : j ∈ [n], Rτ ⊆ [n], s ≥ 1}, and
2. If στ = (Rτ , s, τ → j), then στ +1 , . . . , σmin{τ +s−1,t−1} must be of the form (R, null, null), and
3. Rt−1 ⊆ Rt−2 ⊆ · · · ⊆ R1 ⊆ [n]. We define R0 = [n].
The triplet (Rτ , s, τ → j) represents that at time τ , the decision-maker runs (or simulate) job j, the service
time of j is s, and the jobs that remain have not departed the system are Rτ . The triplet (Rτ , null, null)
indicates that at time τ a previous job is still running or the decision-maker decided not to run any job.
States have a natural recursive structure: σ = σ ′ (Rt−1 , st−1 , t − 1 → jt−1 ) or σ = σ ′ (Rt−1 , null, null),
depending on the last triplet of σ. The initial state σ0 is the empty sequence. Given a state σ = σ1 · · · σt−1 ,
we say that the state σ ′ = σ1 · · · στ S
is a prefix of σ for any τ ≤ t − 1. Let Σt ⊆ σ be the set of all states for
time t. Then, Σ1 = {σ0 } and Σ = t≥1 Σt is the set of all possible states for any time t ≥ 1.
A policy is a function Π : Σ → [n] ∪ {null}. Given a state σ = σ1 · · · σt−1 for time t − 1, let τσ =
max{τ ≤ t − 1 : στ = (Rτ , s, τ → j) for some s, j} the index of the last triplet στ in σ that is of the form
(Rτ , s, τ → j); we define τσ = −∞ if σ has no such triplets. Then, we define sσ to be the service time in
the triplet στσ if τσ ≥ 1; otherwise, sσ = −∞. The policy Π is feasible if for any state σ for time t, we have
that the sequences of t triplets defined by
(
′ σ(Rt , s, t → j), if Π(σ) = j ∈ [n], and τσ + sσ ≤ t
σ =
σ(Rt , null, null) if Π(σ) = null
is a state for time t+1 (i.e., it satisfies 1-3) for any subset Rt ⊆ Rt−1 and s ≥ 1. In other words, a valid policy
must respect the service time of the last job run before t, if any, while also must respect the departures, if
any. A state σ for time t is generated by a valid policy Π if σ = σ0 or σ can be obtained from its prefix σ ′
for time t − 1 by applying the policy Π.
Given a state σ = σ1 . . . σt−1 at time t, we say that job j has been run before time t (in σ) if there exists
a triplet in σ of the form στ = (Rτ , si,τ , τ → j) with j ∈ Rτ −1 ; the largest τ ′ ≥ 1 for which the prefix σ ′ up
to time τ ′ in σ, where job j has not been run before τ ′ , represents the time when j is run in σ. Suppose
that job j is run in σ at time τ ′ ; then, we say that any time τ > τ ′ for which στ = (Rτ , si,τ , τ → j) holds,
represents a time when job j is simulated, regardless of whether j belongs to Rτ −1 . For a valid policy Π and
a state σ for time t, we say that Π runs job j at time t if Π(σ) = j, j ∈ Rt−1 , and j has not been run before
t in σ. If Π(σ) = j and j ∈ / Rt−1 or j has been run before t in σ, then, we say that Π simulates job j at
time t.
Given a valid policy Π and a state σ for time t, we define the expected value of the Π starting from the
state σ as
(
vj I{j∈Rt−1 ,j has not been run in σ} + ESj ,Rt [v(σ(Rt , Sj , t → j))] if Π(σ) = j ∈ [n]
v(σ | Π) =
ERt [v(σ(Rt , null, null))] if Π(σ) = null
Note that a policy can only collect a value from a job j once and for the first k jobs that runs. After that, if
the policy decides to select j again, it can only simulate running the job. Given the initial state σ0 , we are
interested in solving
sup v(σ0 | Π).
Π feasible
policy
This value is well-defined, as the policy that maps everythingPto null is feasible (with an expected value of
0). Additionally, the value of any policy is upper-bounded by j∈[n] vj . We denote by OPT the feasible policy
that maximizes v(σ | Π). Note that an optimal policy will never simulate a job if it can run a job. However,
29
allowing this possibility is going to be useful for the next subsection when we analyze the greedy algorithm.
When it is clear from the context, and to agree with the notation introduced in previous subsections, we will
write v(Π) to refer to v(σ | Π).
Proof. For the sake of simplicity, we run the argument over OPT′ that does not make any decision after time
T so the construction above only generates a finite sequence of algorithms ALG0 , . . . , ALGT , and ALG′ = ALGT .
We know that up to an arbitrarily small error, the result will hold. We denote OPT′ simply by OPT.
In the proof, we compare the value collected by ALGt against ALGt−1 for each t ≥ 1. To do this, we take
a state σ for time T + 1 that is generated by OPT. From this state, we generate a state σ T that is generated
by ALGT . This process shows that the probability that OPT generates σ is equal to the probability that ALGT
generates σ T . Furthermore, if vσ denotes the value collected by jobs run in σ and vσT the value collected
by runs run in σ T , then, we show that vσ ≤ 2vσT . With these two ingredients, we conclude that twice the
expected value collected by ALGT = ALG′ upper bounds the expected value of OPT. The rest of the proof is
a formalization of this idea.
Let σ = σ1 σ2 · · · σT be a state for time T + 1 generated by OPT. Let i1 , . . . , ik be the jobs run by OPT in
this state and τ1 < . . . < τk be the times they are run. We first generate a sequence of states for time T + 1
that ALGt can generate for each t = 1, . . . , T . Let σ 0 = σ. For t ≥ 1, we generate σ t from σ t−1 as follows.
Let στt−1 be the τ -th triplet in σ t−1 . Then,
• If σtt−1 = (Rt , null, null), then, σ t = σ t−1 and go to t + 1.
• Otherwise, if σtt−1 = (Rt , s, t → j), then, let ALGt (σ1t−1 · · · σt−1
t
) = j ′ and define σ t = σ1t−1 · · · σt−1
t−1
(Rt , s, t →
′ t−1 t−1 t
j )σt+1 · · · σT . That is, σ is the same as σ t−1
with the t-th triplet changed to (Rt , s, t → j ′ ).
30
With this, we have generated a sequence σ t that ALGt can generate. An inductive argument shows that the
probability that the sequence σ is generated by OPT is the same as the probability that ALGt generated σ t .
Moreover, notice that ALGt and ALGt−1 make decisions different from null at the same times τ1 , . . . , τk .
Now, we are ready to compare the values of ALGt−1 and ALGt . We do this by recording the values as in
the table in Figure 4. The columns corresponds to the times where OPT, ALG1 , . . . , ALGT make decisions, in
this case, τ1 , . . . , τk . The rows are the values that each algorithm ALGt collects with a minor twist. For each
time τ ≤ tt , we boost the value collected by ALGt by a factor of 2 as Figure 4 shows. We claim that the
time τ1 τ2 τ3 τ4 ··· τk
OPT v i1 v i2 vi3 v i4 ··· v ik
ALG1 2vi∗1 v i2 0 v i4 ··· v ik
ALG2 2vi∗1 v i2 0 v i4 ··· v ik
ALG3 2vi∗1 2vi∗2 0 v i4 ··· v ik
.. .. .. .. .. ..
. . . . . .
ALGT 2vi∗1 2vi∗2 0 2vi∗4 ··· 0
Figure 4: Comparison of coupled values. In this example, 1 = t1 < 2 < t2 and so ALG2 does not
change the action in t2 .
sum of values in the t-th row in Figure 4 upper bounds the sum of values in the (t − 1)-th row for all t ≥ 1.
Indeed, for t ≥ 1, if t ̸= τ1 , . . . , τk , then the sum of values in row t and t − 1 are the same. if t = τi , for some
i, then there are two cases:
• If ALGt gets 0 value at time t, then, ALGt−1 also does. Since, every other entry in the two rows coincide,
then the result follows in this case.
• If ALGt gets a positive value vi∗ for some i∗ , then, this value upper bounds the value obtained in the
same column for row t − 1. Furthermore, the additional copy induced by the factor of 2 in vi∗ helps
pay for any future occurence of item i∗ in row t − 1. Hence, also in this case the result follows.
From these two results, we obtain
X X
vj Pr(OPT generates σ) ≤ 2 vj Pr(ALGT generates σ T )
j run in σ j run in σ T
and summing over all σ (which has a unique σ T associated) concludes that v(OPT) ≤ 2v(ALG′ ).
Proof. Observe that whenever ALG′ runs a job, it does so with the highest-valued one available. Greedy does
the same, but it does not miss any opportunity to run a job as ALG′ might do. Hence, the expected value
collected by greedy can only be larger than the expected value collected by ALG′ . This finishes the proof.
31
We also make necessary modifications to the constraints accordingly. Since each job j has a deadline
of Bj , the summation found in Constraint (ddl(a)) now has a temporal upper bound of Bj . As for Con-
straint (ddl(b)), note that we are only allowed to run job j up to time Bj . Therefore, at any given time t,
only jobs with deadlines later than t are permitted to occupy the server busy.
Proof of Lemma 3.5. We use Conj,t to denote the indicator of the event that job j is considered by ALG at
time t, and its value equals to 1 with probability x∗j,t /2, and 0 otherwise by equation (2). Let Conj be the
indicator representing that ALG ever considers item j at any time during Pthe process respectively. Conj is a
Bernoulli variable with value 1 with probability less than or equal to t≥1 x∗j,t /2. Let Con be the sum of
weight of job j (wj ) multiplying Conj for all j ∈ [n]. Thus,
X n
XX
E[Con] = E[ Conj ] ≤ wj · x∗j,t /2 ≤ W/2
j∈[n] t≥1 j=1
where the final inequality uses Constraint (c). Therefore, we apply we apply the upper tail Chernoff bound
for bounded variables [30], and yield the following expression:
Pr(Con > W ) ≤ exp(−W 2 /(2nwmax
2
))
Since ALG only runs jobs if they are considered, the probability that the jobs which ALG runs have a cumulative
weighted sum at least W can also be bounded by exp(−W 2 /(2nwmax 2
)). From here, the result follows.
Proof of Corollary 3.7. We use the exact same notation as the proof above for Lemma 3.5. Note that now
the expected number of jobs that are considered during the whole process has a different expression:
n
XX
E[Con] ≤ x∗j,t /2 ≤ k/2
t≥1 j=1
by plugging in wj = 1 and W = k respectively. Therefore, we apply the upper tail Chernoff bound for
Bernoulli variables [43], and yield the following expression:
Pr(Con > k) ≤ exp(−k/6)
From here, we use the same reasoning from Theorem 3.4 and reach Corollary 3.7.
Example D.1. Consider the greedy algorithm that runs the job with the largest value that won’t exceed the
size of the knapsack. Consider the following instance of 2 jobs and knapsack size W . We set the parameters
of job 1 as follows: v1 = w1 = 1, D1 = 1 and S1 = 1 with probability 1. For job 2, we have v2 = W − 1,
w2 = W , D2 = 1 and Sj = 1 with probability 1. Observe that any algorithm can only run at most one job
and the greedy algorithm runs job 1, collecting value 1. The optimal solution obtains a total value of W − 1
by first running jobs 2.
Example D.2. Consider the greedy algorithm that runs the one with the largest value per unit-weight
(vj /wj ) that won’t exceed the size of the knapsack. Consider the following instance of n jobs and knapsack
size W > n. We set the parameters of job 1 as follows: v1 = 1 + ε, w1 = W , D1 = n and S1 = 1 with
probability 1. For each job j ∈ {2, . . . , n}, we have vj = 1, wj = 1, Dj = n and Sj = 1 with probability 1.
Observe that the greedy algorithm can only run job 1 and collect value 1 + ε. The optimal solution obtains a
total value of n − 1 by serving all jobs but job 1.
Example D.3. Consider the hybrid greedy algorithm that, from the set of available jobs that won’t exceed the
size of the knapsack, with probability 1/2 runs the job j with the largest vj and then stop; with probability 1/2
runs jobs by decreasing order of value per unit-weight (vj /wj ). Note above policy gives a 1/2-approximation
for the standard knapsack problem. Consider the following instance of n+1 jobs and knapsack size n (assume
for simplicity, n is even). For each job j ∈ {1, . . . , n}, we have vj = 1 + (j − n+1
2 ) · ε, wj = 1, Dj = j + 1 and
Sj = 1 with probability 1. We set the parameters of job n+1 as follows: vn+1 = 1+nε/2, wn+1 = n, Dn+1 = n
and Sn+1 = 1 with probability 1. Note that the hybrid greedy policy with probability 1/2, retrieve a value
of vn+1 = 1 + nε/2 and stop. With remaining probability 1/2, runs job j from {n, . . . , n/2 + 1}sequentially
and stop since all other jobs have already departed, and retrieve total value of: n/2 + n2 ε/8. The optimal
policy runs jobs from {1, . . . , n} sequentially and collect a total value of n. Thus, when n large, and ε → 0,
v(ALGHybrid Greedy )/v(OPT) → 1/4.
32
E The Sampling-Based Algorithm
In this section, we complete the proof of Lemma 2.7. We illustrate the simulation process through Algorithm 2
which is a slight modification from ALG (see §E.1). We delve into the analysis onto how simulation impacts
objective value and failure analysis respectively in subsequent subsections (see §E.2).
As discussed in Lemma 2.2, we know that fj,1 = 1 for all j ∈ [n], thus we perform the following simulation
process starting from time t = 2:
• We start simulation by iterating ALGε with t = 2 along with other standard inputs (distributional
information for service and departure time, as well as job values) for a total of M times. And for each
iteration, when job j satisfies the event {NCj,2 , SF2 | Aj,2 }, we increase Countj,2 by 1. In the end, we
retrieve the value of fbj,2 = Countj,2 /M for all j.
• We
n continue
o the same process with time t increasing
n by o 1 each time up until time B, and use
fbj,t′ ′
as known information to simulate f
bj,t .
j∈[n],t ∈[t−1] j∈[n]
Algorithm 2 ALGε
1: Input:
n o jobs j = 1, . . . , n, with value vj , departure time pj (·), parameter c, time t, simulated
fbj,t′ ′
j∈[n],t ∈[t]
2: x∗ ← optimal solution LP (LP-Sched).
3: C ← [n].
4: for t′ = 1, . . . t do
5: if server is free then
6: C ∗ ← ∅.
7: for j ∈ C do:
8: if x∗j,t′ ≥ ε/n2 T vj then: add j to C ∗ w.p. x∗j,t′ · (1 − ϵ)2 /(2 · pj (t) · fbj,t′ ).
9: jt′ ← arg maxj∈C ∗ {vj }, run jt′ .
10: update C by removing the job that was considered (if any) or jobs that are no longer
available.
E.2 Analysis
E.2.1 Impact on Objective Value
Claim E.1. Given x∗ the optimal solution of LP (LP-Sched), we can disregard tiny values of x∗j,t for all
j and t that are no larger than ε/n2 T vj as illustrated in ALGε , at the expense of an ε-loss in the overall
objective.
33
Proof. Without loss of generality, we can normalize the objective value to 1 from scaling respective vi values.
Then, the loss of the LP value can be derived:
n
XX ε
loss of vLP = I{x∗j,t ≤ }x∗ vj ≤ ε
vj · n2 T j,t
t≥1 j=1
The advantage of doing so lies in the fact that very small fj,t value would not fit the Monte Carlo
simulation scheme in general. Consequently, we avoid simulating these numerical values. Indeed,
x∗j,t ϵ
E[fbj,t ] = fj,t ≥ ≥ 2
2 · pj (t) 2n T · pj (t) · vj
A lower bound for fj,t that we simulate is established, and thus we assume that Countj,t ≥ µ in the
following analysis.
Notice the following scenario may happen during simulation, that is: fbj,t can be more than ϵ away from fj,t .
Formally, by Chernoff bound, we retrieve the following inequality:
where the last inequality applies Chernoff bound [43]. We call the above event as failure which could happen
with ε in the whole time frame t ∈ [nT ] and across all job j ∈ [n] by union bound. However, assuming no
failures happen, we inductively prove the following claim.
Claim E.2. Given zero failure up to time t, denoted as ZFt , x∗j,t · (1 − ϵ)2 /(2 · pj (t) · fbj,t ) ≤ 1 for all t ∈ [nT ]
and j ∈ [n].
′
Proof. Note by the set-up of ALGε , we can find the probability of Cj,t :
′ x∗j,t
Pr[Cj,t ] = · fj,t · (1 − ε)2
2fbj,t
fbj,t
Assuming no failures happen (i.e. 1 − ε ≤ fj,t ≤ 1 + ε) up to time t, we get:
Reordering the above inequality will give fj,t ≥ x∗j,t /(2 · pj (t)). Thus,
34
Finally we connect above facts with what we want to show in Corollary 1.3.
Proof. Notice that following ALGε , the probability that job j is in the consideration set C ∗ at time t is at
2 x∗
least (1−ε) j,t
1+ε · 2 from inequality 3. Using similar technique from the proof of Theorem 1.2, we get that the
2
expected value obtained is at least (1−ε)
1+ε · 1/2 · (1 − 1/e) · v(OPT). Therefore, by treating failure (which
happens ε amount of time) as a run obtaining 0 value, we recover Corollary 1.3.
Table 3: Algorithm Values and 95% Confidence Intervals (Synthetic Datasets). Results are rounded
to retain two decimals for consistency.
Table 4: Algorithm Values and 95% Confidence Intervals (Real Datasets). Results are rounded to
retain two decimals for consistency.
35
Figure 5: Service time point mass function distribution for the three categories of customers.
36
Figure 6: Departure time point mass function distribution for the three categories of customers.
37
Figure 7: Sample Path of Approximating f1,3
38