Machine Scheduling
Contents
Problem description
Empirical Scheduling
Optimal Scheduling
# install dependencies and select solver
%pip install -q amplpy matplotlib
SOLVER = "highs"
from amplpy import AMPL, ampl_notebook
ampl = ampl_notebook(
modules=["highs"], # modules to install
license_uuid="default", # license to use
) # instantiate AMPL object and register magics
Problem description
“Which job should be done next?” is a question one face in modern life, whether for a busy
student working on course assignments, a courier delivering packages, a server waiting on
tables in a busy restaurant, a computer processing threads, or a machine on a complex
assembly line. There are empirical answers to this question, among them “first in, first out”,
“last in, first out”, or “shortest job first”.
What we consider in this notebook is the modeling finding solutions to this class of problem
using optimization techniques. This notebook demonstrates the formulation of a model for
scheduling a single machine scheduling using disjunctive programming in AMPL. The problem
is to schedule a set of jobs on a single machine given the release time, duration, and due time
for each job. Date for the example problem is from Christelle Gueret, Christian Prins, Marc
Sevaux, “Applications of Optimization with Xpress-MP,” Chapter 5, Dash Optimization, 2000.
Skip to main content
Job data
The problem is to schedule a sequence of jobs for a single machine. The problem data
presented as a Pandas dataframe where each row corresponds to a job. The dataframe in
indexed the names of the jobs. The data for each job consists of the time when it is released
for machine processing, the duration of the job, and when the it is due. The objective is, if
possible, to schedule the jobs on the machine to satisfy the due dates. If that is not possible,
then the objective is to schedule the jobs to minimizing some measure of schedule “badness”.
import pandas as pd
jobs = pd.DataFrame(
{
"A": {"release": 2, "duration": 5, "due": 10},
"B": {"release": 5, "duration": 6, "due": 21},
"C": {"release": 4, "duration": 8, "due": 15},
"D": {"release": 0, "duration": 4, "due": 10},
"E": {"release": 0, "duration": 2, "due": 5},
"F": {"release": 8, "duration": 3, "due": 15},
"G": {"release": 9, "duration": 2, "due": 22},
}
).T
jobs
release duration due
A 2 5 10
B 5 6 21
C 4 8 15
D 0 4 10
E 0 2 5
F 8 3 15
G 9 2 22
Schedule data
A schedule is also represented here by a Pandas dataframe indexed by job names. The
columns indicate the start, finish, and the
Skip toamount by each job is past due. The following cell
main content
creates a schedule when the jobs are executed in the order given by the jobs dataframe.
def schedule_jobs(jobs, seq):
"""
Schedule a sequence of jobs based on their release times, durations, and du
Args:
jobs (DataFrame): Job information with columns "release", "duration", a
seq (DataFrame): A list of job indices representing the sequence in whi
Return:
A DataFrame containing schedule information, including "start", "finish
"""
schedule = pd.DataFrame(index=jobs.index)
t = 0
for job in seq:
t = max(t, jobs.loc[job]["release"])
schedule.loc[job, "start"] = t
t += jobs.loc[job, "duration"]
schedule.loc[job, "finish"] = t
schedule.loc[job, "past"] = max(0, t - jobs.loc[job, "due"])
return schedule
schedule = schedule_jobs(jobs, jobs.index)
schedule
start finish past
A 2.0 7.0 0.0
B 7.0 13.0 0.0
C 13.0 21.0 6.0
D 21.0 25.0 15.0
E 25.0 27.0 22.0
F 27.0 30.0 15.0
G 30.0 32.0 10.0
Gantt chart
A traditional means of visualizing scheduling data in the form of a Gantt chart. The next cell
presents a function gantt that plots a Gantt
Skip chart
to main given jobs and schedule information.
content
ort matplotlib.pyplot as plt
m matplotlib.lines import Line2D
gantt(jobs, schedule, title="Job Schedule"):
"""
Plot a Gantt chart for a given job schedule.
Args:
jobs (DataFrame): Contains job release times and due times.
schedule (DataFrame): Contains job start times, finish times, and past due
title (str, optional): Title of the Gantt chart. Defaults to "Job Schedule".
"""
w = 0.25 # bar width
fig, ax = plt.subplots(1, 1, figsize=(10, 0.7 * len(jobs.index)))
for k, job in enumerate(jobs.index):
r = jobs.loc[job, "release"]
d = jobs.loc[job, "due"]
s = schedule.loc[job, "start"]
f = schedule.loc[job, "finish"]
# Show job release-due window
ax.fill_between(
[r, d], [-k - w, -k - w], [-k + w, -k + w], lw=1, color="k", alpha=0.2
)
ax.plot(
[r, r, d, d, r], [-k - w, -k + w, -k + w, -k - w, -k - w], lw=0.5, color
)
# Show job start-finish window
color = "g" if f <= d else "r"
ax.fill_between(
[s, f], [-k - w, -k - w], [-k + w, -k + w], color=color, alpha=0.6
)
ax.text(
(s + f) / 2.0,
-k,
"Job " + job,
color="white",
weight="bold",
ha="center",
va="center",
)
# If past due
if f > d:
ax.plot([d] * 2, [-k + w, -k + 2 * w], lw=0.5, color="k")
ax.plot([f] * 2, [-k + w, -k + 2 * w], lw=0.5, color="k")
ax.plot([d, f], [-k + 1.5 * w] * 2, solid_capstyle="butt", lw=1, color="
ax.text(
f + 0.5,
-k + 1.5 * w,
f"{schedule.loc[job, 'past']}
Skip to past due",
main content
va="center",
)
total_past_due = schedule["past"].sum()
ax.set_ylim(-len(jobs.index), 1)
ax.set_title(f"{title}: Total past due = {total_past_due}")
ax.set_xlabel("Time")
ax.set_ylabel("Jobs")
ax.set_yticks([]) # range(len(jobs.index)), jobs.index)
ax.spines["right"].set_visible(False)
ax.spines["left"].set_visible(False)
ax.spines["top"].set_visible(False)
custom_lines = [
Line2D([0], [0], c="k", lw=10, alpha=0.2),
Line2D([0], [0], c="g", lw=10, alpha=0.6),
Line2D([0], [0], c="r", lw=10, alpha=0.6),
]
ax.legend(
custom_lines,
["Job release/due window", "Completed on time", "Completed past due"],
)
tt(jobs, schedule, "Execute Jobs in Order")
Empirical Scheduling
To provide a comparison to scheduling with optimization models, the following cells implement
two well known and accepted empirical
Skip rules forcontent
to main scheduling jobs on a single machine:
First in, first out (FIFO)
Earliest due data (EDD)
First-in, first-out (FIFO)
One of the most common scheduling rules is to execute jobs in the order they are released for
processing, in other words “first-in, first-out” (FIFO). The following cell creates a Pandas
dataframe is indexed by job names. The start time, finish time, and, if past due, the amount by
which the job is past due.
fifo = schedule_jobs(jobs, jobs.sort_values(by="release").index)
gantt(jobs, fifo, "First in, First out")
Earliest due date (EDD)
When due dates are known, a common scheduling rule is to prioritize jobs by due date. This
strategy will be familiar to any student deciding which homework assignment should to work
on next.
edd = schedule_jobs(jobs, jobs.sort_values(by="due").index)
gantt(jobs, edd, "First in, First out")
Skip to main content
Optimal Scheduling
The modeling starts by defining the problem data.
Symbol Description
releasej when job j is available
durationj how long job j
duej when job j is due
The essential decision variable is the time at which the job starts processing.
Print to PDF
Symbol Description
startj when job j starts
finishj when job j finishes
pastj how long job j is past due
Depending on application and circumstances, various objectives can be considered. Suitable
objectives include the total numberSkip to main
of late content
jobs, the longest past due interval, or the sum of all
past due intervals. The following AMPL model minimizes the sum of past due intervals, that is
min ∑ pastj
j
Constraints describe the required logical relationships among the decision variables. For
example, a job cannot start until it is released for processing
startj ≥ releasej
Once started, machine processing continues until the job is finished. The finish time for each
job is compared to the due time, and any past due interval is stored the pastj decision variable.
finishj = startj + durationj
pastj ≥ finishj − duej
pastj ≥ 0
The final set of constraints require that no pair of jobs be operating on the same machine at the
same time. For this purpose, we consider each unique pair (i, j) where the constraint i < j to
imposed to avoid considering the same pair twice. Then for any unique pair i and j, either i
finishes before j starts, or j finishes before i starts. This is expressed as the family of
disjunctions
[ finishi ≤ startj] ⊻ [finishj ≤ starti ] ∀i < j
This model and constraints can be directly translated to AMPL.
%%writefile job_machine_scheduling.mod
set JOBS;
set PAIRS within{JOBS, JOBS};
param release{JOBS};
param duration{JOBS};
param due{JOBS};
var start{JOBS} >= 0, <= 300;
var finish{JOBS} >= 0, <= 300;
var past{JOBS} >= 0, <= 300;
Skip to
s.t. job_release {job in JOBS}: main content
start[job] >= release[job];
s.t. job_duration {job in JOBS}: finish[job] == start[job] + duration[job];
s.t. past_due_constraint {job in JOBS}: past[job] >= finish[job] - due[job];
s.t. machine_deconflict {(job_a, job_b) in PAIRS}:
(finish[job_a] <= start[job_b] or finish[job_b] <= start[job_a])
and not
(finish[job_a] <= start[job_b] and finish[job_b] <= start[job_a])
;
minimize minimize_past: sum{job in JOBS} past[job];
Overwriting job_machine_scheduling.mod
def build_model(jobs):
m = AMPL()
m.read("job_machine_scheduling.mod")
m.set_data(jobs, "JOBS")
pairs = [(i, j) for i in jobs.index for j in jobs.index if i < j]
m.set["PAIRS"] = pairs
return m
def solve_model(m, solver_name=SOLVER):
m.option["solver"] = solver_name
m.solve()
schedule = pd.DataFrame(
{
"start": m.var["start"].to_dict(),
"finish": m.var["finish"].to_dict(),
"past": m.var["past"].to_dict(),
}
)
schedule["past"] = schedule["past"].round()
return schedule
model = build_model(jobs)
schedule = solve_model(model)
display(schedule)
HiGHS 1.5.1: HiGHS 1.5.1: optimal solution; objective 16
2447 simplex iterations
85 branching nodes
Skip to main content
start finish past
A 6.0 11.0 1.0
B 14.0 20.0 0.0
C 22.0 30.0 15.0
D 2.0 6.0 0.0
E 0.0 2.0 0.0
F 11.0 14.0 0.0
G 20.0 22.0 -0.0
gantt(jobs, schedule, "Minimize total past due")