0% found this document useful (0 votes)
6 views14 pages

Gradient Descent Algorithm For The Optimization of F 2024 Journal

This paper presents a novel approach to optimize fixed priority assignments in partitioned preemptive real-time systems using gradient descent algorithms commonly utilized in machine learning. The proposed method adapts techniques such as the Adam optimizer and Gradient Noise to improve priority assignment efficiency while maintaining compliance with timing constraints. Evaluation results demonstrate that this approach can yield more schedulable solutions compared to existing heuristics, approximating optimal algorithms without the scalability issues associated with them.

Uploaded by

mhmad240
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views14 pages

Gradient Descent Algorithm For The Optimization of F 2024 Journal

This paper presents a novel approach to optimize fixed priority assignments in partitioned preemptive real-time systems using gradient descent algorithms commonly utilized in machine learning. The proposed method adapts techniques such as the Adam optimizer and Gradient Noise to improve priority assignment efficiency while maintaining compliance with timing constraints. Evaluation results demonstrate that this approach can yield more schedulable solutions compared to existing heuristics, approximating optimal algorithms without the scalability issues associated with them.

Uploaded by

mhmad240
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

Journal of Systems Architecture 153 (2024) 103198

Contents lists available at ScienceDirect

Journal of Systems Architecture


journal homepage: www.elsevier.com/locate/sysarc

Gradient descent algorithm for the optimization of fixed priorities in


real-time systems
Juan M. Rivas a ,∗, J. Javier Gutiérrez a , Ana Guasque b , Patricia Balbastre b
a
Software Engineering and Real-Time Group, Universidad de Cantabria, Avda. de los Castros 48, Santander, 39005, Spain
b
Instituto de Automática e Informática Industrial (ai2), Universitat Politècnica de València, Camino de Vera, s/n, Valencia, 46022, Spain

ARTICLE INFO ABSTRACT

Keywords: This paper considers the offline assignment of fixed priorities in partitioned preemptive real-time systems
Real-time where tasks have precedence constraints. This problem is crucial in this type of systems, as having a good fixed
Fixed-priorities priority assignment allows for an efficient use of the processing resources while meeting all the deadlines. In
Optimization
the literature, we can find several proposals to solve this problem, which offer varying trade-offs between
Gradient descent
the quality of their results and their computational complexities. In this paper, we propose a new approach,
leveraging existing algorithms that are widely exploited in the field of Machine Learning: Gradient Descent, the
Adam Optimizer, and Gradient Noise. We show how to adapt these algorithms to the problem of fixed priority
assignment in conjunction with existing worst-case response time analyses. We demonstrate the performance
of our proposal on synthetic task-sets with different sizes. This evaluation shows that our proposal is able to
find more schedulable solutions than previous heuristics, approximating optimal but intractable algorithms
such as MILP or brute-force, while requiring reasonable execution times.

1. Introduction have been proposed to work-around this problem offering sub-optimal


solutions [3], ranging from the application of general purpose tech-
Real-time systems, which impose both functional and timing con- niques such as Genetic Algorithms [5,6] or Simulated Annealing [4], to
straints, can be found in many mission-critical applications in domains tailor-made algorithms such as HOPA [7]. An interesting technique that
such as automotive, aerospace and healthcare. These systems are usu- has been proposed is Mixed Integer Linear Programming (MILP) [8].
ally composed of a set of tasks that are concurrently scheduled by While MILP is in theory able to provide optimal solutions, its main
a scheduler provided by a Real-Time Operating System (RTOS). Al- drawback lies in its scalability issues, which become apparent when
though already proposed more than half a century ago [1] in the the complexity of the system increases.
form of Rate Monotonic Scheduling, Fixed Priority Scheduling (FPS) Nowadays, the field of Artificial Intelligence, and more specifically
nowadays remains the most common scheduling policy used in real- Machine Learning, is experiencing the highest rates of research in-
time systems [2], and is extensively supported in current RTOSs and terest and production in the area of Computer Science. This push is
programming languages [3]. With FPS, each task is assigned at de- specially felt in the advancements reported on areas such as Natural
sign time a static fixed priority. At runtime, the scheduler selects for
Language Processing, Image Generation or Autonomous Systems. In
execution the active task with the highest priority.
its most basic building blocks, these systems are usually composed of
The assignment of fixed priorities is a vital step in the design of
vast neural networks that must be subject to a computing intensive
FPS real-time systems. A bad selection of a priority assignment may
training process in order to produce useful results. This training is
result in an under-utilization of the resources to be able to meet the
essentially an optimization process, in which the parameters of the
timing constraints. On the contrary, a good priority assignment allows
neural networks (e.g. weights and biases) are iteratively tuned to
for higher utilization of the resources while complying with the timing
minimize a cost function. Currently, Gradient Descent (GD) is the de
constraints, and thus reduced costs.
We consider real-time systems characterized by precedence relation- facto algorithm for training such neural networks [9]. GD is a general-
ships, akin to those encountered in distributed systems. The challenge purpose optimization algorithm that is used to minimize differentiable
of finding a fixed priority assignment that meets the timing constraints mathematical functions. It achieves this by repeatedly making small
in this type of systems is known to be NP-hard [4]. Several heuristics adjustments to the parameters of the cost function in the opposite

∗ Corresponding author.
E-mail address: [email protected] (J.M. Rivas).

https://doi.org/10.1016/j.sysarc.2024.103198
Received 31 January 2024; Received in revised form 5 April 2024; Accepted 31 May 2024
Available online 4 June 2024
1383-7621/© 2024 The Author(s). Published by Elsevier B.V. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
J.M. Rivas et al. Journal of Systems Architecture 153 (2024) 103198

direction of the gradient. The field of Machine Learning has produced


further variants and optimizations to the original GD algorithm, that
have demonstrated their effectiveness in locating minima of the cost
function in large search spaces, such as deep neural networks [9].
In this paper, we propose exploiting these advancements on efficient
training of neural networks, by adapting the Gradient Descent optimiza-
tion algorithm to the problem of assigning fixed priorities in real-time
systems. Additionally, from the literature of Machine Learning we pick
two techniques that enhance the behavior of Gradient Descent: the
Adam optimizer and Gradient Noise. By employing Gradient Descent
and subsequent optimizations, we aim to define a priority assignment
algorithm that approximates an optimal assignment, while avoiding the
scalability issues of optimal techniques such as MILP.
This paper is organized as follows. In Section 2, we describe the
system model for real-time systems that we assume in this paper.
In Section 3, we describe how a generic gradient descent algorithm
Fig. 1. Two end-to-end flows (𝛤1 and 𝛤2 ), composed of 3 steps, that traverse 3
operates, and also we list state-of-the-art algorithms to assign fixed processing resources (𝑃 𝑅1 , 𝑃 𝑅2 and 𝑃 𝑅3 ).
priorities in real-time systems that conform with our model. Section 4
describes the main contribution of this paper, a Gradient Descent-based
algorithm to assign fixed priorities. Section 5 proposes an optimization
We define the worst-case response time (WCRT) of a step as the
of the previous algorithm to accelerate its execution. In Section 6, we
longest possible (or an upper bound) interval counting from the release
present the results of an exhaustive evaluation of our proposal. Finally,
of its end-to-end flow till the step’s completion. For a step 𝜏𝑖𝑗 , we denote
in Section 7, we present the main conclusions of this work.
its WCRT as 𝑅𝑖𝑗 . We assume that a step 𝜏𝑖𝑗 can have a real response time
between 0 and 𝑅𝑖𝑗 . We define the WCRT of a flow 𝛤𝑖 as the WCRT of
2. System model
its last step, and denote it as 𝑅𝑖 . When every end-to-end flow meets its
To describe the system model we follow the terminology of the OMG deadline, that is 𝑅𝑖 ≤ 𝐷𝑖 ∀𝑖, we say that the system is schedulable.
MARTE specification for Schedulability Analysis Modeling (SAM) [10]. We allow steps to have input jitter, which models the maximum
An implementation of this specification can be found in the MAST amount of time the release of a step may be delayed. The jitter of step
modeling framework [11,12]. 𝜏𝑖𝑗 is denoted as 𝐽𝑖𝑗 . Jitters can have any arbitrary positive value. There
We consider real-time systems composed of 𝑁 steps statically allo- is an inter-dependency between the jitters and WCRTs because of the
cated to processing resources. A step can model a task that executes precedence constraints: the start time of a step depends on the finish
on a CPU processing resource. The steps (i.e. tasks) are grouped into time of its predecessor, which is not constant, as its response time can
end-to-end (e2e) flows that establish precedence relationships among range from 0 to its WCRT.
the steps. For simplifying purposes, in this paper we consider linear The WCRTs of the steps, and by extension of the end-to-end flows,
e2e flows in which each step may have at most one successor and are obtained by applying a WCRT analysis. Since the problem of
predecessor step. Therefore, each e2e flow 𝛤𝑖 is composed of a linear obtaining exact WCRTs is NP-hard [13], these analyses generally obtain
sequence of 𝑁𝑖 steps. The j-th step of e2e flow 𝛤𝑖 is denoted as 𝜏𝑖𝑗 . A safe upper bounds. In the literature, we can find several such analyses
step 𝜏𝑖𝑗 has a worst-case execution time (WCET) denoted as 𝐶𝑖𝑗 , and is that support this system model, with varying degrees of pessimism and
statically allocated to processing resource 𝑃 𝑅𝑘 . time complexity. For instance, the Holistic analysis [14,15] makes the
Each end-to-end flow 𝛤𝑖 is released by a periodic sequence of simplifying but safe assumption that the dependency among tasks in the
external events with period 𝑇𝑖 . Sporadic events are also supported, in same e2e flow is only indirectly taken into account by the propagation
which case the period is considered as the minimum inter-arrival time of their jitters. Offset-Based techniques [16,17] model the in-flow step
of the events. We assume that all event sequences that arrive at the inter-dependencies more exactly through the use of offsets, resulting in
system and their worst-case rates are known in advance. The relative generally less pessimistic WCRTs. Implementations of these techniques
phasing of the activations of different end-to-end flows is assumed to can be found in the open source MAST Analysis tool [11,18].
be arbitrary and unknown. In this paper only tasks executed on CPUs and linear e2e flows
Deadlines can be set for individual steps or for the whole end-to- are considered, but the MAST model and available WCRT analyses
end flow. In this paper, to simplify the notation and without loss of support a wider range of systems. For instance, distributed systems
generality, we only consider e2e deadlines. We define 𝐷𝑖 as the e2e can be modeled by viewing messages as steps transmitted through
deadline that flow 𝛤𝑖 must meet, counting from the release of the e2e network processing resources. The analysis of this message traffic on
flow until its last step finishes its execution. End-to-end deadlines have the networks can be carried out using similar techniques to those used
no restrictions in relation to the periods. Specifically, deadlines can be in the CPUs, by adding a blocking term that accounts for the non
longer than the periods. The e2e flow 𝛤𝑖 is released by the arrival of preemptability of the message packets [15,16]. Non-linear end-to-end
an external event 𝑒𝑖 with period 𝑇𝑖 and e2e deadline 𝐷𝑖 . Assuming that
flows with fork and join events are also supported [19]. More specific
the event 𝑒𝑖 arrives at time 𝑡, this deadline imposes that the execution
and industry-relevant standards, such as Logical Execution Time (LET)
of the whole end-to-end flow must finish before 𝑡 + 𝐷𝑖 . Fig. 1 shows
in the automotive sector, or ARINC-653 in aerospace applications, can
two simple e2e flows composed of 3 steps, that traverse 3 different
also be supported [20–22].
processing resources (𝑃 𝑅1 , 𝑃 𝑅2 and 𝑃 𝑅3 ).
We assume that each processing resource is governed by a fixed
priority preemptive scheduler, and that each step is scheduled by their 3. Related work
statically assigned fixed priority. The fixed priority of step 𝜏𝑖𝑗 is denoted
as 𝑃𝑖𝑗 . Although the values of the fixed priorities are usually restricted In this section we provide the context of this work. First, in Sec-
to integers, in this paper we relax this restriction to allow any real tion 3.1 we present a review of the state-of-the-art on the assignment
number: 𝑃𝑖𝑗 ∈ R. Moreover, step 𝜏𝑖𝑗 is said to have a higher priority of fixed priorities. Then, in Section 3.2, we describe how a generic
than 𝜏𝑘𝑚 if 𝑃𝑖𝑗 > 𝑃𝑘𝑚 , where 𝑃𝑘𝑚 is the priority of step 𝜏𝑘𝑚 . Gradient Descent algorithm operates.

2
J.M. Rivas et al. Journal of Systems Architecture 153 (2024) 103198

3.1. Fixed priority assignment provide efficient environments to define MILP problems. Nevertheless,
we identify that MILP has two main challenges in its applicability to
The problem of finding a schedulable fixed priority assignment for our system model.
a real-time system is known to be NP-hard [4]. The works we can First, it is known to be NP-hard itself. This translates into becoming
find in the literature [3] offer different balances between their degree intractable as the search space becomes bigger, which is confirmed
of optimality and their computational complexities. Here, we consider in empirical evaluations [8]. Second, MILP requires the definition of
an algorithm as sub-optimal if it may be unable to find a schedulable linear constraints, which may not be available. For the problem we
priority assignment when one exists. are tackling in this paper, that is, to find a schedulable priority assign-
One of the first solutions proposed was to leverage the general ment, we would require linear equations that model a schedulability
purpose Simulated Annealing (SA) algorithm [4]. SA is a global opti- test compatible with our system model. As far as we know, no such
mization technique that attempts to find the lowest point in an energy equations have yet been defined that could be feasibly applied to MILP.
landscape, by emulating the physical process of heating and controlled Generally, simplifying assumptions must be included in the model to
cooling of a material to alter its physical properties. This algorithm obtain feasible equations. For instance, in [8] the distributed model
was proposed to both find schedulable priority assignments and to map assumes a strictly periodic activation of the steps, with constrained
steps to processing resources. deadlines. This greatly simplifies the problem, and makes that work
Another algorithm found in the literature is HOPA [7]. This iterative incompatible with our system model. Recent efforts [29] aim to relax
technique was created ad-hoc, and it is based on the distribution of the those previous restrictions imposed on the system model, but still
end-to-end deadlines among the steps, in the form of virtual deadlines, include several simplifications that make it incompatible with our
taking into account the steps’ worst-case response times. These virtual model. Namely, it assumes that every step in an e2e flow has the same
deadlines are then transformed into fixed priorities by following a priority, the end-to-end deadlines are constrained, and that the jitters
Deadline Monotonic criterion. HOPA has demonstrated to be capable of are assumed to be known before-hand and kept constant. In our model,
finding more schedulable priority assignments than SA, in significantly there is an inter-dependency between jitters and worst-case response
less time [7]. times that cannot be solved exactly if jitters are assumed constant.
Proportional Deadlines (PD) [23] is another technique that can be The scalability issues of MILP were tackled in [30], which proposes
used to assign fixed priorities. PD is a non-iterative algorithm that a more efficient and near-optimal algorithm that exploits the idea of
distributes the end-to-end deadlines among the steps proportionally to finding the maximum virtual deadlines that would render the system
the WCETs. As in HOPA, these virtual deadlines are then transformed not schedulable. These values are iteratively computed with an in-loop
into fixed priorities by applying a Deadline Monotonic principle. As it is standard Integer Linear Programming (ILP) optimization with relaxed
non-iterative, PD lacks the capability to improve on the initial priority constraints. Similarly to [8], this paper considers a simpler system
assignment, and generally is outperformed by iterative algorithms. model, with constrained deadlines and independent tasks without jitter.
Nonetheless, it is a very fast algorithm that can be useful as a seed In this paper, we are aiming to improve upon the performance of
in iterative algorithms, such as HOPA. sub-optimal algorithms such as HOPA, while avoiding the scalability
Genetic Algorithms are part of the so-called evolutionary algo- issues that an optimal technique such as MILP suffers.
rithms, which imitate biological mechanisms that guide the evolution
3.2. Generic gradient descent algorithm
process in species, and which are used to look for solutions to diverse
problems in wide search spaces. In the context of optimizing real-
Gradient Descent (GD) is an optimization algorithm for minimizing
time systems, they have been used as a part of a multi-objective
differentiable mathematical functions. GD is extensively used in the
strategy, for example: (1) to allocate tasks in identical processors and
field of Machine Learning to optimize parameters such as coefficients
to determine cyclic scheduling [24,25]; (2) to allocate independent
in linear regression problems or weights in neural networks.
tasks in heterogeneous distributed real-time systems, to which fixed
The GD algorithm starts with an initial guess for the function input
priorities are assigned following the Rate Monotonic scheme [26]; or
parameters, and then iteratively adjusts them in the direction that
(3) to assign priorities to tasks, as well as to determine the timing slots
reduces the value of the function the most, until a minimum (or some
for the messages transmitted through a TDMA network [6]. The work
other criteria) is reached. To achieve this behavior, GD employs the
in [5] develops a permutational genetic algorithm for the assignment of
gradient of the function. The gradient of a function at a given input
priorities to tasks and messages in a distributed real-time system, where
is a vector that points in the direction of the steepest increase of the
the results are compared to those obtained by HOPA [5], showing a
function at said input. GD leverages this observation by updating the
slight improvement (up to 4% higher schedulable utilization), but at a
current input parameters in the opposite direction of the gradient, as
much higher cost in computation time.
this represents the direction of steepest descent. The size of this update
Similarly, the authors of [27] propose a multi-objective competi- is usually modulated by a factor called learning rate.
tive co-evolution algorithm. This work considers tasks that may have In formal terms, given an 𝑛-dimensional function 𝑓 (𝑥1 , … , 𝑥𝑛 ), its
precedence constraints, but presents two main incompatibilities with input can be represented as an 𝑛-dimensional point 𝑝 = (𝑝1 , … , 𝑝𝑛 ).
our model, (1) the exact activation instants of the flows are known Function 𝑓 is usually called the cost function. Therefore, GD is said
beforehand (i.e. the relative phasings of the flows are known), and (2) to minimize the cost function. The gradient of function 𝑓 at point 𝑝 is
the tasks are dynamically assigned a processor at runtime according to depicted as ∇𝑓 (𝑝), which can be expressed as a vector of the partial
their global priority. The algorithm is evaluated using simulations on a derivatives of 𝑓 at point 𝑝 as follows:
set of industrial examples, and compares positively to relatively simple
algorithms (manual expert assignment, random search and sequential [ ]
𝜕𝑓 𝜕𝑓 𝜕𝑓
search). The execution times range from less than 2 min to 16 h in the ∇𝑓 (𝑝) = (𝑝), (𝑝), … (𝑝) (1)
𝜕𝑥1 𝜕𝑥2 𝜕𝑥𝑛
more complex example.
Mixed Integer Linear Programming (MILP) is a promising technique At a given iteration number 𝑡, the next point 𝑝𝑡+1 is calculated by
that has been applied to both step-to-processor mapping and priority moving the current point 𝑝𝑡 in the opposite direction of the gradient,
assignment problems [8]. In MILP, the problem is described as a which is scaled by a learning rate 𝜂:
set of linear constraints and a linear objective function, which gets ( )
𝑝𝑡+1 = 𝑝𝑡 − 𝜂∇𝑓 𝑝𝑡 (2)
optimized within those constraints. The main benefit of MILP is that,
in theory, it is an optimal algorithm, that is, it will find a schedulable Starting from an initial point 𝑝0 , Eq. (2) is iteratively applied. If
solution if one exists. Existing commercial libraries such as Gurobi [28] an appropriate 𝜂 factor is applied, and function 𝑓 is differentiable

3
J.M. Rivas et al. Journal of Systems Architecture 153 (2024) 103198

around 𝑝𝑡 , the inequality 𝑓 (𝑝𝑡+1 ) ≤ 𝑓 (𝑝𝑡 ) is respected. Therefore, by


repeatedly applying Eq. (2), GD will traverse function 𝑓 along a path
that keeps minimizing 𝑓 , until a point that produces a minimum of
the function is reached. Further stopping criteria could be added, for
instance establishing a maximum number of iterations.
Parameters such as the learning rate (𝜂) are usually called hyper-
parameters. A hyper-parameter is defined as a configuration variable
that can tweak the behavior of the algorithm, but it is not a parameter
that is being optimized. Typically, known good values for the hyper-
parameters are selected and kept unchanged. For instance, a very low
positive value for the learning rate (≈ 0.01) is usually considered as a
good candidate in the context of training neural networks.
The main challenge for GD is that there is no guarantee that a
global minimum will be found: depending on the starting point and the
shape of 𝑓 , following the gradient may lead to different local minima. Fig. 2. Gradient Descent Priority Assignment flowchart.
Furthermore, GD may get stuck in flat areas of the function, as the
gradient there evaluates to 0.
The chances of finding a global minimum can be increased by
GDPA is an iterative algorithm that will compute and evaluate one
enhancing GD with the idea of momentum, which can be intuitively
priority assignment per iteration. Accordingly, we denote as 𝛱 𝑡 the
explained if we visualize the Gradient Descent algorithm in the physical
priority assignment evaluated at iteration 𝑡. By extension, we define
world. If we imagine the function 𝑓 as a 3D shape, and the starting
𝑃𝑖𝑗𝑡 as the fixed priority value assigned to step 𝜏𝑖𝑗 at iteration 𝑡.
point as the location at which we release a ball, the ball will follow
Fig. 2 shows a high level overview of the GDPA algorithm. GDPA is
a downward path along the slopes of the shape, i.e. a path opposite
composed of 6 main phases: (1) initial priority assignment, (2) initial
to the gradients of the shape. The ball will continue its descent until
priority compression, (3) stop condition, (4) gradient computation, (5)
reaching a resting state at a local minimum. If we add mass to the
gradient optimization, and (6) new priority assignment. These phases
ball and consider gravity, the ball will accumulate momentum as it
will be described in detail in the following sub-sections.
accelerates down steep slopes. This momentum may be sufficient for
the ball to surpass the first local minimum it encounters, thus increasing
the possibility of finding further lower minima. 4.1. Initial priority assignment
Several techniques have been proposed that leverage this idea of
momentum [9], which have proved to be effective when optimizing Any Gradient Descent algorithm requires an initial set of input
vast neural networks comprised of cost functions with millions of values from which to start the optimization process. In the case of
parameters. It is worth stating that it is not the objective of this paper GDPA, these initial values represent an initial priority assignment,
to propose new solutions in the field of Machine Learning, but to select denoted as 𝛱 𝑖𝑛𝑖 . GDPA does not impose any restriction on this initial
and exploit existing and successful techniques, adapting them to the priority assignment. A technique such as PD [23] is a good candidate,
problem of fixed priority assignment, which will be carried out in the as it is a fast algorithm. A better starting point can be provided by
next section. employing a more advanced but slower technique such as HOPA [7].
A completely random priority assignment can also be used.
4. Gradient descent priority assignment
4.2. Initial priority compression
The problem we aim to solve can be defined as follows: given a real-
time system composed of end-to-end flows as described in Section 2, we In any Gradient Descent algorithm, the next candidate solution is
want to find a fixed priority assignment for every step in the system in always calculated by adding some values to the previous candidate.
such a way that the system becomes schedulable. That is, we want to Consequently, in the case of GDPA, after several iterations there is
find a fixed priority value 𝑃𝑖𝑗 for every step 𝜏𝑖𝑗 , so that the worst-case a risk of inducing runaway priority values that could diverge as the
response times of every end-to-end flow are less than or equal to their algorithm progresses. To avoid this problem, we add a priority normal-
end-to end deadlines, 𝑅𝑖 ≤ 𝐷𝑖 , ∀𝑖. We assume that every step is already ization stage by defining a compression function 𝑐, which constrains
mapped to a processing resource. Any new step-to-processor mapping the priority values into the range [−1, 1]. The compression function 𝑐 is
would require a re-computation of the fixed priority values. defined as follows:
We define 𝛱 as a priority assignment, which is a vector containing
𝛱
a particular mapping of a fixed priority value to each step. Therefore, 𝑐 (𝛱) = (4)
𝑚𝑎𝑥 (|𝛱|)
a priority assignment 𝛱 is a flat view of the priority values assigned to
a system. The ordering inside a priority assignment vector 𝛱 follows where 𝛱 is a priority assignment, and 𝑚𝑎𝑥(|𝛱|) represents the maxi-
the ordering of the e2e flows and their steps, as shown in Eq. (3). mum of the absolute values of every priority value in 𝛱.
[ ] Before feeding the initial priority assignment into GDPA, the com-
𝛱 = 𝑃11 , 𝑃12 , … , 𝑃21 , 𝑃22 , … (3) pression function 𝑐 is applied to make sure the priority values get
We propose adapting the Gradient Descent (GD) algorithm to assign constrained within the expected range [−1, 1]. The resulting priority
fixed priorities. The resulting algorithm is called Gradient Descent assignment is labeled as 𝛱 0 , indicating that this is the first priority
Priority Assignment (GDPA). As previously described in Section 3.2, assignment evaluated by GDPA, that is, the priority assignment at
the basic idea of the generic Gradient Descent algorithm involves iteration t=0. Formally:
iteratively adjusting the input parameters of a cost function in the ( )
𝛱 0 = 𝑐 𝛱 𝑖𝑛𝑖 (5)
direction that makes the function decrease the most. GDPA mirrors this
behavior by iteratively adjusting the fixed priority values of every step It is worth noting that the compression function 𝑐 does not modify
in the direction that reduces the worst-case response times in relation the actual priority ordering of the steps. Therefore, it has no impact on
to the imposed deadlines the results of the schedulability analysis.

4
J.M. Rivas et al. Journal of Systems Architecture 153 (2024) 103198

4.3. Stop condition the availability of a response-time analysis that is able to calculate the
WCRT of every e2e flow.
The priority assignment computed at each iteration is evaluated to Given a set of worst-case response times for each e2e flow, 𝑅 =
[ ]
determine whether the GDPA algorithm should terminate. In the case 𝑅1 , … , 𝑅𝑁 , a straightforward cost function we could consider is the
of the first iteration t=0, the priority assignment 𝛱 0 is evaluated. In average WCRT of the system, as this function seems to comply with
GDPA, the algorithm stops if any of the following criteria is met: the 2 requirements set above: (1) for any given priority assignment
we get one cost value (i.e. the average WCRT), and (2), lowering the
1. The current priority assignment 𝛱 𝑡 is schedulable. cost function (i.e. lowering the average WCRT) seems to indicate a
2. A maximum number of iterations has been reached. better schedulability. The problem of using the average WCRT as the cost
function is that it does a poor job reflecting the overall schedulability of
To determine if a priority assignment is schedulable, any schedula-
the system, as it does not take into account the deadlines. Each iteration
bility test compatible with the system model described in Section 2 can
of the Gradient Descent algorithm would tend to lower the WCRTs of
be employed. Typically, the schedulability of the priority assignment
every e2e flow, regardless of the schedulability status of each particular
can be determined by applying a worst-case response time analysis, and
flow.
then comparing the resulting WCRTs with the deadlines. Techniques
Instead, in this paper we use as cost function a metric we call the
such as the Holistic analysis [15] or Offset-Based analyses [16,17] can
inverse slack, or 𝑖𝑛𝑣𝑠𝑙𝑎𝑐𝑘, which we define as follows:
be applied. The selection of which analysis to employ must balance the ( )
trade-offs between computing time and the pessimism in the obtained 𝑅𝑖 − 𝐷𝑖
𝑖𝑛𝑣𝑠𝑙𝑎𝑐𝑘(𝛱) = max (6)
WCRTs. ∀𝑖 𝐷𝑖
Regardless of which stopping criteria was met to terminate the algo- where 𝛱 is a priority assignment, 𝑅𝑖 is the WCRT of flow 𝛤𝑖 computed
rithm, GDPA will always return the priority assignment that produces for the given priority assignment 𝛱, and 𝐷𝑖 is the end-to-end deadline
the lowest value of the cost function, among all the priority assignments of the same e2e flow.
evaluated. The cost function in GDPA is detailed in Section 4.4.1. The main property of 𝑖𝑛𝑣𝑠𝑙𝑎𝑐𝑘 is that it focuses on the worst flow,
The stop condition could be extended by allowing further iterations that is, the flow with the largest (𝑅𝑖 −𝐷𝑖 ) value. Hence, a positive value
after a schedulable priority assignment has already been found. This of 𝑖𝑛𝑣𝑠𝑙𝑎𝑐𝑘 indicates that at least one end-to-end flow is not meeting its
may enable finding solutions with lower cost values. We leave the study deadline. On the other hand, a negative value of 𝑖𝑛𝑣𝑠𝑙𝑎𝑐𝑘 signals that
of such capability outside of the scope of this work, in order to show every flow is meeting its deadline. Additionally, aiming to minimize
the ability of GDPA of finding a schedulable solution more clearly. 𝑖𝑛𝑣𝑠𝑙𝑎𝑐𝑘 will tend to increase the schedulability of the worst flow by
possibly trading off some of the schedulability of other flows, that were
4.4. Gradient computation in a better situation. With this approach, at each iteration GDPA will
try to improve the worst flow (which may be different each iteration),
The objective of this phase is to obtain the gradient of the cost until the worst flow becomes schedulable, at which point the system as
function at the current priority assignment. To achieve this we must a whole is by extension also schedulable.
first define a suitable cost function. This is described in the following
Section 4.4.1. Later, Section 4.4.2 will outline a method to calculate 4.4.2. Calculating the gradient of the cost function
the gradient of the selected cost function. Once a suitable cost function has been selected, the objective now is
to specify a method to calculate its gradients. The cost function has as
its input parameter a priority assignment 𝛱, which assigns a priority
4.4.1. Cost function
value to each step in the system. Although this section will focus on
In this section, we define a suitable cost function that can be
calculating the gradient of the 𝑖𝑛𝑣𝑠𝑙𝑎𝑐𝑘 cost function, to simplify the
employed in the GDPA algorithm. In general, the cost function should
notation, in the following we will denote the cost function as 𝑓 .
have as input the parameters we aim to optimize, i.e., we want to
As described previously, the gradient of cost function 𝑓 can be
find the input at which the cost function is minimized. Moreover,
computed as a vector of the partial derivatives of the cost function
the cost function should represent a metric that we aim to optimize.
with respect to each of its parameters. In the case of 𝑖𝑛𝑣𝑠𝑙𝑎𝑠𝑘, its
Accordingly, for the problem of fixed priority assignment, we identify
parameters are the priorities of each step in the system (𝑃𝑖𝑗 ). Therefore,
that a cost function 𝑓 suitable for GDPA should comply with the
the gradient of the cost function 𝑓 at a given priority assignment 𝛱 can
following 2 requirements:
be represented as the following vector:
[ ]
1. The input of the cost function should be a priority assignment: 𝜕𝑓 𝜕𝑓
∇𝑓 (𝛱) = (𝛱), (𝛱), … (7)
the cost function maps each priority assignment to a cost value. 𝜕𝑃11 𝜕𝑃12
2. The cost function should inversely reflect the schedulability of the To calculate these partial derivatives we start by studying the
system: lower values of the cost should indicate a better schedu- classical definition:
lability situation. Therefore, by minimizing the cost function
𝜕𝑓 𝑓 (𝑃11 , … , 𝑃𝑖𝑗 + ℎ, …) − 𝑓 (𝑃11 , … , 𝑃𝑖𝑗 , …)
we are effectively maximizing the schedulability. Although the (𝛱) = lim (8)
𝜕𝑃𝑖𝑗 ℎ→0 ℎ
schedulability status of a system is binary (it is either schedula-
ble or not), here schedulability refers to a hypothetical continuous Eq. (8) implies that to calculate one partial derivative, the cost
value that quantifies how close (or far) the system is to become function must be computed twice: one time with the current priority
schedulable in terms of the distance between its WCRTs and its assignment 𝛱, and another with a different priority assignment in
deadlines. which 𝑃𝑖𝑗 is increased by an infinitesimal value ℎ. For this equation
to be useful, function 𝑓 must be differentiable around input value
By applying Gradient Descent with a hypothetical cost function 𝛱. Intuitively, this property requires that infinitesimal changes in a
𝑓 with the characteristics described above, we would iteratively find priority assignment should induce a change in the output of the cost
new priority assignments that could potentially converge towards a function. It is trivial to confirm that this property does not hold for
minimum of the cost function, that is, a maximum of the schedulability. 𝑖𝑛𝑣𝑠𝑙𝑎𝑐𝑘, as its input (i.e. priorities) have effectively discrete values.
To define the cost function, we will leverage the worst-case response To illustrate the problem, consider a simple system composed of
times (WCRT) of the end-to-end flows, as these provide the clearest two steps 𝜏11 and 𝜏21 , located in the same processor. Let us assume a
indication of the schedulability of the system. Therefore, we assume priority assignment 𝛱 = [1, 2], that is, 𝜏11 has a lower priority than 𝜏21 .

5
J.M. Rivas et al. Journal of Systems Architecture 153 (2024) 103198

Let us also assume a cost value for 𝛱 equal to 𝑋, that is 𝑓 (𝛱) = 𝑋. 4.5. New priority assignment
Let us now make an infinitesimal change on the priority assignment,
obtaining 𝛱 ′ = [1.001, 2]. Although the priority values have changed, In GDPA, instead of utilizing Eq. (2) directly to calculate the next
the actual priority ordering remains the same, therefore the cost value priority assignment 𝛱 𝑡+1 , we employ the more flexible concept of
also remains unchanged: 𝑓 (𝛱 ′ ) = 𝑋. Consequently, if we were to update vector, which abstracts away the learning rate and gradient terms.
use Eq. (8) to calculate the partial derivatives, the gradients would Accordingly, the new priority assignment is calculated by adding the
probably always be 0, and GDPA would get stuck at the first priority update vector 𝑈 𝑡 to the current priority assignment 𝛱 𝑡 , as shown
assignment indefinitely. in Eq. (11).
To circumvent the problem of the non-differentiability of 𝑖𝑛𝑣𝑠𝑙𝑎𝑐𝑘, ( )
𝛱 𝑡+1 = 𝑐 𝛱 𝑡 + 𝑈 𝑡 (11)
we will approximate its partial derivatives using a non-infinitesimal and
constant value for ℎ, which we rename 𝐻. We define 𝐻 as the average where 𝑡 is the current iteration number, 𝛱 𝑡 is the current priority
priority value separation between consecutive steps inside flat priority assignment, 𝑈 𝑡 is the update vector in the current iteration, and 𝑐 is
vector 𝛱. Formally, 𝐻 is calculated as follows: the compression function (described in Eq. (4)).
∑𝑁−1 The inclusion of the update vector into the formulation facilitates
|𝛱𝑖+1 − 𝛱𝑖 | the incorporation of gradient optimization techniques that will increase
𝐻 = 𝜆 𝑖=0 (9)
𝑁 −1 the chances of finding the global minimum of the cost function. The
where 𝑁 is the number of steps in the system, 𝛱𝑖 is the priority value field of machine learning, in which the Gradient Descent algorithm
for the step located at position 𝑖 in the priority assignment vector 𝛱, is extensively used, has proposed several such optimizations [9]. For
and 𝜆 is a hyper-parameter to control the size of 𝐻. this paper we have selected two: Gradient Noise [31], and the Adam
We modify Eq. (8) to take into account the non-infinitesimal value optimizer [32]. The update vector 𝑈 𝑡 is constructed by sequentially
𝐻. The resulting equation to approximate the partial derivatives is applying both techniques. In the following, we provide a more detailed
depicted in Eq. (10). By applying a larger non-infinitesimal step size explanation of each technique, and how its notation is adapted to
2𝐻 (+𝐻 to −𝐻), this equation will have a greater chance of changing GDPA.
the priority ordering of the steps, and thus providing a non-zero value
for the partial derivatives. It is worth noting that, although Eq. (10) is 4.5.1. Gradient noise
not a formal partial derivative, to simplify the notation we still denote it The Gradient Noise technique adds a Gaussian noise with mean 0
as such, as it approximately quantifies the slope between two different and variance 𝜎𝑡2 to the gradient. In a given GDPA iteration 𝑡, we denote
priority assignments. as 𝐺𝑡 the gradient vector with the added noise as follows:
( ) ( )
𝜕𝑓 𝑓 (𝑃11 , … , 𝑃𝑖𝑗 + 𝐻, …) − 𝑓 (𝑃11 , … , 𝑃𝑖𝑗 − 𝐻, …) 𝐺𝑡 = ∇𝑡 𝑓 𝛱 𝑡 +  0, 𝜎𝑡2 (12)
(𝛱) = (10)
𝜕𝑃𝑖𝑗 2𝐻
where  denotes a normal or Gaussian distribution.
According to Eq. (10), to calculate the partial derivative of 𝑓 with The variance of the Gaussian noise decays with the iterations of the
respect to the priority of step 𝜏𝑖𝑗 , the cost function must be computed optimization process, as given in Eq. (13), in which 𝜂 is the learning
for two different priority assignments: one in which the priority of step rate, 𝑁 is the number of steps in the system, and 𝛾 is an additional
𝜏𝑖𝑗 is increased by 𝐻, and another in which its priority is decreased hyper-parameter to control the noise decay:
by 𝐻. Consequently, to calculate the gradient of a system composed of 𝜂
𝜎𝑡2 = (13)
𝑁 steps, the cost function must be computed for 2𝑁 different priority (1 + 𝑁 + 𝑡)𝛾
assignments. Parameter 𝑁 in Eq. (13), which was not included in the original
It is worth noting that each computation of the cost function re- formulation of Gradient Noise, is added to modulate the effect of
quires invoking a response time analysis. Therefore, calculating the the noise in systems with many steps. In such systems, less noise is
gradient of said system composed of 𝑁 steps requires executing 2𝑁 required to induce slight variations in the priority ordering of the
analyses. As an example, let us consider a realistic scenario in which 10 steps, considering that the priority values of all the steps always get
iterations of GDPA are executed with a system composed of 100 steps. compressed into the range [−1, 1].
Under this scenario, considering that in each iteration the gradient will
be computed once, in total the response time analysis will be invoked 4.5.2. Adam optimizer
2000 times. Adam is a momentum based gradient optimizer that effectively
Any response time analysis compatible with the model presented computes specific learning rates for each optimization parameter. It
in Section 2 can be used to compute the cost function, but taking into defines two vectors, 𝑚 and 𝑣, which are the first and second moment
account that it may potentially be invoked on numerous occasions, it is of the gradient respectively. In a given iteration 𝑡, these vectors are
preferable to select an analysis that tends to be fast, such as the Holistic defined as follows:
analysis [15]. 𝑚𝑡 = 𝛽1 𝑚𝑡−1 + (1 − 𝛽1 )𝐺𝑡
It is important to highlight that the response time analysis to (14)
𝑣 = 𝛽2 𝑣𝑡−1 + (1 − 𝛽2 )(𝐺𝑡 )2
𝑡
compute the gradients, and the response time analysis to determine the
stop condition (Section 4.3), do not need to be the same. This property where 𝛽1 and 𝛽2 are hyper-parameters to control the effect of the
can be exploited by employing a fast analysis for the computation of momentum, and (𝐺𝑡 )2 is the element-wise square of the noisy gradient.
the gradients, and a slower but more precise analysis to determine the It is worth noting that in this formulation we are directly optimizing
stopping conditions. This can be useful if the results of the fast analysis the noisy gradient 𝐺𝑡 . The vectors 𝑚 and 𝑣 are bias-corrected as follows:
are correlated with those of the slower but more precise one.
This potential for a high number of invocations of the response 𝑚𝑡
𝑚̂ 𝑡 = ( )𝑡
time analysis represents the main bottleneck of the GDPA algorithm. To 1 − 𝛽1
(15)
manage this, Section 5 presents a method to accelerate the computation 𝑣𝑡
𝑣̂ 𝑡 = ( )𝑡
of the gradients by vectorizing a response time analysis technique. 1 − 𝛽2

6
J.M. Rivas et al. Journal of Systems Architecture 153 (2024) 103198

Algorithm 1 Gradient Descent Priority Assignment algorithm to both determine the schedulability of the system and to compute
the gradients. For illustration purposes, in this example we use the
Input: Input system S, initial priority assignment 𝛱 𝑖𝑛𝑖 ,
maximum
following GDPA hyper-parameter values: 𝜆 = 1.5, 𝜂 = 3, 𝛽1 = 0.9,
iterations 𝑡𝑚𝑎𝑥
𝛽2 = 0.999, 𝜖 = 0.1. Section 6.3.1 provides a justification for these
Output: Best priority assignment
values, which were obtained empirically.
𝑡←0 ⊳ iteration index
GDPA starts by compressing the initial priority assignment. The
𝛱 ← 𝑐𝑜𝑚𝑝𝑟𝑒𝑠𝑠(𝛱 𝑖𝑛𝑖 ) ⊳ priority assignment
result of this compression is shown in the second column of Table 2,
𝛱𝑏 ← 𝛱 ⊳ best priority assignment
labeled 𝛱 0 . The first column, labeled 𝛱 𝑖𝑛𝑖 shows the uncompressed
𝑏𝑒𝑠𝑡 ← ∞ ⊳ best cost value found
initial priority assignment.
while true do
The next phase of GDPA requires evaluating the Stop Condition,
𝑣𝑎𝑙𝑢𝑒 ← 𝑐𝑜𝑠𝑡(𝛱) ⊳ current cost value
which involves determining whether assignment 𝛱 0 is already schedu-
if 𝑣𝑎𝑙𝑢𝑒 < 𝑏𝑒𝑠𝑡 then
lable. By applying the Holistic analysis, it is determined that 𝛱 0 is not
𝑏𝑒𝑠𝑡 ← 𝑣𝑎𝑙𝑢𝑒 ⊳ record best cost value
schedulable, with an initial cost value 𝑖𝑛𝑣𝑠𝑙𝑎𝑐𝑘 = 9.33. Therefore, GDPA
𝛱𝑏 ← 𝛱 ⊳ save best priority assignment
continues to the next phase, the Gradient Computation.
end if
As shown in Eq. (7), the gradient is a vector of the partial derivatives
of the priorities of each step (i.e. task). In GDPA, each partial derivative
𝑠𝑐ℎ𝑒𝑑𝑢𝑙𝑎𝑏𝑙𝑒 ← 𝑎𝑛𝑎𝑙𝑦𝑠𝑖𝑠(𝑆, 𝛱) ⊳ Schedulability test
is calculated using Eq. (10), which involves computing the cost function
if 𝑠𝑐ℎ𝑒𝑑𝑢𝑙𝑎𝑏𝑙𝑒 or 𝑡 ≥ 𝑡𝑚𝑎𝑥 then
twice, using a non-infinitesimal value H. Eq. (9) is used to calculate 𝐻,
break ⊳ stop when schedulable or max. iterations
which is the average separation of the priority values of 𝛱 0 , which in
end if
this case is 𝐻 = 0.6.
In this example composed of 6 steps, the gradient requires a total
∇(𝛱) ← 𝑔𝑟𝑎𝑑𝑖𝑒𝑛𝑡(𝑆, 𝛱) ⊳ gradient
of 12 computations of the cost function, each with a different priority
𝑈 ← 𝑜𝑝𝑡𝑖𝑚𝑖𝑧𝑒(∇(𝛱)) ⊳ update vector
assignment. Each of these priority assignments is obtained by summing
𝛱 ← 𝑐𝑜𝑚𝑝𝑟𝑒𝑠𝑠(𝛱 + 𝑈 ) ⊳ new priority assignment
(or subtracting) value 𝐻 to the priority of one task. The resulting
𝑡←𝑡+1
12 priority assignments are depicted in the Table 2, from column 3
end while
onwards. The columns labeled 𝑃𝑖𝑗 + 𝐻 contain the priority assignment
return 𝛱 𝑏 ⊳ return priority assignment with lowest cost
in which the priority of step 𝜏𝑖𝑗 is increased by H. Similarly for the
columns labeled 𝑃𝑖𝑗 − 𝐻.
Table 1 Next, the cost function is evaluated for each one of those 12 pri-
Parameters of the illustrative example. ority assignments. The resulting cost values are depicted in the first
𝐶𝑖𝑗 𝑃𝑖𝑗 𝑝𝑟𝑜𝑐. 𝑇𝑖𝑗 𝐷𝑖𝑗 2 columns of Table 3. A cell located at row 𝜏𝑖𝑗 and column 𝑓 (+𝐻)
𝜏11 5 1 1 30 contains the value of cost function invslack for a priority assignment in
𝜏12 2 2 2 which the priority of 𝜏𝑖𝑗 was increased by 𝐻. Similarly for the column
𝜏13 20 3 3 35 labeled 𝑓 (−𝐻). The third column of Table 3 (labeled 𝜕𝑓 ∕𝜕𝑃𝑖𝑗 ), shows
𝜏21 5 1 3 40
the final partial derivative values for each step, obtained by applying
𝜏22 10 2 2
𝜏23 10 1 1 45 Eq. (10). The whole column represents the elements of the gradient
∇𝑓 (𝛱 0 ).
In a simple Gradient Descent algorithm, the next priority assignment
𝛱 1 would be calculated by adding the gradient (last column of Table 3)
4.5.3. Update vector scaled by a learning rate to the current priority assignment 𝛱 0 . This is
The bias-corrected vectors described in Eq. (15) are combined as in represented in Eq. (2). In GDPA, we optimize the gradient by adding
the original formulation of the Adam optimizer [32] to construct the a decaying noise and applying the Adam optimizer, as explained in
update vector as follows: Section 4.5. The results of this process are shown in Table 4, which
𝜂 are decomposed as follows: (1) column 𝛱 0 is the current priority
𝑈𝑡 = − √ 𝑚̂ 𝑡 (16)
𝑣̂ 𝑡 + 𝜖 assignment at iteration t=0, (2) column 𝑈 0 is the resulting update
vector after applying Gradient Noise and Adam, (3) column 𝛱 1∗ is
where 𝜂 is the learning rate, and 𝜖 is a hyper-parameters to control the
the non-compressed new priority assignment which is computed as
update vector.
the summation of 𝛱 0 and 𝑈 0 , and (4) column 𝛱 1 is the new priority
Finally, given Eqs. (11) and (16), the next priority assignment 𝛱 𝑡+1
assignment, which results from compressing column 𝛱 1∗ .
is calculated as follows:
( ) GDPA continues by evaluating the Stop Condition on 𝛱 1 . The holis-
𝑡+1 𝑡 𝜂 𝑡 tic analysis now deems this new priority assignment as schedulable,
𝛱 =𝑐 𝛱 − √ 𝑚̂ (17)
𝑣̂ 𝑡 + 𝜖 with a cost value 𝑖𝑛𝑣𝑠𝑙𝑎𝑐𝑘 = −0.09. Therefore, GDPA now stops and
returns 𝛱 1 as the best priority assignment it has found.
It is important to note that, as in the original Adam formulation,
in this equation the vector operations are meant to be performed 5. Accelerating the gradient computation
element-wise.
To summarize, Algorithm 1 shows the pseudo-code of GDPA. In this section, we aim to improve the computation times of the
GDPA algorithm. In Section 4 we identified that the Gradient Computa-
4.6. Illustrative example tion phase of GDPA is its main computational bottleneck, as it requires
computing the cost function twice per step in the system, which by
We present here a simple example to illustrate how GDPA can be extension requires invoking the response time analysis. In total, for a
applied on a particular system. The example is adapted from [15], and system with 𝑁 steps, each iteration of GDPA requires 2𝑁 invocations
is composed of 2 e2e flows with 3 steps each, for a total of 6 steps, of a response time analysis.
which traverse 3 processing resources. The basic parameters of the Also in Section 4.4.2 we observed that each of the invocations of
system are shown in Table 1, including the initial priority assignment. the response time analysis to compute a gradient differs only in the
For this example we use the Holistic analysis [15] as the WCRT analysis priority assignment it is analyzing. That is, for a system with 𝑁 steps,

7
J.M. Rivas et al. Journal of Systems Architecture 153 (2024) 103198

Table 2
Example of the priority assignments involved in one iteration of GDPA.
𝛱 𝑖𝑛𝑖 𝛱0 𝑃11 + 𝐻 𝑃11 − 𝐻 𝑃12 + 𝐻 𝑃12 − 𝐻 𝑃13 + 𝐻 𝑃13 − 𝐻 𝑃21 + 𝐻 𝑃21 − 𝐻 𝑃22 + 𝐻 𝑃22 − 𝐻 𝑃23 + 𝐻 𝑃23 − 𝐻
𝜏11 1 0.33 0.93 −0.27 0.33 0.33 0.33 0.33 0.33 0.33 0.33 0.33 0.33 0.33
𝜏12 2 0.67 0.67 0.67 1.27 0.07 0.67 0.67 0.67 0.67 0.67 0.67 0.67 0.67
𝜏13 3 1.00 1.00 1.00 1.00 1.00 1.60 0.40 1.00 1.00 1.00 1.00 1.00 1.00
𝜏21 1 0.33 0.33 0.33 0.33 0.33 0.33 0.33 0.93 −0.27 0.33 0.33 0.33 0.33
𝜏22 2 0.67 0.67 0.67 0.67 0.67 0.67 0.67 0.67 0.67 1.27 0.07 0.67 0.67
𝜏23 1 0.33 0.33 0.33 0.33 0.33 0.33 0.33 0.33 0.33 0.33 0.33 0.93 −0.27

Table 3 ℎ𝑝(𝑎) is the set of steps that can preempt step 𝜏𝑎 . A step 𝜏𝑏
Cost values, and coefficients of the resulting gradient. can preempt 𝜏𝑎 if both are located in the same
𝑓 (+𝐻) 𝑓 (−𝐻) 𝜕𝑓 ∕𝜕𝑃𝑖𝑗 processor, and 𝑃𝑏 > 𝑃𝑎 .
𝜏11 2.82 9.33 −5.43 𝐽𝑏 is the jitter of step 𝜏𝑏 , which in the Holistic analysis is
𝜏12 3.84 9.33 −4.57 simplified as the worst-case response time of the
𝜏13 9.33 9.33 0.00 previous step in its e2e flow.
𝜏21 9.33 9.33 0.00
𝑝 is the index of the current instance of the step under
𝜏22 9.33 3.84 4.57
𝜏23 9.33 2.82 5.43 analysis, as more than once instance of the same step
must be taken into account when deadlines are higher
than the periods. The first instance is given a value
Table 4
p=1.
New priority assignment.
𝑇𝑏 is the period of step 𝜏𝑏 , which is equal to the period of
𝛱0 𝑈0 𝛱 1∗ 𝛱1
its end-to-end flow
𝜏11 0.33 1.27 1.61 0.83
𝜏12 0.67 1.27 1.94 1.00
𝜏13 1.00 −1.26 −0.26 −0.13 A typical implementation of the Holistic analysis embeds Eq. (18)
𝜏21 0.33 1.27 1.60 0.83 inside 3 loops: (1) an inner loop to solve recursive value 𝑤𝑎 of the
𝜏22 0.67 −1.27 −0.61 −0.31 equation for a given 𝑝 value, (2) a middle loop that iterates the value
𝜏23 0.33 −1.27 −0.94 −0.48 𝑝 and registers the activation number that incurred in the longest
response time, and (3) an outer loop that updates the jitters of every
step according to the currently found WCRT, and stops when two
2𝑁 known priority assignments are evaluated to calculate a gradient. consecutive outer loops reach the same WCRT. Ideally all these loops
Here we exploit this observation by proposing a method to effectively should be replaced by pure vector operations, however due to the
complexity of this endeavour, in this paper we will focus on vectorizing
calculate the WCRTs of all the required priority assignments at the same
just Eq. (18), keeping the resulting vectorization inside the same 3
time.
aforementioned loops. In the evaluation section (Section 6), we will
We propose a strategy that can be summarized in the following two show how this limited vectorization still produces sizeable speed-ups.
points: To vectorize Eq. (18) we employ a two-pronged approach. First,
we replace every step attribute in the equation (e.g. 𝐶𝑎 , etc.), by an
1. Selection of an efficient response time analysis to calculate the equivalent vector that contains the attribute of every step. Accordingly,
gradients. To our knowledge, the Holistic analysis [15] is the we define the following three vectors:
fastest analysis available that is compatible with the system
𝐶 = [𝐶1 , 𝐶2 , …] WCETs vector
model laid out in Section 2.
𝐽 = [𝐽1 , 𝐽2 , …] Jitters vector
2. Vectorization of the Holistic analysis for a faster execution. 𝑇 = [𝑇1 , 𝑇2 , …] Periods vector
Moreover, 3D matrices will be employed to effectively analyze
several priority assignments at the same time. Second, we implement the summation in Eq. (18) with a pure matrix
multiplication. To achieve this we leverage the concept of a priority
matrix. We define each element 𝑝𝑚𝑎𝑏 of a priority matrix 𝑃 𝑀 as follows:
A vectorization is a process that involves minimizing the use of
loop and conditional operations in a particular piece of code, replacing {
them with vector and matrix operations. This generally enables a faster 1 if step 𝜏𝑏 can preempt step 𝜏𝑎
𝑝𝑚𝑎𝑏 = (19)
execution, as vector operations are typically optimized in modern CPUs, 0 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
especially when an appropriate library is used. Examples of such li- The idea of the priority matrix is also used in the linearization of
braries implement the BLAS specification [33] such as OpenBLAS [34]. schedulability tests, to codify them as linear constraints for MILP [8].
Higher level libraries such as Numpy [35] for Python rely on efficient Essentially, a priority matrix contains all the necessary priority ordering
BLAS libraries. information of one priority assignment 𝛱. We propose extending this
priority matrix to include the priority information of several priority
In this section we simplify the step notation to include just one
assignments at the same time.
index, that is, we denote the i-th step as 𝜏𝑖 . Following this notation,
We define 𝛹 as a set of 𝑀 priority assignments as follows:
Eq. (18) depicts the main equation of the Holistic Analysis, for a step
[ ]
𝜏𝑎 under analysis. Here we follow the formulation presented in Palencia 𝛹 = 𝛱 1, 𝛱 2, … , 𝛱 𝑀 (20)
et al. [15].
where 𝛱 𝑚 is the m-th priority assignment.
⌈ ⌉
∑ 𝐽𝑏 + 𝑤𝑛𝑎 (𝑝) It is important to note that the set 𝛹 does not assume or impose any
𝑤𝑛+1 (𝑝) = 𝑝𝐶 𝑎 + 𝐶𝑏 (18)
𝑎 𝑇𝑏 type of relation among the priority assignments it contains. Specifically,
∀𝑏∈ℎ𝑝(𝑎)
the index 𝑚 is unrelated to the iteration index used in Section 4 to label
where: the priority assignment in a given GDPA iteration.

8
J.M. Rivas et al. Journal of Systems Architecture 153 (2024) 103198

In this paper, we propose constructing a 3D priority matrix, by 6.1. Implementation details


stacking together the priority matrices of several priority assignments.
We denote this extended priority matrix as 𝐻𝑃 . For a given set 𝛹 , we We implemented GDPA in Python, with the code publicly avail-
define 𝐻𝑃 as a binary 3D matrix, in which each of its elements ℎ𝑝𝑚𝑎𝑏
able in a Github repository [36]. Although this implementation was
is defined as follows:
created to evaluate GDPA, it is meant to be extensible. As such, it
⎧1 if step 𝜏 can preempt step 𝜏 in the provides classes to model real-time systems and interfaces to implement
⎪ 𝑏 𝑎
ℎ𝑝𝑚𝑎𝑏 = ⎨ priority assignment 𝛱 𝑚 (21) algorithms other than GDPA.
⎪0 otherwise Two versions of the Holistic analysis were implemented: a standard

non-vectorized version directly following the original formulation [15],
Intuitively, we can visualize each slice of 3D matrix 𝐻𝑃 as contain-
ing one 2D priority matrix 𝑃 𝑀. and another vectorized version adopting the methodology described in
We can now vectorize Eq. (18) employing the priority matrix 𝐻𝑃 Section 5. The latter leverages the Numpy [35] numerical library to
as follows: efficiently perform the vector operations.
⌈ 𝑇 ⌉ Apart from GDPA, the PD and HOPA priority assignment algo-
𝐽 + 𝑊 𝑛 (𝑝)
𝑊 𝑛+1 (𝑝) = 𝑝𝐶 + 𝐻𝑃 × 𝐶 (22) rithms were also implemented. Moreover, two optimal fixed priority
𝑇𝑇
assignments were implemented: a brute-force algorithm, and another
where 𝑝 is the step instance index, 𝐽 𝑇 and 𝑇 𝑇 are the transposed Jitter
one employing MILP. We consider an algorithm as optimal if it is
and Period vectors respectively, and 𝑊 is a matrix in which each of its
always capable of finding a schedulable priority assignment if such an
elements 𝑤𝑚𝑎 contains the value 𝑤 for step 𝜏𝑎 (i.e. 𝑤𝑎 ) associated with
assignment exists.
priority assignment 𝛱 𝑚 .
The recursive Eq. (22) is solved iteratively, with an initial 𝑊 matrix The brute-force algorithm simply evaluates every possible priority
initialized to zero, stopping when two consecutive iterations yield the ordering. To accelerate this process, it leverages the vectorized Holistic
same results, that is, when 𝑊 𝑛+1 (𝑝) = 𝑊 𝑛 (𝑝). As in the original analysis by analyzing at the same time batches of 10000 priority assign-
formulation, the 𝑝 value is initialized to 1. ments. Nevertheless, this brute-force algorithm is still intractable for all
The original Holistic analysis provides the equations to derive the but small systems. For instance, a system composed of 5 processors,
WCRT of a step from its 𝑤 value. These equations can be employed di- with 10 steps mapped to each processor, offers 10!5 ≈ 6.29 × 1032
rectly here too, obtaining the WCRT of every step, and for every priority possible priority assignments, which is not feasible even when taking
assignment, at the same time. It is worth noting that Eq. (22) assumes advantage of the vectorized analysis.
that any matrix shape incompatibilities are automatically solved, for For the MILP algorithm we faced the challenge of the unavailability
example by employing broadcasting [35]. If such capabilities are not
of feasible linear equations to describe a schedulability test for our
available, then the vectors should be manually padded to make their
system model. Because of this, our implementation of a priority assign-
dimensions compatible.
ment algorithm with MILP only defines restrictions that declare when
Eq. (22) can be exploited in GDPA to compute a gradient of the
cost function with just 1 invocation of the Holistic analysis. This a fixed priority assignment is valid. For this, it employs a square binary
can be achieved by constructing a 𝛹 set containing all the priority priority matrix, defined with Eq. (19), with the same linear restrictions
assignments needed to compute a gradient (Eqs. (7) and (8)). As a result as in [8]. Each valid priority assignment is evaluated externally using
of this analysis, the WCRT of every flow in every priority assignment a callback, which invokes a schedulability test to determine whether
is obtained, which can be used to calculate all the partial derivatives the valid priority assignment is also schedulable. The algorithm is
that compose the gradient (Eq. (8)). terminated as soon as a schedulable priority assignment is found. This
To better contextualize the effect of this acceleration method, we MILP algorithm is implemented with Gurobi [28]. This implementation
can study the computational complexity of GDPA before and after the of MILP is still optimal (i.e. it will find a schedulable solution if
vectorization process, with respect to the number of tasks 𝑛. The com- one exists), but may incur in a computation time penalty due to the
plexity of the original Holistic analysis (i.e. before vectorization) can unavailability of feasible linear constraints that completely support our
be bounded to 𝑂(𝑛2 ), as it nests two loops that iterate the whole task- system model. The definition of such linear constraints falls outside the
set: one to determine which tasks have higher priority (the summation
scope of this paper.
shown in Eq. (18)), and another outer loop that iterates each task.
To widen the availability of response time analyses, a bridge be-
Considering that GDPA must execute the analysis 2𝑛 times in each
iteration to compute the gradient, this yields a total complexity of tween Python and the MAST tool [18] was implemented too. This
GDPA of 𝑂(𝑛3 ). bridge automatically performs the transformation between Python
Although the vectorized Holistic Eq. (22) has 𝑂(1) complexity, there model classes and MAST input and output files, and the execution of
are previous setup stages that constructs set 𝛹 (Eq. (20)) and the the MAST tool executable on those files. This bridge allows any algo-
necessary vectors and matrices, which can be computed with linear rithm implemented in MAST to be invoked transparently directly from
complexity 𝑂(𝑛). Therefore, the whole complexity of GDPA with the Python code. This bridge enables an Offset-Based analysis technique
vectorized analysis can be bounded to 𝑂(𝑛). implemented in MAST to be used in the evaluation of the stop condition
It is common for a vectorization process to trade off computational in GDPA.
complexity with a space complexity penalty. In our case, GDPA without We also implemented an automatic generator of synthetic systems.
the vectorization optimization offers a 𝑂(𝑛) space complexity, while the It generates random systems given a number of e2e flows, steps per
vectorized optimization of GDPA has a 𝑂(𝑛3 ) space complexity, due to
flow, processors, processor utilization, and the selection ranges for the
3D matrix 𝐻𝑃 (Eq. (21)), which has a size (𝑛 × 𝑛 × 2𝑛).
periods and deadlines. To compute the WCETs, it employs the widely
6. Evaluation used UUNIFAST algorithm [37], and the periods are selected using a
log-uniform distribution. The specific characteristics of the synthetic
In this section we present the evaluation results of the GDPA algo- systems are described within each evaluation (Sections 6.2 and 6.3).
rithm. This section is organized as follows: in Section 6.1 we provide The utilization of a processing resource 𝑃 𝑅𝑘 , which we denote as
implementation details of GDPA and other algorithms that we have 𝐿𝑘 , is defined with Eq. (23). The system utilization is defined as the
implemented for this comparison; in Section 6.2 we study the execution average utilization of its processing resources.
time speed-ups obtained by vectorizing the Holistic analysis; and in
∑ 𝐶𝑖𝑗
Section 6.3 we evaluate the GDPA algorithm by analyzing its ability 𝐿𝑘 = (23)
to find schedulable solutions and its computation times. ∀𝜏𝑖𝑗 ∈𝑃 𝑅𝑘
𝑇𝑖

9
J.M. Rivas et al. Journal of Systems Architecture 153 (2024) 103198

more than one order of magnitude faster than holistic. Similarly, holistic-
mast also shows execution times around 0.2 times those of holistic.
Here the systems are more complex and require more computations
to be analyzed. Therefore the native compiled binary of MAST clearly
becomes faster than the non-vectorized Python implementation, even
considering the necessary file read and write overheads.

6.3. GDPA evaluation

The objective is to compare GDPA against other priority assignment


algorithms. We will measure two aspects in this evaluation: the ability
of each algorithm to find schedulable solutions, and their respective
computation times.
In total the evaluation compares seven priority assignment algo-
rithms, including GDPA with three different initializations:

• pd: the non-iterative PD priority assignment [23].


• hopa: the HOPA iterative algorithm [7].
• brute-force: the brute force algorithm described in Section 6.1.
• milp: our ad-hoc implementation of a MILP algorithm, described
in Section 6.1.
• gdpa-random: GDPA with an initial random priority assignment.
• gdpa-pd: GDPA with an initial PD priority assignment.
• gdpa-hopa: GDPA with an initial HOPA priority assignment.

The evaluation will be performed on a pool of synthetic systems,


which were created taking into account the characteristics of publicly
available real use cases. For instance, the authors of [38] provide a
Fig. 3. Total execution times of analyzing a set of priority assignments, normalized to
the execution times of holistic, for systems with (a) 20 steps, and (b) 64 steps. description of a cruise-control system composed of 11 tasks divided
into 2 e2e flows that traverse 2 processing resources. Furthermore, the
generic military avionics system described in [39] is composed of 23
6.2. Vectorization speed-ups tasks running on 2 processing resources. Thus, we generated synthetic
systems with 3 different sizes: 16 steps (4 flows with 4 steps each, 4
The objective here is to measure the execution time speed-ups processors); 30 steps (6 flows with 5 steps each, 5 processors); and
gained by vectorizing the Holistic analysis as described in Section 5. For 72 steps (12 flows with 6 steps each, 7 processors). Utilizations are
this, we measure the total execution times of analyzing 1, 10, 100, 1000 swept up from 50% to 90% with 20 intermediate utilizations. The
and 10000 priority assignments, comparing three different approaches: WCETs are generated with UUNIFAST. The step to processor mapping
keeps the load balanced, resulting in all processors having the same
• holistic: each priority assignment is analyzed sequentially and utilization. Periods are randomly selected in the range [100, 100000]
independently with the Holistic analysis implemented in Python using a log-uniform distribution. The end-to-end deadline of each flow
with no vectorization (i.e. Numpy is not used). 𝛤𝑖 is randomly selected in the range [0.5 ⋅ 𝑇𝑖 ⋅ 𝑁𝑖 , 𝑇𝑖 ⋅ 𝑁𝑖 ]. To obtain
• holistic-mast : each priority assignment is analyzed sequentially relevant results, 50 systems were generated for each configuration.
and independently with the Holistic analysis provided by the To contextualize the problem, for the systems with 16 steps and 4
MAST tool, leveraging the MAST-Python bridge. The analysis in processors, there are 4!4 = 331776 possible priority assignments. For
MAST is written in Ada and compiled into native executables. the systems with 30 steps and 5 processors there are 6!5 ≈ 1.93 × 1014
• holistic-vector: all the priority assignments are analyzed at the possible priority assignments. Finally, for the largest systems, which
same time leveraging the capabilities of the vectorized Holistic contain 72 steps in 7 processors, the search space is composed of
analysis described in Section 5, employing the Numpy library. (72∕7)!7 ≈ 9.39 × 1047 candidate priority assignments. Consequently,
the optimal algorithms (brute-force and milp) were only applied to the
We generate synthetic systems with two different total numbers of
systems with 16 steps. For the other system sizes, which offer larger
steps: 20 steps and 64 steps. Every system has a utilization of 70%
search spaces, these algorithms become intractable.
in every processor. The priority assignments are randomized, but the
same ones are analyzed in the 3 approaches. The measured execution Unless otherwise specified, we employ the Holistic analysis to de-
times are shown in Fig. 3 for both system sizes. The execution times termine the schedulability of a priority assignment. In the case of
are normalized to those of holistic. GDPA, we always employ the vectorized Holistic analysis to compute
For systems with 20 steps (Fig. 3(a)), we can observe that when only the gradients. The initial priority assignment in HOPA is performed
1 priority assignment is analyzed, the vectorized analysis offers a slight with PD. The maximum number of iterations was set to 100 for GDPA
speed-up over the non-vectorized version. However, as the number and 120 for HOPA. A common configuration of HOPA was used: 𝑘𝑎 =
of priority assignments increases, the benefits of the vectorization [1, 1.8, 3, 1.5] and 𝑘𝑟 = [1, 1.8, 3, 1.5].
become clear, with total execution times that are 10 times lower than This evaluation is organized into 3 sections. Section 6.3.1 performs
holistic. On the other hand, analyzing with MAST (holistic-mast ) incurs a preliminary evaluation in which appropriate values for the hyper-
in execution times 10 times higher than the baseline (holistic). These parameters are determined. Also, the individual contributions to the
overheads are due to the necessity to write and read a file each time a performance of GDPA of Adam and Gradient Noise are quantified.
priority assignment is analyzed with MAST. Section 6.3.2 evaluates the ability of each priority assignment algorithm
For larger systems with 64 steps (Fig. 3(b)), the vectorized analysis of finding schedulable solutions. Finally, Section 6.3.3 evaluates the
is still clearly the faster approach, with execution times that are always computation times incurred by each of the algorithms under study.

10
J.M. Rivas et al. Journal of Systems Architecture 153 (2024) 103198

Fig. 4. Number of schedulable solutions found, for systems with (a) 16 steps, (b) 30 steps, and (c) 72 steps.

Table 6
GDPA optimizers comparison.
GDPA configuration Schedulable solutions
gdpa-pd (adam + noise) 864
gdpa-pd (adam) 860
gdpa-pd (noise) 854
gdpa-pd (none) 839
hopa 819

(2) GDPA with just Adam, (3) GDPA with just Gradient Noise, and (4)
GDPA with neither Adam nor Gradient Noise. Every configuration starts
with a PD assignment, and the same pool of synthetic systems as the
Fig. 5. Number of schedulable solutions found, applying an Offset-Based analysis to
evaluate the Stop Condition. previous evaluation was used. (i.e. 1000 systems with 16 steps each).
The number of systems for which a schedulable solution was found for
each configuration is shown in Table 6. For added context, the results
Table 5
Number of schedulable solutions found for different hyper-parameters, with gdpa-hopa. for HOPA are also included. From these results, it can be confirmed
𝛽1 𝛽2 𝜂 𝜆 𝜖 𝛾 Schedulable solutions that each optimizer (Adam and Gradient Noise) positively contributes
to the gradient descent algorithm in its search for a schedulable priority
0.9 0.999 3 1.5 0.1 0.9 867
0.7 0.999 3 1.5 0.1 3 867 assignment, being the combination of both (i.e. Adam + Noise) the
0.9 0.7 3 1.5 0.1 0.9 861 scenario that produces the best recorded performance.
0.9 0.999 3 1.5 0.1 1.5 860
0.9 0.999 3 2 0.01 0.9 852 6.3.2. Schedulable solutions
0.9 0.999 0.1 2 0.1 3 844
Here we measure how many systems each algorithm was able to
0.7 0.999 0.1 2 0.01 3 820
schedule successfully, for the different system complexities under study.
The results are shown in Fig. 4 for each system size, and given as a
function of the average processor utilization. It is worth noting that
6.3.1. Preliminary evaluation the maximum number of systems each algorithm can schedule is 50, as
As shown in Section 4, GDPA offers 6 hyper-parameters that can be this is the number of systems that were generated for each utilization
used to tweak its behavior: 𝛽1 , 𝛽2 , 𝜂, 𝜆, 𝜖, and 𝛾. Before carrying out level.
the full evaluation of GDPA, we executed a preliminary evaluation to For systems with 16 steps (Fig. 4(a)), we observe that both brute-
determine appropriate values for these hyper-parameters. force and MILP dominate, and also were able to schedule the same
This preliminary evaluation consisted on executing gdpa-hopa (i.e. number of systems. This is expected, as these 2 algorithms are optimal,
in the sense that if a schedulable solution exists, they will find it.
GDPA with a HOPA initial assignment), with different values for the
We also observe that all the GDPA variants closely approximate the
hyper-parameters, on the small synthetic systems (16 steps). In total,
optimal algorithms, clearly outperforming HOPA. It is worth bearing
1000 systems were studied, with utilizations ranging from 50% to 90%.
in mind that both gdpa-pd and HOPA start with the same PD priority
For each hyper-parameters configuration, the total number of systems
assignment. In the case of gdpa-random, it shows slightly worse results
for which a schedulable solution was found was recorded. Table 5
than the rest of GDPA variants, but still clearly above HOPA. This
collects a representative selection of the results obtained. From these
indicates that GDPA has the capability of correctly exploring the search
results, it can be concluded that the hyper-parameters have a clear
space, even when starting from very poor priority assignments.
impact on the performance of GDPA. Furthermore, we determined For more complex systems with 30 steps (Fig. 4(b)), the evaluation
that the following values provide good results, and were chosen for shows similar results, although here gdpa-random has a performance
all subsequent evaluations: 𝛽1 = 0.9, 𝛽2 = 0.999, 𝜂 = 3, 𝜆 = 1.5, closer to gdpa-pd. The GDPA variants clearly outperform HOPA, with a
𝜖 = 0.1, 𝛾 = 0.9. It is worth indicating that this preliminary evaluation is slight advantage of gdpa-hopa for higher utilizations. Here brute-force
limited in scope, and a more comprehensive study is needed to precisely and MILP algorithms were not applied due to their high computation
determine the effect of each hyper-parameter, including more types of times. Therefore it is not possible to measure whether there is any room
systems. This is left for future work. to improve above gdpa-hopa. However, considering the differences
Additionally, we performed a further evaluation to quantify the observed in systems with 16 steps, and in general the maximum schedu-
individual contributions that the Adam optimizer and Gradient Noise lable utilizations that are reached with fixed priorities scheduling, we
have on the performance of GDPA. For this, we tested GDPA with 4 dif- expect that here an optimal algorithm would not be able to schedule
ferent configurations: (1) standard GDPA with Adam + Gradient Noise, many more systems than GDPA.

11
J.M. Rivas et al. Journal of Systems Architecture 153 (2024) 103198

Fig. 6. Average computation time to find a schedulable solution, for systems with (a) 16 steps, (b) 30 steps, and (c) 72 steps.

Fig. 7. Average number of iterations required to find a schedulable solution, for systems with (a) 16 steps, (b) 30 steps, and (c) 72 steps.

The same conclusions can be reached for systems with 72 steps with utilizations above 70%. This matches the results previously re-
(Fig. 4(c)). To bound the computation times of the evaluation, we only ported in Fig. 4, in which the scheduling improvements of GDPA
applied gdpa-hopa, as we have confirmed previously it is the best of the over HOPA manifested for systems with utilizations higher than 70%.
GDPA variants. Here, the difference between gdpa-hopa and HOPA is This indicates that the extra computation time required by GDPA is
very similar to that observed for the previous system. This indicates that effectively employed in finding more schedulable solutions.
the capability of finding solutions of GDPA is not relatively hindered by Among the GDPA variants, in Fig. 6(a) we can observe that gdpa-
increasing the size of the search space. random is clearly the slowest. This is expected, as it usually starts with
We now put to test a different configuration of GDPA, in which very poor priority assignments, that should require more iterations to
the stop condition is evaluated using a less pessimistic Offset-Based improve upon them. Moreover, the reported difference in computation
analysis [16], while keeping the same vectorized Holistic analysis to times between gdpa-pd and gdpa-hopa is not significant.
compute the gradients. The results for systems with 30 steps are shown For systems with 90% utilization, which are the most complex to
in Fig. 5. Both PD and HOPA are also using the same Offset-Based analyze, GDPA required on average around 1 s, 10 s and 1500 s to
analysis. This result depicts a very similar situation to that of Fig. 4(b), find a schedulable solution in the systems with 16, 30 and 72 steps
with the main difference being that the results are slightly better, respectively. 90% is a very high utilization that is not usually reached
equally for all algorithms. This is due to the usage of a less pessimistic in industrial settings. For the more realistic range up to a utilization of
analysis. This result validates the idea of employing a fast Holistic- 80%, GDPA required on average 100 s to find a solution in the systems
type analysis to direct the optimization process (i.e. to compute the with 72 steps.
gradients), while using a different less pessimistic but slower analysis Focusing on the available comparison with the optimal algorithms
to validate the results. Further, this observation also suggests that there (Fig. 6(a)), we can confirm that all the GDPA variants were significantly
is a correlation between the WCRTs of the Holistic analysis and those faster than both MILP and the brute-force algorithms. It is important to
from the less pessimistic Offset-Based analysis. note that we have previously showed that for these systems with 16
steps, GDPA demonstrated a capability to find schedulable solutions
6.3.3. Computation times very close to that of those optimal algorithms (see Fig. 4(a)). Summa-
We have shown that GDPA is able to schedule more systems than rizing, although GDPA exhibits scheduling capabilities that are close to
HOPA, approximating an optimal algorithm (at least in the situations optimal, its computational complexity is closer to that of HOPA.
in which the optimal algorithms are tractable). Here we evaluate the It is important to note that the very high computation times of MILP
computation times of GDPA in comparison with the other techniques. in this evaluation should not be used to conclude that MILP is not
Fig. 6 shows the average computation times each algorithm required a valid method to assign fixed priorities for systems that follow our
to find a schedulable solution, for systems with 16, 30 and 72 steps. The system model. Rather, these results signal the need for the definition of
computation times of GDPA and HOPA include the times required to linear equations to model a schedulability test with sufficient precision.
calculate the initial priority assignments. For instance, the computation We believe that, if such equations existed, the performance of MILP
times of gdpa-hopa include the computation times required by HOPA to here may be greatly improved.
calculate the initial priority assignment. In Fig. 7, we show the average number of iterations each algorithm
As a general observation we can confirm that all the GDPA variants required to find a schedulable solution. This metric adds more context
required longer computation times than HOPA, specially for systems to the computation times previously reported. The optimal algorithms

12
J.M. Rivas et al. Journal of Systems Architecture 153 (2024) 103198

are not included here because they are both in essence brute-force algo- Acknowledgments
rithms in which the number of iterations required can be considered as
arbitrary. Similarly, PD is not included because it is not an iterative This work was partially supported by MCIN/ AEI/10.13039/ 5011
algorithm. In general we observe in the figure that GDPA does not 00011033/ FEDER ‘‘Una manera de hacer Europa’’, Spain under grants
need a high number of iterations to find a schedulable solutions. If we PID2021-124502OB-C41 and PID2021-124502OB-C42 (PRESECREL),
focus on the GDPA variants that start with good priority assignments and by the Vicerrectorado de Investigación de la Universitat Politècnica
(gdpa-pd and gdpa-hopa), and on the more complex systems with 90% de Valencia (UPV) ‘‘Aid to First Research Projects’’, Spain under grant
utilization, we can observe that on average they required less than 5, PAID-06-23 and PAID-10-20.
30 and 55 iterations for systems with 16, 30 and 72 steps respectively.
These results highlight that the current maximum number of iterations References
set for GDPA (i.e. 100), should not noticeably constrain its ability to
find schedulable solutions. [1] C.L. Liu, J.W. Layland, Scheduling algorithms for multiprogramming in a hard-
real-time environment, J. ACM 20 (1) (1973) 46–61, http://dx.doi.org/10.1145/
In the Figs. 7(a) and 7(b) (16 and 30 steps respectively), we con-
321738.321743.
firm that gdpa-random requires more iterations than the other GDPA [2] B. Akesson, M. Nasri, G. Nelissen, S. Altmeyer, R.I. Davis, A comprehensive
variants. This observation reaffirms the conclusion that the longer survey of industry practice in real-time systems, Real-Time Syst. 58 (2022)
computation times of gdpa-random are due to its need to overcome a 358–398, http://dx.doi.org/10.1007/s11241-021-09376-1.
[3] R.I. Davis, L. Cucu-Grosjean, M. Bertogna, A. Burns, A review of priority
worse initial priority assignment.
assignment in real-time systems, J. Syst. Archit. 65 (2016) 64–82, http://dx.
doi.org/10.1016/j.sysarc.2016.04.002.
7. Conclusions and future work [4] K.W. Tindell, A. Burns, A.J. Wellings, Allocating hard real-time tasks: An NP-
Hard problem made easy, Real-Time Syst. 4 (1992) 145–165, http://dx.doi.org/
10.1007/BF00365407.
In this paper, we presented a new algorithm to assign fixed priori- [5] E. Azketa, J.P. Uribe, M. Marcos, L. Almeida, J.J. Gutierrez, Permutational
ties in real-time systems, called Gradient Descent Priority Assignment genetic algorithm for the optimized assignment of priorities to tasks and messages
in distributed real-time systems, in: 2011IEEE 10th International Conference on
(GDPA). As far as we know, this is the first time a Gradient Descent
Trust, Security and Privacy in Computing and Communications, IEEE, 2011, pp.
algorithm has been employed in this particular type of problem. 958–965, http://dx.doi.org/10.1109/TrustCom.2011.132.
We evaluated GDPA on a variety of synthetic systems and showed [6] A. Hamann, M. Jersak, K. Richter, R. Ernst, A framework for modular analysis
that it is able to find more schedulable solutions than previous custom and exploration of heterogeneous embedded systems, Real-Time Syst. 33 (2006)
101–137, http://dx.doi.org/10.1007/s11241-006-6884-x.
heuristics such as HOPA. We also showed that GDPA closely approxi-
[7] J.J. Gutiérrez, M.G. Harbour, Optimized priority assignment for tasks and mes-
mates the success rate of optimal algorithms, at least in those situations sages in distributed hard real-time systems, in: Proceedings of Third Workshop
in which those optimal algorithms were tractable. Crucially, the eval- on Parallel and Distributed Real-Time Systems, IEEE Comput. Soc. Press, 1995,
uation showed that GDPA achieves this performance while requiring pp. 124–132, http://dx.doi.org/10.1109/WPDRTS.1995.470498.
reasonable amounts of computation time. In the more extreme systems [8] Q. Zhu, H. Zeng, W. Zheng, M.D. Natale, A. Sangiovanni-Vincentelli, Optimiza-
tion of task allocation and priority assignment in hard real-time distributed
tested, with 72 steps and very high processor utilizations of 90%, GDPA systems, ACM Trans. Embedd. Comput. Syst. 11 (2013) 1–30, http://dx.doi.org/
was able to find schedulable solutions on average in less than 25 min. 10.1145/2362336.2362352.
For more reasonable utilizations around 80%, which are still considered [9] S. Ruder, An overview of gradient descent optimization algorithms, 2016, URL
as high, the computation times were on average less than 100 s for the http://arxiv.org/abs/1609.04747.
[10] O.M. Group, UML profile for MARTE: Modeling and analysis of RealTime
more complex systems tested with 72 steps.
embedded systems, 2019, OMG Document, v1.2 formal/19-04-01. URL https:
We are planning to extend this evaluation, to include a deeper //www.omg.org/spec/MARTE/1.2.
study of the effects of the hyper-parameters on a wider set of systems, [11] M.G. Harbour, J.J. Gutiérrez, J.C. Palencia, J.M. Drake, MAST: Modeling and
including more specific system models with multipath e2e flows. Also, analysis suite for real time applications, in: Proceedings 13th Euromicro Con-
ference on Real-Time Systems, IEEE Comput. Soc, 2001, pp. 125–134, http:
the ability of GDPA of optimizing already schedulable solutions will be
//dx.doi.org/10.1109/EMRTS.2001.934015.
explored. [12] M.G. Harbour, J.J. Gutiérrez, J.M. Drake, P.L. Martínez, J.C. Palencia, Modeling
As a more general conclusion, this work also hints at the potential distributed real-time systems with MAST 2, J. Syst. Archit. 59 (2013) 331–340,
of Gradient Descent as an algorithm to optimize real-time systems in http://dx.doi.org/10.1016/j.sysarc.2012.02.001.
[13] F. Eisenbrand, T. Rothvoß, Static-priority real-time scheduling: Response time
general due to its flexibility. In this regard, we plan to generalize
computation is NP-hard, in: 2008 Real-Time Systems Symposium, IEEE, 2008,
the Gradient Descent algorithm, to optimize other parameters such as pp. 397–406, http://dx.doi.org/10.1109/RTSS.2008.25.
scheduling deadlines for EDF, or task to processor mapping. [14] K.W. Tindell, A. Burns, A.J. Wellings, An extendible approach for analyzing
fixed priority hard real-time tasks, Real-Time Syst. 6 (1994) 133–151, http:
//dx.doi.org/10.1007/BF01088593.
CRediT authorship contribution statement [15] J.C. Palencia, J.J. Gutierrez, M.G. Harbour, On the schedulability analysis for
distributed hard real-time systems, in: Proceedings Ninth Euromicro Workshop
on Real Time Systems, IEEE Comput. Soc, 1997, pp. 136–143, http://dx.doi.org/
Juan M. Rivas: Writing – review & editing, Writing – original
10.1109/EMWRTS.1997.613774.
draft, Visualization, Validation, Software, Methodology, Investigation, [16] J.C. Palencia, M.G. Harbour, Schedulability analysis for tasks with static and
Formal analysis, Data curation, Conceptualization. J. Javier Gutiér- dynamic offsets, in: Proceedings 19th IEEE Real-Time Systems Symposium, Cat.
rez: Writing – review & editing, Writing – original draft, Validation, No.98CB36279, IEEE Comput. Soc, 1998, pp. 26–37, http://dx.doi.org/10.1109/
REAL.1998.739728.
Methodology, Funding acquisition. Ana Guasque: Writing – review
[17] J. Mäki-Turja, M. Nolin, Efficient implementation of tight response-times for
& editing, Writing – original draft, Validation, Investigation, Formal tasks with offsets, Real-Time Syst. 40 (2008) 77–116, http://dx.doi.org/10.1007/
analysis. Patricia Balbastre: Writing – review & editing, Writing – s11241-008-9050-9.
original draft, Validation, Formal analysis. [18] MAST home page. URL https://mast.unican.es/.
[19] A. Amurrio, J.J. Gutiérrez, M. Aldea, E. Azketa, Priority assignment in hierar-
chically scheduled time-partitioned distributed real-time systems with multipath
Declaration of competing interest flows, J. Syst. Archit. 122 (2022) 102339, http://dx.doi.org/10.1016/j.sysarc.
2021.102339.
[20] J.M. Rivas, J.J. Gutiérrez, J.L. Medina, M.G. Harbour, Comparison of memory
The authors declare that they have no known competing finan-
access strategies in multi-core platforms using mast, in: International Workshop
cial interests or personal relationships that could have appeared to on Analysis Tools and Methodologies for Embedded and Real-Time Systems,
influence the work reported in this paper. WATERS, 2017.

13
J.M. Rivas et al. Journal of Systems Architecture 153 (2024) 103198

[21] S. Altmeyer, É. André, S. Dal Zilio, L. Fejoz, M.G. Harbour, S. Graf, J.J. Gutiérrez, Juan M. Rivas is an Assistant Professor in the Software
R. Henia, D. Le Botlan, G. Lipari, J. Medina, N. Navet, S. Quinton, J.M. Rivas, Engineering and Real-Time Group at the University of
Y. Sun, From FMTV to WATERS: Lessons learned from the first verification Cantabria (Spain). He received his B.Sc. degree in Telecom-
challenge at ECRTS, in: A.V. Papadopoulos (Ed.), 35th Euromicro Conference munications Engineering and M.Sc. in Computer Science
on Real-Time Systems, ECRTS 2023, in: Leibniz International Proceedings in from the University of Cantabria in 2008 and 2009 respec-
Informatics (LIPIcs), vol. 262, Schloss Dagstuhl – Leibniz-Zentrum für Informatik, tively. He obtained his Ph.D. degree in Computer Science
Dagstuhl, Germany, 2023, pp. 19:1–19:18, http://dx.doi.org/10.4230/LIPIcs. from the same institution in 2015. He has been involved in
ECRTS.2023.19. several national and European research projects, including
[22] J.C. Palencia, M.G. Harbour, J.J. Gutiérrez, J.M. Rivas, Response-time analysis industrial collaborations, focusing on topics such as the op-
in hierarchically-scheduled time-partitioned distributed systems, IEEE Trans. timization of distributed hard-real-time systems, modeling,
Parallel Distrib. Syst. 28 (7) (2017) 2017–2030, http://dx.doi.org/10.1109/TPDS. and scheduling in novel platforms such as GPUs.
2016.2642960.
[23] J.W.S. Liu, Real-Time Systems, Prentice Hall, Upper Saddle River, NJ, 2000.
[24] J. Oh, C. Wu, Genetic-algorithm-based real-time task scheduling with multiple
J. Javier Gutiérrez received his B.S. and Ph.D. Degrees
goals, J. Syst. Softw. 71 (2004) 245–258, http://dx.doi.org/10.1016/S0164-
from the University of Cantabria (Spain) in 1989 and 1995
1212(02)00147-4.
respectively. He is a Professor in the Software Engineering
[25] M. Yoo, Real-time task scheduling by multiobjective genetic algorithm, J. Syst.
and Real-Time Group at the University of Cantabria, which
Softw. 82 (2009) 619–628, http://dx.doi.org/10.1016/j.jss.2008.08.039.
he joined in the early 90s. His research activity deals
[26] R. Ayari, I. Hafnaoui, A. Aguiar, P. Gilbert, M. Galibois, J.-P. Rousseau,
with the scheduling, analysis and optimization of embedded
G. Beltrame, G. Nicolescu, Multi-objective mapping of full-mission simulators
real-time distributed systems (including communication net-
on heterogeneous distributed multi-processor systems, J. Def. Model. Simul.:
works and distribution middleware). He has been involved
Appl., Methodol., Technol. 15 (2018) 449–460, http://dx.doi.org/10.1177/
in several research projects building real-time controllers for
1548512916657907.
robots, evaluating Ada for real-time applications, developing
[27] J. Lee, S.Y. Shin, S. Nejati, L.C. Briand, Optimal priority assignment for real-
middleware for real-time distributed systems, and proposing
time systems: a coevolution-based approach, Empir. Softw. Eng. 27 (6) (2022)
models along with the analysis and optimization techniques
http://dx.doi.org/10.1007/s10664-022-10170-1.
for distributed real-time applications.
[28] Gurobi Optimization. URL https://www.gurobi.com/.
[29] P. Pazzaglia, A. Biondi, M. Di Natale, Simple and general methods for fixed-
priority schedulability in optimization problems, in: 2019 Design, Automation
& Test in Europe Conference & Exhibition, DATE, IEEE, 2019, pp. 1543–1548, Ana Guasque was born in Valencia, Spain. She received a
http://dx.doi.org/10.23919/DATE.2019.8715017. B.S. degree in industrial engineering from the Universitat
[30] Y. Zhao, H. Zeng, The virtual deadline based optimization algorithm for prior- Politècnica de València (UPV) in 2013; and an M.S. degree
ity assignment in fixed-priority scheduling, in: 2017 IEEE Real-Time Systems in automation and industrial computing from the UPV in
Symposium, RTSS, 2017, pp. 116–127, http://dx.doi.org/10.1109/RTSS.2017. 2015. She received a Ph.D. degree in industrial engineering
00018. from the UPV in 2019. She is currently working as a
[31] A. Neelakantan, L. Vilnis, Q.V. Le, I. Sutskever, L. Kaiser, K. Kurach, J. Martens, researcher in the Universitat Politècnica de València. Her
Adding gradient noise improves learning for very deep networks, 2015, arXiv main research interests include real-time operating sys-
preprint arXiv:1511.06807. tems, scheduling, and optimization algorithms and real-time
[32] D.P. Kingma, J. Ba, Adam: A method for stochastic optimization, in: 3rd control.
International Conference for Learning Representations, San Diego, 2015.
[33] BLAS (Basic Linear Algebra Subprograms). URL https://netlib.org/blas/.
[34] OpenBLAS. URL https://www.openblas.net/.
[35] NumPy. URL https://numpy.org/. Patricia Balbastre is an associate professor of computer en-
[36] GDPA Repository. URL https://github.com/rivasjm/gdpa. gineering at the Universitat Politècnica de València (UPV).
[37] E. Bini, G.C. Buttazzo, Measuring the performance of schedulability tests, Real- She graduated in electronic engineering at the UPV in
Time Syst. 30 (2005) 129–154, http://dx.doi.org/10.1007/s11241-005-0507- 1998 and obtained the Ph.D. degree in computer science in
9. 2002. Her main research interests include real-time operat-
[38] S. Anssi, S. Tucci-Piergiovanni, S. Kuntz, S. Gérard, F. Terrier, Enabling schedul- ing systems, dynamic scheduling algorithms and real-time
ing analysis for AUTOSAR systems, in: 2011 14th IEEE International Symposium control.
on Object/Component/Service-Oriented Real-Time Distributed Computing, 2011,
pp. 152–159, http://dx.doi.org/10.1109/ISORC.2011.28.
[39] C.D. Locke, D.R. Vogel, L. Lucas, J.B. Goodenough, Generic avionics software
specification, 1990, Draft Specification for Naval Weapons Center, China Lake,
CA. IBM Systems Integration Division, Owego, NY.

14

You might also like