Sigmetrics08 Control PDF
Sigmetrics08 Control PDF
Sigmetrics08 Control PDF
7.1 Introduction
Feedback control is central to managing computing systems and networks. For ex-
ample, feedback (or closed loop systems) is employed to achieve response time ob-
jectives by taking resource actions such as adjusting scheduling priorities, memory
allocations, and network bandwidth allocations. Unfortunately, computing practi-
tioners typically employ an ad hoc approach to the design of feedback control, often
Tarek Abdelzaher
Dept. of Comp. Sci., University of Illinois, Urbana-Champaign, IL, e-mail: [email protected]
Yixin Diao
IBM T. J. Watson Research Center, Hawthorne, NY, e-mail: [email protected]
Joseph L. Hellerstein
Developer Division, Microsoft Corp, Redmond, WA, e-mail: [email protected]
Chenyang Lu
Dept. of Comp. Sci. and Eng., Washington University, St. Louis, MO, e-mail: [email protected]
Xiaoyun Zhu
Hewlett Packard Laboratories, Hewlett Packard Corp., Palo Alto, CA, e-mail: xi-
[email protected]
185
186 T Abdelzaher, Y Diao, JL Hellerstein, C Lu, X Zhu
This section provides a brief overview of control theory for computer scientists with
little background in the area. The focus is on key concepts and fundamental results.
Disturbance Noise
Input Input
Transducer
Transduced
Output
Karl Astrom, one of the most prolific contributors to control theory, states that
the magic of feedback is that it can create a system that performs well from com-
ponents that perform poorly [2]. This is achieved by adding a new element, the con-
troller, that dynamically adjusts the behavior of one or more other elements based
on the measured outputs of the system. We use the term target system to refer to the
elements that are manipulated by one or more controllers to achieve desired outputs.
7 Introduction to Control Theory And Its Application to Computing Systems 187
The elements of a closed loop system are depicted in Figure 7.1. Below, we
describe these elements and the information, or signals, that flow between elements.
Throughout, time is discrete and is denoted by k. Signals are a functional of time.
The reference input r(k) is the desired value of the measured output (or trans-
formations of them), such as CPU utilization. For example, r(k) might be 66%.
Sometimes, the reference input is referred to as the desired output or the set point.
The control error e(k) is the difference between the reference input and the mea-
sured output.
The control input u(k) is the setting of one or more parameters that manipulate
the behavior of the target system(s) and can be adjusted dynamically.
The controller determines the setting of the control input needed to achieve the
reference input. The controller computes values of the control input based on
current and past values of control error.
The disturbance input d(k) is any change that affects the way in which the control
input influences the measured output (e.g., running a virus scan or a backup).
The measured output y(k) is a measurable characteristic of the target system such
as CPU utilization and response time.
The noise input n(k) changes the measured output produced by the target system.
This is also called sensor noise or measurement noise.
The transducer transforms the measured output so that it can be compared with
the reference input (e.g., smoothing stochastics of the output).
In general, there may be multiple instances of any of the above elements. For ex-
ample, in clustered systems, there may be multiple load balancers (controllers) that
regulate the loads on multiple servers (target systems).
To illustrate the foregoing, consider a cluster of three Apache Web Servers. The
Administrator may want these systems to run at no greater than 66% utilization so
that if any one of them fails, the other two can absorb the load of the failed server.
Here, the measured output is CPU utilization. The control input is the maximum
number of connections that the server permits as specified by the MaxClients
parameter. This parameter can be manipulated to adjust CPU utilization. Examples
of disturbances are changes in arrival rates and shifts in the type of requests (e.g.,
from static to dynamic pages). Control theory provides design techniques for deter-
mining the values of parameters such as MaxClients so that the resulting system
is stable and settles quickly in response to disturbances.
Controllers are designed for some intended purpose or control objective. The
most common objectives are:
regulatory control: Ensure that the measured output is equal to (or near) the
reference input. For example, in a cluster of three web servers, the reference
input might be that the utilization of a web server should be maintained at 66%
to handle fail-over. If we add a fourth web server to the cluster, then we may want
to change the reference input from 66% to 75%.
disturbance rejection: Ensure that disturbances acting on the system do not
significantly affect the measured output. For example, when a backup or virus
188 T Abdelzaher, Y Diao, JL Hellerstein, C Lu, X Zhu
scan is run on a web server, the overall utilization of the system is maintained
at 66%. This differs from regulator control in that we focus on changes to the
disturbance input, not to the reference input.
optimization: Obtain the best value of the measured output, such as optimiz-
ing the setting of MaxClients in the Apache HTTP Server so as to minimize
response times. Here, there is no reference input.
There are several properties of feedback control systems that should be consid-
ered when comparing controllers for computing systems. Our choice of metrics is
drawn from experience with commercial information technology systems. Other
properties may be of interest in different settings. For example, [21] discusses prop-
erties of interest for control of real-time systems.
Below, we motivate and present the main ideas of the properties considered.
A system is said to be stable if for any bounded input, the output is also bounded.
Stability is typically the first property considered in designing control systems
since unstable systems cannot be used for mission critical work.
The control system is accurate if the measured output converges (or becomes
sufficiently close) to the reference input in the case of regulatory control and
disturbance rejection, or the measured output converges to the optimal value in
the case of an optimization objective. Accurate systems are essential to ensuring
that control objectives are met, such as differentiating between gold and silver
classes of service and ensuring that throughput is maximized without exceeding
response time constraints. Typically, we do not quantify accuracy. Rather, we
measure inaccuracy. For a system in steady state, its inaccuracy, or steady state
error is the steady state value of the control error e(k).
The system has short settling times if it converges quickly to its steady state
value. Short settling times are particularly important for disturbance rejection in
the presence of time-varying workloads so that convergence is obtained before
the workload changes.
The system should achieve its objectives in a manner that does not overshoot.
The motivation here is that overshoot typically leads to undershoot and hence to
increased variability in the measured output.
Much of our application of control theory is based on the properties of stability,
accuracy, settling time, and overshoot. We refer to these as the SASO properties.
To elaborate on the SASO properties, we consider what constitutes a stable sys-
tem. For computing systems, we want the output of feedback control to converge,
although it may not be constant due to the stochastic nature of the system. To re-
fine this further, computing systems have operating regions (i.e., combinations of
workloads and configuration settings) in which they perform acceptably and other
operating regions in which they do not. Thus, in general, we refer to the stability of
a system within an operating region. Clearly, if a system is not stable, its utility is
severely limited. In particular, the systems response times will be large and highly
variable, a situation that can make the system unusable.
7 Introduction to Control Theory And Its Application to Computing Systems 189
4
yssMP
Measured Output
3 yss
ess
2 rss
0
0 2 4 6 8 10 12 14 Time (k)
ks
Fig. 7.2 Response of a stable system to a step change in the reference input. At time 0, the reference
input changes from 0 to 2. The system reaches steady state when its output always lies between the
light weight dashed lines. Depicted are the steady state error (ess ), settling time (ks ), and maximum
overshoot (MP ).
If the feedback system is stable, then it makes sense to consider the remaining
SASO propertiesaccuracy, settling time, and overshoot. The vertical lines in Fig-
ure 7.2 plot the measured output of a stable feedback system. Initially, the (normal-
ized) reference input is 0. At time 0, the reference input is changed to rss = 2. The
system responds and its measured output eventually converges to yss = 3, as indi-
cated by the heavy dashed line. The steady state error ess is 1, where ess = rss yss .
The settling time of the system ks is the time from the change in input to when the
measured output is sufficiently close to its new steady state value (as indicated by the
light dashed lines). In the figure, ks = 9. The maximum overshoot MP is the (normal-
ized) maximum amount by which the measured output exceeds its steady state value.
In the figure, the maximum value of the output is 3.95 and so (1 + MP )yss = 3.95, or
MP = 32%.
The properties of feedback systems are used in two ways. The first is for analysis
to assess the SASO properties of a system. The second is as design objectives. For
the latter, we construct the feedback system to have acceptable values of steady
state error, settling time, and maximum overshoot. More details on applying control
theory to computing systems can be found in [16].
Target System
Notes Notes
Controller Server Sensor
R(z) zKI Y(z)
0.47 0.17z-0.11
+
z 1 z 0.43 z 0.64
Fig. 7.3 Block diagram of a feedback system to control RPCs in System for the IBM Lotus Notes
Domino Server.
We describe the essentials of control design using the IBM Lotus Domino Server
in [26]. The feedback loop is depicted in Figure 7.3. It consists of the Controller,
the Notes Server, and the Notes Sensor. The control objective is regulation, which is
motivated by administrators who manage the reliability of Notes Servers by regulat-
190 T Abdelzaher, Y Diao, JL Hellerstein, C Lu, X Zhu
ing the number of remote procedure calls (RPCs) in the server. This quantity, which
we denote by RIS, roughly corresponds to the number of active users (those with
requests outstanding at the server). Administrators choose a setting for RIS that bal-
ances the competing goals of maximizing throughput by having high concurrency
levels with maximizing server reliability by reducing server loads.
RIS is measured by periodically reading a server log file, which we call the Notes
Sensor. Regulation is accomplished by using the MaxUsers tuning parameter that
controls the number of connected users. The correspondence between MaxUsers
and RIS changes over time, which means that MaxUsers must be updated almost
continuously to achieve the control objective. The controller automatically deter-
mines the value of MaxUsers based on the objective for RIS and the measured
value of RIS obtained from the Notes Sensor.
Our starting point is to model how MaxUsers affects RIS as output by the Notes
Server. We use u(k) to denote the k-th of MaxUsers, and y(k) to denote the k-th
value of RIS. (Actually, u(k) and y(k) are offsets from a desired operating point.)
We construct an empirical model that relates y(k) to u(k) by applying least squares
regression to data obtained from off-line experiments. (Empirical models can also
be constructed in real time using on-line data.) The resulting model is
is stable, accurate, and has short settling times. First, observe that the closed loop
system itself has a transfer function that relates the reference input to the measured
output. We denote this transfer function by F(z). Translating the design objectives
into properties of F(z), we want the poles of F(z) to be close to 0 (which achieves
both stability and short settling times), and we want F(z)s steady state gain to be
1 (which ensures accuracy since the measured output will be equal to the reference
input). These objectives are achieved by choosing the right Controller.
We proceed as follows. First, we construct a transfer function S(z) for the Notes
Sensor in the same way as was done with the Notes Server. Next, we choose a
parameterized controller. We use an integral controller, which provides incremental
adjustments in MaxUsers. Specifically, u(k + 1) = u(k) + KI e(k), and its transfer
zKI
function is K(z) = z1 . With these two transfer functions, it is straight forward to
construct F(z) [16]. It turns out that an integral controller guarantees that F(z) has
a steady state gain of 1. Thus, the control design reduces to choosing KI such that
the poles of F(z) are close to 0.
The theory discussed so far addresses linear, time-invariant, deterministic (LTI)
systems with a single input (e.g., MaxUsers ) and a single output (e.g., RIS). There
are many extensions to LTI theory. Adaptive control (e.g., [4]) provides a way to au-
tomatically adapt the controller in response to changes in the target system and/or
workloads. Stochastic control (e.g., [3]) is a framework for going beyond determin-
istic systems. State space and hybrid systems (e.g., [24]) provide a way to address
multiple inputs and multiple outputs as well as complex phase changes. Non-linear
control provides a way to address complex relationships between inputs and out-
puts [29].
This section describes a feedback control approach that achieves the optimization
objective. We study such an approach in the context of memory management in
IBMs DB2 Universal Database Management System. The feedback controller man-
ages memory allocation in real time to respond to workload variation and minimize
system response time.
Figure 7.4 shows the architecture and system operations of a database server that
works with multiple memory pools. The database clients interact with the database
server through the database agents which are computing elements that coordinate
access to the data stored in the database. Since disk accesses are much slower rel-
ative to main memory accesses, database systems use memory pools to cache disk
pages so as to reduce the number and time of disk input/output operations needed.
The in-memory data are organized in several pools, which are dedicated for differ-
ent purposes and can be of different types and characteristics (e.g., buffer pools,
package cache, sort memory, lock list).
192 T Abdelzaher, Y Diao, JL Hellerstein, C Lu, X Zhu
Disks
Database
Server
Agents
Memory Pools
Memory
Tuner Memory
Allocations Database Clients
Sensor
Response Time Benefit
J = f (u1 , u2 , . . . , uN ) (7.2)
Further, there may be N scalar inequality constraints imposed on the memory pools:
hi (ui ) = ui ui 0 (7.4)
h
L
ui= ufi + ugi + Nj=1 j uij = yi + + i = 0. Furthermore, i satisfies the com-
plementarity condition of i hi = 0 with i 0. This implies that when the memory
allocation is optimal and pool sizes are not at the boundaries, the measured outputs
of memory pool are equal (yi = , and i = 0 since hi > 0). In the case that the
memory allocation is optimal when some pool sizes are at the boundaries, the mea-
sured output from these memory pool may be smaller (yi = i , and i 0
since hi = 0). Since f is a convex function, the optimal solution is unique in that the
local optimum is also the global optimum.
We design a multiple-input multiple-output (MIMO) feedback controller to
equalize the measured output. Such an approach allows us to exploit well estab-
lished techniques for handling dynamics and disturbances (from changes in work-
loads) and to incorporate the cost of control (throughput reductions due to load
imbalance and resource resizing) into the design. The feedback control system is
defined as follows (where matrices are denoted by boldface uppercase letters and
vectors by boldface lowercase):
The first equation represents a state space model [14], which is a local linear ap-
proximation of the concave memory-benefit relationship. Although most computing
systems are inherently non linear, from the local point of view, a linear approxima-
tion can be effective and rational, especially when considering the existence of sys-
tem noise and the ability of on line model adaptation. The N 1 vector y(k) denotes
the measured output (i.e., response time benefit), the N 1 vector u(k) represents
the control input (i.e., memory pool size), and the N 1 vector dI (k) indicates possi-
ble disturbances applied on the control inputs (e.g., adjustments made to enforce the
equality and inequality resource constraints). The N N matrices A and B contain
state space model parameters that can be obtained from measured data and system
identification [20].
1 0
Equation (7.6) specifies the N 1 control error vector e(k), where I = ... ..
.
0 1
1 1
.. .. are N N matrices. The N 1 vector dO (k) indicates pos-
and 1N,N =. .
1 1
sible disturbances applied on the measured outputs (e.g., measurement noises that
are not characterized by the deterministic model). Implied from this equation is that
we define the average measured output y(k) = N1 Ni=1 yi (k) as the control reference
7 Introduction to Control Theory And Its Application to Computing Systems 195
for all measured outputs, and the i-th control error ei (k) = y(k) yi (k). Note that in
contrast to having a static value or external signal as the reference input, we specify
the reference as a linear transformation of the measured outputs. The control objec-
tive is to make ei (k) = 0, that is, equalizing the measured outputs (i.e., yi (k) = y j (k)
for any i and j) so as to maximize the total saved response time.
The dynamic state feedback control law is defined in Equation (7.8), and the
integral control error eI (k) is the N 1 vector representing the sum of the control
errors as defined in Equation (7.7). The N N matrices KP and KI are controller
parameters to be chosen (through controller design) in order to stabilize the closed
loop system and achieve the SASO performance criteria regarding convergence and
settling time.
We design the controller and choose the control parameters in a way that con-
siders the cost of controlboth the cost of transient memory imbalances and the
cost of changes in memory allocations [10]. Reducing memory imbalance gener-
ally indicates an aggressive control strategy with short settling time of moving the
memory from imbalance to balance. However, too aggressive control can also lead
to overreacting to random fluctuations and thus incurs additional cost of allocation
changes.
We handle this trade-off by exploiting optimal linear quadratic regulator (LQR)
control [15]. LQR chooses control parameters that minimize the quadratic cost func-
tion
e(k)
J = e (k) eI (k) Q
+ u(k) Ru(k)
(7.9)
k=1
eI (k)
over an infinite time horizon as well as satisfy the dynamics defined in Equation
(7.5)-(7.8). The cost function includes the control error e(k) and eI (k), and the con-
trol input u(k). The former is related to the cost of transient resource imbalances,
and the latter the cost of changing resource allocations. The matrices Q and R deter-
mine the trade-off. Intuitively, if Q is large compared to R, the controller will make
big changes in resource allocations and hence can react quickly to disturbances. On
the other hand, if R is large compared to Q, the controller is much more conservative
since there is a high cost for changing resource allocations.
With Q and R defined, the control parameters KP and KI can be computed in the
usual way by solving the Riccati equation [4]. Hence, the controller design problem
is to select the proper weighting matrices Q and R which quantify the cost of control.
We achieve this by developing a cost model, regarding to the performance impact
of control, and constructing Q and R in a systematic way [10].
Although the cost model and LQR framework provides a systematic way to study
the cost of control, it is more appropriate to be used off-line for analyzing the tar-
get system and designing the controller prior to operation. Further simplification is
needed to facilitate real time adaptation when the workload is unknown in advance
and can change overtime. This also helps to manage a large set of memory pools
where the number of pools is varying.
This simplification is achieved using a distributed control architecture and adap-
tive pole placement techniques. The model is built and the controller is designed
locally for each individual memory pool; the only connection between different
196 T Abdelzaher, Y Diao, JL Hellerstein, C Lu, X Zhu
is built on line for the i-th memory pool. This is equivalent to having A = 0 and B =
diag([b1 , . . . , bN ]) in Equation (7.5), while the disturbance term dI (k) is enlarged to
include the modeling uncertainty. Having a set of SISO models simplifies the model
structure and parameter, so that on line modeling techniques such as recursive least
squares can be effectively applied with less computational complexity [20].
The controller is also built individually
!
1 p 1 N
ui (k + 1) = ui (k) yi (k) y j (k) (7.11)
bi (k) N j=1
The controller takes the format of integral control, a simplification from Equation
(7.8) by setting KP = 0 and KI = diag([ b1p 1p
(k) , . . . , bN (k) ]). The control parameter
1
1p
bi (k)
is designed through adaptive pole placement so that it will be adapted when
different model parameter bi (k) is estimated on line.
With reasonable simplifications, a distributed architecture makes the controller
agile to workload and resource variations, and increase its robustness regarding to
measurement uncertainties and maybe uneven control intervals. For example, al-
though in general for a database server the system dynamics may not be negligible
(i.e., an increase of buffer pool size may not immediately result in response time
benefit decrease, as time is needed to fill up the added buffer space) and the cross
memory pool impact does exist (i.e., an increase of sort memory will not only bring
down the benefit for sort memory but also that for the buffer pool that stores tempo-
rary sort spill pages), our experimental results confirm the control performance of
this distributed controller.
Figure 7.5 evaluates the performance of the feedback controller under an on line
transaction processing (OLTP) workload. The OLTP workload consists of a large
number of concurrent requests, each of which has very modest resource demands;
we use 20 buffer pools to contain data and index for the database tables and 50
database clients to generate the load. Figure 7.5(a) shows the throughput (measured
in transactions per unit time) that indicates the performance impact of buffer pool
re-sizings. Figure 7.5(b) and (c) display the memory allocations and response time
benefits for the controlled buffer pools (as indicated by the 20 solid lines in the
plot). Initially, the database memory is not properly allocated: most of the memory
has been allocated to one buffer pool, while the other buffer pools are set at the
minimum size. The controller adjusts the size of buffer pools so as to equalize the
response time benefits of the pools. We see that even for a large number of memory
pools the controller converges in approximately 80 intervals. Further, our studies in
[10] show that the controllers actions increases throughput by a factor of three.
7 Introduction to Control Theory And Its Application to Computing Systems 197
0 0 0
0 50 100 150 200 0 50 100 150 200 0 50 100 150 200
Control interval (in 60 sec.) Control interval (in 60 sec.) Control interval (in 60 sec.)
(a) OLTP throughput (b) Memory pool sizes (c) Response time benefit
Distributed real-time embedded (DRE) systems must control the CPU utilization
of multiple processors to prevent overload and meet deadlines in face of fluctuat-
ing workload. We present the End-to-end Utilization CONtrol (EUCON) algorithm
that controls the CPU utilization of all processors in a DRE system by dynamically
adjusting the invocation rates of periodic tasks. A DRE system is comprised of m
end-to-end periodic tasks {Ti |1 i m} executing on n processors {Pi |1 i n}.
Task Ti is composed of a chain of subtasks {Ti j |1 j ni } running on multiple
processors. The execution of a subtask Ti j is triggered by the completion of its pre-
decessor Ti, j1 . Hence all the subtasks of a task are invoked at a same rate. For
example, on a Real-Time CORBA middleware a task may be implemented as a
sequence of remote operation requests to distributed objects, where each remote
operation request corresponds to a subtask. Each subtask Ti j has an estimated exe-
cution time ci j known at deployment time. However, the actual execution time of a
subtask may differ from ci j and vary at run time. The rate of Ti can be dynamically
adjusted within a range [Rmin,i , Rmax,i ]. A task running at a higher rate contributes
higher utility at the cost of higher CPU utilization. For example, both video stream-
ing and digital control applications usually deliver better performance when running
at higher rates.
As shown in Figure 7.6, EUCON is composed of a centralized controller, and a
utilization monitor and a rate modulator on each processor. A separate TCP connec-
tion connects the controller with the pair of utilization monitor and rate modulator
on each processor. The user inputs to the controller include the utilization set points,
B = [B1 . . . Bn ]T , which specify the desired CPU utilization of each processor, and
the rate constraints of each task. The measured output is the CPU utilization of all
processors, u(k) = [u1 (k) . . . un (k)]T . The control input is the change to task rates
r(k) = [r1 (k) . . . rm (k)]T , where ri (k) = ri (k) ri (k 1) (1 i m). The
goal of EUCON is to regulate the CPU utilizations of all processors so that they re-
main close to their respective set points by adjusting the task rates, despite variations
in task execution times at run time.
198 T Abdelzaher, Y Diao, JL Hellerstein, C Lu, X Zhu
u1 (k )
Measured
Output
un (k )
Distributed System
(m tasks, n processors)
B1 Rmin,1 Rmax,1
Utilization UM UM
, Model Monitor
Predictive
Bn Rmin, m Rmax, m
Controller Rate
Modulator RM RM
r1 (k )
Feedback Loop
Control Precedence Constraints
rm (k )
Input Subtask
DRE systems pose several challenges to utilization control. First, the utilization
control problem is multi-input-multi-output (MIMO) in that the system needs to reg-
ulate the CPU utilization of multiple processors by adjusting the rates of multiple
tasks. More importantly, the CPU utilization of different processors is coupled to
each other due to the correlation among subtasks belonging to a same task, i.e.,
changing the rate of a task will affect the utilization of all the processors hosting its
subtasks because they must execute at the same rates. Therefore the CPU utilization
of different processors cannot be controlled independently from each other. Finally,
the control is subject to actuator constraints as the rate of a task must remain with
an application-specific range.
To deal with inter-processor coupling and rate constraints, EUCON adopts Model
Predictive Control (MPC) [23], an advanced control technique used extensively in
industrial process control. Its major advantage is that it can deal with coupled MIMO
control problems with constraints on the actuators. The basic idea of MPC is to
optimize an appropriate cost function defined over a time interval in the future. The
controller employs a model of the system which is used to predict the behavior
over P sampling periods called the prediction horizon. The control objective is to
select an input trajectory to minimize the cost subject to the actuator constraints.
An input trajectory includes the control inputs in the following M sampling periods,
r(k), r(k + 1|k), . . . r(k + M 1|k), where M is called the control horizon. The
notation r(k + 1|k) means that r(k + 1) depends on the conditions at time k.
Once the input trajectory is computed, only the first element (r(k)) is applied as
the control input to the system. In the next step, the prediction horizon slides one
sampling period and the input trajectory is computed again based on the measured
output (u(k)).
Before designing the controller for EUCON, we derive a dynamic model that
characterizes the relationship between the control input r(k) and the measured
output u(k). First, we model the utilization ui (k) of one processor Pi . Let r j (k)
denote the change to the task rate, r j (k) = r j (k) r j (k 1). We define the esti-
mated change to utilization, bi (k), as:
7 Introduction to Control Theory And Its Application to Computing Systems 199
where Si represents the set of subtasks located at processor Pi . Note bi (k) is based
on the estimated execution time. Since the actual execution times may differ from
their estimation, we model the utilization ui (k) as:
P M1
V(k) = k u(k + i|k) ref(k + i|k) k2 + k r(k + i|k) r(k + i 1|k) k2
i=1 i=0
(7.16)
where P is the prediction horizon, and M is the control horizon. The first term in the
cost function represents the tracking error, i.e., the difference between the utilization
vector u(k + i|k) and a reference trajectory ref(k + i|k). The reference trajectory de-
fines an ideal trajectory along which the utilization vector u(k + i|k) should change
from the current utilization u(k) to the utilization set points B. Our controller is de-
signed to track the following exponential reference trajectory so that the closed-loop
system behaves like a linear system:
TTs i
ref(k + i|k) = B e re f (B u(k)) (7.17)
200 T Abdelzaher, Y Diao, JL Hellerstein, C Lu, X Zhu
where Tre f is the time constant that specifies the speed of system response. A smaller
Tre f causes the system to converge faster to the set point. By minimizing the tracking
error, the closed loop system will converge to the utilization set point if the system
is stable. The second term in the cost function (7.16) represents the control penalty,
which causes the controller to reduce the changes to the control input.
The controller minimizes the cost function (7.16) under the rate constraints based
on an approximate system model. This constrained optimization problem can be
transformed to a standard constrained least-squares problem. The controller can
then use a standard least-squares solver to solve this problem on-line [22].
Note that the system model described in (7.14) and (7.15) cannot be used directly
by the controller because the system gains G are unknown. The controller assumes
G = I in (7.14), i.e., the actual utilization is the same as the estimation. Although this
approximate model may behave differently from the real system, as proven in [22],
the closed loop system can maintain stability and track the utilization set points as
long as the actual G remains within a certain range. Furthermore, this range can be
established using stability analysis of the closed-loop system.
EUCON has been implemented in FC-ORB [31], a distributed middleware for
DRE systems. We now summarize the representative experimental results presented
in [31]. All tasks run on a Linux cluster composed of four Pentium-IV machines.
The EUCON controller is located on another Pentium-IV machine. The workload
comprises 12 tasks with a total of 25 subtasks. In the first experiment shown in Fig-
ure 7.7(a), the average execution times of all subtasks change simultaneously. The
execution times of all subtasks increase by 50% at 600 seconds, EUCON responds
to the overload by reducing task rates, which causes the utilization of every proces-
sor to converge to its set point within 100 seconds (25 sampling periods). At 1000
seconds, the utilization of every processor drops sharply due to 56% decrease in the
execution times of all subtasks. EUCON increases task rates until the utilizations re-
converge to their set points. In the second experiment shown in Figure 7.7(b), only
the average execution times of the subtasks on one of the processors experience the
same variations as in the first run, while all the other subtasks maintain the same av-
erage execution times. As shown in Figure 7.7(b) the utilization of every processor
converges to its set point after the variation of execution times at 600 seconds and
1000 seconds, respectively. These results demonstrate that EUCON can effectively
control the utilization of multiple processors under varying execution times, while
handling inter-processor coupling and rate constraints.
7 Introduction to Control Theory And Its Application to Computing Systems 201
0.8
CPU utilization
0.6
0.4
ron harry
0.2
norbert hermione
0
0 200 400 600 800 1000 1200 1400 1600
Time (sec)
0.8
CPU utilization
0.6
0.4
ron harry
0.2
norbert hermione
0
0 200 400 600 800 1000 1200 1400 1600
Time (sec)
Fig. 7.7 The CPU utilization of all the processors in a Linux cluster when subtask execution times
change on all four processors (top figure) and only one processor (bottom figure)
7.5.1 Introduction
Data centers today play a major role in providing on-demand computing to en-
terprise applications supporting key business processes including supply chain, e-
commerce, payroll, customer relationship management, etc. These applications typ-
ically employ a multi-tier architecture where distinct components of a single appli-
cation, e.g., the web tier, the application tier, and the database tier, spread across
multiple servers. In recent years, there has been wide adoption of server virtualiza-
tion in data centers due to its potential to reduce both infrastructure and operational
costs. Figure 7.8 shows an example scenario where multiple multi-tier applications
share a common pool of physical servers. Each physical server contains multiple vir-
tual containers, and each virtual container hosts a specific component of a multi-tier
application. Here a virtual container can be a hypervisor-based virtual machine
(e.g., VMware, Xen), an operating system level container (e.g., OpenVZ, Linux
VServer), or a workload group (e.g., HP Global Workload Manager, IBM Enterprise
Workload Manager). Although the grouping of application tiers can be arbitrary in
general, we specifically consider the case where the same tiers from different ap-
202 T Abdelzaher, Y Diao, JL Hellerstein, C Lu, X Zhu
plications are hosted on the same physical server. This is a common scenario for
shared hosting environments for potential savings in software licensing costs.
A App 2 A App 2 A
Client 2 App 2
Tier 1 Tier 2 Tier 3
QoS Sensor 2 S S S
A A A
Client M App M App M App M
Tier 1 Tier 2 Tier 3
QoS Sensor M S S S
Consider the system in Figure 7.8, where N (N = 3) virtualized servers are used
to host M 3-tier applications. When one or more of the virtualized servers become
overloaded, the workload management tool needs to dynamically allocate the shared
server resources to individual tiers of the M applications in a coordinated fashion
such that a specified level of QoS differentiation can be maintained. Next, we de-
scribe how this problem can be cast into a feedback control problem. For simplicity,
we assume that only a single resource on a server (e.g., CPU) may become a bottle-
neck. The approach described here can be generalized to handle multiple resource
bottlenecks.
Each virtual container has an actuator (box A in Figure 7.8) associated with it,
which can allocate a certain percentage of the shared server resource to the appli-
cation component running in the container. This is referred to as resource entitle-
ment. At the beginning of each control interval k, the control input u(k) is fed into
the actuators, where ui, j (k) denotes the resource entitlement for tier j of application
i during interval k. Since M i=1 ui, j = 1, 1 j N, there are a total of (M 1) N
such independent variables. Hence, u(k) is an (M 1) N-dimensional vector.
Each application has a QoS sensor (see Figure 7.8) that measures some end-to-
end performance (e.g., mean response time, throughput) at the end of each control
interval. Let qi (k) denote the QoS measurement for application i during interval
k 1. We then define the measured output, y(k), to be the normalized QoS ratios for
individual applications, where yi (k) = M qi (k) . Since M
i=1 yi (k) = 1, only M 1
m=1 m q (k)
of such yi (k)s are independent. As a result, the system output y(k) is an (M 1)-
dimensional vector.
The goal of the feedback controller is to automatically determine the appropriate
value for each ui, j (k), such that each yi (k) can track its reference input, ri (k), the
desired QoS ratio for application i when the system is overloaded.
We now describe the adaptive optimal controller we presented in [19] for the service
differentiation problem. A block diagram of the closed-loop control system is shown
in Figure 7.9. The controller consists of two key modules: a model estimator that
learns and periodically updates a linear model between the resource entitlements
for individual application tiers and the measured QoS ratios, and an optimal con-
troller that computes the optimal resource entitlements based on estimated model
parameters and a quadratic cost function.
We use the following linear, auto-regressive MIMO model to represent the input-
output relationship in the controlled system:
204 T Abdelzaher, Y Diao, JL Hellerstein, C Lu, X Zhu
n n1
y(k + 1) = Al y(k + 1 l) + Bm u(k m). (7.18)
l=1 m=0
The controller aims to steer the system into a state of optimum reference tracking,
while penalizing large changes in the control variables. W OO and Q V V
are weighting matrices on the tracking errors and the changes in the control actions,
respectively. They are commonly chosen as diagonal matrices. Their relative magni-
tude provides a trade off between the responsiveness and the stability of the control
system.
The optimal control law, u (k), can be derived by first explicitly expressing the
J
dependency of the cost function J on u(k), and then solving the equation u(k) = 0.
As a result, we get
where
Our controller design has been validated on a two-node testbed hosting two in-
stances of the RUBiS application [1], an online auction benchmark. We use a two-
tier implementation consisting of an Apache web server and a MySQL database
(DB) server. Each application tier is hosted in a Xen virtual machine. The web
node is used to host two web tiers, and the DB node is used to host two DB
tiers. For this application, CPU is the only potential resource bottleneck. We use the
credit-based CPU scheduler in the hypervisor of Xen 3.0.3 unstable branch [7] as
the actuator in our control loop. It implements proportional fair sharing of the CPU
capacity among multiple virtual machines.
We choose a control interval of 20 seconds, which offers a good balance be-
tween responsiveness of the controller and predictability of the measurements. For
each RUBiS application i, we use mean response time per interval (RTi (k)) as the
QoS metric, and the normalized RT ratio, y(k) = RT1 (k)/(RT1 (k) + RT2 (k)), as the
measured output. The reference input, r(k), indicates the desired level of QoS dif-
ferentiation between the two applications. Note that both y(k) and r(k) are scalars
in this example.
In the first experiment, we varied the reference input, r(k), from 0.3 to 0.5 then
to 0.7. Each reference value was used for a period of 60 control intervals.
Figure 7.10(a) shows the measured per-interval throughput in requests per second
(top) and the mean response time in seconds (middle) for the two applications, as
well as the normalized RT ratio y(k) against the reference input r(k) (bottom) over
a period of 180 control intervals (one hour). The vertical dashed lines indicate the
two step changes in the reference input. As we can see, the measured output was
able to track the changes in the reference input fairly closely. The performance of
both applications also behaved as we expected. For example, a r(k) value of 0.3
gave preferential treatment to application 1, where application 1 achieved higher
throughput and lower average response time than application 2 did. When r(k) was
set at 0.5, both applications achieved comparable performance. Finally, as r(k) was
increased to 0.7, application 2 was able to achieve a higher level of performance
than application 1 did, which was consistent with our expectation.
Figure 7.10(b) shows the corresponding CPU entitlements and resulting CPU
consumptions of individual application tiers. As we can see, as r(k) went from 0.3
to 0.5 to 0.7, our controller allocated less and less CPU capacity to both tiers in
application 1, and more CPU capacity to application 2.
In the second experiment, we fixed the target RT ratio at r(k) = 0.7, and varied
the intensity of the workload for application 1 from 300 to 500 concurrent users.
This effectively created varying resource demands in both tiers of application 1. Ex-
perimental results showed that, the controller was able to allocate the CPU capacity
on both nodes accordingly, and always maintained the normalized RT ratio near the
reference value, in spite of the change in the workload.
In this section, we described how control theory can be applied to the design
of automated workload management solutions for a virtualized data center. In par-
ticular, as one or more virtualized servers become overloaded, our controller can
206 T Abdelzaher, Y Diao, JL Hellerstein, C Lu, X Zhu
Throughput
50
0
0 20 40 60 80 100 120 140 160 180
Response Time
20 RT1 RT2
10
0
0 20 40 60 80 100 120 140 160 180
Norm. RT Ratio
1
RT1/(RT1+RT2) Ref
0.5
0
0 20 40 60 80 100 120 140 160 180
Sample Number (Ts = 20 sec)
100 100
Entitlement
Consumption
50 50
0 0
0 50 100 150 0 50 100 150
(a) Application 1, Web tier (b) Application 2, Web tier
40 40
30 30
20 20
10 10
0 0
0 50 100 150 0 50 100 150
(c) Application 1, DB tier (d) Application 2, DB tier
The following case study is motivated by the importance of energy saving in multi-
tier Web server farms. In large server farms, it is reported that 23-50% of the revenue
is spent on energy [13, 6]. In order to handle peak load requirements, server farms
are typically over-provisioned based on offline analysis. A considerable amounts
of energy can be saved by reducing resource consumption during non-peak condi-
tions. Significant research efforts have been expended on applying dynamic voltage
scaling (DVS) to computing systems in order to save power while meeting time or
performance constraints [13, 6, 12, 28, 27, 33].
In this section, we describe adaptive techniques for energy management in server
farms based on optimization and feedback control. We specifically illustrate the im-
portance of joint adaptation. We show that in large-scale systems, the existence of
several individually stable adaptive components may result in a collectively unstable
system. For example, a straightforward combination of two energy-saving policies
may result in a larger energy expenditure than that with either policy in isolation. We
illustrate this problem by exploring a combination of a DVS policy (that controls fre-
quency, f , of machines in a server farm given their delay D1 ) and an independently
designed machine On/Off policy (that increases the number of machines m in the
server farm when the delay is increased and removes machines when the delay is
decreased). We then provide a solution to avoid the unstable interaction between the
two policies.
Figure 7.11 shows experimental results from a three-tier Web server farm testbed.
Four different energy saving configurations are compared: the On/Off policy, the
DVS policy, the combination of On/Off + DVS (exhibiting adverse interaction) and
finally an optimized policy that we explain later in this section. It is clearly demon-
strated that when the workload increases, the combined On/Off + DVS policy spends
much more energy than all other policies.
The adverse interaction is because the DVS policy reduces the frequency of a
processor, increasing system utilization, which increases end-to-end delay causing
the On/Off policy to to turn more machines on.
In this section, we describe how to design feedback control mechanisms that are
free of adverse interactions, optimize energy and respect end-to-end resource and
timing constraints. Our solution methodology is divided into three steps:
1. Formulate the optimization problem: Optimization is performed with respect
to the available feedback control knobs subject to (i) resource constraints, and (ii)
1 Observe that changing frequency of a processor also changes the associated core voltage. There-
fore, we interchangeably use changing frequency (level) and changing DVS (level) throughout
this paper.
208 T Abdelzaher, Y Diao, JL Hellerstein, C Lu, X Zhu
700
OnOff+DVS (Independently)
OnOff
650 DVS
Our Approach
550
500
450
400
350
350 400 450 500 550 600 650
# Emulated Browsers
Fig. 7.11 Comparison of total system power consumption for different adaptive policies in the
Web server case study.
min f (x1 , . . . , xn )
x1 ,...,xn
subject to g j (x1 , . . . , xn ) 0, j = 1, . . . , m, (7.20)
where f is the common objective function2; g j (), j = 1, . . . , m are the resource and
performance constraints related to the application. Introducing Lagrange multipliers
1 , . . . , m , the Lagrangian of the problem is given as:
L(x1 , . . . , xn , 1 , . . . , m ) = f (x1 , . . . , xn ) +
1 g1 (x1 , . . . , xn ) +
... +
m gm (x1 , . . . , xn ) (7.21)
points. A series of feedback loops is then used to traverse that locus in search of a
maximum utility point.
The necessary conditions of optimality are derived by relaxing the original prob-
lem (i.e., where knob settings are discrete) into a continuous problem (where knob
setting are real numbers and functions g(.) and f (.) are differentiable), then using
the Karush-Kuhn-Tucker (KKT) optimality conditions [5], i : 1, . . . , n:
f (x1 , . . . , xn ) m
g j (x1 , . . . , xn )
+ j =0 (7.22)
xi j=1 xi
Let us call the left-hand-side, xi . Observe that, we have the necessary condition:
x1 = ... = xn (7.23)
We then use a feedback control approach to find the maximum utility point on
the locus that satisfies Equation (7.23). Our feedback control approach is de-
scribed next. We find it useful for the discussion below to also define the average
x = (x1 + ... + xn )/n. This average at time k will serve as the set point r(k) for
each individual xn .
L nAi 3 1 mi
= 2 i4 + = 0 i,
Ui mi Ui i (1 Ui )2
L 2Ai 3 1 Ui
= 3 i3 + Bi + + 2 = 0 i,
mi mi Ui i 1 Ui
3
! (7.26)
mi Ui
1 ( ) K = 0,
i=1 i 1 Ui
!
3
2 (mi ) M = 0.
i=1
Solving for 1 and 2 then substituting in the first two sets of equations above, we
get after some rearranging:
i4 (1Ui )2
To simplify the notations, we will use (mi ,Ui ) to denote m3i Ui4
in the following
discussions. Then the necessary condition for optimality is expressed as
3. Feedback control: It can be easily seen from the necessary condition that,
assuming stable changes in i and mi , the value of (mi ,Ui ) will increase as Ui
decreases. On the other hand, (mi ,Ui ) will decrease if Ui increases. From this,
we can deduce that a smaller value for (mi ,Ui ) indicates that tier i is overloaded
and, similarly, a larger value for (mi ,Ui ) indicates that tier i is underloaded. Based
on this observation, we can design a feedback loop in which the utilization and
the number of machines are adjusted (using traditional control-theoretic analysis
techniques described earlier in this tutorial) in the direction that reduces error (i.e.,
enforces Equation (7.28)) while minimizing the energy objective function.
7.6.2 Evaluation
Next, we evaluate five different energy saving approaches: a baseline (no power
management), the Linux On-demand governor [25], and the three control algo-
rithms mentioned above (the Feedback DVS, the Feedback On/Off, and the Feed-
back On/Off & DVS). For the baseline, we set the CPU frequency to the maximum
on all machines. For each test run, 2500 seconds of TPC-W workload are applied,
with a 300-second ramp-up period, a 2000-second measurement interval, and fi-
nally a 200-second ramp-down period. The TPC-W benchmark generates requests
by starting a number of emulated browsers (EB). We used the shopping mix work-
load consisting of 80% browsing and 20% ordering, which is considered the primary
performance metric by the Transactional Processing Council [30]. The user think
time was set to 1.0 seconds. We used 450 ms as the delay set-point for all experi-
212 T Abdelzaher, Y Diao, JL Hellerstein, C Lu, X Zhu
600 500
Baseline Baseline
Ondemand 450 Ondemand
500 Feedback DVS Feedback DVS
Feedback OnOff 400 Feedback OnOff
Feedback OnOff DVS Feedback OnOff DVS
350
400
Throughput
300
Delay
300 250
200
200
150
100
100
50
0 0
350 400 450 500 550 600 650 350 400 450 500 550 600 650
# Emulated Browsers # Emulated Browsers
Fig. 7.12 Other Metrics: Average Delay, Deadline Miss Ratio, and Throughput
ments. The delay set-point is computed such that if the average delay is kept around
or below it, the miss ratio of the latency constraint is maintained at or below 0.1,
assuming that the end-to-end delay follows an exponential distribution. Figure 7.11
shows that our approach improves energy consumption (baseline and Linux gover-
nor are not shown). Figure 12(a) depicts the average delay of the five algorithms.
Figure 12(b) depicts throughput.
Current trends in computing systems are challenging our ability to engineer systems
that adapt quickly to changes in workloads and resources. Examples addressed in
this paper include: self-tuning memory management in database systems that adapts
to changes in queries and disk contention, dynamic control of resources in real-time
embedded systems that control variations in task resource demands to meet real
time objectives, adapting CPU allocations of virtualized servers in data centers in
response to variations in the user requests, and addressing interactions between con-
trol loops for power management in response to workload variations. Such adapta-
tion is usually addressed by building a closed loop system that dynamically adjusts
resource allocations and other factors based on measured outputs. Control theory
provides a formal approach to designing closed loop systems that is used in many
other fields such as mechanical engineering, electrical engineering, and economics.
This paper provides a brief introduction to key concepts and techniques in control
theory that we have found valuable in the design of closed loops for computing
systems. There has been considerable success to date with applying control theory
to computing systems, including impact on commercial products from IBM, Hewlett
Packard, and Microsoft. However, many research challenges remain. Among these
are the following.
7 Introduction to Control Theory And Its Application to Computing Systems 213
Benchmarks for assessing closed designs. While there are well established bench-
marks for steady state workloads of web servers, database systems, and other
widely used applications, assessing the ability of closed loop systems to adapt to
changes in workloads and resources requires the characterizations of transients.
Examples of such characterizations include the magnitude of changes in arrival
rates and/or service times, how quickly changes occur, and how long they persist.
Further, we need efficient ways to generate such workload dynamics that permit
the construction of low cost, low noise benchmarks. Good insights into workload
characteristics will allow us to incorporate more sophisticated techniques, such
as model based predictive control that is discussed in Section 4.
Control patterns for software engineering. To make control design accessible to
software practitioners, we need a set of control patterns that provide a con-
venient way to engineer resource management solutions that have good con-
trol properties. By good control properties, we mean considerations such as the
SASO properties (stability, accuracy, settling time, and overshoot) discussed in
Section 2. Two starting points for such patterns are contained in this paper: self-
tuning memory in Section 3, which shows how to use control theory to do load
balancing, and the optimal design of interacting control loops in Section 6.
Scalable control design for distributed systems. Traditionally, control engineer-
ing deals with complex systems by building a single Multiple Input, Multiple
Output closed loop. This approach scales poorly for enterprise software systems
because of the complexity and interactions of components. Helpful here are de-
composition techniques such as those in Section 5 that address virtualized servers
for enterprise computing.
Analysis tools to address interactions between control loops. Feedback control
introduces a degree of adaptive behavior into the system that complicates the con-
struction of component based systems. Analysis tools are needed to understand
and quantify the side-effects of interactions between individually well-optimized
components, as well as any emergent behavior that results from component com-
positions.
Dynamic verification of design assumptions. Feedback loops make assumptions
about causal relations between systems variables, such as an admission controller
assuming that request rate and utilization change in the same direction. There is
considerable value in dynamically verifying design assumptions. For example,
one could have a performance assert statement that tests that system variables
change in the expected direction in relation to one another. When violations of
these assumptions are detected, appropriate actions must be taken.
Control of multiple types of resources. Most of the existing applications of con-
trol theory deal with one resource type, for instance, memory in Section 3, and
CPU in Sections 4 and 5. In practice, the performance of applications running in
computing systems depends on multiple resources, such as CPU, memory, net-
work bandwidth and disk I/O. From a control perspective this creates challenges
with interactions between multiple controllers and target systems with different
time constants, delay characteristics, and software interfaces.
214 T Abdelzaher, Y Diao, JL Hellerstein, C Lu, X Zhu
References
20. L. Ljung. System Identification: Theory for the User. Prentice Hall, Upper Saddle River, NJ,
second edition, 1999.
21. C. Lu, J. A. Stankovic, T. F. Abdelzaher, G. Tao, S. H. Son, and M. Markley. Performance
specifications and metrics for adaptive real-time systems. In Proceedings of the IEEE Real
Time Systems Symposium, Orlando, 2000.
22. C. Lu, X. Wang, and X. Koutsoukos. Feedback utilization control in distributed real-time
systems with end-to-end tasks. IEEE Transactions on Parallel and Distributed Systems,
16(6):550561, 2005.
23. J. Maciejowski. Predictive Control with Constraints. Prentice Hall, 1 edition, 2002.
24. K. Ogata. Modern Control Engineering. Prentice Hall, 3rd edition, 1997.
25. V. Pallipadi and A. Starikovskiy. The ondemand governor. In Proceedings of the Linux Sym-
posium, volume 2, 2006.
26. S. Parekh, N. Gandhi, J. Hellerstein, D. Tilbury, J. Bigus, and T. S. Jayram. Using control
theory to acheive service level objectives in performance management. Real-time Systems
Journal, 23:127141, 2002.
27. P. Pillai and K. G. Shin. Real-time dynamic voltage scaling for low-power embedded oper-
ating systems. In SOSP 01: Proceedings of the 18th ACM Symposium on Operating Sstems
Principles, pages 89102, New York, NY, USA, 2001. ACM Press.
28. V. Sharma, A. Thomas, T. Abdelzaher, K. Skadron, and Z. Lu. Power-aware QoS management
in Web servers. In RTSS 03: Proceedings of the 24th IEEE International Real-Time Systems
Symposium, page 63, Washington, DC, USA, 2003. IEEE Computer Society.
29. J.-J. E. Slotine and W. Li. Applied Nonlinear Control. Prentice-Hall, 1991.
30. Transaction Processing Performance Council. TPC Benchmark W (Web Commerce).
31. X. Wang, Y. Chen, C. Lu, and X. Koutsoukos. FC-ORB: A robust distributed real-time em-
bedded middleware with end-to-end utilization control. Journal of Systems and Software,
80(7):938950, 2007.
32. W. Xu, X. Zhu, S. Singhal, and Z. Wang. Predictive control for dynamic resource allocation in
enterprise data centers. In Proceedings of the IEEE/IFIP Network Operations & Management
Symposium, Apr. 2006.
33. W. Yuan and K. Nahrstedt. Energy-efficient soft real-time cpu scheduling for mobile multi-
media systems. In SOSP 03: Proceedings of the 19th ACM Symposium on Operating Systems
Principles, pages 149163, New York, NY, USA, 2003. ACM Press.
34. X. Zhu, Z. Wang, and S. Singhal. Utility driven workload management using nested control
design. In Proceedings of the American Control Conference, June 2006.