Design of High - Speed and Low-Power Finite-Word-Length PID Controllers
Design of High - Speed and Low-Power Finite-Word-Length PID Controllers
Abstract— ASIC or FPGA implementation of a finite word- optimization results can be achieved if not undertaken at
length PID controller requires a double expertise: in control architectural and especially at algorithmic level. To achieve
system and hardware design. In this paper, we only focus on such a goal, a deep insight into PID arithmetic is necessary.
the hardware side of the problem. We show how to design
configurable fixed-point PIDs to satisfy applications requiring
At this stage, a choice of a numeric representation format is
minimal power consumption, or high control-rate, or both a crucial issue. Compared to floating-point, fixed-point
together. As multiply operation is the engine of PID, we format is the best candidate for optimized designs as it is
experienced three algorithms: Booth, modified Booth, and a much simpler to implement, faster, power-efficient and
new recursive multi-bit multiplication algorithm. This later requires far much less hardware resources. However, the
enables the construction of finely grained PID structures with limited dynamic range can be source of control instability.
bit-level and unit-time precision. Such a feature permits to
tailor the PID to the desired performance and power budget.
This problem, referred to as finite-word-length (FWL)
All PIDs are implemented at RTL level as technology- effect is an active research area that aims to shorten the
independent reusable IP-cores. They are reconfigurable floating-to-fixed point conversion time while preserving
according to two compile-time constants: set-point word-length control performances [8][9].
and latency. To make PID design easily reproducible, all The digital implementation of PID controllers went
necessary implementation details are provided and discussed. through several stages of evolution, initially dominated by
the use of commercial-of-the-shelf (COTS) components and
Index Terms— Design-Reuse, Embedded Finite-Word-
Length (FWL) Controllers, Intellectual Property (IP), Linear DSP. But over the past few years, FPGAs have brought a
Time Invariant (LTI) Systems, Low-Power and Speed key advantage to digital control: the inherent parallelism of
Optimization, Proportional-Integral-Derivative (PID) FPGA architecture allows many independent control loops
to run at different deterministic rates without relying on
I. BACKGROUND AND MOTIVATION shared resources that might slow down their responsiveness
as in the case of COTS and DSP [10][11].
T HE PID is by far the most commonly used feedback
controller due to its simple structure and robust
performance [1]. An important feature of this controller is
A survey of recent PID related works can be classified
into three categories. The biggest one includes works that
that it does not require a precise analytical model of the are straightforward FPGA implementations targeting
system that is being controlled, which makes it very specific applications: DC-DC converter [12], temperature
attractive for a large class of dynamic systems. While PID is control [13], motor multi-axis control [14], liquid level
well adapted for linear-time-invariant (LTI) systems [2], it control [15], and Xilinx versus Altera FPGA
stands powerless for non-LTI ones. Nevertheless some implementation for result comparison [16]. The second
solutions exist, such as partitioning the non-LTI control category proposes methodologies that analyze the FWL
algorithm into a linear portion and a non-linear portion effect on PID controller in order to reduce the number of
[3][4][5]. The linear portion represents the major control hardware resources [17][18]. And finally the third category,
loop and is computed using an integrated PID, while the paradoxically the smallest one despite the large popularity
non-linear portion that acts as dynamic compensation to the of PID, comprises architecture-optimization works. In [19]
linear one is performed in software using a general-purpose- low-power serial and parallel multiple-channel PID
microprocessor or a DSP. architectures are proposed for small mobile robots. In this
In embedded control applications, such as in small-scale work, the optimization was carried out at macro-level
mobile robot, the control-loop-cycle is very tight and the considering several PIDs, rather than at micro-level
power budget is very limited. A low sample rate leads to (optimization of the PID itself). Nevertheless, the whole
poor and degraded control-performance. And high power architecture will deliver much more interesting results if
consumption shortens the battery lifetime. To cope with combined with an optimized PID. The second work [20]
these two severe and antagonistic constraints, the need for proposes serial, parallel, and mixed PID architectures
both a high-speed and low-power PID structure is of utmost incorporating different number (1-3) of multiplication cores.
importance. High power consumption, even with the serial architecture,
Today, design-reuse [6] is a well established design and complex control-part are the two major shortcomings of
standard that allows grasping with rapid technology changes this proposal. Finally, in [21] an attractive optimized PID
and increasing design complexity. It consists in the use of structure based on distributed arithmetic (DA) is presented.
predesigned technology-independent, generic and Although this latter exhibits interesting results in terms of
reconfigurable IP-cores [7], most generally implemented at resource utilization and power consumption, it suffers from
register-transfer-level (RTL). three serious drawbacks: high latency (n+1 clock-cycles for
However, at RTL abstraction level, no significant n bit set-point word-length), FPGA technology-dependent
as it’s essentially based upon FPGA look-up-tables (LUTs),
and inability to handle time-varying PID parameters since denoted by recurrent equations (1) and (2), respectively, and
they are precomputed and stored into LUTs. Nevertheless, their corresponding coefficients are grouped in Table I.
u (k ) = P (k ) + I (k ) + D (k ) (1)
it’s considered as a reference design against which the Equations (1) and (2) are fully detailed in the Appendix.
obtained results are confronted into the same conditions.
Where P(k ) = A ⋅ u c (k ) + B ⋅ y (k ) ;
The objective of this paper is to design optimized
I (k ) = I (k − 1) + C ⋅ e(k − 1) ;
FWL-PID structures that overcome all above-mentioned
D(k ) = H ⋅ D(k − 1) + L ⋅ f (k ) .
shortcomings, and which are especially dedicated to
embedded control applications. The PID cores are described
With e(k − 1) = u c (k − 1) − y (k − 1)
at RTL level. They are highly reconfigurable and
and f (k ) = y (k ) − y (k − 1)
technology-independent, offering the possibility to be
mapped both on FPGA and ASIC.
e(k − 1) = u c (k − 1) − y (k − 1) ;
commercial form [22], called also the standard or ISA form,
e(k − 2) = u c (k − 2) − y (k − 2) .
and the incremental form. These two forms went through
three successive types of FPGA implementations, using:
Booth multiplication algorithm (BMA) [23], modified
Booth multiplication algorithm (MBMA) [24], and a new TABLE I
COEFFICIENTS OF DISCRET RECURRENT EQUATIONS
developed version called recursive multibit recoding
multiplication algorithm (RMRMA) [25]. Results show Coefficients Commercial PID Incremental PID
⎛ T T ⎞
gradual improvements with clear superiority over those K p ⎜⎜1 + s + d ⎟⎟
⎝ Ti Ts ⎠
A Kp b
provided in [21]. PID control-rate and energy-consumption
⎛ T ⎞
savings are respectively as follows: 32% and 25% with − Kp − K p ⎜⎜1 + 2 d ⎟⎟
⎝ Ts ⎠
B
BMA, 177% and 23% with MBMA, 431% and 20% with
− Kp
Ts Td
RMRMA. C Kp
Ti Ts
Our previous paper [26] introduced a limited design-
Td
Td + N Ts
space of PID. In this paper, we extended the design-space to H _
accommodate different application cases and provided all
−
K p Td N
Td + N Ts
necessary implementation details to make the design easily L _
reproducible.
The paper is organized as follows. In this section we Kp is the proportional gain; Ti and Td are the integral and
derivative times, respectively; N is the maximum
outlined the main requirement specifications for embedded
derivative gain; b is the fraction of set-point in
PID controller. Section II introduces the two mostly-used proportional term; and Ts is the sampling period.
discrete versions of PID algorithm. Section III, IV and V
deal with BMA, MBMA and RMRMA implementations, To satisfy different application cases, two IP versions are
respectively. A discussion around the obtained results is developed for each equation: with constant coefficients and
given in section VI. Section VII describes the verification with varying coefficients (Fig. 2). This latter requires a host
method, while Section VIII shows how the FWL-effect is side interface (HSI) to handle the runtime change of the
tackled. And finally some concluding remarks in Section coefficients.
Mode Ck Reset
XI. Mode Mode
uc(k) u(k) uc(k) u(k)
A B C H L
II. THE TWO MOSTLY-USED DISCRETE VERSIONS OF PID y(k) PID1 Done y(k) Done
PID2 HSI
A typical closed-loop system using a PID controller is
shown in Fig. 1, where uc(k), y(k), and u(k) are the discrete (a) Ck Reset (b) Din Adr Rw Cs
signal quantities at the kth sampling instant of the reference Ck Reset Ck Reset
set-point, the process-feedback measured output, and the uc(k) u(k) uc(k) u(k)
PID controller output, respectively. A B PID4
y(k) PID3 Done y(k) Done
C HSI
uc(k)
(c) (d)
Din Adr Rw Cs
Input
y(k) PID u(k) Output Fig. 2. Various PID IP-cores. (a) commercial PID with constant
Interface Controller Interface coefficients; (b) commercial PID with time varying coefficients;
(c) incremental PID with constant coefficients; (b) incremental
PID with time varying coefficients;
Process under
Control The commercial version allows the three standard PID
functioning modes (P, PI, PID) according to Mode input
Fig. 1. Typical closed-loop control system using a PID value. At the end of u(k) computation, the Done output
signal toggles during one clock cycle, and the PID enters
In digital control, commercial and incremental forms are into sleep mode (whole internal activity stopped except for
the two mostly-used discrete PID versions [1][22]. They are clocking and HSI) for maximum energy conservation.
III. BMA BASED PID their implementation results (Table III) are respectively
A straightforward parallel implementation of PID compared to those of [21]. Comparison was made into
requires an amount of 7 adders/substractors and 5 identical conditions using the same FPGA device (Spartan
multiplication cores for equation (1), and 4 XC2S50E-7FT256), although relatively old, as well as the
adders/substractors and 3 multiplication cores for equation same synthesis-tool version (Xilinx ISE 9.1i). In [21], only
(2). In digital hardware, the total gate count scales linearly a 16-bit word-length commercial version with constant
with word length for an adder core, while it scales coefficients (without HSI) is implemented. PID1 and PID3
quadratically for a multiplier core. Thus, any effort for a exhibits interesting results: 44%, 25%, and 32% savings and
low-power optimization of PID must be focused on the 62%, 35%, and 38% savings in terms of gate count, power,
implementation of the multiply-and-accumulate (MAC) and speed, respectively. PID3 exhibits higher savings but at
function (X.Y) [27]. In this work, the optimization effort is the expense of control-quality. Latency is rather the same
rather concentrated on the double MAC function (X.Y+T.Z) (17), which is n+1 clock cycles for all designs (PIDX).
called DMAC, considered as the main building block of our Optimizing latency without sacrificing the three other
PID structures. Equations (1) and (2) are partitioned issues is the main objective of the next two sections.
accordingly. X Y T Z
n n n n
For FWL-PID, two’s complement fixed-point
representation is used, which is habitually expressed in Q "0" X X "0" "0" T T "0"
notation as Qni.nf The values are coded in ni bits before the
point (integer word length including 1 sign bit), and nf bits yj-1 Mux Mux zj-1
yj zj
after the point (fractional word length). The total word (Qj.X) (Pj.T)
Y = − y n −1 2 n −1 + ∑ y j 2 j (3)
multiplication. Let Y be the multiplier: j = 0 , n-1 j = 0 , n-1
n−2
j =0 +
DMAC
Y = ∑ ( y j −1 − y j ) 2 = ∑ Q j 2
Equation (3) can also be expressed as follows: j = 0 , n-1
n −1 n −1
Reg
j j (4) 2n+1 X.Y+T.Z
X .Y + T . Z = ∑ (Q j . X ) 2 j + ∑ (Pj .T ) 2 j
in common. Thus, the DMAC becomes: "0" X X "0" "0" T T "0"
n −1 n −1
(5) yj-1 zj-1
[ ]
Mux Mux
j =0 j =0
yj zj
= ∑ Q j . X + Pj . T 2 j
n −1 (Qj.X) (Pj.T)
(6)
j =0
Cin
+
+
0 0 +0 e(k-1) 2n+log2(r)+2
ODMAC
Reg
uc(k) y(k)
circumvent this obstacle is the purpose of the
ODMAC
f(k)
Reg
next section.
_ _ D
TABLE IV
MODIFIED BOOTH ALGORITHM
Reg D(k-1)
Y2j+1 Y2j Y2j-1 Operation
I(k) A
MAC
e(k-1) u(k) 0 0 0 +0
Reg
+
Reg
+ 0 0 1 +X
ODMAC
C uc(k) + 2n+log2(r)+2 0 1 0 +X
I(k-1) 0 1 1 + 2X
y(k) P(k)
1 0 0 - 2X
PID1-2 1 0 1 -X
B
1 1 0 -X
Fig. 6. Commercial PID architecture 1 1 1 -0
TABLE III
IMPLEMENTATION RESULT COMPARISON OF MBA-BASED PID V. RMRMA BASED PID
PID Total Gate Power* Max. Clock
Latency Multiplication is a fundamental operation in digital
Core Count (mW) Freq. (MHz)
PID [21] 16728 456 47 design. Its speed and power requirements are two critical
PID1 9286 (44%) 342 (25%) 62 (32%) factors limiting the whole system performances (PID in our
PID2 10661 (36%) 359 (21%) 61 (30%) 17
case). Since the publication of Booth’s algorithm in 1951, a
PID3 6337 (62%) 297 (35%) 65 (38%)
PID4 7168 (57%) 308 (32%) 62 (32%) huge number of improvement attempts were proposed,
especially after the publication of a generalized version of
* : Dynamic power consumption at 47MHz; (XX%): saving MBA algorithm accompanied with its proof [29]. Most of
the proposals aimed to reduce the number of partial
IV. MBMA BASED PID products either by employing digital optimization
∑ (y 2 j −1 + y 2 j − 2 y 2 j +1 ) 2
techniques [30][31][32] or by using larger slices (higher
∑Q
Equation (3) can also be rewritten as follows [24]:
( n / 2 ) −1 ( n / 2 ) −1
Y= =
radices) [33]. However, experience showed [34] that beyond
2j
22 j (7) 4-bit slices (radix 8), the complexity to generate hard partial
∑ [Q ]
DMAC equation becomes:
( n / 2 ) −1
To circumvent the problem of hard partial products in
X .Y + T . Z = . X + Pj . T 2 2 j (8) higher radices, the idea proposed in [35] is to apply a
j =0
j
recursive Booth recoding on the r-bit slice. While the idea is
Likewise, n/2 simple partial products are generated interesting, it relies upon a complicated mathematical
(Table IV). Since ODMAC is a reconfigurable RTL block, formulation, leading to a complex control circuitry and
it is parameterized to suit equation (8). The new adapted especially to an exaggerated latency (2n/r).
ODMAC architecture is depicted in Fig. 7. The only TABLE V
difference is that Mux(8:1) are used instead of Mux(4:1), IMPLEMENTATION RESULT COMPARISON OF MBMA-BASED PID
and (<<2.j) hardwired shifter instead (<<1.j). Compared to PID Total Gate Power* Max. Clock
Latency
Core Count (mW) Freq. (MHz)
BMA based PID (Table V), MBMA based one (PID1)
PID [21] 16728 456 47 17
shows much more interesting results, since latency is PID1 10642 (36%) 350 (23%) 62 (32%)
divided by 2 while maintaining stable power consumption PID2 11923 (29%) 366 (20%) 61 (30%)
PID3 7042 (58%) 303 (33%) 64 (38%) 9 (47%)
and speed. Control rate is drastically improved as its equal
PID4 7795 (53%) 315 (31%) 62 (32%)
to maximum clock frequency divided by latency. As the
discrete commercial form (equation 1) can accommodate the * : Dynamic power consumption at 47MHz; (XX%): saving
three functioning modes, implementation of PID2 produced According to the multibit recoding algorithm presented in
∑ (y
the following power consumption values at 47 MHz: 268 [29], a n-bit two’s complement operand Y can be written as:
( n / r ) −1
Y= + 2 0 y rj + 21. y rj +1 + 2 2 y rj + 2 + ⋅ ⋅ ⋅
mW, 313 mW, and 366 mW for P, PI, and PID functioning
modes, respectively. rj −1
)
j =0
∑Q
With regard of these improvements, one is encouraged to
( n / r ) −1
pursue farther [24] in reducing latency by considering larger + 2 r − 2 y rj + r − 2 − 2 r −1 y rj + r −1 2 rj = 2 rj (10)
∑Q
slices, such as:
( n / 3) −1 ( n / 3) −1
Y= 3 j −1 + y 3 j + 2. y 3 j +1 − 2 y 3 j + 2 2 2 3j
= 2 3j (9)
∑ [(y (
+ y rj − 2. y rj +1 ) 2 0 + y rj +1 + y rj + 2 − 2. y rj +3 ) 2 2 + ...
form: log2(r/2) adder levels and which is equal to D in the case of
( n / r ) −1
Y=
MBMA, is slightly increased D+log2(r/2). Note that we are
rj −1 using a logarithmic summation tree and not a linear one
j =0
+ ( y rj + r −5 + y rj + r − 4 − 2. y rj + r −3 ) 2
(CSA like).
+
2( −2) An illustrative serial example with r=4 is described as
r
∑ (y )
2
follows:
(y )2
( n / 4 ) −1
⎤ rj Y= + y 4 j + 2 y 4 j +1 + 2 2 y 4 j + 2 − 2 3 y 4 j + 3 2 4 j (15)
+ y rj + r −2 − 2. y rj + r −1
2 ( −1)
⎥ 2
4 j −1
r
j =0
⎢∑ ( y 4 j −1+ 2i + y 4 j + 2i − 2. y 4 j +1+ 2i ) 2 ⎥ 2
(11)
⎦
rj + r −3
2
∑
(n / 4 )−1
⎡ 1 2i ⎤
∑ ⎢ ∑ (y − 2. y rj +1+ 2i ) 2 ⎥ 2 rj (12)
(n / r )−1 ⎡ (r / 2 )−1 ⎤ =
⎣ i =0 ⎦
(16)
= + y rj + 2i
4j
∑ [Q ]
j =0
⎣ ⎦
rj −1+ 2 i
2i
j =0 i =0 (n / 4 )−1
= + Q j1 2 2 2 4 j
∑ ⎢ ∑Q
(17)
(n / r )−1 ⎡ (r / 2 )−1 ⎤
∑ [(Q ]
j =0
=
X + Pj 0T ) + (Q j1 X + Pj1T ) 2 2 2 4 j (18)
j0
2 2i ⎥ 2 rj ( n / 4 ) −1
⎣ ⎦
(13)
j =0 i =0 X .Y + T . Z =
ji
With Q ji ∈ {− 2, − 1, 0,1, 2}
j =0
j0
X . Y + T . Z = ∑ ⎢ ∑ (Q ji . X + Pji . T ) 2 2i ⎥ 2 rj (14)
DMAC equation becomes:
⎡ ⎤
PID2_4 22962 (-37%) 256 (-15%) 43 (-08%) 5 (+71%)
( n / r ) −1 ( r / 2 ) −1 PID2_8 26073 (-56%) 204 (+08%) 37 (-21%) 3 (+82%)
j =0 ⎣ i =0 ⎦
PID2_16 40327 (-141%) 488 (-119%) 23 (-51%) 2 (+88%)
Depending on r value ranging from 2 to n, PIDs with *: Dynamic power consumption at 23MHz; PIDY_X: X = r
(+AB%): saving; (-AB%): overhead
X Y T Z
n n n n
y4j+21
y4j+1
y4j+3
y4j+1
z4j+1
z4j+3
z4j+1
z4j+2
y4j-1
z4j-1
"0" X X 2X 2X X X "0" "0" T T 2T 2T T T "0" "0" X X 2X 2X X X "0" "0" T T 2T 2T T T "0"
y4j+
z4j+
Mux Mux Mux Mux
<< 2
<< 2
(Qj0.X) (Pj0.T) Cin (Qj1.X) (Pj1.T)
y4j+1 Cin z4j+1 +
+ Cin
+
y4j+3 z4j+3
<< 4j
Cin z4j+3
+
ODMAC
Reg j = 0 , (n/4)-1
X.Y+T.Z
2n+2
Fig. 9. Optimized DMAC architecture for r=4
At this stage, a key question arises: among this panoply and 2, respectively) and one stage difference in the critical
of PIDs, which one fits the best one’s application case? The path (n-1 and n, respectively), but an important multiplexer
answer to this question is given in the next section. fanin difference (n/4 and n/2, respectively).
In terms of resource occupation, the total complexity
VI. DISCUSSION grows linearly O(r) as r multiplexers and r adders are
In embedded control, satisfactory control-rate (without required by ODMAC which is the most resource consuming
performance degradation) at minimum power consumption block of PID architecture. This is also confirmed by the
is the main requirement. To select the most adequate PID implementation results shown in Table VI. Note that each
for a given application, it’s necessary to investigate how adder of each level of MAC and ODMAC as well as the two
speed, power and hardware resources scales versus r factor ones at the output of the PID (Fig. 5 and 6) are
for a fixed word length n. Referring to equation (14) and successively extended by one bit so that the total bit size of
aided by Fig. 9, the ODMAC architecture scales as a binary the control output u(k) becomes 2n+log2(r)+2. It’s necessary
tree with one stage of r mux(8:1) followed by Log2(r)+1 to do so to prevent the apparition of a possible overflow in
stages of adders with a total of r adders too. Thus, the total the data-path which can cause signal clipping and
delay cumulated by the critical path which goes through instabilities in the closed loop response [37].
Log2(r)+2 stages increases with O(Log(r)) complexity, As for power consumption, intuitively, one would expect
whilst latency (n/r+1) decreases linearly O(r), which makes to see PID1_16 of Table VII as being the most rapid and the
the maximum control-rate increases as r increases. This is most power consumer too, for the reason that it exhibits the
confirmed by implementation results shown in Table VII smallest latency and the biggest total gate count! While it is
and VIII corresponding to PID1 and PID2, respectively. The almost true for the latter (13 MHz, before the first), it is
sole exception to this general rule is PIDX_n/2 which quite the opposite for the former (244 mW, the smallest
1.5
Set Point (Uc)
Plant Measure (Y)
1
Response
Temperature °C
have the required skills to implement and evaluate the
controllers using ASIC/FPGAs [17][43]. This is why we (a)
propose, as hardware designers, a highly reconfigurable
(n, r) and technology-independent FWL PID that can
systematically respond to control-engineer demands after
having modelled, simulated, and evaluated the performances
provided by different bit-width fixed-point representations Time (s)
using Matlab/Simulink environment, and finally opted for
an appropriate word-length (n) of the setpoint. As for
Temperature °C
latency value (r), it depends on the application domain and
intended objectives. Precise guidelines on how to choose r
(b)
value were given in section VI.
Now that (n, r) couple is known, the FWL problem is
tackled from hardware side by simply adjusting in the RTL
code the two compile-time constants: setpoint bit-size (n)
and latency (r). The synthesis of such a PID generates an Time (s)
optimal structure that not only meets the performances
specified by control-engineers, but also consumes minimum
power and hardware resources. This would not have been
possible without the use of the new highly serialisable
Temperature °C
multi-bit multiplication algorithm (equation 13). The
incorporation of equation (13) [25] into equations (1) and
(c)
(2) as an efficient PID engine, allows the generation of PID
architectures classified as regular iterative architectures
(RIA) [44], known for their high conformity with the
principles of regularity and locality. In addition to equation Time (s)
(13), we propose in [25] several new highly serialisable
multiplication algorithms, offering different features in
Temperature °C
Incremental form
The standard version of PID controller is described in a differential equation as: u (t ) = K p ⎜ e(t ) + 1 e(τ ) ⋅ dτ + Td ⋅ de(t ) ⎟ ,
⎛ ⎞
∫
t
⎜ dt ⎟
⎝ ⎠
where e is the system error ( e(t ) = uc (t ) − y (t ) ), uc is the command signal (setpoint), y is the process variable (measured
Ti
0
⎜ s ⋅ Ti ⎟
⎝ ⎠
For a small sample interval Ts, the continuous time variable u (t ) can be discretized using the following approximations:
e (t ) ⋅ dt ≈ ∑ e( j ) ⋅ T ; d e(t ) ≈ e(k ) − e(k − 1) . k denotes the kth sampling instant (k.Ts). Thus, u (t ) can be rewritten as:
∫
k ⋅Ts k
e (k ) − e (k − 1 ) ⎞
j=0
s
e(k ) = uc (k ) − y(k )
0 dt Ts
∑
⎛
u (k ) = K p ⋅ ⎜⎜ e (k ) + e ( j )⋅ T s + Td ⋅ ⎟
⎟
1 k
⎝ ⎠
with and
Ti j = 0
e (k − 1) − e (k − 2 ) ⎞⎟
Ts
⎛
u (k − 1) = K p ⎜ e (k − 1) + ∑ e ( j ) .T s + T d ⋅
1 k −1
⎜ ⎟
⎝ ⎠
Ti j = 0 Ts
K p ⎛⎜ k ⎞
We calculate the difference: u (k ) − u (k − 1) = K p ⋅ (e (k ) − e (k − 1)) + ∑ ( ) ∑ ( )
k −1
⋅ − ⋅ ⎟
Ti ⎜ j = 0
s⎟
⎝ ⎠
e j T e j T
j=0
s
K p ⎛⎜ k ⎞
K p ⋅ (e (k ) − e (k − 1)) = K p .e (k ) − K p ⋅ e (k − 1) ⋅ ∑ e ( j ) ⋅ T s − ∑ e ( j ) ⋅ T s ⎟ = K p ⋅ s ⋅ e (k )
k −1
Ti ⎜ j = 0 ⎟
T
⎝ ⎠
;
j =0
Ti
⎛ e (k ) − e (k − 1) e (k − 1) − e (k − 2 ) ⎞
⎟⎟ = K p ⋅ d ⋅ e (k ) − K p ⋅
2 ⋅ Td
K p ⋅ Td ⋅ ⎜⎜ − ⋅ e (k − 1) + K p ⋅ d ⋅ e (k − 2 )
T T
⎝ Ts Ts ⎠ Ts Ts Ts
⎛ T ⎞ ⎛ T ⎞
u (k ) = u (k − 1 ) + K ⋅ ⎜⎜ 1 + s + d ⎟⎟ ⋅ e (k ) − K p ⋅⎜⎜ 1 + 2 . d ⎟⎟ ⋅ e (k − 1 ) + K p ⋅ d ⋅ e(k − 2)
T T
⎝ Ts ⎠ ⎝ Ts ⎠
p
Ti Ts
= u (k − 1 ) + A ⋅ e (k ) + B ⋅ e (k − 1 ) + C ⋅ e(k − 2 )
This latter equation is called the incremental form of the controller. A drawback with the incremental algorithm is that it
cannot be used for P or PD controllers.
Commercial form
For better performances of PID, two corrections are performed: limitation of the derivative gain and setpoint weighting. A
s ⋅ Td
pure derivative action will induce a very large amplification of measurement noise. The gain of the derivative must thus be
s ⋅ Td ≈
1 + s ⋅ Td / N
limited. This can be done by approximating the transfer function s.Td as follows: , where N is
typically in the range of 3 to 20. In addition, to avoid sudden overshoots due to high variations of the setpoint, only a fraction
⎛ ⎞
U (s ) = K p ⋅ ⎜⎜ (b ⋅ U c (s ) − Y (s )) + ⋅ (U c (s ) − Y (s )) − ⋅ Y (s )⎟⎟
s ⋅ Td
b of uc acts on the proportional part (b.uc - y). Hence, the improved PID algorithm becomes:
1
⎝ s ⋅ Ti 1 + s ⋅ Td N ⎠
u (k ) = P (k ) + I (k ) + D (k ) , where P(k ) = K p ⋅ b ⋅Uc (k ) − K p ⋅ Y (k ) and I (k ) = I (k − 1) + K p ⋅ T s ⋅ (U c (k − 1) − Y (k − 1)) .
U(s) expression is discretized such that the proportional, integral and derivative terms are separately obtained, as follows:
To determine the derivative term D (k ) , we use the differential equation representing the transfer function of Gd (s ) :
Ti
U d (s )
G d (s ) = . By performing cross products, we get: U d (s ) ⋅ ⎛⎜1 + ⎟ = − K p ⋅ Y (s ) ⋅ s ⋅ Td .
s ⋅ Td s ⋅ Td ⎞
= −K p
Y (s ) 1 + s ⋅ Td N ⎝ N ⎠
Td du d (t ) dy (t )
Applying the inverse Laplace Transform to this latter equation, we obtain: u d (t ) = − ⋅ − K p ⋅ Td ⋅ d .
N Ts Ts
u (k ) = P (k ) + I (k ) + D (k )
P (k ) = A ⋅ u c (k ) + B ⋅ y (k )
with
I (k ) = I (k − 1 ) + C ⋅ e (k − 1 ) ;
;
D (k ) = H ⋅ D (k − 1 ) + L ⋅ f (k ) and
K p ⋅ N ⋅ Td
A = K p ⋅b ; B = −K p ; C = −K p ⋅ H = L=−
Ts ; Td
Td + N ⋅ Ts Td + N ⋅ Ts
; .
Ti