FFWD 2ble Pend
FFWD 2ble Pend
www.elsevier.com/locate/automatica
Brief paper
Swing-up of the double pendulum on a cart by feedforward and feedback
control with experimental validation夡
Knut Graichen ∗ , Michael Treuer, Michael Zeitz
Institut für Systemdynamik, Universität Stuttgart, Pfaffenwaldring 9, 70569 Stuttgart, Germany
Received 29 August 2005; received in revised form 4 April 2006; accepted 10 July 2006
Available online 2 October 2006
Abstract
The swing-up maneuver of the double pendulum on a cart serves to demonstrate a new approach of inversion-based feedforward control
design introduced recently. The concept treats the transition task as a nonlinear two-point boundary value problem of the internal dynamics by
providing free parameters in the desired output trajectory for the cart position. A feedback control is designed with linear methods to stabilize
the swing-up maneuver. The emphasis of the paper is on the experimental realization of the double pendulum swing-up, which reveals the
accuracy of the feedforward/feedback control scheme.
䉷 2006 Elsevier Ltd. All rights reserved.
Keywords: Double pendulum swing-up; Nonlinear feedforward control; Boundary value problem; Input–output normal form; Linear feedback control;
Experiment
The swing up within a finite time interval t ∈ [0, T ] requires ∗ (0) = [−, −]T , ∗ (T ) = 0, ˙ ∗ |t=0,T = 0.
(10)
to steer the double pendulum from the initial downward equi-
librium Note that the second time derivative ÿ ∗ (t) of the output tra-
jectory serves as input to (9). Obviously, the BVP (9)–(10) of
y(0) = 0, ẏ(0) = 0, (0) = [−, −]T , ˙
(0) =0 (6) the internal dynamics is overdetermined by eight BCs for two
second-order ODEs.
to the terminal upward equilibrium The basic idea of the approach presented in Graichen et al.
˙ ) = 0. (2005) is to provide a sufficient number of four free parameters
y(T ) = 0, ẏ(T ) = 0, (T ) = 0, (T (7)
in the internal dynamics (9), which are required for its solvabil-
The internal dynamics (5) is weakly asymptotically stable in ity. Thereby, the parameters p = (p1 , . . . , p4 ) are provided in a
the downward equilibrium and unstable in the upward position. setup function Υ (t, p) for the output trajectory y ∗ (t) = Υ (t, p).
The ODEs (4)–(5) together with the boundary conditions
(BCs) (6)–(7) form a nonlinear two-point BVP for the states 3.2. Output trajectory setup with free parameters
˙
y(t), ẏ(t) and (t), (t) that depends on the input trajectory
u(t). Its determination is the main objective of the feedforward The output trajectory y ∗ (t) = Υ (t, p) has to satisfy the four
control design. BCs (6)–(7), which implies that the output trajectory must be at
The swing-up time T is an important parameter and mainly least once differentiable,2 i.e. y ∗ (t) ∈ C1 . The setup function
depends on the constraints (1) and the system dynamics (4)–(5). Υ (t, p) is constructed using the cosine series
If T is chosen too small, the cart may violate the constraints.
5
On the other hand, the swing up is not possible arbitrarily t it
slowly due to the fact that no quasi-stationary connection exists Υ (t, p) = a0 + a1 cos + pi−1 cos , (11)
T T
between the downward and upward equilibria, i.e. they are not i=2
connected by a set of equilibria in between. Hence, the swing-
with the free parameters p = (p1 , . . . , p4 ) as the coefficients
up time T has to be determined appropriately in course of the
for the cosine terms with the highest frequencies. The remain-
feedforward control design.
ing coefficients a0 = −p1 − p3 and a1 = −p2 − p4 follow
from solving the equations stemming from the BCs Υ (0, p)=0
3. Nonlinear feedforward control design and Υ (T , p) = 0. Note that the cosine series directly satisfies
Υ̇ (0, p)= Υ̇ (T , p)=0 due to the sine terms occuring in the first
The inversion-based feedforward control design (Devasia, time derivative Υ̇ (t, p). Other possible choices for the setup
Chen, & Paden, 1996; Graichen et al., 2005) uses the of Υ (t, p) are e.g. polynomials or spline functions (Graichen,
input–output coordinates of the considered system. In case of Treuer, & Zeitz, 2005).
the double pendulum, the system (4)–(5) is already given in
input–output normal form (Isidori, 1995) with the cart position Remark 1. The swing-up time T is directly affected by the
y as the output and the relative degree r = 2. The respective choice of the setup function y ∗ (t) = Υ (t, p). For instance, the
cart ODE (4) represents the input–output dynamics, and the number of times that the output y ∗ (t) passes through zero
ODEs (5) of the angles 1 , 2 form the internal dynamics of (“swinging” of the cart) is limited by the highest frequency
order n − r = 4. of Υ (t, p) in (11). This corresponds to certain regions of T
The feedforward control is obtained by inverting the where solutions for the swing-up problem exist. Alternatively,
input–output dynamics (Devasia et al., 1996; Graichen et al., the swing-up time can also be treated as a free parameter
2005). In view of (4), the feedforward control1 (via time transformation) with the remaining three parameters
p=(p1 , p2 , p3 ) in the setup function Υ (t, p), see Graichen and
u∗ (t) = ÿ ∗ (t) (8)
Zeitz (2005a,b).
is simply the second time derivative of the desired output tra-
jectory y ∗ (t). 3.3. Numerical results
3.1. BVP of the internal dynamics The numerical solution of the BVP (9)–(11) is a standard
task in numerics. A particularly convenient way is to use the
˙ ∗ (t) of the
In order to determine the trajectories ∗ (t) and MATLAB function bvp4c3 designed for the solution of two-
angles, the internal dynamics (5) can be rewritten by inserting point BVPs with unknown parameters. The initial guesses
the feedforward control (8), i.e.
¨ ∗ = (∗ ,
˙ ∗ , ÿ ∗ ) (9)
2 If the feedforward control (8) has to be continuous at the transition
bounds t = 0 and T , two further BCs ÿ(0) = ÿ(T ) = 0 have to be satisfied
by the output trajectory y ∗ (t) ∈ C2 , see e.g. Graichen et al. (2005).
1 The asterisk “∗” signifies the feedforward variables. 3 http://www.mathworks.com/bvp_tutorial
66 K. Graichen et al. / Automatica 43 (2007) 63 – 71
1
2
0.5
∗ [rad]
y∗ [m]
0 0
1
-0.5 -2
0 0.5 1 1.5 2 2.5 0 0.5 1 1.5 2 2.5
2 3
y∗ [m/s]
∗ [rad]
0 1 T = 1.8 s
2
0 T = 2.2 s
-2 T = 2.5 s
-1
0 0.5 1 1.5 2 2.5 0 0.5 1 1.5 2 2.5
time [s]
20
Free parameters p = (p1,..., p4)
u∗ = y∗ [m/s2]
Fig. 2. Nominal trajectories and parameters p for the swing-up of the double pendulum in three different times T.
for the trajectories ∗ (tk ) and ˙ ∗ (tk ) are linear interpolations controller. In the context of the two-degree-of-freedom control
between the BCs (10) on a uniform time mesh with 30 points scheme in Fig. 4, the feedforward control FF is supported by a
tk ∈ [0, T ], k = 1, . . . , 30. The initial guess of the free pa- state feedback control FB with an observer in order to stabi-
rameters p is pi = 0, i = 1, . . . , 4. bvp4c returns the trajec- lize the system along the nominal trajectories x∗ (t) provided
tories ∗ (t) = [∗1 (t), ∗2 (t)]T and the parameter set p, which by the signal generator ∗ . Thereby, a highly accurate feed-
yields the output trajectory y ∗ (t)=Υ (t, p) and the feedforward forward control FF is necessary in order to minimize the de-
control (8), i.e. u∗ (t) = Ϋ (t, p). mands on the feedback part during the swing-up. The accuracy
Fig. 2 shows the nominal trajectories for the swing-up ma- of the nonlinear feedforward control can be enhanced by an
neuver for different swing-up times T ∈ {1.8, 2.2, 2.5} s. The optimization-based adjustment of the mechanical parameters in
significant influence of the swing-up time T is particularly Table 1 with respect to the open-loop experimental results for
apparent in the cart trajectories y ∗ (t), ẏ ∗ (t), and ÿ ∗ (t). For the nominal feedforward control u∗ (t). The experimental setup
T = 1.8 s, the cart movement y ∗ (t) is very limited, but the and the above mentioned points are addressed in the following
maximum acceleration maxt ÿ ∗ (t) = 19 m/s2 almost hits the re- subsections.
spective constraint in (1). In contrast to this, the swing-up time
T = 2.5 s leads to a different swing-up motion with the large 4.1. Experimental construction of the double pendulum
cart displacement maxt y ∗ (t) = 1 m. A good trade-off between
the maximum amplitudes of the trajectories y ∗ (t), ẏ ∗ (t), ÿ ∗ (t) The swing-up maneuver is experimentally realized with the
with respect to the constraints (1) is obtained for T = 2.2 s, double pendulum in Fig. 5 corresponding to the model parame-
which is therefore chosen as swing-up time for the experimen- ters in Table 1 and the cart constraints (1).4 The incremental an-
tal implementation. gle encoders at the two joints have a resolution of 2/8192 rad
Fig. 3 shows snapshots of the pendulum for the swing-up and transmit their information through optical links in the joints
time T =2.2 s to illustrate its motion. It is interesting to mention to reduce friction. The cart is driven by a toothed belt connected
that both arms of the pendulum are in a hinged position during to a synchronous motor. The control algorithm is implemented
the swing-up (see sequences 2 and 3 in Fig. 3) and only stretch on a 933 MHz computer with real-time Linux and the sampling
close to the upward “inverted” position.
-0.6 -0.3 0 0.3 0.6 -0.6 -0.3 0 0.3 0.6 -0.6 -0.3 0 0.3 0.6 -0.6 -0.3 0 0.3 0.6
y [m] y [m] y [m] y [m]
-0.6 -0.3 0 0.3 0.6 -0.6 -0.3 0 0.3 0.6 -0.6 -0.3 0 0.3 0.6 -0.6 -0.3 0 0.3 0.6
y [m] y [m] y [m] y [m]
Fig. 3. Snapshots of the nominal swing-up maneuver for T = 2.2 s (see Fig. 2) depicted in eight sequences with increasing darkness of the snapshots as time
increases during the respective sequence.
y* u*
ΣFF
x* Δu u
Σ* ΣFB Σ
-
x̂ ^
Σ
nominal Table 3
open-loop Adjusted mechanical parameters for the open-loop swing-up of the double
default parameters adjusted parameters
1 1 pendulum in Fig. 6.
0 0
-1 -1 Pendulum link Inner Outer
φ2 [rad]
φ2 [rad]
-2 -2 i=1 i=2
-3 -3
Distance to center of gravity ai (m) 0.186 0.195
-4 -4 Mass mi (kg) 0.881 0.551
-5 -5 Moment of inertia Ji (N m s2 ) 0.0141 0.0177
0 0.5 1 1.5 2 0 0.5 1 1.5 2 Friction constant di (N m s) 0.0034 0.0016
time [s] time [s]
1 1
0 0
-1 -1 than the swing-up time T = 2.2 s, because the dynamics (13)
φ1 [rad]
φ1 [rad]
According to the two-degree-of-freedom control scheme in of the gains ki (t) for t ∈ [0.5, 1.2] s pose significant de-
Fig. 4, the control mands on the closed-loop control of the double pendulum
leading to large displacements of the cart position y. More-
u = u∗ + kT (t)(x∗ − x̂) (15) over, the linear time-varying system (14) looses its con-
trollability (see e.g. Silverman & Meadows, 1967; Kailath,
comprises the feedforward control u∗ (t) and the feedback part
1980) several times in this time interval. Due to these rea-
u = kT (t)(x∗ − x̂). The calculation of the time-varying feed-
sons, the feedback control is turned off for t ∈ [0.6, 1.1] s by
back gains k(t) is based on an optimal LQ (linear quadratic)
setting the gain vector k(t) to zero. In the bordering intervals
feedback design which minimizes the objective functional
t ∈ [0.5, 0.6] s and t ∈ [1.1, 1.2] s, the gains ki (t) are linearly
T interpolated between zero and the respective gains values at
I = xT (T )Mx(T ) + (xT Qx + uRu) dt, (16) t = 0.5 and 1.2 s in order to smoothly switch on/off the feed-
0
back control. Hence, during the time interval t ∈ [0.6, 1.1] s,
with the symmetric positive semidefinite matrices M ∈ R7×7 , the pendulum is steered along the nominal trajectories x∗ (t)
Q ∈ R7×7 and the positive scalar R > 0. The solution P (t), by the feedforward control without a stabilizing feedback
t ∈ [0, T ] of the Riccati ODE, see e.g. Kwakernaak and Sivan control.
(1972), Bertsekas (2000),
4.4. Experimental results
Ṗ = −P A(t) − A(t)T P + P b(t)R −1 b(t)T P − Q,
P (T ) = M (17) The implementation of the closed-loop control (15) re-
determines the feedback gains quires full state information of the double pendulum. A Lu-
enberger observer (O’Reilly, 1983) based on the nonlinear
k(t) = R −1 b(t)T P (t). (18) model (4)–(5) is used for the state estimation x̂, see Fig. 4.
The error dynamics of the observer is designed by eigenvalue
The weighting matrices in (16) are chosen to Q = diag assignment point-wise in time with the linearized pendulum
(50, 0, 500, 500, 0, 0, 10) and R = 5 (with consistent units). model.
The choice of the terminal condition P (T )=M for the reverse- Fig. 8 shows the experimental and nominal trajectories of
time integration of the Riccati equation (17) is a degree-of- the angles (t), the cart y(t), ẏ(t), and the input u(t) = ÿ(t)
freedom in the LQ-design. Thereby, the matrix M ∈ R7×7 is for open-loop (also see Fig. 6b) and closed-loop control of
determined by solving the algebraic Riccati equation follow- the swing-up maneuver. As pointed out in Section 4.2, the
ing from (17) with Ṗ = 0. The MATLAB function lqr of the open-loop trajectories reveal the good accuracy of the designed
control system toolbox is used to calculate M, whereas the feedforward control u∗ (t), but the pendulum drifts away at-
reverse-time integration of (17) is performed with a standard approximately t = 1.5 s when it approaches the unstable up-
ODE solver of MATLAB. ward position. In closed-loop mode, the feedback control is
Fig. 7 shows the time-varying feedback gains ki (t), smoothly turned on at t = 1.1 s and stabilizes the pendulum
i = 1, . . . , 7 in the time interval t ∈ [0, T ] for the swing- along the nominal trajectories. The correction u of the sta-
up maneuver with T = 2.2 s. During the time interval t ∈ bilizing feedback in (15) is less than 3 m/s2 , which reveals
[0.5, 1.2] s, the gains ki (t) oscillate and change the signs sev- the quality of the feedforward control u∗ (t) and the effective-
eral times. Although the LQ design provides optimal feedback ness of the parameter adjustment by solving the optimization
gains ki (t) for minimizing the cost functional (16) over the problem (12)–(13).
time interval t ∈ [0, T ], the large gradients and magnitudes The experimental swing up is easily repeatable without
nameable performance loss after several successive swing-up
and swing-down maneuvers (along the accordingly calculated
300 feedback OFF
swing-down trajectories). Although the double pendulum is
200 a highly sensitive system, simulations and experiments have
k3 [m/s2] shown that the control scheme is robust enough to deal with
feedback gains
1
0.5 feedback OFF
0
y [m]
0 -1
2 [rad]
-2
feedback OFF
-0.5 -3
-0.5 0 0.5 1 1.5 2 2.5 3 nominal
-4 open-loop
2
-5 closed-loop
0 time [s]
1
-2
0
-0.5 0 0.5 1 1.5 2 2.5 3
-1
1 [rad]
10 -2
u = y¨ [m/s2]
-3
0
-4
-10 -5
-0.5 0 0.5 1 1.5 2 2.5 3 -0.5 0 0.5 1 1.5 2 2.5 3
time [s] time [s]
Fig. 8. Measured and nominal trajectories of the cart y(t), ẏ(t), u = ÿ(t) and the angles 1 (t), 2 (t) for the swing-up of the double pendulum.
two-point BVP with free parameters is solved numerically Furuta, K., Kajiwara, H., & Kosuge, K. (1980). Digital control of a double
with the MATLAB function bvp4c. The mechanical parame- inverted pendulum on an inclined rail. International Journal of Control,
ters of the double pendulum are adjusted with respect to the 32, 907–924.
Graichen, K., Hagenmeyer, V., & Zeitz, M. (2005). A new approach
measured open-loop trajectories of the swing up, in order to inversion-based feedforward control design for nonlinear systems.
to increase the accuracy of the feedforward trajectories. Ex- Automatica, 41, 2033–2041.
perimental results of the swing-up maneuver reveal the high Graichen, K., Treuer, M., & Zeitz, M. Fast side-stepping of the triple
performance and accuracy of the tracking control with the inverted pendulum via constrained nonlinear feedforward control design.
In Preprints 44th IEEE conference on decision and control & European
nonlinear feedforward and linear feedback control. The applied
control conference (CDC-ECC) (pp. 1096–1101), Sevilla, Spain, 2005.
feedforward control design also allows to account for input Graichen K., & Zeitz, M., (2005a). Feedforward control design for nonlinear
constraints (Graichen & Zeitz, 2005a), which is of importance systems under input constraints. In T. Meurer, K. Graichen, and E.D.
e.g. for mechatronic systems with physical limitations of the Gilles, (Eds.), Control and observer design for nonlinear finite and infinite
actuators. dimensional systems, Lecture Notes in Control and Information Sciences
Vol. 322, (pp. 235–252) Berlin: Springer.
Graichen, K., & Zeitz, M. (2005b). Nonlinear feedforward and feedback
Acknowledgments tracking control with input constraints solving the pendubot swing-up
problem. In Preprints 16th IFAC world congress, Prague, CZ.
Hasomed GmbH. Double/triple pendulum: Product documentation.
The authors gratefully acknowledge the constructive com- Magdeburg, Germany. www.hasomed.de.
ments of the reviewers as well as the valuable support of Torsten Huang, C.-I., & Fu, L.-C. (2003). Passivity based control of the double
Nutsch with the double pendulum. inverted pendulum driven by a linear induction motor. In Proceedings of
conference on control applications (CCA) (pp. 797–802). Istanbul, Turkey.
Isidori, A. (1995). Nonlinear control systems. 3rd ed., Berlin: Springer.
Kailath, T. (1980). Linear systems. Englewood Clifffs, NJ: Prentice-Hall.
References Kwakernaak, H., & Sivan, R. (1972). Linear optimal control systems. New
York: Wiley-Interscience.
Anderson, M. J., & Grantham, W. J. (1989). Lyapunov optimal feedback Mori, S., Nishihara, H., & Furuta, K. (1976). Control of unstable mechanical
control of a nonlinear inverted pendulum. Journal of Dynamic Systems, system—control of pendulum. International Journal of Control, 23,
Measurement, and Control, 111, 554–558. 673–692.
Åström, K. J., & Furuta, K. (2000). Swinging up a pendulum by energy O’Reilly, J. (1983). Observers for linear systems. London: Academic Press.
control. Automatica, 36, 287–295. Rubí, J., Rubio, Á, & Avello, A. (2002). Swing-up control problem for a
Bertsekas, D. P. (2000). Dynamic programming and optimal control. 2nd ed., self-erecting double pendulum. IEE Proceedings of Control Theory and
Belmont, MA: Athena Scientific. Applications, 149, 169–175.
Devasia, S., Chen, D., & Paden, B. (1996). Nonlinear inversion-based output Silverman, L. M., & Meadows, H. E. (1967). Controllability and observability
tracking. IEEE Transactions on Automatic Control, 41, 930–942. of time-variable linear systems. SIAM Journal on Control, 5, 64–73.
Fantoni, I., Lozano, R., & Spong, M. W. (2000). Energy based control of the Spong, M. W. (1995). The swing up problem for the acrobot. IEEE Control
pendubot. IEEE Transactions on Automatic Control, 45, 725–729. Systems Magazine, 15, 49–55.
K. Graichen et al. / Automatica 43 (2007) 63 – 71 71
Wiklund, M., Kristenson, A., & Åström, K.J. (1993). A new strategy for Michael Treuer (1978) received his Master of
swinging up an inverted pendulum. In Proceedings of 12th IFAC world Engineering from the University of Auckland,
New Zealand in 2004 and his Diploma degree
congress Vol. 9, (pp. 151–154).
in Engineering Cybernetics from Universität
Yamakati, M., Iwashiro, M., Sugahara, Y., & Furuta, K. (1995). Robust Stuttgart (Germany) in 2005. For his Diploma
swing up control of double pendulum. In Proceedings of American control thesis on constrained feedforward control de-
conference (ACC) (pp. 290–295). Seattle, Washington. sign for multiple-link pendulums he received
Yamakati, M., Nonaka, K., & Furuta, K. (1993). Swing up control of the Award of the University’s Alumni Founda-
double pendulum. In Proceedings of American control conference (ACC) tion (“Freunde der Universität Stuttgart”). Since
(pp. 2229–2233). San Francisco, California. graduation he is a research assistant and Ph.D.
Zhong, W., & Röck, H. (2001). Energy and passivity based control of the student at the Institute of Process Engineer-
double inverted pendulum on a cart. In Proceedings of conference on ing and Power Plant Technology, Universität
Stuttgart.
control applications (CCA) (pp. 896–901). Mexico City, Mexico.
His main research interests are nonlinear modeling, simulation and control
design with applications to power plants and power systems.