Adaptive Dynamic Programming-Based Fixed-Time Optimal Control For Wheeled Mobile Robot
Adaptive Dynamic Programming-Based Fixed-Time Optimal Control For Wheeled Mobile Robot
1, JANUARY 2025
Abstract—In this study, the adaptive dynamic programming trajectory tracking problem of wheeled mobile robots [2], [3],
(ADP)-based fixed-time optimal trajectory tracking control is in- [4], [5], [6], [7]. Considering the possibility of slippage when
vestigated for wheeled mobile robots. An ADP-based fixed-time the robot moves in slopes or uneven terrain, the authors in [2]
optimal tracking controller is developed based on the critic-only and [3] proposed adaptive control algorithms with an adaptive
neural network ADP technique, which guarantees the robot track compensation mechanism. To solve the difficulties posed by
the desired trajectory in fixed time. Firstly, to address the solution
difficulty of the Hamilton-Jacobi-Bellman (HJB) equation, a critic
lateral and longitudinal slippage, Chen et al. [4] proposed an im-
neural network is used to estimate the cost function. Meanwhile, proved linear active disturbance rejection control algorithm for a
a weight update law is designed by using the adaptive control wheeled robot with six wheels. For unicycle mobile robots with
technique, which not only removes the persistent or finite excita- perturbation parameters, two sliding-mode control algorithms
tion condition, but also enables the fixed-time convergence of the were proposed in [5] and [6] to achieve trajectory tracking.
weight estimation error. By using the proposed controller, all error Additionally, considering wheeled mobile robots with unknown
variables can converge to a neighborhood of zero in fixed time. disturbances and unmeasurable velocities, a robust tracking
Finally, both simulations and physical experiments indicate that control algorithm based on an extended observer was introduced
the proposed ADP-based fixed-time optimal controller has a faster in [7]. It is important to note that the above research mainly
convergence rate compared to the two comparison controllers. focuses on handling unknown disturbances, slippage, unmeasur-
Index Terms—Wheeled mobile robots, adaptive dynamic able velocity information, and model parameter perturbations.
programming (ADP), fixed-time optimal control, critic neural However, these approaches only lead to asymptotic convergence
networks. of the system, meaning that the tracking error theoretically takes
an infinite amount of time to converge to the equilibrium point
I. INTRODUCTION or a neighborhood near it.
HE wheeled mobile robot, with its superior mobility and In recent years, a new control method, finite-time control,
T flexibility, is gradually becoming an indispensable tool
in various fields of modern society. Its efficient fulfillment of
which ensures system convergence within a finite time, has
gained popularity among scholars [8], [9], [10]. In [8], a fuzzy
finite-time control method was proposed for nonlinear systems
tasks requires the precise and stable support of a powerful
control algorithm. However, the underactuated and nonlinear with actuator failures. In [9], a fast finite-time control scheme
characteristics of the wheeled mobile robot introduce numerous was proposed for nonlinear systems with full-state constraints
complexities and challenges to the development of the control to improve the convergence speed when the energy function
algorithms [1]. value is relatively small. Later, this approach was extended
Trajectory tracking control, aimed at guiding wheeled mobile to consensus control of nonlinear multiagent systems [10]. In
robots to follow a preset reference signal, has been a signif- addition, the finite-time control methods for wheeled mobile
icant focus in the control community. Over the past decade, robots can be found in [11] and [12]. Li et al. [11] proposed
many control algorithms have been developed to address the a finite-time control method for time-varying systems and ap-
plied it to the trajectory tracking control of wheeled mobile
robots. In contrast to [11] which employs full-state feedback,
Received 30 July 2024; accepted 10 November 2024. Date of publication 21 Wu et al. [12] developed a finite-time output feedback tracking
November 2024; date of current version 29 November 2024. This article was rec- control method. It has been demonstrated in numerous studies
ommended for publication by Associate Editor K. Tahara and Editor C. Gosselin
upon evaluation of the reviewers’ comments. This work was supported in part by
that finite-time control exhibits better robustness, faster con-
the National Natural Science Foundation of China under Grant 52175046 and vergence, and superior anti-interference capability compared to
Grant 51939001 and in part by Sichuan Science and Technology Program under non-finite-time control methods [13], [14], [15]. However, there
Grant 2024YFFK0037 and Grant 2024ZYD0165. (Corresponding author: Qing exists an undeniable fact that the convergence time is highly
Guo.) dependent on the initial system states. To address this, fixed-time
Chen Wang, Haoran Zhan, and Qing Guo are with the School of Aero-
nautics and Astronautics, University of Electronic Science and Technol- control was proposed in [16], where the convergence time has
ogy of China, Chengdu 611731, China (e-mail: [email protected]; an upper bound independent of the system’s initial states. This
[email protected]; [email protected]). control method has since been rapidly developed in the study
Tieshan Li is with the School of Automation Engineering, University of of nonlinear systems [17], [18], [19], [20], [21]. For instance,
Electronic Science and Technology of China, Chengdu 611731, China (e-mail:
[email protected]).
Wang et al. [17] proposed a fuzzy fixed-time control scheme for
This letter has supplementary downloadable material available at higher-order nonlinear systems with sensor and actuator faults.
https://doi.org/10.1109/LRA.2024.3504314, provided by the authors. The scheme was later extended to handle time-varying full-state
Digital Object Identifier 10.1109/LRA.2024.3504314 constraints and actuator hysteresis [18]. Fixed-time control has
2377-3766 © 2024 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See https://www.ieee.org/publications/rights/index.html for more information.
Authorized licensed use limited to: Macquarie University. Downloaded on November 30,2024 at 03:12:42 UTC from IEEE Xplore. Restrictions apply.
WANG et al.: ADP-BASED FIXED-TIME OPTIMAL CONTROL FOR WHEELED MOBILE ROBOT 177
Authorized licensed use limited to: Macquarie University. Downloaded on November 30,2024 at 03:12:42 UTC from IEEE Xplore. Restrictions apply.
178 IEEE ROBOTICS AND AUTOMATION LETTERS, VOL. 10, NO. 1, JANUARY 2025
(4) is rewritten as Remark 1: The analytical solution of the optimal control law
(12) requires solving the partial derivative function ∇J (z)
ż = A(z) + B(z)uo (5a)
through the HJB (13). Unfortunately, solving the (13) directly
T can be quite challenging, thereby rendering it impractical to
A(z) = [ωr zy vr sin zθ − ωr zx 0] (5b)
directly utilize the optimal control law (12) to achieve optimal
T
−1 0 0 performance for the error system (5).
B(z) = (5c)
zy −zx −1
B. ADP-Based Fixed-Time Optimal Tracking Controller
where A(z) is locally Lipchitz with respect to z, and it is Design
assumed that there exists a positive constant ψB such that
B(z) < ψB [28], [33]. For convenience, A(z) and B(z) will To overcome the difficulty of solving the HJB equation, a critic
be abbreviated to A and B, respectively. neural network is used to estimate the optimal cost function
J (z) defined on a compact set z ∈ Θ ∈ R3×1 . It can be
III. ADP-BASED FIXED-TIME OPTIMAL CONTROL DESIGN described as
The design of controller and system stability analysis requires J (z) = ϕT (z)W + σ(z) (14)
the support of some mathematical lemmas, which are given as
where W = [W1 , W2 , . . . , Wl ]T ∈ Rl×1 is the ideal weight
follows.
vector; σ(z) ∈ R is the estimation error; ϕ(z) = [ϕ1 (z),
Lemma 1 [20]: For ∀ ∈ R, ∀¯ ∈ R, o > 0, μ > 0, and 1 −
ϕ2 (z), . . . , ϕl (z)]T ∈ Rl×1 is the basis function vector.
μ > 0, one has
Hence, the optimal control law (12) can be reformulated as
−µ
|μ ||1−μ ≤ (1 − μ)o 1−µ || + μo|¯
|¯ |. (6) 1
uo = − R−1 BT ∇ϕT (z)W + ∇σ(z) (15)
Lemma 2 [20]: For bi > 0, i = 1, 2, . . . , n, 0 < a1 < 1, and 2
a2 ≥ 1, one has
where ∇ϕT (z) = [∂ϕ(z)/∂z]T ∈ R3×l , ∂ϕ(z)/∂z = [∂ϕ1
a1 a2
n n n n (z)/∂z, ∂ϕ2 (z)/∂z, . . . , ∂ϕl (z)/∂z]T ∈ Rl×3 , ∂ϕi (z)/∂z
bi ≤ bai 1 , n1−a2 bi ≤ bai 2 . (7) = [∂ϕi (z)/∂zx , ϕi (z)/∂zy , ϕi (z)/∂zθ ]T ∈ R3×1 , i = 1, 2,
i=1 i=1 i=1 i=1 . . . , l, and ∇σ(z) = ∂σ(z)/∂z ∈ R3×1 .
Lemma 3 [36]: For ∀a1 ∈ R and a2 > 0, one has Since W is unknown, the corresponding estimation Ŵ =
[Ŵ1 , Ŵ2 , . . . , Ŵl ]T ∈ Rl×1 is introduced. Thus, the estimation
a1
0 ≤ |a1 | − a1 tanh ≤ 0.2785a2 . (8) of uo is
a2
1
A. Ideal Optimal Controller ûo = − R−1 BT ∇ϕT (z)Ŵ. (16)
2
The control objective is to design the optimal control law uo Meanwhile, Ŵ is updated by the adaptive update law
that effectively minimizes the prescribed cost function
∞
T ˙ 1 T
Ŵ = Γ ∇ϕ(z)BR−1 BT z − κ1 Ŵ − κ2 Ŵ Ŵ Ŵ
J (z) = uo (z)Ruo (z) + z T (τ )Qz(τ ) dτ (9) 2
t (17)
where R ∈ R2×2 and Q ∈ R3×3 are positive definite matrices. where Γ ∈ Rl×l is a positive definite matrix, and κ1 and κ2 are
Assuming that the optimal control law is uo which is C1 , one positive constants.
can obtain the optimal cost function Based on the above design, the ADP-based fixed-time optimal
∞ tracking controller is constructed as follows:
T
J (z) = uo (z)Ruo (z) + z T (τ )Qz(τ ) dτ. (10) † z p
t uo = ûo − B λtanh + μz + αz q + βz 3 (18)
ρ
Then, the HJB equation is obtained as follows:
H (z, uo , ∇J (z)) = ∇J T (z) (A + Buo ) + z T Qz where B† the pseudo-inverse of B; λ, μ, α, and β are
positive definite matrices with appropriate dimensions;
+ uT
o Ruo = 0 (11) tanh(z/ρ) = [tanh(zx /ρ), tanh(zy /ρ), tanh(zθ /ρ)]T ,
p/q p/q p/q
z 3 = [zx3 , zy3 , zθ3 ]T , z p/q = [zx , zy , zθ ]T ; p and q are
where ∇J T (z) = [∂J (z)/∂z]T ∈ R1×3 , ∂J (z)/∂z = positive odd integers satisfying p < q.
[∂J (z)/∂zx , ∂J (z)/∂zy , ∂J (z)/∂zθ ]T ∈ R3×1 . Up to now, the fixed-time optimal controller has been de-
According to ∂H(z, uo , ∇J (z))/∂uo = 02×1 , the ideal signed. To gain a deeper understanding of the controller’s struc-
optimal control law is derived as ture, its primary framework is presented in Fig. 2.
1
uo = − R−1 BT ∇J (z). (12)
2 C. System Stability Analysis
Finally, applying (12) to (11) yields Theorem 1: Consider the tracking error system described by
1 (5a)–(5c) for the nonholonomic mobile robot. The proposed
H (z, uo , ∇J (z)) = − ∇J T (z)BR−1 BT ∇J (z) ADP-based fixed-time optimal tracking controller (18) can sta-
4
bilize the origin of the error system in fixed time independent
+ ∇J T (z)A + z T Qz. (13) of the initial condition. Specifically, the tracking error and the
Authorized licensed use limited to: Macquarie University. Downloaded on November 30,2024 at 03:12:42 UTC from IEEE Xplore. Restrictions apply.
WANG et al.: ADP-BASED FIXED-TIME OPTIMAL CONTROL FOR WHEELED MOBILE ROBOT 179
with
p+q
1 q−p p + q q−p
ς = κ1 W T W +
2 2q 2q
l
κ2 Wi4 3κ2 Wi4
+ + . (23)
i=1
12 4i4
1 −1 T p 2
+ R B ∇σ(z) + A − μz − αz q − βz 3
2 z
−λtanh ≤ 0.2785ριT λι. (24)
z T ˙ ρ
− λtanh − W̃ Γ−1 Ŵ
ρ Then, substituting (24) into (22) yields
p+q 2
T 1 ˙
∇ϕ(z)BR−1 BT z − Γ−1 Ŵ
2q
= W̃ 1 T −1 1 T −1
2 V̇ ≤ − δ1 W̃ Γ W̃ − φ1 W̃ Γ W̃
2 2
1 2q
p+q 2
+ z T Buo + A + BR−1 BT ∇σ(z) 1 T 1 T
2 − δ2 z z − φ2 z z + ς¯
2 2
z p
−μz − λtanh − αz q − βz 3 . (20) ≤ −δV
p+q
− φV 2 + ς¯ (25)
ρ 2q
Then, substituting (17) into (20) yields where δ2 = 2(p+q)/2q ϑmin (α), φ2 = 4ϑmin (β), ς¯ =
T T ς + 0.2785ριT λι, δ = min{δ1 , δ2 }, φ = 2 min{φ1 , φ2 },
T
V̇ = κ1 W̃ Ŵ + κ2 W̃ Ŵ Ŵ Ŵ and ϑmin (α) and ϑmin (β) are the minimum eigenvalues of α
and β, respectively.
According to the Lemma 2.2 in [37], the Lyapunov function
1
+ z Buo + A + BR−1 BT ∇σ(z)
T
V can converge to the region described as
2
2q 1
z p ς¯ p+q
ς¯ 2
Based on the inequalities (39) and (42) obtained in appendix, and the convergence time T (z(0), W̃(0)) satisfies
we can get
2q 1
p+q 2 T (z(0), W̃(0)) ≤ Tmax := + (27)
1 T −1 2q
1 T −1 δε (q − p) φε
V̇ ≤ − δ1 W̃ Γ W̃ − φ1 W̃ Γ W̃
2 2 where ε ∈ (0, 1), z(0) and W̃(0) are the initial errors.
According to (19) and (27), as t → Tmax , the robot’s tracking
1
+ z T Buo + BR−1 BT ∇σ(z) + A − μz error and neural network weight estimation error satisfy
2 ⎧ ⎫
⎨√ 1
q
4⎬
z p ς¯ p+q
ς
4¯
−λtanh − αz q − βz 3 + ς (22) z ≤ min 2 , (28)
ρ ⎩ (1 − ε)δ (1 − ε)φ ⎭
Authorized licensed use limited to: Macquarie University. Downloaded on November 30,2024 at 03:12:42 UTC from IEEE Xplore. Restrictions apply.
180 IEEE ROBOTICS AND AUTOMATION LETTERS, VOL. 10, NO. 1, JANUARY 2025
√ ς̄ p+q 4 1
q IV. SIMULATIONS AND EXPERIMENTS
4ς̄
min 2 (1−ε)δ , (1−ε)φ
In order to verify the feasibility of the proposed optimal
W̃ ≤ (29) controller, simulations and physical experiments are conducted.
ϑmin (Γ−1 ) We compare the proposed controller with two other controllers.
One of them is the classical backstepping controller (marked as
where ϑmin (Γ−1 ) represents the maximum eigenvalue of Γ−1 . BC) which is designed as follows [35]:
Then, the error between the ideal optimal controller (15) and T
the ADP-based fixed-time optimal controller (18) needs to be v ξ1 zx + vr cos zθ
= (31)
analyzed. According to (16) and (18), one has ω ωr + ξ2 vr zy + ξ3 vr sin zθ
where ξ1 , ξ2 , and ξ3 are positive constants; zθ , zx , and zy are
1 given in (3). The other one is the ADP-based optimal controller
uo − uo = R−1 BT ∇ϕT (z)W̃ + ∇σ(z) (marked as ADP-UUB) with uniformly ultimately bounded sta-
2
bility, which is obtained by modifying the proposed controller.
z p
To be specific, replace (17) and (18) with
− B† λtanh + μz + αz q + βz 3 .
ρ
˙ 1
(30) Ŵ = Γ ∇ϕ(z)BR−1 BT z − κ1 Ŵ (32)
2
and
Notice that B, B† , ∇ϕT (z), and ∇σ(z) are function vectors
with respect to z. Moreover, according to (28) and (29), z and z
uo = ûo − B† λtanh + μz + αz (33)
W̃ are bounded. Hence, it follows that uo − uo is bounded, ρ
which implies that the fixed-time optimal controller uo designed where the relevant parameter definitions are the same as (17)
using critic neural networks can fully approximate the ideal and (18). In addition, all the remaining designs are consistent
optimal controller uo by adjusting the relevant customized with the proposed controller. To ensure the fairness of the com-
parameters. The proof of Theorem 1 is complete. parison, the control parameters of ADP-UUB and the proposed
Remark 2: In some existing studies on ADP-based optimal controller are chosen consistently in the follow-up simulations
control [32], [34], the neural network weight update law is and experiments.
designed based on the gradient descent algorithm, which re- Remark 5: Similar to the proof of Theorem 1, the ADP-UUB
quires the persistent/finite excitation condition. Although [38] allows the Lyapunov function (19) to satisfy the inequality
removed the above excitation condition by introducing a positive
V̇ ≤ −δ1 V + δ2 with δ1 > 0 and δ2 > 0, which implies that the
definite function, it still requires neural networks with the same
ADP-UUB could make the error system (4) uniformly ultimately
actor-critic structure as [34]. This complex structure is not
bounded. Moreover, since the ADP-UUB is designed in the
favorable for engineering applications. In contrast, only a critic
framework of optimal control, the system can also obtain optimal
neural network is used in this study, which not only removes the
control performance. It is worth mentioning that ADP-UUB
excitation condition, but also simplifies the controller structure.
allows the robot’s tracking error to converge exponentially [30].
Remark 3: Although the optimal control performance can be
Theoretically, as time tends to infinity, the tracking error con-
obtained by some existing ADP-based optimal control meth-
verges to a neighborhood of the origin. In contrast, the control
ods [28], [29], [30], the convergence speed of the system has
method proposed in this study can achieve tracking error con-
not been concerned. Specifically, they can only guarantee uni-
vergence in fixed time.
formly ultimately bounded stability, so the system takes take an
infinitely long time to enter the final stable state. In this study,
we concurrently consider both the convergence speed and the A. Simulation Results
optimal control performance of the system, ensuring that both The control task is to drive the mobile robot to move follow-
aspects are optimized for improved overall performance. ing the reference trajectory defined by [xr (0), yr (0), θr (0)]T =
Remark 4: Utilizing the Lyapunov stability theory, it has been [−2, −1.5, 0]T , vr = 0.3 m/s, and ωr is given as
shown that both the tracking error and the weight estimation ⎧
error of the system can converge to the compact sets described by ⎪ 0 rad/s 0 s ≤ t ≤ 15 s
⎨
(28) and (29), respectively. According to (27), the convergence 0.2 rad/s 15 s < t ≤ 5π + 15 s
ωr = . (34)
time of the system is bounded, and the boundary is independent ⎪
⎩ 0 rad/s 5π + 15 < t ≤ 5π + 30 s
of the system’s initial state. This is in contrast to the uniformly 0.2 rad/s 5π + 30 s < t ≤ 60 s
ultimately bounded stability and the finite-time stability [30],
[31], [32]. From (27)–(29), one can choose the relevant param- The initial position of the mobile robot is specified as
eters to achieve larger values of δ and φ and smaller values of [x(0), y(0), θ(0)]T = [−2.1, −1.6, π/20]T . The basis function
ς¯, thereby reducing the tracking and estimation errors. On the vector is designed as ϕ(z) = [zx2 , zy2 , zθ2 , zx zy , zy zθ , zx zθ ]T .
other hand, convergence can be accelerated by increasing the Hence, the corresponding partial derivative term is ∇ϕT (z) =
values of δ and φ. For instance, δ and φ can be increased by [2zx , 0, 0, zy , 0, zθ ; 0, 2zy , 0, zx , zθ , 0; 0, 0, 2zθ , 0, zy , zx ]. The
raising ϑmin (α), ϑmin (β), κ1 , and κ2 , and vice versa. However, initial value of the weight estimation vector is Ŵ(0) =
excessively large values of δ and φ may result in a large control [0, 0, 0, 0, 0, 0]T . The parameters of the proposed optimal con-
input, potentially causing system chatter or even instability. troller are chosen as follows: p = 17, q = 19, R = 2I 3 , Q =
Therefore, all parameters must be selected carefully to optimize 10I 3 , λ = μ = α = diag(1, 1, 0.4), β = diag(1e5, 1e5, 1)
the control performance. ρ = 10, Γ = 2I 6 , κ1 = κ2 = 0.1, where I N ∈ RN ×N is an
Authorized licensed use limited to: Macquarie University. Downloaded on November 30,2024 at 03:12:42 UTC from IEEE Xplore. Restrictions apply.
WANG et al.: ADP-BASED FIXED-TIME OPTIMAL CONTROL FOR WHEELED MOBILE ROBOT 181
Fig. 3. Simulation results: (a) tracking performance; (b) tracking error; (c) neural network weight estimation; (d) control input.
TABLE I
COMPARISON OF COST INDEXES IN SIMULATIONS
identity matrix. The parameters of BC are selected as ξ1 = 3 Fig. 4. Mobile robot experiment platform.
and ξ2 = ξ3 = 8.
The simulation outcomes pertaining to the dynamic per-
formance of the mobile robot are presented in Fig. 3. The TABLE II
COMPARISON OF COST INDEXES IN EXPERIMENTS
motion trajectory of the center of mass in the x − y plane
is shown in Fig. 3(a). Meanwhile, the position tracking error
and heading angle tracking error are depicted in Fig. 3(b). The
results show that the proposed ADP-based fixed-time optimal
tracking controller exhibits faster convergence and steady-state
accuracy compared to the BC and ADP-UUB controllers. The
learning process of neural network weight estimation is shown in
Fig. 3(c), and it shows that all the weight estimations converge
rapidly to different constants. The control input responses for network. The QBot 3 can receive real-time state information
the three controllers are shown in Fig. 3(d). To facilitate a more from the host computer, thus realizing closed-loop control.
accurate comparison of the control performance between the The desired trajectory of the robot is defined as
three controllers, we define two specific performance indexes. [xr (0), yr (0), θr (0)]T = [1.5, 0, π/2]T , vr = 0.3 m/s, ωr =
t 0.4 rad/s. The initial position of the robot is specified as
Specifically, one is the control cost Jc = t0f uT (τ )Ru(τ )dτ
t [x(0), y(0), θ(0)]T = [0.5, 0, π/2]T . The neural network basis
and the other is the error cost Je = t0f z T (τ )Qz(τ )dτ , where functions are chosen in the same way as in the simulation. The
t0 and tf represent the initial and terminal moments, respectively. parameters of the proposed optimal controller are chosen
The performance indexes of the three controllers at different time as follows: Ŵ(0) = [0, 0, 0, 0, 0, 0]T , p = 17, q = 19,
periods are shown in Table I. The results show that compared to R = 2I 3 , Q = 10I 3 , λ = μ = α = diag(0.1, 0.1, 0.2),
the other two controllers, the proposed controller requires more β = diag(200, 200, 0.3) ρ = 1, Γ = 2I 6 , and κ1 = κ2 = 0.1.
control cost in the first 30 s to achieve fast convergence of the The parameters of BC are selected as ξ1 = ξ2 = ξ3 = 1.5.
tracking error. However, the error cost of the proposed controller The experimental outcomes are depicted in Table II and Fig. 5.
is minimized for both time periods. It is obvious that all closed-loop system signals remain bounded.
According to the robot motion trajectory depicted in Fig. 5(a),
the proposed controller and ADP-UUB can enable the robot to
B. Experiment Results follow the desired trajectory rapidly, while the conventional BC
To test the performance of the proposed method in a practical cannot. Moreover, owing to the advantage of fixed-time stability,
application, an experimental platform as shown in Fig. 4 is built. the proposed controller exhibits a faster convergence speed than
The OptiTrack system consists of multiple cameras, which can that of the ADP-UUB. The tracking error is depicted by Fig. 5(b).
measure the position and attitude information of a robot in real Fig. 5(c) shows that the weight estimates of the neural network
time and transmit it to the host computer. The robot is QBot 3 converge to a stable state after a period of learning process.
manufactured by Quanser and is connected to the host computer The control signals are depicted by Fig. 5(d). Additionally, a
via a wireless network. The controller is constructed in the comparison of the cost indexes of the three controllers is shown
host computer and downloaded to the QBot 3 via the wireless in Table II. According to the table, the error indexes of the
Authorized licensed use limited to: Macquarie University. Downloaded on November 30,2024 at 03:12:42 UTC from IEEE Xplore. Restrictions apply.
182 IEEE ROBOTICS AND AUTOMATION LETTERS, VOL. 10, NO. 1, JANUARY 2025
Fig. 5. Experimental results: (a) tracking performance; (b) tracking error; (c) neural network weight estimation; (d) control input.
TABLE III
COST INDEXES FOR DIFFERENT CONTROL GAINS IN EXPERIMENTS
By the definition of Ŵ, one has
W̃i Ŵi3 = W̃i Wi3 + 3Wi W̃i2 − 3Wi2 W̃i − W̃i3 . (35)
Authorized licensed use limited to: Macquarie University. Downloaded on November 30,2024 at 03:12:42 UTC from IEEE Xplore. Restrictions apply.
WANG et al.: ADP-BASED FIXED-TIME OPTIMAL CONTROL FOR WHEELED MOBILE ROBOT 183
Hence, one can obtain the inequality [19] C. Wang, Q. Guo, J. Wang, Z. Liu, and C. L. P. Chen, “Fixed-time
fuzzy control for uncertain nonlinear systems with prescribed performance
p+q
2q and event-triggered communication,” IEEE Trans. Circuits Syst. I., Reg.
T 1 1 T −1
κ1 W̃ Ŵ ≤ κ1 W T W − δ1 W̃ Γ W̃ Papers, vol. 71, no. 5, pp. 2362–2371, May 7, 2024.
2 2 [20] M. Chen, H. Wang, and X. Liu, “Adaptive practical fixed-time tracking
p+q control with prescribed boundary constraints,” IEEE Trans. Circuits Syst.
q − p p + q q−p I., Reg. Papers, vol. 68, no. 4, pp. 1716–1726, Apr. 2021.
+ (42) [21] Y. Zhang and F. Wang, “Observer-based fixed-time neural control for a
2q 2q
class of nonlinear systems,” IEEE Trans. Neural Netw. Learn. Syst., vol. 33,
where δ1 = (κ1 /ϑmin (Γ−1 ))(p+q/2q) , and ϑmin (Γ−1 ) represents no. 7, pp. 2892–2902, Jul. 2022.
[22] J. Zhang, F. Xu, X. Liu, S. Gu, and H. Geng, “Fixed-time dynamic
the maximum eigenvalue of Γ−1 . surface control for pneumatic manipulator system with unknown dis-
turbances,” IEEE Robot. Autom. Lett., vol. 7, no. 4, pp. 10890–10897,
REFERENCES Oct. 2022.
[23] C. Wang, H. Zhan, Q. Guo, and T. Li, “Distributed neural fixed-time
[1] Z. Li, W. Gao, C. Goh, M. Yuan, E. K. Teoh, and Q. Ren, “Asymptotic consensus control of uncertain multiple Euler-Lagrange systems with
stabilization of nonholonomic robots leveraging singularity,” IEEE Robot. event-triggered mechanism,” IEEE/ASME Trans. Mechatron., early ac-
Autom. Lett., vol. 4, no. 1, pp. 41–48, Jan. 2019. cess: Jun., 25, 2024, doi: 10.1109/TMECH.2024.3410299.
[2] S. J. Yoo, “Adaptive tracking control for a class of wheeled mobile robots [24] Z. Lv, Y. Wu, X.-M. Sun, and Q.-G. Wang, “Fixed-time control for a
with unknown skidding and slipping,” IET Control Theory Appl., vol. 4, quadrotor with a cable-suspended load,” IEEE Trans. Intell. Transp. Syst.,
no. 10, pp. 2109–2119, Oct. 2010. vol. 23, no. 11, pp. 21932–21943, Nov. 2022.
[3] H. Gao, X. Song, L. Ding, K. Xia, N. Li, and Z. Deng, “Adaptive motion [25] H. Dong, X. Zhao, Q. Hu, H. Yang, and P. Qi, “Learning-based atti-
control of wheeled mobile robot with unknown slippage,” Int. J. Control, tude tracking control with high-performance parameter estimation,” IEEE
vol. 87, no. 8, pp. 1513–1522, Feb. 2014. Trans. Aerosp. Electron. Syst., vol. 58, no. 3, pp. 2218–2230, Jun. 2022.
[4] C. Chen, H. Gao, L. Ding, W. Li, H. Yu, and Z. Deng, “Trajectory tracking [26] R. E. Bellman, Dynamic Programming. Princeton, NJ, USA: Princeton
control of WMRs with lateral and longitudinal slippage based on active Univ. Press, 1957.
disturbance rejection control,” Robot. Auton. Syst., vol. 107, pp. 236–245, [27] D. Liu, D. Wang, F.-Y. Wang, H. Li, and X. Yang, “Neural-network-
Sep. 2018. based online HJB solution for optimal robust guaranteed cost control
[5] M. Mera, H. Ros, and E. A. Martnez, “A sliding-mode based controller of continuous-time uncertain nonlinear systems,” IEEE Trans. Cybern.,
for trajectory tracking of perturbed unicycle mobile robots,” Control Eng. vol. 44, no. 12, pp. 2834–2847, Dec. 2014.
Pract., vol. 102, Sep. 2020, Art. no. 104548. [28] D. Wang and C. Mu, “Adaptive-critic-based robust trajectory tracking of
[6] P. Rochel, H. Ros, M. Mera, and A. Dzul, “Trajectory tracking for uncertain uncertain dynamics and its application to a spring-mass-damper system,”
unicycle mobile robots: A super-twisting approach,” Control Eng. Pract., IEEE Trans. Ind. Electron., vol. 65, no. 1, pp. 654–663, Jan. 2018.
vol. 122, May 2022, Art. no. 105078. [29] C. Mu, Y. Zhang, Z. Gao, and C. Sun, “ADP-based robust tracking
[7] H. Yang, X. Fan, Y. Xia, and C. Hua, “Robust tracking control for wheeled control for a class of nonlinear systems with unmatched uncertainties,”
mobile robot based on extended state observer,” Adv. Robot., vol. 30, no. 1, IEEE Trans. Syst. Man Cybern. Syst., vol. 50, no. 11, pp. 4056–4067,
pp. 68–78, Jan. 2016. Nov. 2020.
[8] H. Wang, P. X. Liu, X. Zhao, and X. Liu, “Adaptive fuzzy finite-time [30] G. Wen, S. S. Ge, and F. Tu, “Optimized backstepping for tracking control
control of nonlinear systems with actuator faults,” IEEE Trans. Cybern., of strict-feedback systems,” IEEE Trans. Neural Netw. Learn. Syst., vol. 29,
vol. 50, no. 5, pp. 1786–1797, May 2020. no. 8, pp. 3850–3862, Aug. 2018.
[9] H. Wang, W. Liu, and M. Tong, “Adaptive fuzzy fast finite-time output- [31] Y. Li, K. Li, and S. Tong, “Reinforcement learning-based adap-
feedback tracking control for switched nonlinear systems with full- tive finite-time performance constraint control for nonlinear systems,”
state constraints,” IEEE Trans. Fuzzy Syst., vol. 32, no. 3, pp. 958–968, IEEE Trans. Syst. Man Cybern. Syst., vol. 54, no. 2, pp. 1335–1344,
Mar. 2024. Feb. 2024.
[10] J. Wang, C. Wang, C. L. P. Chen, Z. Liu, and C. Zhang, “Fast finite- [32] H. Dong, X. Zhao, and B. Luo, “Optimal tracking control for uncertain
time event-triggered consensus control for uncertain nonlinear multiagent nonlinear systems with prescribed performance via critic-only ADP,”
systems with full-state constraints,” IEEE Trans. Circuits Syst. I., Reg. IEEE Trans. Syst. Man Cybern. Syst., vol. 52, no. 1, pp. 561–573,
Papers, vol. 70, no. 3, pp. 1361–1370, Mar. 2023. Jan. 2022.
[11] S. Li and Y.-P. Tian, “Finite-time stability of cascaded time-varying [33] B. Xiao, H. Zhang, Z. Chen, and L. Cao, “Fixed-time fault-tolerant
systems,” Int. J. Control, vol. 80, no. 4, pp. 646–657, Apr. 2007. optimal attitude control of spacecraft with performance constraint via
[12] D. Wu, Y. Cheng, H. Du, W. Zhu, and M. Zhu, “Finite-time output feedback reinforcement learning,” IEEE Trans. Aerosp. Electron. Syst., vol. 59, no. 6,
tracking control for a nonholonomic wheeled mobile robot,” Aerosp. Sci. pp. 7715–7724, Dec. 2023.
Technol., vol. 78, pp. 574–579, Jul. 2018. [34] R. Kamalapurkar, H. S. D. Bhasin, and W. E. Dixon, “Approximate optimal
[13] H. Wang, K. Xu, and H. Zhang, “Adaptive finite-time tracking control trajectory tracking for continuous-time nonlinear systems,” Automatica,
of nonlinear systems with dynamics uncertainties,” IEEE Trans. Autom. vol. 51, pp. 40–48, Jan. 2015.
Control, vol. 68, no. 9, pp. 5737–5744, Sep. 2023. [35] C.-Y. Chen, T.-H. S. Li, Y.-C. Yeh, and C.-C. Chang, “Design and im-
[14] Y.-X. Li, Z. Hou, W.-W. Che, and Z.-G. Wu, “Event-based design of finite- plementation of an adaptive sliding-mode dynamic controller for wheeled
time adaptive control of uncertain nonlinear systems,” IEEE Trans. Neural mobile robots,” Mechatron., vol. 19, no. 2, pp. 156–166, Mar. 2009.
Netw. Learn. Syst., vol. 33, no. 8, pp. 3804–3813, Aug. 2022. [36] J. Zhang, S. Li, and Z. Xiang, “Adaptive fuzzy output feedback event-
[15] C. P. Vo and K. K. Ahn, “An adaptive finite-time force-sensorless tracking triggered control for a class of switched nonlinear systems with sensor
control scheme for pneumatic muscle actuators by an optimal force estima- failures,” IEEE Trans. Circuits Syst. I., Reg. Papers, vol. 67, no. 12,
tion,” IEEE Robot. Autom. Lett., vol. 7, no. 2, pp. 1542–1549, Apr. 2022. pp. 5336–5346, Dec. 2020.
[16] A. Polyakov, “Nonlinear feedback design for fixed-time stabilization of [37] D. Ba, Y. Li, and S. Tong, “Fixed-time adaptive neural tracking control
linear control systems,” IEEE Trans. Autom. Control, vol. 57, no. 8, for a class of uncertain nonstrict nonlinear systems,” Neurocomputing,
pp. 2106–2110, Aug. 2012. vol. 363, pp. 273–280, Oct. 2019.
[17] H. Wang, J. Ma, X. Zhao, B. Niu, M. Chen, and W. Wang, “Adaptive [38] S. Yang, F. Yu, H. Liu, H. Ma, and H. Zhang, “Adaptive-dynamic-
fuzzy fixed-time control for high-order nonlinear systems with sensor and programming-based robust control for a quadrotor UAV with external
actuator faults,” IEEE Trans. Fuzzy Syst., vol. 31, no. 8, pp. 2658–2668, disturbances and parameter uncertainties,” Appl. Sci., vol. 13, no. 23,
Aug. 2023. Nov. 2023, Art. no. 12672.
[18] H. Wang and L. Shen, “Adaptive fuzzy fixed-time tracking control for [39] W. Liu, X. Wang, and S. Li, “Formation control for leader-
nonlinear systems with time-varying full-state constraints and actuator follower wheeled mobile robots based on embedded control tech-
hysteresis,” IEEE Trans. Fuzzy Syst., vol. 31, no. 4, pp. 1352–1361, nique,” IEEE Trans. Control Syst. Technol., vol. 31, no. 1, pp. 265–280,
Apr. 2023. Jan. 2023.
Authorized licensed use limited to: Macquarie University. Downloaded on November 30,2024 at 03:12:42 UTC from IEEE Xplore. Restrictions apply.