0% found this document useful (0 votes)

62 views17 pages

Relay-Assisted Federated Edge Learning

In this paper, we study a relay-assisted federated edge learning (FEEL) network under latency and bandwidth constraints. In this network, N users collaboratively train a global model assisted by M intermediate relays and one edge server. We firstly propose partial aggregation and spectrum resource multiplexing at the relays in order to improve the communication of the relay-assisted FEEL system. Furthermore, we derive analytical and asymptotic expressions of the system outage probability and con

Uploaded by

Tuong Nguyen Minh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

62 views17 pages

Relay-Assisted Federated Edge Learning

Uploaded by

Tuong Nguyen Minh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 17

Relay-assisted federated edge learning: performance analysis and

system optimization
Chen, L., Fan, L., Lei, X., Duong, T. Q., Nallanathan, A., & Karagiannidis, G. K. (2023). Relay-assisted federated
edge learning: performance analysis and system optimization. IEEE Transactions on Communications.
https://doi.org/10.1109/tcomm.2023.3263566

Published in:
IEEE Transactions on Communications

Document Version:
Peer reviewed version

Queen's University Belfast - Research Portal:

Link to publication record in Queen's University Belfast Research Portal

Publisher rights
© 2023 IEEE.
This work is made available online in accordance with the publisher’s policies. Please refer to any applicable terms of use of the publisher.

General rights
Copyright for the publications made accessible via the Queen's University Belfast Research Portal is retained by the author(s) and / or other
copyright owners and it is a condition of accessing these publications that users recognise and abide by the legal requirements associated
with these rights.

Take down policy

The Research Portal is Queen's institutional repository that provides access to Queen's research output. Every effort has been made to
ensure that content in the Research Portal does not infringe any person's rights, or applicable UK laws. If you discover content in the
Research Portal that you believe breaches copyright or violates any law, please contact [email protected].

Download date:30. May. 2023

Relay-Assisted Federated Edge Learning:

Performance Analysis and System Optimization
Lunyuan Chen, Lisheng Fan, Xianfu Lei, Trung Q. Duong, Fellow, IEEE, Arumugam Nallanathan, Fellow, IEEE,
and George K. Karagiannidis, Fellow, IEEE

Abstract—In this paper, we study a relay-assisted federated intelligent paradigm namely federated learning (FL) was pro-
edge learning (FEEL) network under latency and bandwidth posed to enable multiple users to train a global model without
constraints. In this network, N users collaboratively train a transmitting the sensitive data [5]–[8]. In this framework, the
global model assisted by M intermediate relays and one edge
server. We firstly propose partial aggregation and spectrum FL server periodically selects some users as the candidates to
resource multiplexing at the relays in order to improve the join each round’s training. Then, the selected users calculate
communication of the relay-assisted FEEL system. Furthermore, the training loss, update the weights and transmit the local
we derive analytical and asymptotic expressions of the system models to the server. Once they received, the server can
outage probability and convergence rate. For the purpose of aggregate the models and repeat the whole procedure until
improving the system performance, we further optimize the relay-
assisted FEEL network by maximizing the number of users it converges [9]–[11].
who participate in each round of federated learning, through In the same time, mobile edge computing (MEC) has
allocation of the wireless bandwidth among users and relays. become one of the most advanced technologies for reducing
Specifically, two bandwidth allocation (BA) schemes have been communication latency and energy consumption [12], [13].
proposed, assuming either instantaneous or statistical channel For example, MEC could be used for video transmission
state information (CSI). Simulations show the advantages of the
proposed BA schemes over other benchmarks, regarding the to suppress jamming [14], where the compression parameter
accuracy and convergence rate of the considered relay-assisted and power control were optimized by reinforcement learning.
FEEL network. Besides, similar concept was used to decide offloading against
Index Terms—Federated learning, edge learning, relay, outage jamming attacks and interference in [15], which could achieve
probability, Internet of Things. a significant reduction in latency and energy consumption.
Therefore, FL can be used in the MEC scenarios, where the
mobile users perform distributed learning and transmit the
I. I NTRODUCTION trained models to be aggregated at the edge server, called
federated edge learning (FEEL) [16]–[18]. The FEEL per-
Recently, fast-growing applications of the Internet of Things formance depends on the number of successfully participated
(IoT) have generated an explosive amount of data to drive users in the federated learning, which is however limited by
artificial intelligence (AI), widely applied in wireless com- the communication overhead, due to practical constraints, such
munication, image processing and other fields [1]–[4]. The as latency and bandwidth [19]–[21]. To reduce the commu-
centralized AI applications need to aggregate distributed data nication overhead, a physical-layer quantization scheme was
from users into the server for training, which is hard to be proposed to upload training models, where the compromise
achieved due to privacy concerns. To tackle this issue, an between FEEL performance and quantization ratio was re-
vealed [22]. Also, to further cope with this overhead, the
L. Chen and L. Fan are both with School of Computer Science, Guangzhou
University, Guangzhou 510006, China (e-mail: [email protected], system resources of FEEL networks can be exploited to sup-
[email protected]). port more users to successfully participate into the federated
X. Lei is with the School of Information Science and Technology, Institute learning [23], [24]. For instance, the trade-off between the
of Mobile Communications, Southwest Jiaotong University, Chengdu 610031,
China (e-mail: [email protected]). communication overhead and computational capability was
T. Q. Duong is with the School of Electronics, Electrical Engineer- investigated in [25], by dividing the deep model into several
ing and Computer Science, Queens University Belfast, Belfast, BT7 sub-models, where the authors enabled heterogeneous mobile
1NN, UK and also with the Department of Electronic Engineering,
Kyung Hee University, Yongin-si, Gyeonggi-do 17104, South Korea (e- users to select models of appropriate size to reduce the amount
mail:[email protected]) of transmitted data. In addition, the system resources such
A. Nallanathan is with the School of Electronic Engineering and Com- as bandwidth can be optimized among the users, in order
puter Science, Queen Mary University of London, London, U.K (e-mail:
[email protected]). to meet practical requirements such as latency and energy
G. K. Karagiannidis is with Aristotle University of Thessaloniki, Greece consumption, by exploiting the channel state information (CSI)
and is also with Cyber Security Systems and Applied AI Research Center, [26], [27].
Lebanese American University (AUL), Lebanon (e-mail: [email protected]).
The work of Lisheng Fan was supported in part by NSFC under Grant Besides the above techniques, relays can be deployed in
62271158 and Grant 62101145; and in part by the Natural Science Foundation FEEL to decrease the communication overhead and thus,
of Guangdong Province under Grant 2021A1515011392. The work of Xianfu enhance the system communication and learning performance.
Lei was supported in part by the National Natural Science Foundation of
China under Grants 61971360 and 62271420. In recent works, relaying has been proposed to be an effec-
The corresponding author of this paper is L. Fan. tive technology in wireless communication systems to extend
2

coverage and improve reliability without requiring additional

power [28]–[30]. In the relay-assisted FEEL, some interme- ES
diate relays can be deployed to assist the communication
between mobile users and the edge server. In this aspect,
a FEEL network which exploits cooperative relaying with
service pricing was presented in [31], where the relays only
help the data communication during the model update. In
addition, a relay-assisted FEEL system was investigated in R1 R2 ... RM
[32], where multiple relays were used to improve the over-
the-air computation performance. Besides assisting the data
communication, the relays in the FEEL networks can help
performing partial aggregation in order to reduce the total
amount of data required for transmission. In this aspect, a
two-tier relay-assisted FL framework was proposed in [33], ...
where the relays assisted the model aggregation for the local
gradients to achieve a partially synchronized parallel mecha-
nism. In addition, federated learning aggregation was explored U1 U2 U3 UN-1 UN
in [34] for device-to-device (D2D) communications across the
wireless devices, where partial gradient aggregation was used Fig. 1. Relay-assisted federated edge learning (FEEL).
at the relays to assist the uplink. However, so far, to the best of
our knowledge, there has been little work on the relay-assisted
FEEL system with limited resources, especially about the edge server ES and M intermediate decode-and-forward (DF)
framework of performance analysis and system optimization. relays1 . Due to severe fading, there is no direct link between
In this paper, we study a relay-assisted FEEL network the ES and users, i.e., the ES can only communicate with the
under latency and bandwidth constraints, where N users users via relaying links. Besides assisting the data communi-
collaboratively train a global model assisted by one edge cation, the relays can also perform the model aggregation, in
server and M intermediate relays. For the relay-assisted FEEL order to reduce the communication and computing overhead
system, we propose a novel framework for the performance at the server. Let U , {U1 , U2 , . . . , UN } denote the set of N
analysis and system optimization. Specifically, we begin with users, where device Uk ∈ U has a local trainable dataset Dk
the first critical question:“How to design a relay-assisted FEEL and it can perform local stochastic gradient descent (SGD) on
system that can make full use of the relays in the edge Dk . In addition, we use R , {R1 , R2 , . . . , RM } to denote the
environment with limited resources?”. To answer this question, relay set, in which each relay can receive the local models from
we propose to use partial aggregation and spectrum resource the N IoT users through wireless links. Then, the relays can
multiplexing at the relays to enhance the communication of perform partial aggregation and further transmit the aggregated
the relay-assisted FEEL system. We then study the second models to the ES for global aggregation through wireless
important question: “How to evaluate the system performance links. Due to the limitation in the size, each node in the system
of the relay-assisted FEEL?”. To answer this question, we is assumed to have a single antenna.
provide the analysis of outage probability and perform con-
vergence analysis to reveal the impact of outage probability B. Conventional Federated Learning
on the convergence rate of federated learning. Driven by the
In the conventional FL network, multiple users with dis-
system performance analysis, we come to the third impor-
tributed data train a global model assisted by an edge server,
tant question: “How to optimize the FEEL performance by
where the intermediate relays are not involved. For such a
scheduling the system bandwidth resources?”. To answer this
network, the FL can be described by the following problem
question, we provide instantaneous and statistical bandwidth
N
allocation (BA) schemes, which can be applicable depending X |Dk |
on specific requirements of communication and computing min F (w) = Fk (w), (1)
w |D|
k=1
scenarios. Simulation results are finally provided to illustrate
the advantages of the instantaneous and statistical BA schemes. where w represents the global model parameter, |D PkN| denotes
the training sample amount in user Uk , and |D| = k=1 |Dk |.
II. R ELAY A SSISTED F EDERATED E DGE L EARNING Notation Fk (w) is the local loss function of user Uk ,
In this section, the system model of the relay-assisted X 1
FEEL network is firstly presented, and then the conventional Fk (wk ) = L(wk , x), (2)
|Dk |
federated learning is introduced. After that, we present the x∈Dk

procedure for the relay selection and partial aggregation. 1 It is straightforward to adopt the DF relaying protocol to decode and
recover the model weights, in order to aggregate models at the relays in
A. System Model this paper. Note that this work can be extended to other relaying protocols,
like AF protocol with some minor modification. In particular, we can use the
A relay-assisted FEEL network is shown in Fig. 1, where summation property of wireless channels and introduce over-the-air computing
N users collaboratively train a global model assisted by one technology to aggregate models without decoding and re-modulation.
3

where wk denotes the model parameter of user Uk , and applied for the considered relay-assisted federated framework.
L(wk , x) is the corresponding loss function. As the data is Without loss of generality, the well-known FedAvg is adopted
distributed, it is generally difficult to solve (1) directly. Hence, in this work to aggregate the local trained models, given by
FL tends to be used by employing an iterative algorithm to X |Dk |
train a global model from the users. Specifically, for each wmt+1
= P vkt+1 , (5)
round, user Uk calculates the training loss, and then the Uk ∈Jm Uk ∈Jm |Dk |

weights are updated using the gradient descending as where the aggregation at the relay Rm is synchronized,
vk ←
− wk − η∇Fk (wk ), (3) which can help reduce the communication overhead and avoid
model staleness, by contrast with the asynchronous federated
where vk is the updated model parameter of user Uk and η learning. Although the synchronous federated learning has the
denotes the learning rate. After that, the updated local models limitation of waiting for slow learners, i.e., stragglers, such
from multiple users are gathered and aggregated at ES. limitation can be alleviated through setting a latency threshold
to drop out slow users and using proper resource allocation to
C. Relay-assisted FEEL avoid a long time waiting.
Note that in the above FedAvg, the problem of “objective
In the considered relay-assisted FEEL, the intermediate
inconsistency” may arise due to the heterogeneity in the size
relays cooperatively assist the model exchange between users
of local dataset and local SGD iteration among users [38].
and ES, to extend the coverage and enhance the transmission
This is because the aggregated model will be biased towards
reliability. Moreover, the relays can perform the operation of
the users with more SGD iterations, which eventually affects
aggregation early to cut down the communication cost.
the federated learning performance. To tackle this problem,
Next, we present in detail the procedure of the FEEL assist-
we can use the important works [34], [38] and especially the
ed by the relays under the paradigm of FedAvg. Specifically,
normalized cumulative gradients to replace the FedAvg, given
the global model parameter is initialized to w0 , and then the
by
global model is updated in a number of rounds. At each round, !
we can divide the model update into the following four steps. X |Dk | X |Dk | v t+1 − wt
t+1 t k
1) User sampling and model broadcast: At this step, ES wm = w + ek .
|DJm | |DJm | ek
firstly selects a group of users for each round t. It then Uk ∈Jm Uk ∈Jm
(6)
broadcasts the global model parameters wt of the previous
If not specified, FedAvg will be used for aggregating the local
round to the selected users with the help of the relays. In
trained models in the subsequent sections.
particular, ES may uniformly select the user subset K out
4) Global Aggregation: At this step, each relay needs to
of N users without replacement, where |K| = K is the user
send its aggregated model to the edge server via the second-
number in the user subset K. Note that the uniform selection
hop relaying link. After gathering all the models from the
can be applied to many scenarios where the importance of
relays, the ES can perform the aggregation as
users is unknown or identical [35]–[37], and it can guarantee
the unbiasedness of the model aggregation with full client X |DJm |
wt+1 = P t+1
wm . (7)
participation in each round. For other scenarios where the users Rm ∈R |D J m
|
Rm ∈R
have different importance, importance-aware scheduling can
be adopted to enhance the federated learning performance.
D. Problem Formulation
2) Local model update: At this step, user Uk firstly sets the
initial local model parameters as wkt+1 = wt , after receiving For the considered relay-assisted FEEL system under laten-
the global model parameters wt from ES. Then, user Uk trains cy and bandwidth constraints, we can optimize the system
its model on its local dataset. Specifically, user Uk conducts performance through minimizing the global loss function,
E epochs of SGD on its local dataset, where there are totally given by
ek = E |Dbk | SGD iterations, and b is the mini-batch size. N
1 X X
Therefore, the local model will be updated in a total of ek P0: min L(wk , x). (8)
times, and in each SGD iteration holds that |D|
k=1 x∈Dk

vkt+1,j+1 ←
− vkt+1,j − ηt+1 ∇Fk (wkt+1,j ; ξk ), (4) However, obtaining an exact expression for the global loss
function of FEEL is generally hard, which causes much diffi-
where j ∈ {1, · · · , ek } is the local SGD iteration index, and culty in solving the optimization in problem P0. To overcome
ξk is the data batch uniformly chosen from the local dataset this difficulty, we turn to perform some analysis on the system
Dk . performance, as shown in the following section.
3) Relay selection and partial aggregation: After finishing
the local update, user Uk needs to transmit its updated weight III. S YSTEM PERFORMANCE ANALYSIS
vkt+1 to a selected intermediate relay Rm . Let Jm denote the
user subset uploaded to relay Rm , and |DJm | is total the A. Latency analysis
training sample amount in the user subset Jm . After receiving The latency is a critical performance metric in the FEEL
and decoding the local models, relay Rm will aggregate the network, as it determines whether the users can finish the
collected models, where some aggregation method can be model training and model upload in time or not. When the
4

devices fail to accomplish uploading in time, the effective From (11), the transmission latency from user Uk to relay
number of successfully participated users will decrease, caus- Rm∗k is given by
ing deterioration in the convergence of federated learning.
I |L|
In the considered relay-assisted FEEL, the latency of each Tk,m ∗ =
I
, (13)
IoT device is related to the computational capability, wireless
k Rk,m ∗
k

channel quality, and relay selection. The latency of local where |L| is the size of the uploaded model. After receiving
training and global aggregation may significantly affect the all the model parameters from the user set Jm∗k , relay Rm∗k
system training performance. Thus, investigating the latency aggregates the local model according to (5). Then, relay Rm∗k
for the considered relay-assisted FEEL is very important. needs to transmit the aggregated model to ES, where the
The total latency of user Uk is denoted as Tktotal , which corresponding transmission data rate from relay Rm∗k to ES
consists of both the local training latency and the uplink is
latency. Note that the downlink latency is ignored in this !
II II
Pm∗k |gm∗k |2
work, as it is generally much smaller than the uplink latency, Rm∗k = Bm∗k log2 1 + , (14)
because the transmit power at the server can be much larger. σ2
Specifically, the local training latency Tklocal of user Uk is given where Pm∗k denotes the transmit power at relay Rm∗k , gm∗k
by denotes the instantaneous channel parameter of the link Rm∗k –
ek bρ ES, and it follows Rayleigh fading with E[|gm∗k |2 ] = λm∗k .
Tklocal = , (9)
fk In this paper, the relays work in a time-division multiplex-
ing mode, where the dual hops share the same frequency
where CPU needs ρ cycles to process one sample training, and resources, i.e.,
fk denotes the computational capability at user Uk . Then, the BmII
∗ =
X
BkI . (15)
local trained model needs to be uploaded to ES via the uplink k
Uk ∈Jm∗
relaying links. In this paper, we perform the relay selection k

based on the instantaneous CSI of the first-hop relaying links2 From II

Rmin (14), the transmission latency from relay Rm∗k
∗
k
and ES is
m∗k = arg max |hk,m |2 , (10) |L|
II
1≤m≤M Tm ∗ =
II
. (16)
k Rm ∗
k
where hk,m is the channel parameter of the link Uk –Rm ,
and it follows Rayleigh fading with E[|hk,m |2 ] = λk,m . The In summary, the total latency of user Uk is
I II
transmission data rate of the link Uk –Rm∗k is Tktotal = Tklocal + Tk,m ∗ + Tm∗ .
k k
(17)
!
I I
Pk |hk,m∗k |2 B. Outage Probability Analysis
Rk,m ∗ = Bk log2 1+ , (11)
k σ2 From the above Tktotal , we can start to analyze the outage
probability of user Uk . To avoid idle time in the FEEL
where BkI is the allocated bandwidth of the link Uk –Rm∗k , Pk network, a predetermined latency threshold γth will be set
denotes the transmit power at user Uk , and σ 2 denotes the in practice. The user Uk will be dropped from the federated
variance of AWGN. learning, if the associated latency Tktotal is above γth . Thus, the
Note that the transmission in (11) employs orthogonal effective number of users who can successfully participate in
frequency resources among users. If multiple users employ federated learning can be given by
the same frequency resource to communicate simultaneously, K
the co-channel interference will arise among the users, and the
X
Keff = I(Tktotal ≤ γth ), (18)
transmission data rate between user Uk and the selected relay k=1
Rm∗k becomes,
where I(·) denotes the indicator function which returns 1 if
  the condition is met or 0 otherwise. Accordingly, the expected
I II
Pk |hk,m∗k |2 effective user number is given by
Rk,m ∗ = Bm∗ log2 1 + P ,
k k σ2 + ∗ 2
Ui ∈Jm∗ ,i6=k Pi |hi,mk | K
k X
(12) E (Keff ) = Pr[Tktotal ≤ γth ]
II
where Bm ∗ denotes the bandwidth of the link Rm∗ –ES. From
k
k=1
k !
this expression, we can find that the co-channel interference K
1 X
will deteriorate the transmission data rate, and multiple users =K 1− Pr[Tktotal > γth ] . (19)
K
will have to collaborate or compete in some other domains, k=1
such as the power domain in multiuser NOMA systems. From (19), the system outage probability of the FL is given
by
2 In order to obtain the instantaneous CSI, each user needs to broadcast
K K
the transmission request to all relays, and then the users will send some pilot 1 X 1 X
signals to the relays. After that, the relays can estimate the associated channel
Pout = Pr[Tktotal > γth ] = Pout,k , (20)
K K
parameters and execute the relay selection in (10). k=1 k=1
5

where Pout,k is the outage probability of Uk in the process of Proof: See Appendix A.
FL, given by
Thus, a lower bound on the system outage probability can
I II
Pout,k = Pr[Tktotal > γth ] = Pr[Tklocal + Tk,m ∗
k
+ Tm ∗
k
> γth ]. be obtained in Theorem 2
(21) Theorem 2. A lower bound on the system outage probability
is given by
To analyze the system outage performance of the relay-
K
assisted FEEL, we need first to derive the outage probability of lb 1 X lb
Pout = Pout,k
user Uk . In practice, the local training latency of user Uk can K
k=1
be regarded deterministic, as it is not affected by the stochastic
fk |L| ln 2

1 − exp
K
" !
nature of the channels. Hence, we can re-write 1 X AII
k
(γth fk −dk )
= 1 − exp
K λm∗k ζm∗k
I II dk k=1
Pout,k = Pr Tk,m ∗ + Tm∗ > γth −

k k fk 1 − exp B I f(γk |L|f ln−d
2
M
!!!#
Y
k th k ) k
" # × 1− 1 − exp .
|L| |L| dk λk,m ζk
= Pr I
+ II > γth − m=1
Rk,m ∗ Rm∗ fk (26)
k k
 
I II
Rk,m ∗ Rm ∗ f k Proof: By applying Theorem 1 into (20), the lower bound
k k
= Pr  <  , (22)
on the system outage probability can be proved.
|L| RI ∗ + RII∗ γth fk − dk
k,mk mk
Note that the above bound contains elementary functions
where dk = ek bρ denotes the CPU cycles needed to finish only, which can be easily computed. Therefore, the system
local training for user Uk . outage probability can be easily evaluated in the whole range
As deriving an exact closed-form solution to Pout,k from of SNR.
(22) is generally hard, we turn to use the inequality of xy/(x+ To obtain more insights on the system design of the relay-
y) < min(x, y) for positive x and y 3 , and then obtain a tight assisted FEEL, we use (26) to provide an approximate expres-
upper bound for the first form in (22) as, lb
sion for Pout , when high SNR region is assumed
I II
Rk,m ∗ Rm∗ 1
k k
< I II
min(Rk,m ∗ , Rm∗ ). (23)

fk |L| ln 2
−1
K M exp
!
I
|L| Rk,m∗ + Rm∗ II |L| k k
1 X Y I
Bk (γth fk −dk )
lb
k k Pout ' 1− 1−
K m=1
λk,m ζk
Then, substituting (23) into (22), we can obtain the lower k=1

k |L| ln 2
bound on the outage probability of user Uk , which can be exp AIIf(γ −1
!!
k th fk −dk )
analytically solved, as shown in Theorem 1, × 1− , (27)
λm∗k ζm∗k
Theorem 1. A lower bound on the outage probability of user
Uk is where the Taylors series approximation of lim e−x ' 1 − x
x→0
lim 1 −
is applied [42]. We further use the approximation of x→0
 
fk |L| ln 2

lb
 1 − exp AII
k
(γth fk −dk )  y→0
Pout,k =1 − exp   (1 − x)(1 − y) ' x + y and get the asymptotic expression of
 λm∗k ζm∗k  lb
Pout for high SNR as
K M
   
YM 1 − exp B I f(γk |L|f ln−d
2
)
lb
Pout '
1 X Y
exp
fk |L| ln 2
−1 λk,m ζk
k th k k
× 1 − 1 − exp   . K BkI (γth fk − dk )
   
m=1
λk,m ζk k=1 m=1
| {z }
O1
(24) !
fk |L| ln 2 asy
Pm∗ + exp − 1 λm∗k ζm∗k = Pout .
where ζk = Pσk2 and ζm∗k = σ2k are the transmit SNRs at the AII
k (γth fk − dk )
| {z }
user Uk and relay Rm∗k , respectively, and AII
k is given by O2
(28)
K−1
X
K −1 K −2 Note that the above asymptotic expression contains two
AII
k = I
Bk + I
(Btotal − Bk )
i=1
i i−1 parts, where the first part O1 depends on the transmission
i K−i−1 K−1 between users and relays, while the second part O2 depends on
1 1 I 1 asy
× 1− + Bk 1 − , the transmission between relays and edge server. From Pout ,
M M M several insights on the FL system can be obtained,
(25)
• The first part O1 decays exponentially with factor M ,
3 Note that in this inequality, the approximation error is large when x is which indicates that the M intermediate relays can be
equal to y, and the approximation accuracy improves when x differs from fully exploited.
y. In general, x is often different from y due to random wireless channels, • When relay number M is large, the first part O1 ap-
resulting in a fine approximation accuracy on average. Due to these reasons,
the inequality of xy/(x + y) < min(x, y) is widely used in the existing proaches to 0, and the second part O2 will dominate in the
works such as [39]–[41]. system outage probability, indicating that the transmission
6

between the relays and edge server becomes the system Proof: See Appendix B.
bottleneck.
From Theorem 3, we can conclude that for the relay-
• The outage performance of the relay-assisted FEEL sys-
assisted FEELP with partial user participation and user dropout,
tem improves with a larger λk,m and λm∗k , revealing that a N 2 2 2 2 2 2
better transmission channel can enhance FL transmission.
the terms of k=1 pk δk , 6LΓ, 8(e − 1) G , and 4e G H
dominate the convergence performance. Specifically, the term
• Both O1 and O2 are decreasing with respect to BkI and PN 2 2
p δ
k=1 k k is related to the mini-batch SGD used in the
AII
k , indicating that a larger bandwidth of user Uk and local training, and the term 6LΓ is related to non-i.i.d data
intermediate relays m∗k will improve the system outage
distribution of user data. In particular, the convergence upper
performance.
bound decreases monotonically with Γ, and when Γ becomes
zero, i.e., i.i.d. dataset, the term 6LΓ can be removed. More-
C. Convergence Analysis over, the terms 8(e − 1)2 G2 and 4e2 G2 H are both related
The convergence of the relay-assisted FEEL is now ana- to the distributed SGD algorithm and the model aggregation,
lyzed, which is of vital importance for the FL training. For where the term 4e2 G2 H also shows that the effective number
this purpose, we first introduce the following assumptions, of participated users directly affects the convergence upper
Assumption 1: For any user Uk , Fk (·) is µ-strongly convex, bound, revealing that a larger outage probability will dete-
i.e., for any w0 and w1 , riorate the convergence rate seriously. Thus, it is critical to
µ enhance the convergence performance through reducing the
Fk (w1 ) ≥ Fk (w0 ) + (w1 − w0 )T ∇Fk (w0 ) + kw1 − w0 k2 . number of users dropped from the FEEL training, by designing
2
(29) a bandwidth allocation scheme for the considered system.

Assumption 2: For any user Uk , Fk (·) is L-smooth, i.e., for IV. BANDWIDTH ALLOCATION
any w0 and w1 , Inspired by the above convergence results that more users
L successfully participating in each round’s learning process
Fk (w1 ) ≤ Fk (w0 ) + (w1 − w0 )T ∇Fk (w0 ) + kw1 − w0 k2 . can improve the convergence in Theorem 3, problem P0
2
(30) is reformulated as maximizing the successfully participated
user number in each round’s FL by allocating the wireless
Assumption 3: For ξk uniformly and randomly sampled from bandwidth among users and intermediate relays, given by
the local dataset Dk , the variance of user Uk is bounded for
K
all k by X
P1: max Keff = I(Tktotal ≤ γth ) (33a)
{BkI ,Bm
II |U ∈U ,R ∈R}
E k∇Fk (w; ξk ) − ∇Fk (w)k2 ≤ δk2 .
k m
(31) X
k=1
II
s.t. Bm ≤ Btotal , (33b)
Assumption 4: For all users, the expected second-order
Rm ∈R
moment of the norm of the stochastic gradient is uniformly X
BkI = Bm
II
, (33c)

bounded by E k∇Fk (w; ξk )k2 ≤ G2 .
addition to the above assumptions, we use the term Γ =
In P Uk ∈Jm
N
F ∗ − k=1 pk Fk∗ to quantify the degree of non-i.i.d, where F ∗ where (33b) and (33c) are the bandwidth constraints at the
and Fk∗ are the minimum values of F and Fk , respectively. We relays and users, respectively. These two bandwidth constraints
can find from Γ’s definition that the data distribution is i.i.d if also indicate that multiple users will collaborate or compete
Γ = 0, or non-i.i.d otherwise. Moreover, in order to simplify with each other in the frequency domain, which can be
the analysis, we change the timeline to SGD iterations and found in many application scenarios where the users employ
assume that all users have the same e SGD iterations in the some orthogonal frequency resources to communicate, such
convergence analysis. as OFDMA systems. On the other hand, if the users employ
From the above assumptions, the convergence performance the same frequency resource to communicate simultaneously,
of the relay-assisted FEEL can be analyzed, which is presented co-channel interference will arise, and multiple users have
in Theorem 3. to collaborate or compete in some other domains, such as
n o the power domain in multiuser NOMA systems. In this case,
Theorem 3. Under Assumption 1-4, with ψ = max 8 L , e ,
µ the proposed framework of performance analysis and system
2
and ηt = µ(ψ+t) , the convergence should satisfiy optimization in this paper is still applicable, and the results in
this work can serve as a useful benchmark for the federated
E[F (wT ) − F ∗ ] learning with multiuser interference, which can help obtain
" N
L 2 X some insights on the system design.
≤ p2k δk2 + 6LΓ + 8(e − 1)2 G2 In the following, the optimization problem is solved by
µ(ψ + T ) µ
k=1 exploiting the instantaneous or statistical CSI, where flexible
! #
µψ 2 choices can be provided for the system optimization.
+ 4e2 G2 H + w0 − w∗ , (32)
2
A. Instantaneous Bandwidth Allocation
PN N −K(1−Pout ) 0
where H = k=1 pk K(1−Pout ) , and w is the initial value For the instantaneous bandwidth allocation method, the edge
of the global model weights. server needs to make bandwidth allocation decision at each
7

∗
time slot, so that the instantaneous bandwidth allocation tends II
Algorithm 1: Bisection search of Bm ∗
and αk,m
to be used in the system which is sensitive to the performance
of communication and training. Due to the indicator function 1 Input Btotal , Jm ;
and the coupling of constraints (33b) and (33c), the problem 2 Blower = 0, Bupper = Btotal ;
P1 is hard to be directly solved. Thus, we propose to solve 3 while Blower < Bupper do
this problem by dividing it into two sub-problems: minimizing 4 Bmid = (Blower + Bupper )/2;
the total bandwidth required for the selected users and choos- 5 For Uk ∈ Jm , calculate the bandwidth ratio αk,m
ing some users to be dropped out from the FEEL process. according
P to (36) with Bmid ;
Specifically, for the first sub-problem, we relax the problem 6 if Uk ∈Jm αk,m < 1 then
P1 by removing the bandwidth constraint (33b), so that all the 7 Blower
P = Bmid ;
relays can be allocated by the required bandwidth, in order to 8 else if Uk ∈Jm αk,m > 1 then
support the selected users to successfully participate in FEEL 9 Bupper
P = Bmid ;
process. The first sub-problem can be given by 10 else if Uk ∈Jm αk,m = 1 then
II ∗ ∗
X 11 Bm = Bmid , αk,m = αk,m , Uk ∈ Jm ;
II
P2: min Bm (34a) 12 break;
{BkI ,αk,m |Uk ∈U ,Rm ∈R}
Rm ∈R 13 end
|L| |L| 14 end
s.t. Tklocal
+ II r I
+ II II ≤ γth , ∀Uk ∈ U, II ∗ ∗
αk,m Bm k,m B m rm 15 Output Bm , {αk,m |Uk ∈ Jm }
(34b)
X
αk,m = 1, (34c)
Uk ∈Jm
0 ≤ αk,m ≤ 1, (34d) With Bmid as the bandwidth allocated to relay Rm , we then
for each user Uk ∈ Jm and sum up all αk,m .
calculate αk,m P
|2

I P |h ∗
where we have rk,m = log2 1 + k σk,m
2 , By comparing Uk ∈Jm αk,m with
P 1, we can halve the search
2
region with Bupper = Bmid if Uk ∈Jm αk,m > 1, or halve
II
rm = log2 1 + Pmσ|g2m | , and αk,m is the bandwidth
the search region with Blower = Bmid otherwise. The search
allocation ratio from Prelay Rm to user Uk , which satisfies process will continue until the constraint (35b) is satisfied,
0 ≤ αk,m ≤ 1 and Uk ∈Jm αk,m = 1. Constraint (34b) which finally outputs the optimal αk,m for Uk ∈ Jm and
guarantees that all users can successfully participate in BmII
.
the training process. Constraints (34c) and (34d) are the
reformulation of (33c) using BkI = αk,m BmII
as the bandwidth We proceed to solve the second sub-problem when the total
allocated to user Uk from relay Rm . We can find that the bandwidth needed exceeds the system total bandwidth, i.e.,
II
P
optimal solution of P2 should satisfy the conditions given in Rm ∈R m > Btotal . In this case, the participating users
B
Theorem 4, should be adjusted and certain users have to be dropped out
∗
to satisfy the bandwidth constraint (33b). Here, a greedy algo-
II ∗
Theorem 4. For relay Rm , the optimal Bm and αk,m to rithm is utilized to solve the second sub-problem. Specifically,
solve problem P2 should satisfy the user with the largest αk,m Bm II
, i.e., the user occupies the
largest bandwidth, will be dropped out from the FEEL process.
|L| |L| After removing the firstly dropped outP user, we continue to


 Tklocal + ∗ II ∗ I + II ∗ II = γth , (35a) solve problem P2 until the constraint Rm ∈R Bm II
≤ Btotal


 αk,m Bm rk,m Bm rm


 X ∗ is satisfied. In this way, we finally solve problem P1 with
αk,m = 1, (35b) the instantaneous CSI. The greedy based bandwidth alloca-
U ∈J
 k m

 ∗
tion algorithm with the instantaneous CSI is summarized in
 0 ≤ αk,m ≤ 1, (35c)



 II Algorithm 2.
Bm ≥ 0. (35d)
Proof: See Appendix C.
From Theorem 4, we can observe that there is one and only
one solution to (35) because of the monotonicity and non- B. Statistical Bandwidth Allocation
∗ II ∗ II ∗
trivial value of αk,m and Bm . Moreover, with a given Bm ,
we can get the optimal value of αk,m as
Besides the above instantaneous BA method, we also pro-
II vide a statistical bandwidth allocation, which is performed
∗ rm |L|
αk,m = II ∗ (γ − local I II I
. (36) once for many time slots and applicable to the system that
Bm th Tk )rk,m rm − rk,m |L|
is sensitive to the computational complexity of bandwidth
∗
II
With (36), we can obtain a numerical value of Bm by using allocation at the price of some performance deterioration
an efficient searching algorithm based on the bisection method, compared with the instantaneous bandwidth allocation. In
as shown in Algorithm 1. In particular, we start the search this case, we turn problem P1 into optimizing the statistical
with the middle point Bmid of an initial range [Blower , Bupper ]. expectation of the successfully participated user number in
8

Algorithm 2: Greedy based bandwidth allocation algorith- Algorithm 3: PSO based bandwidth allocation algorithm
m 1 Input U, R, Jm , Btotal , I, T , ω, ϕ1 , ϕ2 ;
1 Input U, R, Jm , Btotal ; 2 Initialize Create I particles randomly;
II ∗ ∗
2 For Rm ∈ R, Solve Bm , αk,m using Algorithm 1; 3 for t = 1 to T do
∗
II
P
3 while Rm ∈R Bm > Btotal do 4 for i = 1 to I do
0 0 ∗ II ∗ Update vit by (39), and update pti by (40);
4 Uk , Rm = arg maxUk ∈U ,Rm ∈R αk,m Bm ; 5
5 BkI 0 = 0; 6 if Ffitness (pti ) ≤ Ffitness (pbestti ) then
6 U = U \ Uk 0 , Jm0 \ Uk 0 ; 7 pbestti = pti ;
II ∗ ∗ end
7 Solve Bm , αk,m using Algorithm 1 with U and 8

Jm , Rm ∈ R; 9 if Ffitness (pti ) ≤ Ffitness (gbestt ) then

8 end 10 gbestt = pti ;
II II ∗ I ∗ II ∗ 11 end
9 Bm = Bm , Bk = αk,m Bm , Uk ∈ Jm , Rm ∈ R;
I II
10 Output {Bk , Bm |Uk ∈ U, Rm ∈ R} 12 end
13 end P
II
14 Bm = Uk ∈Jm BkI , Rm ∈ R;
15 Output {BkI , Bm II
|Uk ∈ U, Rm ∈ R}
each round’s FL, given by

P3: max E (Keff ) = K (1 − Pout ) (37a)

{BkI ,Bm
II |U ∈U ,R ∈R}
k m
X are two random variables uniformly distributed in [0,1]. The
II position of particle i is updated as
s.t. Bm ≤ Btotal , (37b)
Rm ∈R
X pti = pt−1
i + vit . (40)
BkI = Bm
II
. (37c)
Uk ∈Jm After E times of iteration of velocity and position updates, the
gbest obtained from I particles can be regarded as a feasible
As obtaining an exact analytical expression for Pout is hard, solution to problem P4. The PSO based bandwidth allocation
lb
we turn to employ the derived lower bound Pout to help ap- algorithm with the statistical CSI is summarized in Algorithm
proximate the expectation of the number of users successfully 3.
participating in FEEL. Thus, we can reformulate P3 into P4,
given by V. S IMULATION RESULTS
lb
P4: max K(1 − Pout ) (38a) In this part, some analytical and simulation results are pre-
{BkI ,Bm
II |U ∈U ,R ∈R}
k m
X sented to validate the proposed studies in this paper. In partic-
II ular, the basic setting of these simulations is introduced, along
s.t. Bm ≤ Btotal , (38b)
Rm ∈R with some baselines methods used for comparison. We then
present some simulations with the purpose of verifying the
X
BkI = Bm
II
. (38c)
Uk ∈Jm
derived analysis on the system outage performance. Further,
we conduct some more simulations to validate instantaneous
As problem P4 is hard to be directly solved, we use the particle and statistical bandwidth allocation schemes.
swarm optimization (PSO) to solve problem P4, which is an
intelligent algorithm using a set of particles to search for an
A. Simulation Settings
approximate solution. In PSO, there are I particles, and each
particle i has three associated vectors: the velocity vi , the The simulations are performed in the considered relay-
position pi , and the best position pbesti . Specifically, pi is a assisted FEEL system with a total of 200 users. If not specified,
K-dimension vector denoting a feasible solution of bandwidth for all simulations, there are 500 communication rounds in
I
allocation, where pi = {Bk |Uk ∈ K}, vi is a K-dimension total, and there are 10 selected users for each communication
I
vector of bandwidth variation, where vi = {∆Bk |Uk ∈ K}, round. The channels follow Rayleigh flat fading, where the
and pbesti is a K-dimension vector of the best solution to average channel gain of the link Uk –Rm is set to λk,m =
the optimization problem for particle i. Moreover, there is a (100 + k)/200, and the average channel gain between the
global vector gbest used to denote the best solution among all relays and ES is set to 2. The transmit power at each user
the particles. All the position vectors are potential solutions and each relay are set to 0.1W and 0.5W, respectively. The
7
of the optimization problem evaluated by the fitness function computational capability of each user is 1.5×10 cycle/second.
lb
Ffitness (·), measured by K(1 − Pout ). In addition, for the PSO based bandwidth allocation algorithm,
For particle i at iteration t, its velocity is updated as we use 30 particles and 50 iterations to search for a feasible
solution, where the inertia weight of the previous velocity ω is
vit = ωvit−1 +ϕ1 ρ1 pbestt−1 t−1 t−1 t−1

i − pi +ϕ ρ
2 2 gbest − pi ,0.5 and the two acceleration coefficients ϕ 1 and ϕ 2 are both
(39) 0.4.
where ω denotes the inertia weight of the previous velocity, In practice, for the FL task, the Fashion-MNIST dataset is
ϕ1 and ϕ2 are two acceleration coefficients, and ρ1 and ρ2 used to perform a classification task, where 60000 training and
9

10 0 0.8

0.7
10 -1
0.6

10 -2

Test accuracy
Outage probability

0.5

0.4
10 -3

0.3
-4
10 Simulation (M = 1) Instantaneous BA
0.2
Analytical LB (M = 1) Statistical BA
Asymptotic LB (M = 1) UA
0.1 UA-wo-PA
10 -5 Simulation (M = 2)
Ideal FEEL
Analytical LB (M = 2)
Asymptotic LB (M = 2) 0
0 50 100 150 200 250 300 350 400 450 500
10 -6
15 20 25 30 35 Communication round
Transmit SNR (dB) (a) Test accuracy

Fig. 2. Outage probability of the considered relay-assisted FEEL system 3

versus the transmit SNR.
Instantaneous BA
Statistical BA
UA
2.5
UA-wo-PA
10000 test samples are utilized. There are 10 classes of fashion Ideal FEEL
pictures in the training samples, and the number of training
samples allocated for each user is uniformly distributed as Training loss
2
|Dk | ∈ U(200, 400). For the non-i.i.d setting of the Fashion-
MNIST dataset, each user is assigned with 2 labels in its local
training samples. As to the learning network, we use a CNN 1.5

composed of two 3 × 3 convolution layers, each followed by a

batch normalization layer and a 2 × 2 max pooling layer, two
1
fully connected layers, a drop out layer between the two fully
connected layers, and a soft output layer. For the training of
the CNN network, we use the CrossEntropyLoss as the loss 0.5
function with η = 0.001, b = 30, and E = 3. 0 50 100 150 200 250 300 350 400 450 500

To verify the effectiveness of the proposed instantaneous Communication round

and statistical bandwidth allocation schemes, we compare with (b) Training loss
some baseline methods abbreviated as follows, Fig. 3. Test accuracy and training loss through aggregating the trained
• Ideal FEEL: There is no bandwidth or latency constraint models.
so that all the selected users can successfully take part in
the learning process.
• Uniform allocation (UA): ES performs the uniform of the derived analytical and asymptotic expressions of the
bandwidth allocation for all users selected in each com- system outage probability. Moreover, all the system outage
munication round. results get improved when SNR becomes larger, as a larger
• Uniform allocation without partial aggregation (UA- transmit power at users and relays can achieve a reduced
wo-PA): ES performs the uniform bandwidth allocation latency in the model upload, thus improving the system outage
for all users selected in each communication round, and performance. Further to this, it is found that the system outage
the users upload the model via the selected relay without probability improves with a larger M , as more relays can help
partial aggregation. increase the spatial diversity of the wireless links between
users and relays.
B. Outage Performance Simulation
Fig. 2 depicts the simulated, analytical and asymptotic C. Federated Learning Performance Simulation
outage probabilities for the relay-assisted FEEL under UA Fig. 3(a) and Fig. 3(b) illustrate the test accuracy and train-
method versus the transmit SNR, where the transmit SNR ing loss of the aforementioned BA schemes, where Btotal =
of each user ranges from 15dB to 35dB, the transmit SNR 60MHz, and γth = 1.2s. We can observe from Fig. 3(a)
of each relay is ten times that of the user, and the total and Fig. 3(b) that both the test accuracy and training loss of
bandwidth of the system is 50MHz. Observing from Fig. 2, all BA schemes converge with the increasing communication
we can find that the analytical lower bound fits well with the round. Moreover, the UA-wo-PA performs the worst, because
simulated one, and the asymptotic lower bound converges to without partial aggregation, more models need to be uploaded
the analytical one with high SNR, which shows the correctness through the second hop. Further, the proposed instantaneous
10

0.8 0.8

0.7
0.7

0.6
0.6
Test accuracy

0.5

Test accuracy
0.5
0.4

0.4 Instantaneous BA (M = 1)
0.3 Instantaneous BA (M = 2)
Statistical BA (M = 1)
Instantaneous BA 0.3 Statistical BA (M = 2)
0.2
Statistical BA UA (M = 1)
UA UA (M = 2)
0.1 UA-wo-PA 0.2 UA-wo-PA (M = 1)
Ideal FEEL UA-wo-PA (M = 2)
Ideal FEEL
0
0 50 100 150 200 250 300 350 400 450 500 0.1
0.8 1 1.2 1.4 1.6 1.8
Communication round
Latency threshold γ th
(s)
(a) Test accuracy

2.6 Fig. 5. Test accuracy of the several BA schemes versus γth .

Instantaneous BA
2.4
Statistical BA
UA
2.2
UA-wo-PA demonstrates that the problem of “objective inconsistency”
2
Ideal FEEL caused by different SGD iterations would deteriorate the fed-
erated learning performance, and using normalized cumulative
Training loss

1.8
gradients in the aggregation can help solve the inconsistency
1.6 problem and enhance the system performance.
1.4 Fig. 5 is provided to show the test accuracy of the several
BA schemes versus γth , where M ∈ {1, 2} and the system
1.2
latency threshold varies from 0.8s to 1.8s. We can observe that
1 for all the aforementioned schemes except the ideal FEEL one,
0.8
the test accuracy gets improved with a larger system threshold,
as a larger threshold can allow more users successfully to
0.6
0 50 100 150 200 250 300 350 400 450 500 participate in FEEL. Moreover, for all the aforementioned
Communication round schemes, the performances with two relays are better than
(b) Training loss those with only one relay, since more relays can help improve
the model transmission rate. In further, the UA and UA-wo-
Fig. 4. Test accuracy and training loss through aggregating the normalized PA schemes have a lower test accuracy than the instantaneous
cumulative gradients.
and statistical BA schemes. In particular, when the latency
threshold is low, the relay-assisted FEEL system using the
UA and UA-wo-PA schemes can not even train an effective
and statistical bandwidth allocation schemes outperform UA, model. This is because that only very few users can success-
showing the effectiveness of the two bandwidth allocation fully participate in FEEL under those schemes. However, the
schemes. Furthermore, the instantaneous bandwidth allocation proposed instantaneous and statistical BA schemes can achieve
scheme can achieve a better near-optimal convergence rate and sufficiently good performance for various latency thresholds,
test accuracy than the statistical bandwidth allocation scheme, which proves that instantaneous and statistical BA schemes
indicating that the instantaneous CSI can help maximize the can provide a feasible bandwidth allocation strategy for the
number of users who can successfully participate in FL at each relay-assisted FEEL.
round more effectively. Fig. 6 shows the impact of Btotal on the test accuracy of
Fig. 4(a) and Fig. 4(b) show the test accuracy and training the several bandwidth allocation schemes, where the relay
loss of the aforementioned BA schemes versus the communi- number M ∈ {1, 2}, γth = 1.2s, and Btotal varies from 50MHz
cation round through aggregating the normalized cumulative to 100MHz. The test accuracy improvements are observed
gradients, where Btotal = 60MHz, and γth = 1.2s. We for all the aforementioned schemes except the ideal FEEL
can observe that the proposed instantaneous and statistical one, as Btotal increases, indicating that a larger bandwidth can
BA schemes outperform UA, proving the effectiveness of help increase the transmission rate of the models. Moreover,
the two bandwidth allocation schemes when aggregating the we can see that with the number of relays increasing from
normalized cumulative gradients. Moreover, aggregating the 1 to 2, all the bandwidth allocation schemes get improved
normalized cumulative gradients can provide a better per- because more relays can help enhance the outage performance
formance with an improved test accuracy of 1%-1.5% than and allow more users successfully participate in FEEL. In
simply aggregating the trained models in the FedAvg, which further, the proposed instantaneous and statistical BA schemes
11

0.8
of the relay-assisted FEEL.
0.75

0.7 VI. C ONCLUSION

0.65
In this article, a relay-assisted FEEL system was studied un-
der latency and bandwidth constraints, where we evaluated the
Test accuracy

0.6 system performance by deriving analytical and asymptotic ex-

0.55 Instantaneous BA (M = 1)
pressions of the system outage probability and the convergence
Instantaneous BA (M = 2) analysis. In order to improve the system performance, we
Statistical BA (M = 1)
0.5
Statistical BA (M = 2)
optimized the relay-assisted FEEL network through allocating
UA (M = 1) the wireless bandwidth among users and relays. Specifically,
0.45 UA (M = 2)
UA-wo-PA (M = 1)
we proposed two bandwidth allocation schemes to maximize
0.4 UA-wo-PA (M = 2) the successfully participated user number in each round’s fed-
Ideal FEEL
erated learning. Finally, some simulations were demonstrated
0.35
50 60 70 80 90 100 to verify the instantaneous and statistical bandwidth allocation
Total bandwidth Btotal (MHz) schemes. The simulation results showed that the proposed
instantaneous and statistical BA schemes could outperform the
Fig. 6. Test accuracy of the several BA schemes versus Btotal . conventional UA and UA-wo-PA schemes, and achieve almost
the same performance as the conventional federated learning
0.76 without latency and bandwidth constraints. In future works, we
0.74 will study the federated learning with multiuser interference
for the considered system, where the proposed framework of
0.72
performance analysis and system optimization in this paper
0.7 will be applied.
Test accuracy

0.68 Instantaneous BA
Statistical BA A PPENDIX A
0.66 UA
UA-wo-PA
P ROOF OF T HEOREM 1
0.64 Ideal FEEL
To prove Theorem 1, we substitute (23) into (22), and then
0.62 the lower bound of Pout,k is,

0.6 1 I II 1
Pout,k ≥ Pr min(Rk,m ∗ , Rm∗ ) <
0.58 |L| k k γth − Tklocal

1 I II 1
0.56 = 1 − Pr min(Rk,m ∗ , Rm∗ ) ≥
1 2 3 4 5 |L| k k γth − Tklocal
Relay number M
I |L|
= 1 − 1 − Pr Rk,m ∗ <
Fig. 7. Test accuracy of the several BA schemes versus M . k γth − Tklocal

II |L|
× 1 − Pr Rm∗k < . (A.1)
γth − Tklocal
outperform the other bandwidth allocation schemes for a wide
range of Btotal , and they can achieve almost the same accuracy After some manipulations, we can further have,
as the ideal FEEL. These results further verify the proposed   |L|

B I (γth −T local )
bandwidth allocation schemes. 2 k k − 1 
Pout,k ≥ 1 − 1 − Pr |hk,m∗k |2 <
 
In Fig. 7, the influence of the relay number on the test ζk

accuracy of the several BA schemes is studied, where the relay   
|L|
number varies from 1 to 5, Btotal = 60MHz, and γth =
(
B II∗ γth −T local
k )
1.2s. We can observe from this figure that all the bandwidth   2 2 m
k − 1
×
1 − Pr |gm∗k | <
 
allocation schemes are improved with a larger M , as spatial ζm∗k 
diversity and better transmission connections can be provided
for model uploading. Moreover, the proposed instantaneous |L|
  
M (
B I γth −T local ) − 1 
and statistical BA schemes are superior to the other bandwidth Y 2 k k
= 1 − 1 − Pr |hk,m |2 <
 
allocation schemes, including UA and the UA-wo-PA schemes. ζk

m=1
In particular, when there are four relays in the network, the   
|L|
proposed instantaneous and statistical BA schemes can achieve
(
B II∗ γth −T local
k )
a better test accuracy, at least 5.1% and 3.8% higher than   2 2 m
k − 1
×
1 − Pr |gm∗k | <
  .
that of the UA and UA-wo-PA schemes. These results indicate ζm∗k 
that the proposed instantaneous and statistical BA schemes can
efficiently exploit multiple relays and improve the performance (A.2)
12

As |hk,m |2 and |gm∗k |2 are exponentially distributed with Proof: Using (B.1), we have
E[|hk,m |2 ] = λk,m and E[|gm∗k |2 ] = λm∗k , the analytical lower
N
bound on Pout,k is written as, I(k ∈ K, Tktotal ≤ γth )|Dk |
X
E[w̄ ] = E wt−e +
t
P total
k=1 k∈K I(Tk ≤ γth )|Dk |
   
|L| ln 2
M 1 − exp I γ −T local × (vkt − wt−e )
 Y   Bk ( th k ) 
Pout,k ≥1−
1 −
1 − exp  
m=1
  λk,m ζk 
N
X |Dk |
!  = wt−e + K
N (1 − Pout )|D|
 
|L|
 1 − exp B II∗ (γth −T local ) k=1
× E I(k ∈ K, Tktotal ≤ γth ) (vkt − wt−e )
 
m k
k
  
× E exp 
  . (A.3)


 λ m ∗ ζm∗
k k

 XN
= wt−e + pk (vkt − wt−e )
k=1
XN
We can
observe from (A.3) that = pk vkt = v̄ t . (B.2)
|L|
exp 1 − exp B II γ −T local /λm∗k ζm∗k is concave k=1
m∗ ( th k )
k
II
for a positive Bm ∗ . By using Jensen’s inequality, we have
k

Then, we proceed with the proof of Theorem 3 by looking

    into assumption 2 of Sec.III-C, where we have
|L| ln 2
YM
 1 − exp BkI (γth −Tklocal )
L h t+1
  
≥1− − 1 − exp  i
Pout,k 1 2
E[F (wT ) − F ∗ ] ≤ − w∗

   λk,m ζk  E w̄ . (B.3)
m=1 2
 h i
2

1 − exp |L| ln 2
Thus, we only need to bound E w̄t+1 − w∗ for the proof,
 AII γ −Tklocal )
k ( th 
× exp  , (A.4) which can be further written as
 λm∗k ζm∗k 
h 2
i
E w̄t+1 − w∗
h 2
i
= E w̄t+1 − v̄ t+1 + v̄ t+1 − w∗
where AIIk is the expected bandwidth of the relay selected by h i h i
2 2
user Uk , given in (25). In this way, we have proven Theorem = E w̄t+1 − v̄ t+1 + E v̄ t+1 − w∗
1. | {z } | {z }
Q1 Q2
h T t+1 i
+ 2E w̄t+1 − v̄ t+1 v̄ − w∗ . (B.4)
| {z }
Q3
A PPENDIX B
P ROOF OF T HEOREM 3
h i
2
Next, we bound E w̄t+1 − w∗ part by part. Specifically,
for the first part Q1 , we have
Before proving Theorem 3, we first present some notations h 2
i
the proof. Specifically, we define w̄t = Q1 =E w̄t+1 − v̄ t+1
PNlemmast to facilitate
and
t
P N t |Dk |
k=1 pk wk and v̄ = k=1 pk vk , where pk = |D| . Then,
" N
X I(k ∈ K, Tktotal ≤ γth )|Dk | t+1
we present two preliminary lemmas used for the proof. =E wt−e + P total
(vk − wt+1−e )
k=1 k∈K I(Tk ≤ γ th )|Dk |
2#
Lemma 1. Using (7) and taking the system outage into XN
t+1
account, we can write the aggregated global model at time − pk vk
k=1
t as " N
X I(k ∈ K, Tktotal ≤ γth ) t+1
=E wt−e + pk K
(vk − wt+1−e )
(1 − Pout )
wt+1 = wt+1−e k=1 N
2#
N
N
X t+1 t+1−e
X I(k ∈ K, Tktotal ≤ γth )|Dk | t+1 − pk vk − w
+ P total
(vk − wt+1−e ). k=1
k=1 k∈K I(Tk ≤ γ th )|Dk | " N
N · I(k ∈ K, Tktotal ≤ γth )
X
(B.1) =E pk −1
K(1 − Pout )
k=1
2#

Lemma 2. The two-stage aggregation in the relay-assisted × (vkt+1 − wt+1−e ) . (B.5)

FEEL is unbiased, i.e., E[w̄t ] = v̄ t .
13

PN −K(1−Pout )
Using the convexity of the second-order norm, we further in which H = k=1 pk N K(1−Pout )
. For brevity, we rewrite
have (B.10) as
" N
N · I(k ∈ K, Tktotal ≤ γth )

X ∆t+1 ≤ (1 − µηt )∆t + C, (B.11)
Q1 ≤ E pk −1
K(1 − Pout ) h
2
i
k=1
2 where ∆t+1 = E w̄t+1 − w∗ and
t+1 t+1−e
× (vk − w ) N
X
N
C= p2k δk2 + 6LΓ + 8(e − 1)2 G2 + 4e2 G2 H. (B.12)
2
X N − K(1 − Pout ) k=1
= pk · E vkt+1 − wt+1−e .
K(1 − Pout ) v
k=1 Then, we use the recurrence method to prove that ∆t ≤ ψ+t ,
(B.6) n
β2 C
o
where v = max ψ∆0 , µβ−1 . First, for t = 0, we have
v
∆0 ≤ ψ+0 ≤ ∆0 . For t > 0, we have
h 2
i
We can write E vkt+1 − wt+1−e in (B.6) as
 2
 ∆t+1 ≤ (1 − µηt )∆t + ηt2 C
t
β2C
h i
2 t+ψ−1 µβ − 1
X
E vkt+1 − wt+1−e = E ηi ∇Fk (wki ; ξki )  = v+ − v
i=t+1−e (t + ψ)2 (t + ψ)2 (t + ψ)2
" t
# 1
X 2 ≤ . (B.13)
≤E e· ηi ∇Fk (wki ; ξki ) t+ψ+1
i=t+1−e
" i=t+1−e
# Therefore, we have
2
X 2
≤E ηt+1−e e · ∇Fk (wki ; ξki ) L h 2
i
t
E[F (wT ) − F ∗ ] ≤ E w̄t+1 − w∗
2 " 2N
≤ ηt+1−e e2 G2 ≤4ηt+1
2
e2 G2 ≤ 4ηt2 e2 G2 , (B.7) L 2 X 2 2
≤ pk δk + 6LΓ + 8(e − 1)2 G2
where we use the Cauchy-Schwarz inequality for the first µ(ψ + T ) µ
k=1
inequality, and we assume that ηt is non-increasing with
! #
2 2 µψ 0 ∗ 2
respect to t and ηt ≤ 2ηt+E to derive other inequalities. Then, + 4e G H + w −w . (B.14)
2
we can bound the first part Q1 as
h 2
i In this way, we have proven Theorem 3.
Q1 =E w̄t+1 − w∗
N
X N − K(1 − Pout ) A PPENDIX C
≤ pk · 4ηt2 e2 G2 . (B.8)
K(1 − Pout ) P ROOF OF T HEOREM 4
k=1
II
To prove Theorem 4, we start from αk,m > 0 and Bm >0
For the second part Q2 , its bound can be found in [43], which to have
can still hold for this paper. Thus, according to [43],
n weo have I
dRk,m Pk |hk,m |2

that for any round t, if we choose ψ = max 8 L d
µ,e and II
= αk,m Bm log2 1 +
2
ηt = µ(ψ+t) , the second part Q2 is bounded as dαk,m dαk,m σ2
Pk |hk,m |2

II
= Bm log2 1 + > 0, (C.1)
σ2
h 2
i h i
2
Q2 = E v̄ t+1 − w∗ ≤ (1 − µηt )E w̄t − w∗
N
X
! and
+ ηt2 p2k δk2 + 6LΓ + 8(e − 1)2 G2 . (B.9) I
dRk,m Pk |hk,m |2

d II
k=1
II
= II
αk,m Bm log2 1 +
dBm dBm σ2
2

For the third part Q3 , we can derive from Lemma 2 that Q3 Pk |hk,m |
= αk,m log2 1 + > 0, (C.2)
equals to 0 due to the unbiasedness of w̄t+1 . By summa- σ2
rizing
h the above three parts, we have that, for any round t,
i I
E w̄t+1 − w∗
2
is bounded as we can see from (C.1) and (C.2) that Rk,m monotonically
II
increases with αk,m and Bm . Therefore, for αk,m > 0,
II I II
h 2
i h
2
i and Bm > 0, we have that Tk,m and Tm monotonically
E w̄t+1 − w∗ ≤ (1 − µηt )E w̄t − w∗ II
decrease with αk,m and Bm . Moreover, the system latency
is determined by the slowest user. Therefore, we can achieve
N
!
X
+ ηt2 p2k δk2 + 6LΓ + 8(e − 1) G + 4e G H 2 2 2 2
, the optimal solution of P2 if and
II
P only if: 1) all of the
k=1 bandwidth Bm is allocated (i.e., k∈Jm αk,m = 1) and 2)
(B.10) all selected users have the same total latency of γth (i.e.,
14

I II
Tktotal = Tklocal + Tk,m + Tm = γth ). So the optimal solution [18] J. Ren, Y. He, D. Wen, G. Yu, K. Huang, and D. Guo, “Scheduling
can be given by for cellular federated edge learning with importance and channel aware-
ness,” IEEE Trans. Wirel. Commun., vol. 19, no. 11, pp. 7690–7703,
2020.

 Tktotal + α∗ B|L|
II ∗ I + B II|L|
∗ II = γth ,
[19] X. Cao, G. Zhu, J. Xu, Z. Wang, and S. Cui, “Optimized power control
k,m m rk,m m rm


design for over-the-air federated edge learning,” IEEE J. Sel. Areas

∗
P
Uk ∈Jm αk,m = 1,

(C.3) Commun., vol. 40, no. 1, pp. 342–358, 2022.
∗

 0 ≤ αk,m ≤ 1, [20] H. Sun, X. Ma, and R. Q. Hu, “Adaptive federated learning with gradient
compression in uplink NOMA,” IEEE Trans. Veh. Technol., vol. 69,

B II ≥ 0.

m no. 12, pp. 16 325–16 329, 2020.
[21] H. Yang, J. Zhao, Z. Xiong, K. Lam, S. Sun, and L. Xiao, “Privacy-
∗
Because of the monotonicity and non-trivial value of αk,m preserving federated learning for UAV-enabled networks: Learning-
∗
and Bm II
, there is one and only one solution to (C.3), which based joint scheduling and resource management,” IEEE J. Sel. Areas
Commun., vol. 39, no. 10, pp. 3144–3159, 2021.
finishes the proof of Theorem 4. [22] S. Zheng, C. Shen, and X. Chen, “Design and analysis of uplink and
downlink communications for federated learning,” IEEE J. Sel. Areas
Commun., vol. 39, no. 7, pp. 2150–2167, 2021.
R EFERENCES [23] M. Salehi and E. Hossain, “Federated learning in unreliable and
resource-constrained cellular wireless networks,” IEEE Trans. Commun.,
[1] W. Saad, M. Bennis, and M. Chen, “A vision of 6G wireless systems: vol. 69, no. 8, pp. 5136–5151, 2021.
Applications, trends, technologies, and open research problems,” IEEE [24] J. Xu and H. Wang, “Client selection and bandwidth allocation in
Netw., vol. 34, no. 3, pp. 134–142, 2020. wireless federated learning networks: A long-term perspective,” IEEE
[2] S. Deng, H. Zhao, W. Fang, J. Yin, S. Dustdar, and A. Y. Zomaya, Trans. Wirel. Commun., vol. 20, no. 2, pp. 1188–1200, 2021.
“Edge intelligence: The confluence of edge computing and artificial [25] S. Tang, L. Chen, K. He, J. Xia, L. Fan, and A. Nallanathan, “Compu-
intelligence,” IEEE Internet Things J., vol. 7, no. 8, pp. 7457–7469, tational intelligence and deep learning for next-generation edge-enabled
2020. industrial IoT,” IEEE Trans. Netw. Sci. Eng., vol. PP, no. 99, pp. 1–12,
[3] X. Wang, Y. Han, C. Wang, Q. Zhao, X. Chen, and M. Chen, “In-edge 2023.
AI: Intelligentizing mobile edge computing, caching and communication [26] Z. Zhao, J. Xia, L. Fan, X. Lei, G. K. Karagiannidis, and A. Nallanathan,
by federated learning,” IEEE Netw., vol. 33, no. 5, pp. 156–165, 2019. “System optimization of federated learning networks with a constrained
[4] W. Zhou, J. Xia, and F. Zhou, “Profit maximization for cache-enabled latency,” IEEE Trans. Veh. Technol., vol. 71, no. 1, pp. 1095–1100, 2022.
vehicular mobile edge computing networks,” to appear in IEEE Trans. [27] Y. Wang, Y. Xu, Q. Shi, and T. Chang, “Quantized federated learning
Veh. Technol., pp. 1–6, 2023. under transmission delay and outage constraints,” IEEE J. Sel. Areas
[5] B. McMahan, E. Moore, D. Ramage, S. Hampson, and B. A. y Arcas, Commun., vol. 40, no. 1, pp. 323–341, 2022.
“Communication-efficient learning of deep networks from decentralized [28] Q. Bie, Y. Liu, Y. Wang, X. Zhao, and X. Y. Zhang, “Deployment
data,” in Proc. AISTATS, vol. 54, 2017, pp. 1273–1282. optimization of reconfigurable intelligent surface for relay systems,”
[6] S. Wang, T. Tuor, T. Salonidis, K. K. Leung, C. Makaya, T. He, and IEEE Trans. Green Commun. Netw., vol. 6, no. 1, pp. 221–233, 2022.
K. Chan, “Adaptive federated learning in resource constrained edge [29] J. Xia, L. Fan, W. Xu, X. Lei, X. Chen, G. K. Karagiannidis, and A. Nal-
computing systems,” IEEE J. Sel. Areas Commun., vol. 37, no. 6, pp. lanathan, “Secure cache-aided multi-relay networks in the presence of
1205–1221, 2019. multiple eavesdroppers,” IEEE Trans. Commun., vol. 67, no. 11, pp.
[7] W. Shi, S. Zhou, Z. Niu, M. Jiang, and L. Geng, “Joint device schedul- 7672–7685, 2019.
ing and resource allocation for latency constrained wireless federated [30] X. Li, R. Fan, H. Hu, N. Zhang, X. Chen, and A. Meng, “Energy-
learning,” IEEE Trans. Wirel. Commun., vol. 20, no. 1, pp. 453–467, efficient resource allocation for mobile edge computing with multiple
2021. relays,” IEEE Internet Things J., vol. 9, no. 13, pp. 10 732–10 750, 2022.
[8] A. Hammoud, H. Otrok, A. Mourad, and Z. Dziong, “On demand fog [31] S. Feng, D. Niyato, P. Wang, D. I. Kim, and Y.-C. Liang, “Joint service
federations for horizontal federated learning in iov,” IEEE Trans. Netw. pricing and cooperative relay communication for federated learning,” in
Serv. Manag., vol. 19, no. 3, pp. 3062–3075, 2022. 2019 International Conference on Internet of Things (iThings), 2019,
[9] Z. Zhao, C. Feng, W. Hong, J. Jiang, C. Jia, T. Q. S. Quek, and M. Peng, pp. 815–820.
“Federated learning with non-iid data in wireless networks,” IEEE Trans. [32] Z. Lin, H. Liu, and Y. A. Zhang, “Relay-assisted cooperative federated
Wirel. Commun., vol. 21, no. 3, pp. 1927–1942, 2022. learning,” IEEE Trans. Wirel. Commun., vol. 21, no. 9, pp. 7148–7164,
[10] Y. Zhan, J. Zhang, Z. Hong, L. Wu, P. Li, and S. Guo, “A survey of 2022.
incentive mechanism design for federated learning,” IEEE Trans. Emerg. [33] Z. Qu, S. Guo, H. Wang, B. Ye, Y. Wang, A. Y. Zomaya, and B. Tang,
Top. Comput., vol. 10, no. 2, pp. 1035–1044, 2022. “Partial synchronization to accelerate federated learning over relay-
[11] M. Wazzeh, H. Ould-Slimane, C. Talhi, A. Mourad, and M. Guizani, assisted edge networks,” IEEE Trans. Mob. Comput., vol. 21, no. 12,
“Privacy-preserving continuous authentication for mobile and iot sys- pp. 4502–4516, 2022.
tems using warmup-based federated learning,” IEEE Netw., pp. 1–7, [34] S. Hosseinalipour, S. Wang, N. Michelusi, V. Aggarwal, C. G. Brinton,
2022. D. J. Love, and M. Chiang, “Parallel successive learning for dynamic
[12] G. Zhu, D. Liu, Y. Du, C. You, J. Zhang, and K. Huang, “Toward distributed model training over heterogeneous wireless networks,” CoRR,
an intelligent edge: Wireless communication meets machine learning,” vol. abs/2202.02947, 2022.
IEEE Commun. Mag., vol. 58, no. 1, pp. 19–25, 2020. [35] C. Shen, J. Xu, S. Zheng, and X. Chen, “Resource rationing for wireless
[13] W. Zhou, L. Fan, F. Zhou, F. Li, X. Lei, W. Xu, and A. Nallanathan, federated learning: Concept, benefits, and challenges,” IEEE Commun.
“Priority-aware resource scheduling for UAV-mounted mobile edge Mag., vol. 59, no. 5, pp. 82–87, 2021.
computing networks,” to appear in IEEE Trans. Veh. Technol., pp. 1–6, [36] H. Yang, M. Fang, and J. Liu, “Achieving linear speedup with partial
2023. worker participation in non-iid federated learning,” in 9th International
[14] L. Xiao, Y. Ding, J. Huang, S. Liu, Y. Tang, and H. Dai, “UAV anti- Conference on Learning Representations, ICLR 2021, 2021.
jamming video transmissions with QoE guarantee: A reinforcement [37] S. P. Karimireddy, S. Kale, M. Mohri, S. J. Reddi, S. U. Stich, and
learning-based approach,” IEEE Trans. Commun., vol. 69, no. 9, pp. A. T. Suresh, “SCAFFOLD: Stochastic controlled averaging for on-
5933–5947, 2021. device federated learning,” CoRR, vol. abs/1910.06378, 2019.
[15] L. Xiao, X. Lu, T. Xu, X. Wan, W. Ji, and Y. Zhang, “Reinforcement [38] J. Wang, Q. Liu, H. Liang, G. Joshi, and H. V. Poor, “Tackling the ob-
learning-based mobile offloading for edge computing against jamming jective inconsistency problem in heterogeneous federated optimization,”
and interference,” IEEE Trans. Commun., vol. 68, no. 10, pp. 6114– in NeurIPS 2020, 2020.
6126, 2020. [39] J. Tong and C. Zhong, “Full-duplex two-way AF relaying systems with
[16] R. Yu and P. Li, “Toward resource-efficient federated learning in mobile imperfect interference cancellation in Nakagami-m fading channels,” Sci.
edge computing,” IEEE Netw., vol. 35, no. 1, pp. 148–155, 2021. China Inf. Sci., vol. 64, no. 8, 2021.
[17] X. Huang, P. Li, R. Yu, Y. Wu, K. Xie, and S. Xie, “Fedparking: A [40] O. Waqar, H. Tabassum, and R. Adve, “Secure beamforming and
federated learning based parking space estimation with parked vehicle ergodic secrecy rate analysis for amplify-and-forward relay networks
assisted edge computing,” IEEE Trans. Veh. Technol., vol. 70, no. 9, pp. with wireless powered jammer,” IEEE Trans. Veh. Technol., vol. 70,
9355–9368, 2021. no. 4, pp. 3908–3913, 2021.
15

[41] L. Fan, X. Lei, T. Q. Duong, M. Elkashlan, and G. K. Karagiannidis, Trung Q. Duong (Fellow, IEEE) received his B.Eng.
“Secure multiuser communications in multiple amplify-and-forward degree in electrical and electronics engineering from
relay networks,” IEEE Trans. Commun., vol. 62, no. 9, pp. 3299–3310, Bach Khoa Sai Gon (Vietnam) in 2002, the M.Sc.
2014. degree in computer science from Kyung Hee Uni-
[42] I. S. Gradshteyn and I. M. Ryzhik, Table of Integrals, Series, and versity (South Korea) in 2005, the Ph.D. degree in
Products, 7th ed. San Diego, CA: Academic, 2007. telecommunications systems from Blekinge Institute
[43] X. Li, K. Huang, W. Yang, S. Wang, and Z. Zhang, “On the convergence of Technology (Sweden) in 2012. In 2013, he joint
of fedavg on non-iid data,” in ICLR, 2020. Queen’s University Belfast (U.K.) as an academic
staff, where he is now a Chair Professor in Telecom-
munications. He also holds a prestigious Research
Chair of Royal Academy of Engineering. His current
research interests include quantum communications, wireless communications,
signal processing, machine learning, and realtime optimisation.
Dr. Duong has served as an Editor/Guest Editor for the IEEE Transac-
Lunyuan Chen received the bachelor’s degree in tions on Wireless Communications, IEEE Transactions on Communications,
Communication Engineering from Xidian university IEEE Transactions on Vehicular Technology, IEEE Communications Letters,
in 2019. He is currently pursuing the master degree IEEE Wireless Communications Letters, IEEE Wireless Communications,
with the school of Electronics and Communication IEEE Communications Magazines, and IEEE Journal on Selected Areas in
Engineering, Guangzhou University. His current re- Communications. Currently, he is serving as an Executive Editor for IEEE
search interests focus on statistical machine learning Communications Letters. He received the Best Paper Award at the IEEE
and deep learning. VTC-Spring 2013, IEEE ICC 2014, IEEE GLOBECOM 2016, 2019, 2022
IEEE DSP 2017, and IWCMC 2019. He is the recipient of prestigious Royal
Academy of Engineering Research Fellowship (2015-2020) and has won a
prestigious Newton Prize 2017.

Lisheng Fan received the bachelor and master de-

grees from Fudan University and Tsinghua Univer-
sity, China, in 2002 and 2005, respectively, both
from the Department of Electronic Engineering. He
received the Ph.D degree from the Department of
Communications and Integrated Systems of Toky-
o Institute of Technology, Japan, in 2008. He is
now a Professor with GuangZhou University. His
research interests span in the areas of wireless
cooperative communications, physical-layer secure
communications, interference modeling, and system
performance evaluation. Lisheng Fan has published many papers in inter-
national journals such as IEEE Transactions on Wireless Communications,
IEEE Transactions on Communications, IEEE Transactions on Information
Theory, as well as papers in conferences such as IEEE ICC, IEEE Globecom,
and IEEE WCNC. He is a guest editor of EURASIP Journal on Wireless
Communications and Networking, and served as the chair of Wireless Com- Arumugam Nallanathan (S’97-M’00-SM’05-F’17)
munications and Networking Symposium for Chinacom 2014. He has also is Professor of Wireless Communications and Head
served as a member of Technical Program Committees for IEEE conferences of the Communication Systems Research (CSR)
such as Globecom, ICC, WCNC, and VTC. group in the School of Electronic Engineering and
Computer Science at Queen Mary University of
London since September 2017. He was with the
Department of Informatics at Kings College London
from December 2007 to August 2017, where he was
Professor of Wireless Communications from April
2013 to August 2017 and a Visiting Professor from
Xianfu Lei received his Ph.D from Southwest Jiao- September 2017. He was an Assistant Professor in
tong University in 2012. He has been an Associate the Department of Electrical and Computer Engineering, National University
Professor with the School of Information Science of Singapore from August 2000 to December 2007. His research interests
and Technology at Southwest Jiaotong University include 5G Wireless Networks, Internet of Things (IoT) and Molecular
since 2015. From 2012 to 2014, he worked as Communications. He published nearly 500 technical papers in scientific
a research fellow in the Department of Electrical journals and international conferences. He is a co-recipient of the Best Paper
and Computer Engineering at Utah State University. Awards presented at the IEEE International Conference on Communications
Dr Lei’s research interests are 5G/6G networks, 2016 (ICC’2016) , IEEE Global Communications Conference 2017 (GLOBE-
cooperative and energy harvesting networks and COM’2017) and IEEE Vehicular Technology Conference 2018 (VTC’2018).
physical-layer security. He has been serving as an He is an IEEE Distinguished Lecturer. He has been selected as a Web of
Area Editor for IEEE Communications Letters and Science Highly Cited Researcher in 2016.
an Associate Editor for IEEE Wireless Communications Letters and IEEE He is an Editor for IEEE Transactions on Communications. He was an
Transactions on Communications. He served as Senior/Associate Editor for Editor for IEEE Transactions on Wireless Communications (2006-2011), IEEE
IEEE Communications Letters from 2014-2019. He received the best paper Transactions on Vehicular Technology (2006-2017), IEEE Wireless Communi-
award in IEEE/CIC ICCC2020, the best paper award in WCSP2018, the cations Letters and IEEE Signal Processing Letters. He served as the Chair for
WCSP 10-Year Anniversary Excellent Paper Award, IEEE Communications the Signal Processing and Communication Electronics Technical Committee
Letters Exemplary Editor 2019, and Natural Science Award of China Institute of IEEE Communications Society and Technical Program Chair and member
of Communications (2019). of Technical Program Committees in numerous IEEE conferences. He received
the IEEE Communications Society SPCE outstanding service award 2012 and
IEEE Communications Society RCC outstanding service award 2014.
16

George K. Karagiannidis (M96-SM03-F14) was

born in Pithagorion, Samos Island, Greece. He re-
ceived the University Diploma (5 years) and PhD
degree, both in electrical and computer engineering
from the University of Patras, in 1987 and 1999,
respectively. From 2000 to 2004, he was a Senior
Researcher at the Institute for Space Applications
and Remote Sensing, National Observatory of A-
thens, Greece. In June 2004, he joined the faculty of
Aristotle University of Thessaloniki, Greece where
he is currently Professor in the Electrical & Comput-
er Engineering Dept. and Head of Wireless Communications & Information
Processing (WCIP) Group. He is also Honorary Professor at South West
Jiaotong University, Chengdu, China.
His research interests are in the broad area of Digital Communications
Systems and Signal processing, with emphasis on Wireless Communications,
Optical Wireless Communications, Wireless Power Transfer and Applications
and Communications & Signal Processing for Biomedical Engineering.
Dr. Karagiannidis has been involved as General Chair, Technical Program
Chair and member of Technical Program Committees in several IEEE and non-
IEEE conferences. In the past, he was Editor in several IEEE journals and from
2012 to 2015 he was the Editor-in Chief of IEEE Communications Letters.
Currently, he serves as Associate Editor-in Chief of IEEE Open Journal of
Communications Society.
Dr. Karagiannidis is one of the highly-cited authors across all areas
of Electrical Engineering, recognized from Clarivate Analytics as Web-of-
Science Highly-Cited Researcher in the six consecutive years 2015-2020.

DevOps 1 With Lifecycle
100% (2)
DevOps 1 With Lifecycle
39 pages
Module5 - Identity and Access Management
No ratings yet
Module5 - Identity and Access Management
84 pages
CS 488
No ratings yet
CS 488
288 pages
Forcepoint Ipsec Guide: Forcepoint Web Security Cloud
No ratings yet
Forcepoint Ipsec Guide: Forcepoint Web Security Cloud
36 pages
1 SP - PP Gold Model Design1-2
No ratings yet
1 SP - PP Gold Model Design1-2
221 pages
Basic Principles of Electronic Exchanges
100% (1)
Basic Principles of Electronic Exchanges
12 pages
General Ledger Configuration in Microsoft Dynamics AX: Instructor Date
No ratings yet
General Ledger Configuration in Microsoft Dynamics AX: Instructor Date
29 pages
AUTODYN - Chapter 11 - Parallel - Processing PDF
No ratings yet
AUTODYN - Chapter 11 - Parallel - Processing PDF
42 pages
g13mft MDB s1 GB
No ratings yet
g13mft MDB s1 GB
57 pages
Android Ariny Amos Quick Read Only
No ratings yet
Android Ariny Amos Quick Read Only
67 pages
Cef
No ratings yet
Cef
34 pages
BCM50 - Troubleshooting Guide
No ratings yet
BCM50 - Troubleshooting Guide
92 pages
Unit I Iot
No ratings yet
Unit I Iot
4 pages
Solar Water Heater Control Using Iot - Finaldraft
No ratings yet
Solar Water Heater Control Using Iot - Finaldraft
80 pages
Chapter 01
No ratings yet
Chapter 01
23 pages
Accelerating Federated Learning With Cluster Construction and Hierarchical Aggregation
No ratings yet
Accelerating Federated Learning With Cluster Construction and Hierarchical Aggregation
18 pages
IM ch04
No ratings yet
IM ch04
8 pages
Hierarchical Federated Learning Across
No ratings yet
Hierarchical Federated Learning Across
10 pages
CCNA Routing and Switching Course Brochure
No ratings yet
CCNA Routing and Switching Course Brochure
5 pages
B12. Fedeteral Learning in Edge
No ratings yet
B12. Fedeteral Learning in Edge
64 pages
Federated Learning in Mobile Edge Networks: A Comprehensive Survey
No ratings yet
Federated Learning in Mobile Edge Networks: A Comprehensive Survey
33 pages
B308 Octo-Output Module
No ratings yet
B308 Octo-Output Module
3 pages
Assignment 5 (PivotTables)
No ratings yet
Assignment 5 (PivotTables)
3 pages
IMS-DEDB Alter
No ratings yet
IMS-DEDB Alter
25 pages
Distributed Machine Learning For Multiuser Mobile Edge Computing Systems
No ratings yet
Distributed Machine Learning For Multiuser Mobile Edge Computing Systems
14 pages
Energy Efficient Federated Learning Over Wireless
No ratings yet
Energy Efficient Federated Learning Over Wireless
14 pages
Glade Technology.29802.None
No ratings yet
Glade Technology.29802.None
6 pages
Federated Learning Challanges
No ratings yet
Federated Learning Challanges
21 pages
Applsci 12 09124 v2
No ratings yet
Applsci 12 09124 v2
36 pages
NISMAOL
No ratings yet
NISMAOL
2 pages
Remotesensing 16 01640
No ratings yet
Remotesensing 16 01640
20 pages
20894-Article Text-24907-1-2-20220628
No ratings yet
20894-Article Text-24907-1-2-20220628
9 pages
ChainsFL Blockchain-Driven Federated Learning From Design To Realization
No ratings yet
ChainsFL Blockchain-Driven Federated Learning From Design To Realization
6 pages
Federated Learning: Strategies For Improving Communication Efficiency
No ratings yet
Federated Learning: Strategies For Improving Communication Efficiency
5 pages
Meta Federated Reinforcement Learning For Distributed Resource Allocation
No ratings yet
Meta Federated Reinforcement Learning For Distributed Resource Allocation
11 pages
Electronics 13 02135
No ratings yet
Electronics 13 02135
16 pages
TT Study Guide Organized-2
No ratings yet
TT Study Guide Organized-2
14 pages
2021 - D. Lian - Federated Learning Over Cellular-Connected UAV Networks With Non-IID Datasets
No ratings yet
2021 - D. Lian - Federated Learning Over Cellular-Connected UAV Networks With Non-IID Datasets
6 pages
Dynamic Scheduling For Over-The-Air Federated Edge Learning With Energy Constraints
No ratings yet
Dynamic Scheduling For Over-The-Air Federated Edge Learning With Energy Constraints
16 pages
A Novel Edge-Based Multi-Layer Hierarchical Architecture For Federated Learning
No ratings yet
A Novel Edge-Based Multi-Layer Hierarchical Architecture For Federated Learning
5 pages
A Dispersed Federated Learning Framework For 6G-Enabled Autonomous Driving Cars
No ratings yet
A Dispersed Federated Learning Framework For 6G-Enabled Autonomous Driving Cars
12 pages
Aquamatic 5200 (-A) User Manual EN 5200EN3 - V1.0
No ratings yet
Aquamatic 5200 (-A) User Manual EN 5200EN3 - V1.0
36 pages
Aop Iccps 2024
No ratings yet
Aop Iccps 2024
10 pages
2024-07-20
No ratings yet
2024-07-20
10 pages
1st Review PPT B8
No ratings yet
1st Review PPT B8
21 pages
Accelerated
No ratings yet
Accelerated
6 pages
2017 Konecny Et Al Federated Learning Google Paper
No ratings yet
2017 Konecny Et Al Federated Learning Google Paper
10 pages
GSASG Global Sparsification With Adaptive Aggregated Stochastic Gradients For Communication-Efficient Federated Learning
No ratings yet
GSASG Global Sparsification With Adaptive Aggregated Stochastic Gradients For Communication-Efficient Federated Learning
14 pages
Federated Learning For Internet of Things Recent Advances Taxonomy and Open Challenges
No ratings yet
Federated Learning For Internet of Things Recent Advances Taxonomy and Open Challenges
41 pages
Accelerating DNN Training in Wireless Federated Edge Learning
No ratings yet
Accelerating DNN Training in Wireless Federated Edge Learning
30 pages
Time-Sensitive Federated Learning With Heterogeneous Training Intensity A Deep Reinforcement Learning Approach
No ratings yet
Time-Sensitive Federated Learning With Heterogeneous Training Intensity A Deep Reinforcement Learning Approach
14 pages
Federated Dropout
No ratings yet
Federated Dropout
12 pages
Cs Class
No ratings yet
Cs Class
13 pages
Digital Twin-Assisted Federated Learning Service Provisioning Over Mobile Edge Networks
No ratings yet
Digital Twin-Assisted Federated Learning Service Provisioning Over Mobile Edge Networks
13 pages
2020-cs-433 (Paper Summary)
No ratings yet
2020-cs-433 (Paper Summary)
3 pages
Papr 4
No ratings yet
Papr 4
41 pages
Adaptive Federated Learning in Resource Constrained Edge Computing Systems
No ratings yet
Adaptive Federated Learning in Resource Constrained Edge Computing Systems
20 pages
Eos Linux 7 Broadworks PB
No ratings yet
Eos Linux 7 Broadworks PB
4 pages
Federated Learning For Edge Networks: Resource Optimization and Incentive Mechanism
No ratings yet
Federated Learning For Edge Networks: Resource Optimization and Incentive Mechanism
7 pages
Leveraging Federated Learning and Edge Computing For Recommendation Systems Within Cloud Computing Networks
No ratings yet
Leveraging Federated Learning and Edge Computing For Recommendation Systems Within Cloud Computing Networks
6 pages
CRM Practices of Amazon
No ratings yet
CRM Practices of Amazon
45 pages
Lab 1 - Network Fundamentals - DSU
No ratings yet
Lab 1 - Network Fundamentals - DSU
13 pages
Li 2020
No ratings yet
Li 2020
11 pages
(24.07) Combining Federated Learning and Control A Survey
No ratings yet
(24.07) Combining Federated Learning and Control A Survey
19 pages
Federated Quantum Neural Network With Quantum Teleportation For Resource Optimization in Future Wireless Communication
No ratings yet
Federated Quantum Neural Network With Quantum Teleportation For Resource Optimization in Future Wireless Communication
17 pages
Fed Adp
No ratings yet
Fed Adp
11 pages
Federated Edge Learning: Design Issues and Challenges: Afaf Ta Ik and Soumaya Cherkaoui
No ratings yet
Federated Edge Learning: Design Issues and Challenges: Afaf Ta Ik and Soumaya Cherkaoui
8 pages
COMPDLA08
No ratings yet
COMPDLA08
3 pages
Speed Up Federated Learning in Heterogeneous Environments A Dynamic Tiering Approach
No ratings yet
Speed Up Federated Learning in Heterogeneous Environments A Dynamic Tiering Approach
9 pages
Accelerating Federated Learning Via Momentum Gradient Descent
No ratings yet
Accelerating Federated Learning Via Momentum Gradient Descent
13 pages
Clustering Enhanced Reinforcement Learning For Adaptive Offloading in Resource Constrained Devices
No ratings yet
Clustering Enhanced Reinforcement Learning For Adaptive Offloading in Resource Constrained Devices
8 pages
Natural Language Processing With Python Analyzing Text With the Natural Language Toolkit 1st Edition by Steven Bird, Ewan Klein, Edward Loper 0596516495 9780596516499 - The 2025 ebook edition is available with updated content
No ratings yet
Natural Language Processing With Python Analyzing Text With the Natural Language Toolkit 1st Edition by Steven Bird, Ewan Klein, Edward Loper 0596516495 9780596516499 - The 2025 ebook edition is available with updated content
47 pages
A Communication-Efficient Hierarchical Federated Learning Framework Via Shaping Data Distribution at Edge
No ratings yet
A Communication-Efficient Hierarchical Federated Learning Framework Via Shaping Data Distribution at Edge
16 pages
2022 IEEEIoTM FLIoTVision
No ratings yet
2022 IEEEIoTM FLIoTVision
6 pages
Federated Learning Resource
No ratings yet
Federated Learning Resource
8 pages
Research Paper
No ratings yet
Research Paper
15 pages
Auction Based Clustered Federated Learning in Mobile Edge Computing System
No ratings yet
Auction Based Clustered Federated Learning in Mobile Edge Computing System
13 pages
Broadband Analog Aggregation For Low-Latency Federated Edge Learning (Extended Version)
No ratings yet
Broadband Analog Aggregation For Low-Latency Federated Edge Learning (Extended Version)
30 pages
FederatedLearning Bahaa Nashwa 2311.16021v1
No ratings yet
FederatedLearning Bahaa Nashwa 2311.16021v1
7 pages
Low-Latency Hierarchical Federated Learning in Wireless Edge Networks
No ratings yet
Low-Latency Hierarchical Federated Learning in Wireless Edge Networks
18 pages
Time Minimization in Hierarchical Federated Learning
No ratings yet
Time Minimization in Hierarchical Federated Learning
11 pages
Scheduling For Cellular Federated Edge Learning With Importance and Channel Awareness
No ratings yet
Scheduling For Cellular Federated Edge Learning With Importance and Channel Awareness
14 pages
Broadband Analog Aggregation For Low-Latency Federated Edge Learning
No ratings yet
Broadband Analog Aggregation For Low-Latency Federated Edge Learning
16 pages
HFEL Joint Edge Association and Resource Allocation For Cost-Efficient Hierarchical Federated Edge Learning
No ratings yet
HFEL Joint Edge Association and Resource Allocation For Cost-Efficient Hierarchical Federated Edge Learning
14 pages
Sensors 24 04182
No ratings yet
Sensors 24 04182
22 pages
A Graph Neural Network Learning Approach To Optimize RIS-Assisted Federated Learning
No ratings yet
A Graph Neural Network Learning Approach To Optimize RIS-Assisted Federated Learning
16 pages
Adaptive Batch Size For Federated Learning in Resource-Constrained Edge Computing
No ratings yet
Adaptive Batch Size For Federated Learning in Resource-Constrained Edge Computing
17 pages
Vehcom D 24 00093
No ratings yet
Vehcom D 24 00093
15 pages
Federated Learning in Iot: A Survey From A Resource-Constrained Perspective
No ratings yet
Federated Learning in Iot: A Survey From A Resource-Constrained Perspective
6 pages

Relay-Assisted Federated Edge Learning

Uploaded by

Relay-Assisted Federated Edge Learning

Uploaded by

Relay-assisted federated edge learning: performance analysis and

Queen's University Belfast - Research Portal:

Take down policy

Download date:30. May. 2023

Relay-Assisted Federated Edge Learning:

coverage and improve reliability without requiring additional

based on the instantaneous CSI of the first-hop relaying links2 From II

Jm , Rm ∈ R; 9 if Ffitness (pti ) ≤ Ffitness (gbestt ) then

P3: max E (Keff ) = K (1 − Pout ) (37a)

Fig. 2. Outage probability of the considered relay-assisted FEEL system 3

composed of two 3 × 3 convolution layers, each followed by a

To verify the effectiveness of the proposed instantaneous Communication round

2.6 Fig. 5. Test accuracy of the several BA schemes versus γth .

0.7 VI. C ONCLUSION

0.6 system performance by deriving analytical and asymptotic ex-

Then, we proceed with the proof of Theorem 3 by looking

Lemma 2. The two-stage aggregation in the relay-assisted × (vkt+1 − wt+1−e ) . (B.5)

Lisheng Fan received the bachelor and master de-

George K. Karagiannidis (M96-SM03-F14) was

You might also like